This research was supported by a grant from KRIBB Research Initiative Program and in part by the Korean HapMap Project of MOST (Ministry of Science & Technology).
Functional Element SNPs Database

FESD is a web-based integrated database for selecting sets of SNPs in putative functional elements in human gene.

It provides sets of SNPs located in 10 different functional elements; promoter regions, CpG islands, 5'UTRs (untranslated regions), translation start sites, splice sites, coding exons, introns, translation stop sites, poly adenylation signals (PASes), and 3'UTRs.

Promoter
A promoter is a regulatory region of DNA located upstream (towards the 5' region) of a gene, providing a control point for regulated gene transcription. The promoter contains specific DNA sequences that are recognized by proteins known as transcription factors. These factors bind to the promoter sequences, recruiting RNA polymerase, the enzyme that synthesizes the RNA from the coding region of the gene. Promoters represent critical elements that can work in concert with other regulatory regions (enhancers, silencers, boundary elements/insulators) to direct the level of transcription of a given gene.

CpG islands
The usual formal definition of a CpG island is a region with at least 200 bp and with a GC percentage that is greater than 50% and with an observed/expected CpG ratio that is greater than 0.6. The majority of these islands is associated with genes, and can be used as recognition sites for restriction enzymes. The length of a CpG island is typically 300-3000 base pairs. These regions are characterized by CpG dinucleotide content equal to or greater than what would be statistically expected (=6%), whereas the rest of the genome has much lower CpG frequency (=1%), a phenomenon called CG suppression. Unlike CpG sites in the coding region of a gene, in most instances, the CpG sites in the CpG islands of promoters are unmethylated if genes are expressed. This observation led to the speculation that methylation of CpG sites in the promoter of a gene may inhibit the expression of a gene. Methylation is central to imprinting alongside histone modifications.
CpG islands are typically common near transcription start sites, and may be associated with promoter regions. Normally a C (cytosine) base followed immediately by a G (guanine) base (a CpG) is rare in vertebrate DNA because the Cs in such an arrangement tend to be methylated. This methylation helps distinguish the newly synthesized DNA strand from the parent strand, which aids in the final stages of DNA proofreading after duplication. However, over evolutionary time methylated Cs tend to turn into Ts because of spontaneous deamination. The result is that CpGs are relatively rare unless there is selective pressure to keep them or a region is not methylated for some reason, perhaps having to do with the regulation of gene expression. CpG islands are regions where CpGs are present at significantly higher levels than is typical for the genome as a whole.

5'-UTR
The five prime untranslated region (5' UTR), also known as the leader sequence, is a particular section of messenger RNA (mRNA) and the DNA that codes for it. It starts at the +1 position (where transcription begins) and ends just before the start codon (usually AUG) of the coding region. It usually contains a ribosome binding site (RBS), in bacteria also known as the Shine-Delgarno sequence (AGGAGGU). The 5' UTR may be a hundred or more nucleotides long, and the 3' UTR may be even longer (up to several kilobases in length). Binding sites for proteins, that may affect the mRNA's stability or translation, for example iron responsive elements, which occur in the 5' UTRs (and 3' UTRs) of a small number of eukaryotic mRNAs that regulate gene expression in response to iron.

3'-UTR
The three prime untranslated region (3' UTR) is a particular section of messenger RNA (mRNA). It follows the coding region. In the mRNA structure, approimately to scale for a human mRNA, where the median length of 3'UTR is 700 nucleotides. Several regulatory sequences are found in the 3' UTR:
-A polyadenylation signal, usually AAUAAA, or a slight variant. This marks the site of cleavage of the transcript approximately 30 base pairs past the signal, followed by the addition of several hundred adenine residues (poly-A tail).
-Binding sites for proteins, that may effect the mRNAs stability or location in the cell, like SECIS elements (which direct the ribosome to translate the codon UGA as selenocysteines rather than as a stop codon), or AU rich elements (AREs), stretches consisting of mainly adenine and uridine nucleotides (which can either stabilize or destabilize the mRNA depending on the protein bound to it).
-Binding sites for miRNAs, a type of RNAi.

Start/stop codons
Translation starts with a chain initiation codon (start codon). Unlike stop codons, the codon alone is not sufficient to begin the process. Nearby sequences and initiation factors are also required to start translation. There is only one start codon: AUG, which codes for methionine, so every amino acid chain must start with methionine.
The three stop codons have been given names: UAG is amber, UGA is opal (sometimes also called umber), and UAA is ochre. "Amber" was named by discoverers Richard Epstein and Charles Steinberg after their friend Harris Bernstein, whose last name means "amber" in German. The other two stop codons were named 'ochre" and "opal" in order to keep the "color names" theme. Stop codons are also called termination codons and they signal release of the nascent polypeptide from the ribosome due to binding of release factors in the absence of cognate tRNAs with anticodons complementary to these stop signals.

Splice sites
In genetics, splicing is a modification of genetic information after transcription, in which introns of precursor messenger RNA (pre-mRNA) are removed and exons of it are joined. Since in prokaryotic genomes introns do not exist, splicing naturally only occurs in eukaryotes. The splicing prepares the pre-mRNA to produce the mature messenger RNA (mRNA), which then undergoes translation as part of the protein synthesis to produce proteins. Splicing includes a series of biochemical reactions, which are catalyzed by the spliceosome, a complex of small nuclear ribonucleo-proteins (snRNPs).

Intron
Introns are sections of DNA colinear to the mRNA sequence that will be spliced out after transcription, but before the mRNA is translated. Introns are common in eukaryotic RNAs of all types, but are found in prokaryotic tRNA and rRNA genes only. The regions of a gene that remain in spliced mRNA are called exons. The number and length of introns varies widely among species and among genes within the same species. For example, the pufferfish Takifugu rubripes has little intronic DNA. Genes in mammals and flowering plants, on the other hand, often have numerous introns, which can be much longer than the nearby exons.
Introns sometimes allow for alternative splicing of a gene, so that several different proteins that share some sections in common can be produced from a single gene. The control of mRNA splicing, and hence of which alternative is produced, is performed by a wide variety of signal molecules. Introns also sometimes contain "old code," sections of a gene that were probably once translated into protein but which are now discarded.
It was generally assumed that the sequence in any given intron is junk DNA with no function. More recently, this is being questioned however; it is known that introns contain several short sequences that are important for efficient splicing. The exact mechanism for these intronic splicing enhancers is not well understood, but it is thought that they serve as binding sites on the transcript for proteins that stabilize the spliceosome. It is also possible that RNA secondary structure formed by intronic sequences may have an effect on splicing, and in alternative splicing, an exonic sequence in one product is intronic in another. "Old code" sequences, on the other hand, in most cases indeed seem to be "evolutionary kipple".

Exon
An exon is any region of DNA within a gene that is transcribed to the final messenger RNA (mRNA) molecule, rather than being spliced out from the transcribed RNA molecule. Exons of many eukaryotic genes interleave with segments of non-coding DNA (introns). In many genes, each exon contains part of the open reading frame (ORF) that codes for a specific portion of the complete protein. However, the term exon is often misused to refer only to coding sequences for the final protein. This is incorrect, since many noncoding exons are known in human genesIn many genes, each exon contains part of the open reading frame (ORF) that codes for a specific portion of the complete protein. However, the term exon is often misused to refer only to coding sequences for the final protein. This is incorrect, since many noncoding exons are known in human genes.

Polyadenylation signal sites
The polyadenilation in the process required for the synthesis of messenger RNA (mRNA) in which an endonucleolityc RNA cleavage is coupled with synthesis of polyadenosine [poly(A)] on the newly formed 3'end. The elements required for polyadenylation are the polyadenylation signal and the polyadenylation site.
Polyadenylation is the covalent linkage of a poly(A) tail to a messenger RNA (mRNA) molecule. It is part of the route to producing mature messenger RNA for translation, in the larger process of protein synthesis to produce proteins. In eukaryotic organisms, most messenger RNA molecules end with a poly-A stretch at their 3' ends. The polyadenosine (poly-A) tail protects the mRNA molecule from exonucleases and is important for transcription termination, for export of the mRNA from the nucleus, and for translation. Some prokaryotic mRNAs also are polyadenylated, although the polyadenosine tail's function is different from that in eukaryotes.
Polyadenylation occurs after transcription of DNA into RNA in the nucleus. After the polyadenylation signal has been transcribed, the mRNA chain is cleaved through the action of an endonuclease complex associated with RNA polymerase. The cleavage site is characterized by the presence of the base sequence AAUAAA near the cleavage site. After the mRNA has been cleaved, 50 to 250 adenosine residues are added to the free 3' end at the cleavage site. This reaction is catalyzed by polyadenylate polymerase.

In addition, users can get the flanking sequences of SNPs for genotyping experiments through our web interface.


Source Data
Publication
Hyo Jin Kang, Kyoung Oak Choi, Byung-Dong Kim, Sangsoo Kim and Young Joo Kim. FESD: a Functional Element SNPs Database in human, Nucleic Acids Research, 2005, Vol. 33, Database issue D518-D522
Medical Genomics Research Center, Korea Research Institute of Bioscience and Biotechnology
#52 Eoeun-dong, Yuseong-gu, Daejeon 305-333, Korea
Tel : 82-42-879-8127  |  Fax : 82-42-879-8119  | E-mail : yjkim8@kribb.re.kr
©2007 Korea Research Institute of Bioscience and Biotechnology. All rights reserved.