Nucleotide sequence
The nucleotide sequence is the sequence of the nucleotides of a nucleic acid , usually deoxyribonucleic acid (DNA, English DNA ) or ribonucleic acid (RNS, English RNA ). It is indicated for DNA or RNA single strands by the sequence of their nucleobases , the base sequence.
In the notation, the first letters of their names are used for the nucleobases of the nucleotides: for adenine A, guanine G, thymine T, uracil U and cytosine C. In DNA, the four bases adenine, guanine, thymine and cytosine occur, in RNA the four bases adenine, guanine, uracil and cytosine.
By convention, the base sequence is noted from the 5 'end to the 3' end of the strand, in the same direction 5 '→ 3' in which the polymerase synthesizes the nucleic acid from nucleotides.
determination
The nucleotide sequence of a DNA is determined by DNA sequencing . Base sequences of DNA are inter alia in large public sequence databases such. B. GenBank saved. RNA is not sequenced directly. Instead, it is copied into a DNA ( cDNA ) using reverse transcriptase , which is then sequenced.
Statistical analysis
Represented as a sequence of symbols, base sequences of DNA or RNA can be easily examined. Statistical studies can, for example, compare the frequency of so-called n - tuples in the sequence , i.e. the occurrence of partial sequences of length n . For example, the sequence CG or the tuple (C, G) appears in the human genome significantly less often than one of the other fifteen 2-tuples ( CG suppression ). Local and regional frequency distributions of different nucleotide sequences can also give first indications of possible functions of certain DNA segments. For example, CG searches for clusters of CpG dinucleotides in the DNA strand, the so-called CpG islands , and examines their methylation pattern . In this case, a sequence motif formed from two bases is sought.
Three-base sequence motifs are searched for when possible start codons or stop codons are to be displayed. Examples of somewhat longer sequences are the possible binding sites for ribosomes , such as the Shine-Dalgarno sequence in prokaryotes or the Kozak sequence in eukaryotes . Certain nucleotide sequences also play an important role for the terminator of transcription, as well as for the starting point at which an RNA polymerase begins transcription, for example the TATA box in the promoter region of a gene .
See also
Web links
Individual evidence
- ^ DA Benson, K. Clark, I. Karsch-Mizrachi, DJ Lipman, J. Ostell, EW Sayers: GenBank. In: Nucleic acids research. Volume 43, Database issue January 2015, pp. D30 – D35, doi : 10.1093 / nar / gku1216 , PMID 25414350 , PMC 4383990 (free full text).
- ↑ Ozsolak F, Milos PM: RNA sequencing: advances, challenges and opportunities . In: Nature Reviews. Genetics . 12, No. 2, February 2011, pp. 87-98. doi : 10.1038 / nrg2934 . PMID 21191423 . PMC 3031867 (free full text).