Protein sequencing

from Wikipedia, the free encyclopedia

The protein sequencing refers to biochemical methods for determining the amino acid sequence of proteins or peptides . As protein characterization, it is of essential importance in proteomics .

De novo sequencing

If no database data are used or available, sequencing is called de novo . The protein sequencing by mass spectrometric analysis is carried out by additional fragmentation of the peptides and a separation in a reflectron .

Edman breakdown

Edman degradation is the classic method of sequencing peptides and proteins, but is rarely used these days. In Edman degradation, the N-terminal amino acid is derivatized , cleaved off and identified as phenylthiohydantoin amino acid by HPLC in a cyclical process in each reaction cycle .

Instead of Edman degradation, the protein is sequenced from mass spectrometric fragment spectra. The advantage of mass spectrometric methods over Edman degradation is primarily a lower sample requirement and a significantly reduced analysis time. In addition, the sample can also be present in a lower purity and N-terminally blocked (e.g. by acetylation, formylation) peptides can also be examined.

Slag Kumpf mining

The C -terminal is sequenced during the mining of the Schlack-Kumpf .

Nanopore sequencing

When individual linearized proteins pass through nanopores, the amino acid sequence can be determined by changing the electrical conductivity at the nanopore. This method is a further development of DNA sequencing with nanopores.

Protein identification

Indirect sequencing

In order to determine the amino acid sequence of a protein, either the gene sequence of the DNA from a DNA sequencing or from a database of sequenced genomes such as tBLAST in silico is translated into a protein sequence. The gene sequence of a protein can be obtained through molecular displays . In addition, mass spectrometric methods and the (more historically used) Edman degradation are also suitable for direct identification of a protein.

Mass spectrometric protein analysis

The most important method for the direct sequencing of proteins is mass spectrometry using the peptide mass fingerprint . Its importance has grown rapidly in recent years, also in connection with improving computer technology. In principle, proteins of any size can be sequenced with mass spectrometry, but the calculation of the sequence becomes more and more difficult as the size increases. In addition, peptides are easier to prepare because of their better solubility . The protein is therefore first digested with an endoprotease and the resulting solution is separated by HPLC .

The peptides must first be ionized in the mass spectrometer . One of the two commonly used ionization methods is electrospray ionization , the other is matrix-assisted laser desorption / ionization ( MALDI ). In the first, the solution is sprayed into the mass spectromer through a thin nozzle with a high, positive potential (positive charge). The droplets disintegrate in a vacuum until only individual ions are present. The peptides then fragment and the mass-to-charge ratio of the fragments is measured. The resulting fragment spectrum is analyzed with programs and compared with databases. The process is usually then repeated with a differently digested protein in order to be able to reconstruct the original protein from the fragments.

For the analysis of the fragment spectra it is helpful that fragmentation occurs primarily at the peptide bonds (b and y fragments in the following figure):

Scheme fragment ions peptids de.svg

The figure shows different possibilities for the formation of fragments from a (protonated) peptide consisting of n amino acids. The nomenclature of the different fragments - z. B. b 1 fragment or y 4 fragment (= y n-1 fragment with n = 4 amino acids) - goes back to proposals by Roepstorff, Fohlman and Biemann. The fragments of the N-terminal series are denoted by a, b and c, those of the C-terminal series by x, y and z. The index also shows the number of amino acids it contains. The more rarely occurring fragments of the amino acid side chains are marked with d, v and w. The latter fragments can make it possible to distinguish between leucine and isoleucine in the chain.

Individual evidence

  1. ^ Pehr Edman, Geoffrey Begg: A protein sequenator . In: European Journal of Biochemistry . 1967, pp. 80-91. doi : 10.1111 / j.1432-1033.1967.tb00047.x .
  2. M. Wilm, Shevchenko, A., Houthaeve, T., Breit, S., Schweigerer, L., Fotsis, T., & Mann, M .: Femtomole sequencing of proteins from polyacrylamide gels by nano-electrospray mass spectrometry . In: Nature . 379, No. 6564, 1996, pp. 466-469. doi : 10.1038 / 379466a0 .
  3. ^ Y. Yang, R. Liu, H. Xie, Y. Hui, R. Jiao, Y. Gong, Y. Zhang: Advances in nanopore sequencing technology. In: Journal of nanoscience and nanotechnology. Volume 13, Number 7, July 2013, ISSN  1533-4880 , pp. 4521-4538, PMID 23901471 .
  4. ^ Joshua J. Coon: Collisions or Electrons? Protein Sequence Analysis in the 21st Century . In: Anal. Chem. . 81, No. 9, April 13, 2009, pp. 3208-3215. doi : 10.1021 / ac802330b .
  5. P. Roepstorff, Fohlman: Proposal for a common nomenclature for sequence ions in mass spectra of peptides. Biomed Mass Spectrom . In: Biomedical Mass Spectrometry . 11, No. 11, 1984, p. 601. doi : 10.1002 / bms.1200111109 .
  6. Biemann, K. (1992) Mass spectrometry of peptides and proteins. Annu Rev Biochem 61: 977-1010.
  7. Christian Schmelzer: Mass spectrometric characterization of protein hydrolysates: Digestion studies on beta- casein and structural studies on elastin . 2007 ( uni-halle.de [PDF]).

literature