Artificial Gene Synthesis

from Wikipedia, the free encyclopedia

The artificial gene synthesis is a method of synthetic biology is used to create artificial genes in the laboratory. Based on oligonucleotide synthesis , it differs from molecular cloning and polymerase chain reaction (PCR) in that the user does not need any pre-existing DNA . It is thus possible to produce a complete, double-stranded DNA molecule ( synthetic DNA ) without restrictions in terms of sequence or length. The method was used to make functional, bacterial chromosomes containing roughly one million base pairs .

The first synthesis of a complete gene, a yeast tRNA , was accomplished by Har Gobind Khorana and his co-workers in 1972. The syntheses of the first peptide or protein coding gene were carried out in the laboratories of Herbert Boyer and Alexander Markham .

Commercial gene synthesis orders are now being processed by numerous companies around the world, some of which have dedicated themselves specifically to this branch of genetics. The current approach to gene synthesis is mostly a combination of organic chemistry and molecular biology techniques, whereby whole genes can be synthesized " de novo " without an existing DNA template. Gene synthesis has become an important tool in many fields of recombinative DNA technology. The synthesis of nucleotide bases is often more economical than traditional cloning or mutation methods.

Gene optimization

As the ability to produce increasingly longer stretches of DNA accurately and for ever lower prices creates increasing demand in the gene synthesis field, more and more attention is being paid to customizing genes. In the early days of genome sequencing, gene synthesis was used as an expensive source of cDNA . This was obtained from genomic DNA or partial cDNA, but was difficult to clone. When higher quality sources for cDNA appeared, this method was no longer absolutely necessary.

Obtaining large amounts of proteins from naturally occurring gene sequences, or at least from the protein-coding region of the gene, the open reading frame , can often be difficult. This is a problem which has been the subject of various scientific conferences. Many of the needed by molecular biologists proteins are normally regulated so that they in wildtype - cells slightly very expressed are. By adapting the design of these genes, gene expression can be improved in many cases. Due to the error tolerance , rewriting the open reading frame is possible to a limited extent. You can change up to a third of the base pairs while still producing the same protein. The number of possible designs of the DNA sequence for a given protein is astronomical. For a protein sequence of 300 amino acids, there are over 10 150 codon combinations that would produce an identical protein. Optimization methods, such as replacing rarely used codons with more common ones, sometimes have a drastic effect. Furthermore, optimizations such as the removal of secondary structures can be used. Finally, in the case of E. coli , protein expression is maximized by predominantly using codons, matching tRNA, that contain amino acids that are stored during undersupply. Computer programs are now used to cope with the complexity of the various simultaneous optimizations. A well-optimized gene can improve protein expression by a factor of 2 to 10. In some cases improvements by a factor of 100 are documented. Due to the large number of altered nucleotides , gene synthesis is the only suitable way to create the rewritten genes.

Standard Methods

Chemical synthesis of oligonucleotides

Oligonucleotides can be synthesized chemically by reacting nucleoside-phosphoramidites with one another in a phosphoramidite synthesis . These modules are initially protected, i.e. H. their amines , hydroxyl groups and phosphate groups have protective groups attached which do not react during the oligonucleotide synthesis and are removed afterwards. In each synthetic step, however, the next 5'-hydroxyl group of the product is deprotected so that the next phosphoramidite can be added and a new base can attach. The chain grows from the 3 'to the 5' end, i.e. exactly the opposite of biosynthesis.

Since these are chemical processes, the yield of oligonucleotides with the correct sequence decreases with the sequence length. A small probability of error in each synthesis step inevitably adds up. This technique is therefore more suitable for the production of short sequences. The current limit for oligonucleotides of sufficient quality to be used directly for biological processes is 200 bp . The synthesis product can be purified from incorrect sequences using HPLC .

If a large number of different oligonucleotides are synthesized simultaneously on a carrier material (e.g. glass ), the product is called a " gene chip ".

Annealing Oligonucleotides

Usually, a set of individually designed oligonucleotides is produced using automated solid-phase synthesizers, then purified and then linked via specific annealing and ligation or polymerase reaction . In order to improve the annealing of the oligonucleotides, the synthesis step is based on a combination of thermostable DNA ligase and a polymerase enzyme . Various methods of gene synthesis have been described nowadays. Examples of this are the ligation of phosphorylated overlapping oligonucleotides, the Fok I and a form of the ligase chain reaction adapted for gene synthesis. In addition, some PCR assembly approaches have been described. They usually use oligonucleotides 40 to 50 bp in length that overlap each other. These oligonucleotides are designed so that together they cover most of the sequence of both strands. The complete molecule is then produced step-by-step using overlap extension PCR (OE) using TBIO-PCR or using combined methods. The usual size of synthesized genes is 600 to 1,200 bp, although much longer genes have been generated by ligation of parts less than 1,000 bp long. On this scale it is necessary to test several possible clones for each part using automated sequencing methods.

restrictions

In addition, since the generation of the complete gene depends on the efficient and precise arrangement of long, single-stranded oligonucleotides, there are some critical parameters for the success of the synthesis: larger sequence regions with secondary structures caused by included repeats; exceptionally high or low GC content; repetitive structures. Usually, these segments of a gene can only be created by breaking it down into several small parts and then joining the individual parts together. This in turn leads to a significant increase in time and effort.

The result of a gene synthesis depends strongly on the quality of the oligonucleotides that were used. In this annealing-based procedure, the oligonucleotides have a direct and exponential effect on the correctness of the product. Alternatively, after oligonucleotides of lower quality have been merged by gene synthesis, more effort must be made to ensure the quality of the gene subsequently. This is usually done by standard cloning with subsequent transformation and analysis of the clones by sequencing. However, this is a time-consuming process.

Another problem that arises with the usual gene synthesis methods is the frequent occurrence of sequence errors due to the use of chemically synthesized oligonucleotides. As a result, the percentage of correct products drops sharply with an increasing number of oligonucleotides used.

The mutation problem can be solved by using shorter oligonucleotides as building blocks of the gene. However, all assembly methods require that the primers be put together in a jar. As a result, short overhangs cannot always anneal precisely and correctly with their complementary primers, which in turn impairs the formation of the complete gene.

Creating oligonucleotides manually is a laboratory practice and does not necessarily guarantee successful synthesis of the desired gene. For an optimal result of almost all annealings, the melting temperature of the overlapping regions must be similar for all oligonucleotides. The necessary primer optimizations should be carried out using specialized oligonucleotide design programs. Some solutions for automated primer design for gene synthesis have already been found.

Error correcting procedures

Parallel sequencing of large oligo libraries is used as a means of finding suitable molecules. In one method, oligonucleotides are sequenced on a 454 pyrosequencing platform and a robotic system maps the individual beads and selects those that match the sequence.

There is also increasing demand for entire sets of genes with sequences that are similar to one another or with different sequences that have only a few base pair differences. Almost all of the therapeutic proteins under development, such as monoclonal antibodies, are optimized for improved function or expression by testing numerous gene variants.

See also

Individual evidence

  1. a b HG Khorana, KL Agarwal, H. Büchi, MH Caruthers, NK Gupta, K. Kleppe, A. Kumar, E. Otsuka, UL RajBhandary, JH Van de Sande, V. Sgaramella, T. Terao, H. Weber , T. Yamada: Studies on polynucleotides. 103. Total synthesis of the structural gene for an alanine transfer ribonucleic acid from yeast. In: Journal of molecular biology. Volume 72, Number 2, December 1972, ISSN  0022-2836 , pp. 209-217, doi : 10.1016 / 0022-2836 (72) 90146-5 , PMID 4571075 .
  2. a b K. Itakura, T. Hirose, R. Crea, A. Riggs, H. Heyneker, F. Bolivar, H. Boyer: Expression in Escherichia coli of a chemically synthesized gene for the hormone somatostatin. In: Science. 198, 1977, pp. 1056-1063, doi : 10.1126 / science.412251 , PMID 412251 .
  3. a b M.D. Edge, AR Green, GR Heathcliffe, PA Meacock, W. Schuch, DB Scanlon, TC Atkinson, CR Newton, AF Markham: Total synthesis of a human leukocyte interferon gene. In: Nature. Volume 292, Number 5825, August 1981, ISSN  0028-0836 , pp. 756-762, doi : 10.1038 / 292756a0 , PMID 6167861 .
  4. DNA 2.0, for example, was founded in Menlo Park in 2003 as a "synthetic genomics company" ( quoted page ( memento of the original dated August 7, 2012 in the Internet Archive ) Info: The archive link was automatically inserted and not yet checked Original and archive link according to instructions and then remove this note. ). @1@ 2Template: Webachiv / IABot / www.dna20.com
  5. Difficult to Express Proteins . In: Sixth Annual PEGS Summit . Cambridge Healthtech Institute. 2010. Archived from the original on May 11, 2010. Retrieved on May 11, 2010.
  6. Kathy Liszewski: New Tools Facilitate Protein Expression . In: Genetic Engineering & Biotechnology News , Mary Ann Liebert, May 1, 2010, pp. 1, 40-41. Archived from the original on May 9, 2010. Retrieved May 11, 2010. 
  7. ^ M. Welch, S. Govindarajan, JE Ness, A. Villalobos, A. Gurney, J. Minshull, C. Gustafsson: Design parameters to control synthetic gene expression in Escherichia coli. In: PloS one. Volume 4, number 9, 2009, ISSN  1932-6203 , p. E7002, doi : 10.1371 / journal.pone.0007002 . PMID 19759823 , PMC 2736378 (free full text).
  8. Protein Expression . DNA2.0. Retrieved May 11, 2010.
  9. a b Fuhrmann M, Oertel W, Hegemann P: A synthetic gene coding for the green fluorescent protein (GFP) is a versatile reporter in Chlamydomonas reinhardtii . In: Plant J . 19, No. 3, August 1999, pp. 353-361. doi : 10.1046 / j.1365-313X.1999.00526.x . PMID 10476082 .
  10. Mandecki W, Bolling TJ: FokI method of gene synthesis . In: Genes . 68, No. 1, August 1988, pp. 101-107. doi : 10.1016 / 0378-1119 (88) 90603-8 . PMID 3265397 .
  11. Stemmer WP, Crameri A, Ha KD, Brennan TM, Heyneker HL: Single-step assembly of a gene and entire plasmid from large numbers of oligodeoxyribonucleotides . In: Genes . 164, No. 1, October 1995, pp. 49-53. doi : 10.1016 / 0378-1119 (95) 00511-4 . PMID 7590320 .
  12. Gao X, Yo P, Keith A, Ragan TJ, Harris TK: Thermodynamically balanced inside-out (TBIO) PCR-based gene synthesis: a novel method of primer design for high-fidelity assembly of longer gene sequences . In: Nucleic Acids Res . 31, No. 22, November 2003, p. E143. doi : 10.1093 / nar / gng143 . PMID 14602936 . PMC 275580 (free full text).
  13. Young L, Dong Q: Two-step total gene synthesis method . In: Nucleic Acids Res . 32, No. 7, 2004, p. E59. doi : 10.1093 / nar / gnh058 . PMID 15087491 . PMC 407838 (free full text).
  14. Hillson NH, Rosengarten RD, Keasling JD: j5 DNA Assembly Design Automation Software . In: ACS Synthetic Biology . 1, No. 1, 2012, pp. 14-21. doi : 10.1021 / sb2000116 .
  15. Matzas M et al. DNA sequencing # Pyrosequencing : High-fidelity gene synthesis by retrieval of sequence-verified DNA identified using high-throughput pyrosequencing . In: Nature Biotechnology . 28, 2010, pp. 1291-1294. doi : 10.1038 / nbt.1710 . PMID 21113166 .