Genetic variation (human)

from Wikipedia, the free encyclopedia

One of the best-known results of the human genome project is that humans, whether closely related or from different regions or parts of the world, have around 99.9 percent of their genetic makeup in common - even with the closest relatives of humans, the chimpanzees , the commonality is probably more than 98 .5 percent. Due to the enormous size of the genome (estimated at around 3 billion base pairs), the remaining variable portion - roughly estimated it corresponds to about one heterozygous position per 1,300 base pairs - is still considerable. As far as we know today, the human genome comprises more than 10 million polymorphisms with a share of more than 1 percent of the total population, and new ones are constantly being discovered. Two randomly selected, not closely related people thus differ in millions of base pairs, each of us estimated in about four million base pairs from a randomly selected other person.

This genetic variation is responsible for the hereditary portion of the total phenotypic variation that exists between different people; it thus affects features and feature complexes such as body size, skin color, susceptibility to various diseases and possibly also psychological factors.

background

The human genome comprises around 21,000 protein-coding genes, which correspond to around 1.5 percent of the genome. However, at least about three (up to eight) percent of the base pairs are subject to negative (cleansing) selection , which means that mutations are less common here than would be expected based on chance. The proportion of significant sequences apart from protein-coding sections is therefore larger than that of the protein-coding genes . Such significant non-coding DNA segments are mainly used for gene regulation ; their sequences represent, in particular, so-called cis elements or are transcribed into regulatory RNA . Another large portion of the genome is occupied by mobile genetic elements or transposons ; Contrary to what was previously thought, these are not all just genetic "garbage", but some of them take on regulatory tasks and play a role in evolutionary innovations in gene expression. As far as is known today, the rest of the genome has no comparable function; in many cases, it is a question of constantly repeated short sequences ( repetitive DNA ), probably without any information content.

Variations produced by mutations in protein-coding genes and all regulatory components, and not, as has long been assumed, in the coding genes alone, contribute to the variation in human characteristics. Genetic variation (polymorphisms) in the remaining sections presumably does not usually have any particular effects. Your research can, however, be B. be helpful for the analysis of family relationships ( genetic genealogy ); The technique of so-called genetic fingerprinting is also based on such a variation of non-coding sections that are not subject to selection. Variations of the repetitive DNA, so-called microsatellites, are also considered for these methods . These are not considered further below.

It has also only been known for less than 20 years that under certain circumstances hereditary variations apart from the DNA sequence have to be taken into account in the expression of the characteristics; this is called epigenetics .

All differences between individuals that are neither genetic nor epigenetic in nature must therefore be explained by environmental influences. The part of the range of variation caused by them is called environmental variation.

Types of genetic variation

Variation in the human genome affects different gene loci to different degrees. Some sections never vary, presumably because the resulting variants almost always have lethal effects. One speaks here of “conserved” genes or gene segments. Few areas are highly variable between different individuals.

Inherited variants are to be distinguished from those that are newly created “ de novo ” by spontaneous mutation in an individual; these can affect germ cells or cells from the rest of the body tissue. The difference between hereditary variations anchored in the germline and the non-inherited variations (somatically) newly created in the body tissue is essential here. The latter can have an impact on the development of numerous diseases, for example cancer ; however, they are not passed on to future generations.

SNPs

SNP: DNA molecule 1 differs from DNA molecule 2 in a single base pair position (C / T polymorphism).

The most common and best understood variations of the human genome involve exchanging a single base , known as single nucleotide polymorphism , usually abbreviated as SNP (pronounced "snip"). Due to the redundancy of the genetic code , SNPs can be of no consequence (“mute”) if the base triplet resulting from the mutation codes for the same amino acid as the original one. Otherwise a single amino acid of a protein is usually exchanged; more complex changes are less common, for example when a new stop codon is created. If SNPs affect the germline , they are inherited. Although numerous very rare SNPs naturally exist, for example due to a recent mutation, millions of SNPs in the genome are very widespread in many populations . This polymorphism exists either because the corresponding variant is evolutionary neutral (or almost neutral), i.e. H. is hardly subject to selection because balancing selection actively maintains and promotes diversity, or because different variants are selectively advantageous in different regions, for example because of other predominant pathogens or a different climate. Through research projects such as For example, the 1000 Genome Project or HapMap are now extensive databases of SNPs distributed in the human genome. Because SNPs are inherited in families over several generations, they form the basis for genetic genealogy ; Put simply, the more SNPs they have in common, the closer two people are.

Indels

Another common source of genetic variation relates to short insertions ( insertions ) and deletions ( deletions ) of short sections of DNA that may arise, for example, from errors and inaccuracies in DNA replication. Because of the often comparable effects, both variations are often combined into one class, which is then called Indels.

CNVs

Far less common than SNPs and Indels, but much more common than was sometimes assumed, there are larger and more complex variations in the human genome. These are mostly summarized under copy number variation , abbreviated to CNV, German “copy number variation”. CNVs can be thousands, rarely even millions, of base pairs in length, and they are difficult to detect using normal techniques, which are especially optimized for SNPs. Changes in the number of copies of a gene occur most frequently, which can cause changes in the phenotype due to the changed transcription rate by changing the dose of a gene product; complex changes with structural effects are less common. About 5 percent of all genes are normally already present in two or more copies in the genome; the number of gene copies in these can vary particularly slightly, since it is particularly easy for the copying grid to shift due to non-homologous pairings. However, CNVs may affect more than 10 percent of the entire human genome.

The long-known changes in the number of whole chromosomes that cause diseases such as Down's syndrome or Turner's syndrome are particularly pronounced, but not affecting the germline CNVs.

The haplotype

In the diploid genome of a person, every gene - with the exception of some on the sex chromosomes - exists in two copies: one on the chromosome inherited from the father and one on the chromosome inherited from the mother. These genes do not have to be identical and can occur in gene variants. If the two genes are in different alleles , the relationship to the phenotype is not clear and often involved. Sometimes one of the alleles is also epigenetically masked (muted) so that the phenotype is completely determined by the other.

It is also important for the inheritance of gene variants how they are arranged on the chromosomes. If two variants are on the same chromosome, the nature of the duplication process means that it is much more likely that they will be inherited together and then appear together in the next generation. If the phenotype is shaped in a special way by the combination, this can have an enormous influence on the frequency of the allele. The allele expression of a chromosome is therefore referred to with a special technical term as a haplotype. The alleles of a haplotype are inherited together, unless sections are exchanged for one another by crossing-over during sexual reproduction (during meiosis ).

Coupling and coupling imbalance

Since genes on the same chromosome are more often inherited together, they appear linked to one another in a statistical analysis. In classical genetic studies on hereditary diseases, the location and inheritance of disease-causing genes can be elucidated through coupling analysis, which otherwise would be difficult to find due to the size of the genome. This is best achieved with diseases caused by a single gene with a large effect, the inheritance of which follows Mendel's rules .

If an allele of a haplotype is strongly favored by selection (in a certain environment), not only will this allele itself become more common in future generations, as would be expected, but also other (neutral or even negative) alleles that happen to be on the same Chromosome are in close proximity to it. In the same way, alleles which only have a positive or negative effect in combination may have been selected more or less depending on the coupling. The less time has passed since the formation of the allele and the closer the linked alleles are on the chromosome strand, the less often this connection will be broken by crossing-over (since this makes crossing-over between them less likely). As a result of the coupling, individual alleles are more frequently present together than randomly expected, and a neutral allele can thus become more common in the genome, as it were. This relationship is called linkage disequilibrium, or more often after the English "linkage disequilibrium" (today, however, the term linkage disequilibrium is used, uncoupled, for all non-random relationships of alleles). Gene ensembles conserved by linkage disequilibrium are important, for example, in the analysis of hereditary diseases caused by numerous genes with little effect (which, viewed in isolation, influence the probability of the disease by only a few percent). This makes use of the fact that the (unknown) disease-promoting allele in the linkage disequilibrium could be with an already known SNP, so that a correlation between SNP and disease would also be observed if the SNP allele itself has absolutely nothing to do with the disease has to do. Standardized probes, so-called SNP chips, have been developed for this purpose.

Another field of application for linkage disequilibrium is the analysis of the origin of human populations and migration movements.

Variation between individuals and between populations

According to a common measure, Wright's Index FST, roughly 15 percent of the variance in the human genome can be traced back to differences between populations, and the remaining 85 percent to differences between individuals within these populations. From this value, which has been known for decades and has been confirmed by modern results, it was concluded that the differences between populations (or, following the linguistic usage still prevalent at the time: races) are so small that they can be neglected, but this conclusion is by no means mandatory. Differences in the genetic structure between populations can be used to reconstruct the spread of humans across the globe. However, they are possibly also important in the drug treatment of diseases.

How do differences between populations come about?

The fact that different groups of people carry different alleles of their genes is mostly simply a consequence of chance, referred to as genetic drift in connection with genes . Different numbers of offspring from parents with random genetic differences mean that certain alleles are randomly lost in some locations and others in other locations. These differences only balance each other out again if the mating frequency is randomly distributed within the population ( panmixia ); the homogenizing influence on an individual population is then referred to as gene flow . Studies of human populations have shown that people almost exclusively choose their partners from within a narrow radius (a few kilometers in traditional societies) around their place of birth. This results in a population structure in which characteristics vary more or less continuously, but when comparing them over somewhat larger distances, noticeable differences arise that stand against the homogenizing influence of the gene flow, but without a sharp dividing line being drawn at any point. Population models with such properties are described as “isolation through distance” (this concept also goes back to the work of Sewall Wright). If only people from distant regions are compared with one another, the clinical nature of the variation is easily misunderstood.

An adaptive value was determined only for very few feature variations, for example skin color. People who live in colder climates are also proportionally heavier, their limbs (especially the distal section) are shorter, while there is no overall trend in height. The tendency towards shorter extremities in colder climates follows Allen’s rule based on observations on numerous animal species . Another well-known example is the lactose tolerance of Europeans and North Asians, which is explained by the ancestry of ranchers and which apparently only became established in the populations a few thousand years ago. Another famous example is the sickle cell anemia allele, which gives (heterozygous) carriers higher resistance to malaria and is therefore found in high proportions of the population in areas severely affected by malaria ( balanced polymorphism ) despite the severe hereditary disease in the homozygous case .

Variation due to lineage

The current genetic variation also reflects the history of migration and population growth . When populations, e.g. If, for example, emerge from other populations through emigration, in that a small group moves into a new habitat (e.g. the Polynesians on their boat trips to the Pacific islands), it is to be expected that not all alleles of the original population are represented in this group will be. If this population then becomes more numerous again in the new habitat, its variation is still noticeably lower than that of the starting population (even if the number of alleles will gradually increase again due to a new mutation). This is known as the founder effect . A drastic decline in population that only a small group can survive has the same effect. Even if the population later reaches its original size again, its variation is permanently reduced; this is known as the genetic bottleneck .

Population geneticists have genetically compared individuals from numerous populations and races all over the world, mapped, counted and compared the SNPs, microsatellites and other variations they had in order to be able to reconstruct the distribution on this basis. It turns out that the greatest genetic variability can be found in Africa (this also applies phenotypically, e.g. for the classical anthropological technique of skull measurement). The populations of the other continents (apart from a few new alleles) only have a certain section of the African spectrum. The data can be explained well with a series of founder effects after emigrating from an original African home. If the sequences are grouped according to similarities, those from the Middle East are most similar to the Africans, followed by Europeans, South and Central Asia and East Asia, the inhabitants of Papua and Melanesia as well as the indigenous Americans differ most. Even if the similarity of neighboring peoples is in part due to mixing or hybridization, this pattern can be interpreted very convincingly as a reflection of a migration movement originating in Africa, which older studies do on the basis of fewer genes and the relationship between languages ​​and human parasites , clearly confirmed. The observable variation is almost entirely driven by chance; H. explainable by genetic drift. Hypotheses about a different rate of evolution between "races", which are represented by racist representatives of the New Right to this day, have no basis in the data.

The same methods can also be used to determine the relationship between human populations within regions. In addition to numerous European studies, for example, the inhabitants of the Pacific islands were analyzed in a large study. Not only was the difference between Melanesians and Polynesians confirmed, it was also shown that the inhabitants of the interior of the large islands are genetically very different from one another (each community exhibiting relatively little variability among themselves). The inhabitants of the coastal regions of the same islands are much more closely related to one another. This shows not only that both groups reached the islands independently, but also that the open ocean was apparently less of a barrier to migration than the rugged mountains of the interior.

The data mentioned allow the reconstruction of a branching pattern, but in this form does not provide values ​​for dating the migration or for the (effective) population size of the groups involved. Subsequent analyzes of the data are quite complex and the results inconsistent and dependent on the (also statistical) methods used. The fact that the variation in the human genome is comparatively small when comparing species (it is barely half the size of that of chimpanzees and gorillas, despite a much larger area and population size), suggests a relatively recent origin of all modern humans Populations are closed.

Although in some cases the incidence of diseases is linked to geographical origin, knowledge of the conventional “race” (that is, as a rule, skin color) has so far been of little importance in the diagnosis and treatment of diseases. This applies in particular to immigrant societies such as that of the USA, in which Americans of African origin (classified according to self-disclosure) already wear an average of more than 20 percent of “European” SNPs. In the course of efforts towards individualized medicine, however, efforts are being made to take the genetic origin into account in the treatment if possible. At least when researching the genetic causes of diseases, it is not enough to consider studies on a limited, homogeneous group of Europeans or people of European descent as a representation of humanity.

Influence of archaic people

It has been known since 2010 that part of the genetic variation in modern humans also originates from the crossing ( called introgression ) of genes from archaic, extinct human lineages into the human gene pool. Such a gene flow has only been detectable since DNA can be extracted from fossil bones and sequenced so that the archaic and modern genes can be compared directly with one another. Modern Europeans therefore carry 1 to 4 percent of alleles of the Neanderthal . And another prehistoric man, the Denisova man , previously known only from a finger bone from a cave in the Altai Mountains, has contributed to the genome of numerous groups of people, mostly with 4 to 6 percent to that of the Melanesians . This can be found out on the one hand by comparison with human populations not affected by introgression, especially Africans. There are also sophisticated methods in which, for example, the distribution of the alleles on the chromosomes is statistically analyzed.

Applications

Look for genes that cause disease

The main driving force behind research into genetic variation in humans today is the search for genes and gene variants that cause or promote common diseases such as cancer, diabetes, various autoimmune diseases or cardiovascular diseases, or that have an effect on the effects of drugs against these diseases. The public and private research, which has required a total of hundreds of millions of euros, is driven by the hoped-for medical benefit; further findings are more by-products. Even before the human genome project, it was clear that there cannot be a few genes that cause disease alone - otherwise they would have had to be found using linkage analysis (the OMIM database provides an overview of the relevant genes and diseases ). However, the success of these studies to date has been mixed.

The method with which one hopes to find out the alleles essential for diseases are genome-wide association studies (abbreviated GWAS ). The mapped SNPs, which are widespread in human populations and which are already known and accessible in databases, serve as markers in the search for disease-promoting alleles. The hope is that this linkage will exhibit disequilibrium with some of these markers. Depending on the degree of coupling, if one has found suitable candidates, it is in principle possible to narrow down their position on a chromosome (since linkage disequilibrium should increase with spatial proximity on the base strand). SNPs in which a linkage of a marker SNP to an interesting, more complex characteristic (usually a disease) was found, are also stored and documented in a database maintained by the (American) National Human Genome Research Institute . Several thousand such links have now been found. If an identified allele has a statistically significant connection with a disease, it can usually be assumed that the corresponding gene locus is somehow involved in the development of the disease. In practice, several loci are linked to several diseases ( pleiotropia ).

The previous studies show that for common diseases a few hundred loci with gene variants could be identified that correlate with the frequency of the disease. However, each one typically only contributes 1 to 1.5 percent to the risk of disease, all together around 20 to 30 percent. In the case of the widespread disease diabetes, all of the previously identified SNPs together contribute barely 10 percent to the hereditary disease risk; knowledge of them (compared to other factors, which are also much easier and cheaper to determine) is worthless for clinical application - even if they contain numerous new clues for Provide research into the disease. In addition, the methodological risk of mistakenly mistaking random correlations for statistically significant results is extremely high with these research results.

There are various possible explanations for the discrepancy between the results of classic hereditary analyzes, which often suggest a considerable hereditary share of disease risks, and the few alleles that can actually be identified by GWAS

  • Infinitesimal model: Heredity is determined by the interplay of hundreds, or even thousands, of alleles, each of which typically contributes far less than one percent. The increased risk can then be explained by the coincidence of hundreds of unfavorable alleles.
  • “Rare alleles” model: Heredity results from alleles which, viewed individually, each have a major effect and substantially increase the risk. But each of these alleles is so rare that it only occurs in a fraction of patients, often less than one percent. For any disease, there could be hundreds or thousands of different such rare alleles with great effect, none of which, due to their rarity, would make a significant impact when analyzing the risk to the general population.
  • Extended hereditary model: Genes are not the only factors involved in the inheritance of disease risks. In addition to inherited epigenetic factors (e.g. DNA methylation patterns), interactions between genes ( epistasis ) and between genes and the environment also play an important role.

There is empirical evidence for and against each of the models. Probably all three play a different role in each individual case.

A mass test (screening) for disease-associated alleles or a medical analysis of one's own genome is therefore of no particular use from today's perspective. For the same reason, however, fears are unfounded that third parties (e.g. insurance companies) could identify significant disease risks from the knowledge of the individual genome that they have somehow acquired (apart from a few very rare hereditary diseases). However, knowledge of disease-promoting loci may be able to significantly advance the search for new drugs in the future.

Case study: skin color

Global distribution of skin colors among indigenous peoples , based on von Luschan's color scale

Human skin color is one of the most striking genetic differences between individuals and human populations. In the past, humanity was divided into human races based largely on skin color . In addition, the pigmentation of the skin is a major factor in the development of diseases such as skin cancer . Only Europeans also have variations in hair color and eye color , the genetic basis of which is almost identical to that of skin color.

Although other factors are involved in the development of color, the variation in skin color in humans is almost exclusively due to differences in the number and distribution of melanosomes , and ultimately to the content of the pigment melanin , with the total content being far more important than the ratio of the two occurring forms Phaeomelanin and Eumelanin to each other. Skin color is a classic polygenic trait, with numerous genes involved in its expression. The idea, derived primarily from research into people with albinism , that variants of the key enzyme tyrosinase could explain the variation, has not been confirmed. On the basis of genome-wide association studies (GWAS), however, numerous genes have now been identified in which different alleles, usually only distinguished from one another by point mutations (SNPs), can explain most of the characteristic variation.

The influence of the gene MC1R , which codes for a G protein-coupled receptor , has been well established. One allele of this gene is found to be strikingly more frequent in people with very light skin types and red hair (while, unlike almost all other genes concerned, it does not contribute anything to eye color). SNPs of the SLC24A5 gene are also of great importance; genes homologous to this gene have been identified as responsible for light coat colors in some animal species (e.g. agouti ). Numerous other genes, for example OCA2, MIM, HERC2, ASIP, IRF4, SLC24A4 and many others, could be correlated with manifestations of skin color using GWAS without understanding their role in regulation in every case; new ones are discovered every year. No single gene has a decisive influence; alleles that are predisposed to lighter or darker skin types can always be superimposed by the influence of other genes with opposing effects.

Skin color is one of the few characteristics for which positive selection can be demonstrated in humans . The skin color varies with the geographical latitude , the closer to the equator people live, the darker their skin is. An interplay between the protective function against cell damage through UV radiation (favors dark skin in intense sunlight) and vitamin D synthesis through sunlight in the skin (favors light skin in low sunlight) is considered to be essential , as are other factors such as the greater susceptibility of dark skin to frostbite probably play a role. Although it is not entirely certain what the original skin color is (chimpanzees have light skin that is obscured by black fur), it is most likely that the light variants are due to mutations that occurred in northward migrating human populations during the spread of humanity Africa have been pinned off. The light skin color of the Europeans and North Asians (this is the same when measured objectively, the “yellow race” a fiction of chauvinistic Europeans) arose convergent on an independent genetic basis.

Twin research

Under certain conditions, the genetic share of the variation can be estimated using phenotypic similarities between relatives (e.g. twins). The method of twin research is to analyze the similarities of identical and dizygotic twins. Identical twins are genetically identical, while dizygoti twins share about half of their genes in common. They both share the same pregnancy, and most twins grow up in the same family. If you want to find out to what extent body size is genetically influenced, you can compare the variation in body size in identical twins with that in dizygoti twins. The heritability can then be estimated using population genetics . Studies have shown that in most populations, slightly more than half of the variation in height can be explained by the genetic relationship between parents and children.

literature

  • CG Nicholas Mascie-Taylor, Akira Yasukouchi, Stanley Ulijaszek (eds.): Human Variation. From the Laboratory to the Field (= Society for the Study of Human Biology. Symposium Series. Vol. 49). CRC Press, Boca Raton FL et al. 2010, ISBN 978-1-4200-8471-9 .
  • Julian C. Knight: Human Genetic Diversity. Functional Consequences for Health and Disease. Oxford University Press, Oxford et al. 2009, ISBN 978-0-19-922770-9 .
  • Robert Boyd , Joan B. Silk: How Humans Evolved. 4th edition. Norton, New York NY et al. 2006, ISBN 0-393-92628-1 .

Individual evidence

  1. ^ The Chimpanzee Sequencing and Analysis Consortium (2005): Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437: 69-87. doi : 10.1038 / nature04072
  2. a b Eric S. Lander (2011): Initial Impact of the Sequencing of the Human Genome. Nature 470: 187-197. doi : 10.1038 / nature09792
  3. ^ David B. Goldstein & Gianpiero L. Cavalleri (2005): Understanding human diversity. Nature 437: 1241-1242.
  4. Kelly A. Frazer, Sarah S. Murray, Nicholas J. Schork, Eric J. Topol (2009): Human genetic variation and its contribution to complex traits. Nature Reviews Genetics 10: 241-251. doi : 10.1038 / nrg2554
  5. ^ The ENCODE Project Consortium (2012): An integrated encyclopedia of DNA elements in the human genome. Nature 489: 57-74 doi : 10.1038 / nature11247
  6. Pawel Stankiewicz & James R. Lupski (2019): Structural Variation in the Human Genome and its Role in Disease. Annual Review of Medicine 61: 437-455. doi : 10.1146 / annurev-med-100708-204735
  7. Montgomery Slatkin (2008): Linkage disequilibrium - understanding the evolutionary past and mapping the medical future. Nature Reviews Genetics 9: 477-489.
  8. Kristin G. Ardlie, Leonid Kruglyak, Mark Seielstad (2002): Patterns of linkage disequilibrium in the human genome. Nature Reviews Genetics 3: 299-310. doi : 10.1038 / nrg777
  9. Guido Barbujani & Vincenza Colonna (2010): Human genome diversity: frequently asked questions. Trends in Genetics 26: 285-295. doi : 10.1016 / j.tig.2010.04.002
  10. ^ Richard Lewontin (1972): The apportionment of human diversity. In: T. Dobzhansky, MK Hecht, WC Steere (editors): Evolutionary Biology 6. Appleton-Century-Crofts, New York. pp. 381-398.
  11. ^ AWF Edwards (2003): Human genetic diversity: Lewontin's fallacy. BioEssays 25: 798-801.
  12. ^ S. Wright (1942): Isolation by distance. Genetics 28: 114-138.
  13. David Serre & Svante Pääbo (2004): Evidence for Gradients of Human Genetic Diversity Within and Among Continents. Genome Research 14: 1679-1685. doi : 10.1101 / gr.2529604
  14. Christopher Ruff (2002): Variation in human body size and shape. Annual Revue of Anthropology 31: 211-232 doi : 10.1146 / annurev.anthro.31.040402.085407
  15. ^ J. Burger, M. Kirchner, B. Bramanti, W. Haak, MG Thomas (2007): Absence of the lactase-persistence-associated allele in early Neolithic Europeans. Proceedings of the National Academy of Sciences USA 104 (10): 3736-3741 doi : 10.1073 / pnas.0607187104
  16. Andrea Manica, Bill Amos, François Balloux, Tsunehiko Hanihara (2007): The effect of ancient population bottlenecks on human phenotypic variation. Nature 448 (7151): 346-348. doi : 10.1038 / nature05951
  17. Jun Z. Li, Devin M. Absher, Hua Tang, Audrey M. Southwick, Amanda M. Casto, Sohini Ramachandran, Howard M. Cann, Gregory S. Barsh, Marcus Feldman, Luigi L. Cavalli-Sforza, Richard M. Myers (2008): Worldwide Human Relationships Inferred from Genome-Wide Patterns of Variation. Science 319: 1100-1104. doi : 10.1126 / science.1153717
  18. Chaolong Wang, Sebastian Zöllner, Noah A. Rosenberg (2012): A Quantitative Comparison of the Similarity between Genes and Geography in Worldwide Human Populations. PloS Genetics 8 (8): e1002886. doi : 10.1371 / journal.pgen.1002886
  19. L. Luca Cavalli-Sforza & Marcus W. Feldman (2003): The application of molecular genetic approaches to the study of human evolution. Nature Genetics 33: 266-275. doi : 10.1038 / ng1113
  20. Andreas Vonderach (2008): The Europeans, the others and the asymmetrical evolution. Secession 26: 10-14.
  21. Jonathan S. Friedlaender, Francoise R. Friedlaender, Floyd A. Reed, Kenneth K. Kidd, Judith R. Kidd, Geoffrey K. Chambers, Rodney A. Lea, Jun-Hun Loo, George Koki, Jason A. Hodgson, D Andrew Merriwether, James L. Weber (2008): The Genetic Structure of Pacific Islanders. PLoS Genetics 4 (1): e19 doi : 10.1371 / journal.pgen.0040019
  22. ^ Alan R. Templeton (2005): Haplotype trees and modern human origin. Yearbook of Physical Anthropology 48: 33-59.
  23. ^ Heng Li & Richard Durbin (2012): Inference of Human Population History From Whole Genome Sequence of A Single Individual. Nature 475 (7357): 493-496. doi : 10.1038 / nature10231
  24. ^ Charles N. Rotimi & Lynn B. Jorde (2010): Ancestry and Disease in the Age of Genomic Medicine. New England Journal of Medicine 363: 1551-1558.
  25. Morris W. Foster & Richard R. Sharp (2004): Beyond race: towards a whole-genome perspective on human populations and genetic variation. Nature Reviews Genetics 5: 790-796. doi : 10.1038 / nrg1452
  26. Richard E. Green et al. (2010): A Draft Sequence of the Neandertal Genome. Science 328: 710-722. doi : 10.1126 / science.1188021
  27. David Reich et al. (2010): Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature 468: 1053-1060. doi : 10.1038 / nature09710
  28. explained in Vitor Sousa & Jody Hey (2013): Understanding the origin of species with genome-scale data: modeling gene flow. Nature Reviews Genetics 14: 404-414. doi : 10.1038 / nrg3446
  29. ^ A b Peter M. Visscher, Matthew A. Brown, Mark I. McCarthy, Jian Yang (2012): Five Years of GWAS Discovery. American Journal of Human Genetics 90 (1): 7-24. doi : 10.1016 / j.ajhg.2011.11.029
  30. ^ Catalog of Published Genome-Wide Association Studies
  31. ^ Liana K. Billings & Jose C. Florez (2012): The genetics of type 2 diabetes: what have we learned from GWAS? Annals of the New York Academy of Sciences 1212: 59-77. doi : 10.1111 / j.1749-6632.2010.05838.x
  32. Joel N Hirschhorn, Kirk Lohmueller, Edward Byrne, Kurt Hirschhorn (2002): A comprehensive review of genetic association studies. Genetics in Medicine 4: 45-61. doi : 10.1097 / 00125817-200203000-00002
  33. ^ John PA Ioannidis (2005): Why most published research findings are false. PLoS Medicine 2 (8): e124. doi : 10.1371 / journal.pmed.0020124
  34. ^ Greg Gibson (2012): Rare and common variants: twenty arguments. Nature Reviews Genetics Vol. 13: 135-145. doi : 10.1038 / nrg3118
  35. Dominique Scherer & Rajiv Kumar (2010): Genetics of pigmentation in skin cancer - A review. Mutation research / Reviews in mutation research Volume 705, Issue 2: 141-153. doi : 10.1016 / j.mrrev.2010.06.002
  36. ^ A b Jonathan L. Rees (2003): Genetics of hair and skin color. Annual Revue of Genetics 37: 67-90 doi : 10.1146 / annurev.genet.37.110801.143233
  37. a b A.K. Kalla (2007): Human Skin Color, Its Genetics, Variation and Adaptation: A Review. Anthropologist Special Issue No. 3: 209-214. PDF
  38. ^ A b Richard A. Sturm (2009): Molecular genetics of human pigmentation diversity. Human Molecular Genetics Vol. 18, Review Issue 1 R9-R17. doi : 10.1093 / hmg / ddp003
  39. Leonie C. Jacobs, Andreas Wollstein, Oscar Lao, Albert Hofman, Caroline C. Klaver, Andre G. Uitterlinden, Tamar Nijsten, Manfred Kayser, Fan Liu (2012): Comprehensive candidate gene study highlights UGT1A and BNC2 as new genes determining continuous skin color variation in Europeans. Human Genetics Volume 132, Issue 2: 147-158. doi : 10.1007 / s00439-012-1232-9
  40. Pardis C. Sabeti , Patrick Varilly, Ben Fry, Jason Lohmueller, Elizabeth Hostetter, Chris Cotsapas, Xiaohui Xie, Elizabeth H. Byrne, Steven A. McCarroll, Rachelle Gaudet, Stephen F. Schaffner, Eric S. Lander & The International HapMap Consortium (2007): Genome-wide detection and characterization of positive selection in human populations. Nature Volume 449: 913-919 doi : 10.1038 / nature06250
  41. Ze'ev Hochberg & Alan R Templeton (2010): Evolutionary perspective in skin color, vitamin D and its receptor. Hormones 9 (4): 307-311.
  42. Boyd & Silk, pp. 419-420.