Genome-wide association study

from Wikipedia, the free encyclopedia

A genome-wide association study ( GWAS ., Engl G enome- w ide a ssociation s tudy) is a study of the genetic variation of the genome of an organism - adapted to a particular phenotype (e.g. a disease) - with certain haplotypes (or alleles ) to associate.

The goal of GWAS is ultimately to identify the alleles (a certain expression of a gene ) that occur together with a trait. The genes are not necessarily examined directly - v. a. for economic reasons not - but well-defined markers ( SNP , Single Nucleotide Polymorphism). In order to detect this, methods such as polymerase chain reaction and isothermal DNA amplification with allele-specific oligonucleotides are used .

Overview

A A small locus on human chromosome 5 with two SNPs. B The strength of the association between SNP and disease based on the prevalence of each SNP in disease and control groups. C A Manhattan plot: the chromosomes are lined up on the abscissa and the ordinate shows the degree of association. Each point represents an SNP. With chromosome 5 it becomes clear that there is a significant association between SNP 1 and the disease.

To carry out a GWAS, two groups of test organisms are required: A comparison group (ie “normal”) and a group that shows the phenotype of interest (ie the disease or some other special characteristic). DNA samples are taken from both groups and individually tested for their variation using markers (today defined SNPs are used for this). The analysis then looks for differences in the variation between the two groups: An accumulation of a certain marker in the group of the phenotype of interest represents an association. Most of the loci of the marker SNPs used are not in a protein-coding region, but are either in non-coding regions between two genes (i.e. in regulatory regions ) or on introns .

A GWAS does not say anything about the specific connection between the found allele and the phenotype - it is a mere association (in particular it is an association only with the polymorphism and not even directly with a coding allele), for the time being purely correlative relationship. A possible causal relationship can only be researched after such “candidate genes” have been identified using molecular biological and biochemical methods.

GWAS have gained in importance in recent years due to the drop in prices for DNA sequencing. The lower costs in human medicine increasingly enable the interested population to have a marker analysis of their own genome carried out privately via specialized providers (e.g. 23andMe ). An individual risk assessment (genetic disposition or predisposition) for known allele disease associations is in the foreground, but the increasing number of data sets of the most diverse phenotypes can subsequently be used for research purposes for GWAS (the consent of the DNA Donors assumed).

background

The diploid human genome, for example, comprises a good six billion base pairs . Although the differences between two humans - compared to other species - are extremely small, more than 300 million polymorphisms have so far been found (database Ensembl Variation 91). The vast majority of these polymorphisms are present as single nucleotide polymorphisms (SNP).

Only different alleles (protein-coding and regulatory regions) would actually be of interest - i.e. H. Differences in regions that have a direct influence on gene function (e.g. the function of the encoded protein or the expression rate ). The sequencing of all such regions is still too time-consuming and expensive - and presumably such a high resolution is not even necessary. In a first phase, the HapMap project collected and mapped variants of one million SNPs for the human genome, but is now working on a haplotype map of 3.1 million SNPs in a second phase. In principle, enough markers have been identified to provide one (or more) markers for each gene of interest, which also recombines with the gene . Today, GWAS are practically always carried out on the basis of SNP - in more specific (i.e. not genome-wide, but rather on specific DNA segments or genes) studies, other polymorphisms or complete sequence analyzes can also be used, depending on their suitability.

GWAS make the "freedom from hypotheses" particularly attractive. H. There is no preselection of possible disease / phenotype-causing genes (no a priori knowledge is introduced) - the whole genome is simply examined. This means that the analysis is more open-ended and new and unexpected genes can possibly be associated with phenotypes.

Limits and Dangers

The GWAS has various methodological limits. The greatest limitation of the GWAS is that only associations of common haplotypes to one phenotype can be found - all rare variants remain undetected. It should also be emphasized that GWAS only provide correlative results. A certain allele of a gene occurs more frequently together with a phenotype, which means that the gene and trait are 'somehow' related to one another. The causality must first be shown or found in further investigations. Also today it is not the genes themselves that are found, but only polymorphisms, which in turn only occur correlatively with the genes together.

In human medicine, with the increasing popularity of personalized medicine, more and more patient genomes are being sequenced (or genetic tests are carried out - only sections of the entire genome are sequenced). The continuing decline in the price of DNA sequencing due to increasingly efficient technologies is hugely beneficial to this trend. Providers have also entered the market who address private customers directly - even without illness and only out of curiosity, sequencing is now carried out. The increasing availability of human genomes inevitably raises social issues, e.g. B. how health insurance companies should deal with the highly specific information, how patients deal with correlative results with regard to a probability of illness or how private the personal sequence should be. There are already online SNP databases such as opensnp.

See also

Web links

Individual evidence

  1. a b Manolio, Teri A. Genome Wide Association Studies and Assessment of the Risk of Disease . N Engl J Med . Vol. 363. P. 166-176. 2010.
  2. VK Ramanan, AJ Saykin: Pathways to neurodegeneration: mechanistic insights from GWAS disease in Alzheimer's, Parkinson's disease, and related disorders. In: American journal of neurodegenerative disease. Volume 2, Number 3, 2013, pp. 145-175, ISSN  2165-591X . PMID 24093081 . PMC 3783830 (free full text).
  3. ^ WR Jeck, AP Siebold, NE Sharpless: Review: a meta-analysis of GWAS and age-associated diseases. In: Aging cell. Volume 11, Number 5, October 2012, pp. 727-731, ISSN  1474-9726 . doi : 10.1111 / j.1474-9726.2012.00871.x . PMID 22888763 . PMC 3444649 (free full text).
  4. ^ PM Visscher, MA Brown, MI McCarthy, J. Yang: Five years of GWAS discovery. In: American Journal of Human Genetics . Volume 90, Number 1, January 2012, pp. 7-24, ISSN  1537-6605 . doi : 10.1016 / j.ajhg.2011.11.029 . PMID 22243964 . PMC 3257326 (free full text).
  5. ^ F. Begum, D. Ghosh, GC Tseng, E. Feingold: Comprehensive literature review and statistical considerations for GWAS meta-analysis. In: Nucleic Acids Research . Volume 40, Number 9, May 2012, pp. 3777-3784, ISSN  1362-4962 . doi : 10.1093 / nar / gkr1255 . PMID 22241776 . PMC 3351172 (free full text).
  6. ^ The International HapMap Consortium: A second generation human haplotype map of over 3.1 million SNPs . Nature . Vol. 449. P. 851-861. 2007. [PDF]
  7. Hirschhorn et Daly: Genome-wide association studies for common diseases and complex traits. Nat Rev Genet . 6 (2): 95-108. 2005.
  8. ^ J. Ermann, LH Glimcher: After GWAS: mice to the rescue? In: Current Opinion in Immunology . Volume 24, Number 5, October 2012, pp. 564-570, ISSN  1879-0372 . doi : 10.1016 / j.coi.2012.09.005 . PMID 23031443 . PMC 3631559 (free full text).