The past--Monogenic disorder studies-introduction and limitation
Many disorders have a genetic root and some diseases called monogenic diseases are caused fully by single genetic mutation. For the past two decades, wide use of genetic linkage combined with positional cloning offered an objective as well as straightforward way to explore single mutation disorders such as Huntington's disease and maturity onset diabetes of the young(1 Jimenez-Sanchez G, Childs B, Valle D. Human disease genes. Nature 2001; 409: 853-5). By now, more than 13,000 disease-related genes have been identified and in most cases, mutations change the sequences of amino acid of protein products which tremendously enhance the risk of disease. Studies of these monogenic disease genes led to a deeper understanding of disorders and related phenotypes in a biological way which results in the development of clinical measurement and treatment(2 Pearson E, Hattersley A. Genetic aetiology alters response to treatment in diabetes. Diabet Med 2003; 20: 12.).
The monogenic diseases, however, are comparatively rare in a population bases and due to the limitation of linkage analysis of polygenic diseases and normally only small part of the heritability can be explained by genes discovered (What will whole genome searches for susceptibility genes H. Lango & M. N. Weedon), same strategies of monogenic diseases can hardly be implemented in the common diseases that have a complex trait. ((Altmuller, J., Palmer, L. J., Fischer, G., Scherb, H. & Wjst, M. Genomewide scans of complex human diseases: true linkage is hard to find. Am. J. Hum. Genet. 69, 936-950 (2001).). Examples include type 1, 2 diabetes, (Hakonarson H, Grant SF, Bradfield JP, et al. A genome-wide association study identifies KIAA0350 as a type 1 diabetes gene. Nature 2007;448:591-594.Sladek R, Rocheleau G, Rung J, et al. A study identifies novel risk loci for type 2 diabetes. Nature 2007;445:881-885.), hypertension (Wang Y, O'Connell JR, McArdle PF, et al. From the cover: Whole-genome association study identifies STK39 as a hypertension susceptibility gene. Proc Natl Acad Sci USA 2009;106:226-231.), obesity (Scuteri, A., S. Sanna, et al. (2007). Genome-wide association scan shows genetic variants in the FTO gene are associated with obesity-related traits.PLoS Genet 3(7): e115. ) and cancer(Easton, D. F. and R. A. Eeles (2008). Genome-wide association studies in cancer. Hum Mol Genet 17(R2): R109-15.) which are thought to be affected by sets of gene and genes-environment interactions.
In the past few years, rise of Genome-wide association studies (GWAS) broke the logjam and discovered a large number of strong connections between chromosomal loci and complex human disorders (Genome-wide Association Studies and Human Disease).Four major advancements made Genome-wide association studies feasible and wide-spread in such a short time-First is the common disease, common variant hypothesis (CD-CV) or interaction model developed in the mid 1990s (Lander ES. The new genomics: Global views of biology. Science 1996;274: 536-539.) which propose that the common diseases have common variants among populations and each variant have a comparatively impact on the risk of getting certain disease. That model gives theoretical support for using a case-control study to search for the common variants through differences of allele frequencies. Later in 2003, the completion of the Human Genome Project made it possible to analyze samples for genetic variant in the scope of whole-genome set.[ International HapMap Consortium. A haplotype map of the human genome. Nature 2005;437:1299-1320.] Then, the rapid development of microtechnology and dramatically decreasing cost for high-throughput genotyping make large number of analysis possible.( Lander ES. Array of hope. Nat Genet 1999;21:3-4.) Lastly, the fast development of bioinformatics technology accelerates the biostatistical processes and public online database enable using the existing control dates and multiple testing for quality control.(Moore, J. H., F. W. Asselbergs, et al. Bioinformatics Challenges for Genome-Wide Association Studies. Bioinformatics.).
Genome-wide association studies mainly look for association of DNA sequences variants and aimed phenotypes. (Progress and challenges in genome-wide association studies in humans) More specifically, predominately study the single nucleotide polymorphisms (SNPs) which are the most common genetic variations. It was estimated that whole human genome has more than 10 million SNPs (Lander ES, Linton LM, Birren B, et al. Initial sequencing and analysis of the human genome. Nature 2001;409:860-921.)That seems to be a huge number exceed our capability to realize genome-wide disease association studies, but most human genome, except hotspots of high recombination, can be parsed objectively into haplotype blocks with loci of SNPs that are highly correlated with each other with each other in linkage disequilibrium (LD) blocks.(Gabriel SB, Schaffner SF, Nguyen H, et al. The structure of haplotype blocks in the human genome. Science 2002;296:2225-2229.)The haplotypic nature of human genome enlightened that using careful selection of a parsimonious set of SNPs, called tag-SNPs, able to reconstruct their blocks and distinguish the haplotypic variations in a population. With the help of HapMap project to identify tagging SNPs in the block structure, the first full-scalecatalogue has been open to public in 2005(International HapMap Consortium. A haplotype map of the human genome.Nature 2005;437:1299-1320), following by SNP microarrays designed by several companies and finally, the era of GWAS is ready to parted the curtain.
After the first published GWAS reported association between functional SNP in the complement factor H and age related macular degeneration in 2005(Klein RJ, Zeiss C, Chew EY, et al. Complement factor H polymorphism in age-related macular degeneration. Science 2005;308:385-389.), more than two thousand associations of common genetic variants with over 80 diseases and traits have been identified (http://www.genome.gov/gwastudies/)within 5 years. During these studies, the ability to find association at a certain SNP decrease in less common variants, which indicate, so far, our findings are focused on common variants rather than rare variants.
Furthermore, GWAS studies showed, the effect size of each variant to disease risk is small in most cases: typically 10-30 %.( Progress and challenges in genome-wide association studies in humans).Another notable and controversialfactor of GWAS is, despite the great success of GWAS detects a large number of association signals, few details of the functional mechanisms responsible for the association signals are yet known and maybe surprisingly, most of the loci discovered did not feature on the list of disease candidates genes. Owing to not knowing the mechanisms of association, clinical prospects GWAS studies are often left in question.
Even though reports so far are just the tip of the iceberg, there were still inspiring insights in to disease processes and some unsuspected overlaps between disease loci(Manolio, T. et al. A HapMap harvest of insights into the genetics of common disease. J. Clin.Invest. 118, 1590-1605 (2008). McCarthy, M. et al. Genome-wide association studies for complex traits: consensus,uncertainty and challenges. Nature Rev. Genet. 9, 356-369 (2008))
And for sure, with further extensive analyses, meta-analyses of published studies, follow-up studies and more and more associated variants will be discovered ( Progress and challenges in genome-wide association studies in humans).
Given the some difficulties and critics in these studies as well as the extent of their achievements so far, it might be better to use the following two examples of Genome-wide association studies of common diseases to elucidate advantages and challenges of this promising but critical technology.