How the vertebrates were made: selective pruning of a double-duplicated genome
© Manning and Scheeff; licensee BioMed Central Ltd. 2010
Received: 6 December 2010
Accepted: 7 December 2010
Published: 13 December 2010
Vertebrates are the result of an ancient double duplication of the genome. A new study published in BMC Biology explores the selective retention of genes after this event, finding an extensive enrichment of signaling proteins and transcription factors. Analysis of their expression patterns, interactions and subsequent history reflect the forces that drove their evolution, and with it the evolution of vertebrate complexity.
See research article: http://www.biomedcentral.com/1741-7007/8/146/abstract
A doubling of the genome, or whole genome duplication (WGD), is usually a cataclysmic event for an organism. Yet this polyploidy has been an important, if rare, event in the evolution of many plant groups, and has also occurred in yeasts, ciliates, fish and frogs . It is now generally accepted that we and all other jawed vertebrates are the product of a remarkable two rounds of WGD, known as 2R , which duplicated every gene up to four-fold (fish and frog genomes have undergone a third duplication more recently). This opened the door to a tremendous expansion in functionality, and while most WGD duplicates, or ohnologs, were rapidly lost, this phenomenon was the genesis of almost one-third of all human genes. Establishing why these duplicates were retained and how they have evolved since then is an important way to advance the understanding of their current functions.
A study by Huminiecki and Heldin in BMC Biology  seeks to answer these questions through a global analysis of genes that survived the massive pruning that followed 2R. They identified 2R-derived gene pairs using a combination of sequence similarity (by comparing gene trees with the underlying species trees to identify duplications ) and chromosomal location, using syntenic chromosomal regions, in which runs of related gene pairs occur in different loci. They then explore the history of most vertebrate genes through 2R and subsequent gains and losses. They find that retained ohnologs are highly biased towards signaling genes and transcription factors and argue that this large pool of new genes would have enabled the complex regulation required for the development and function of the vertebrate body plan. They integrate these results with expression and pathway data to show that retained ohnologs play important roles in functional categories, such as those required by the nervous system and for locomotion, that are crucial to complex vertebrates.
The importance of dosage balance is supported by two other findings from this paper. First, small scale duplications (SSDs) that have occurred after 2R show a very different functional bias from that of WGD duplicates: they contain far fewer signaling proteins and transcription factors, but are enriched in immune functions and chromatin modifiers. This suggests that individual duplication of signaling proteins may be toxic or non-functional, requiring the dosage balance of a WGD to survive. A similar bias is also seen in other studies of SSD following WGD, and ohnologs are also underrepresented in copy number variations in human populations, further reflecting their dosage sensitivity . Second, they show that retained ohnologs are more highly connected in pathway and protein interaction maps, further suggesting that they may be required for dosage balance.
The simplest gene dosage models are based on stoichiometric balance between subunits of a stable protein complex. The Huminiecki and Heldin study highlights the limitations of the simple model, since signaling proteins and transcriptional regulators tend to make relatively transient interactions, consistent with their role in information transfer. This suggests that dynamic balancing of signal flux may be as important as structural balances in protein complexes. For instance, duplication of a phosphatase might balance the increased flux from duplication of a corresponding kinase; accordingly, retained ohnologs are specifically enriched for negative regulatory interactions . Dosage balance may also operate in a positive sense: rather than blocking toxicity, the co-duplication of many interacting genes may aid the development of novel pathways and functions.
Duplicates as an innovation factory
Another receptor tyrosine kinase (RTK) family, the Ephs, has expanded by WGD and SSD from one gene in invertebrates to 14 in human, giving rise to a similar explosion in complexity through heterodimerization and ligand cross-talk. This richness is used extensively in developmental patterning, and demonstrates continued evolvability. For instance, in chicken, graded expression of EphA3 across the retina provides the basis for spatial mapping of retinal ganglion cells projecting to the tectum . However, in mouse, EphA3 is not expressed in these cells, and instead EphA5 and EphA6 fulfill this role, suggesting that new and swapped functions can emerge from duplicates long after they have acquired essential roles, and that WGD can represent a quantum leap in the potential for new complexity and evolvability within the vertebrates. We estimate that, excluding the Ephs, 2R caused an expansion of RTKs from 20 to 46, but only two new human RTKs have emerged since then (ES and GM, unpublished): the two rounds of WGD thus seem to have been crucially important in shaping human RTK signaling.
One notable aspect of the patterns reported by Huminiecki and Heldin is how similar they are to those seen in other WGD events [8–10]. Enrichment in signaling proteins and transcription factors has also been seen in WGD from yeast, plants, and fish. Conversely, other genes (mostly those involved in basic cellular processes) preferentially return to singleton status, and similarities in these loss patterns can also be detected across kingdoms. While SSDs show more lineage-specific variability, there are also similarities, such as the increased SSD rate in plant secondary metabolic genes involved in pathogen defense  mimicking the increased vertebrate SSD in immune genes.
2R: the future
It is tempting to speculate from these observations that WGD produces a consistent drive towards higher complexity , and the two rounds of vertebrate WGD doubly so. However, it is a vexed question exactly what is meant by complexity. It is not clear, for example, that fish and frogs, which have undergone an extra round of genome duplication, are more complex than humans, which have not.
The kind of molecular archaeology pursued by Huminiecki and Heldin is not just of academic interest: detailed comparison of ohnologs from many species can provide the unique sequence signatures underlying their specific functions, and patterns of gain or loss can help us to understand functional interactions between genes. As more vertebrate genomes become available, we will gain greater precision in determining orthology, synteny and post-2R changes. Knowing the trends in ohnolog retention and the history of human genes will help us to better understand their dosage sensitivity, and the shared and unique functions of all ohnologs.
- Sémon M, Wolfe KH: Consequences of genome duplication. Curr Opin Genet Dev. 2007, 17: 505-512.View ArticlePubMedGoogle Scholar
- Kasahara M: The 2R hypothesis: an update. Curr Opin Immunol. 2007, 19: 547-552. 10.1016/j.coi.2007.07.009.View ArticlePubMedGoogle Scholar
- Huminiecki L, Heldin C-H: 2R and modeling of vertebrate signal transduction engine. BMC Biol. 2010, 8: 146-PubMed CentralView ArticlePubMedGoogle Scholar
- Li H, Coghlan A, Ruan J, Coin LJ, Hériché JK, Osmotherly L, Li R, Liu T, Zhang Z, Bolund L, Wong GK, Zheng W, Dehal P, Wang J, Durbin R: TreeFam: a curated database of phylogenetic trees of animal gene families. Nucleic Acids Res. 2006, D572-580. 10.1093/nar/gkj118. 34 DatabaseGoogle Scholar
- Makino T, McLysaght A: Ohnologs in the human genome are dosage balanced and frequently associated with disease. Proc Natl Acad Sci USA. 2010, 107: 9270-9274. 10.1073/pnas.0914697107.PubMed CentralView ArticlePubMedGoogle Scholar
- Bublil EM, Yarden Y: The EGF receptor family: spearheading a merger of signaling and therapeutics. Curr Opin Cell Biol. 2007, 19: 124-134. 10.1016/j.ceb.2007.02.008.View ArticlePubMedGoogle Scholar
- Lemke G, Reber M: Retinotectal mapping: new insights from molecular genetics. Annu Rev Cell Dev Biol. 2005, 21: 551-580. 10.1146/annurev.cellbio.20.022403.093702.View ArticlePubMedGoogle Scholar
- Blomme T, Vandepoele K, De Bodt S, Simillion C, Maere S, Van de Peer Y: The gain and loss of genes during 600 million years of vertebrate evolution. Genome Biol. 2006, 7: R43-10.1186/gb-2006-7-5-r43.PubMed CentralView ArticlePubMedGoogle Scholar
- Maere S, De Bodt S, Raes J, Casneuf T, Van Montagu M, Kuiper M, Van de Peer Y: Modeling gene and genome duplications in eukaryotes. Proc Natl Acad Sci USA. 2005, 102: 5454-5459. 10.1073/pnas.0501102102.PubMed CentralView ArticlePubMedGoogle Scholar
- Paterson AH, Chapman BA, Kissinger JC, Bowers JE, Feltus FA, Estill JC: Many gene and domain families have convergent fates following independent whole-genome duplication events in Arabidopsis, Oryza, Saccharomyces and Tetraodon. Trends Genet. 2006, 22: 597-602. 10.1016/j.tig.2006.09.003.View ArticlePubMedGoogle Scholar
- Freeling M, Thomas BC: Gene-balanced duplications, like tetraploidy, provide predictable drive to increase morphological complexity. Genome Res. 2006, 16: 805-814. 10.1101/gr.3681406.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.