Genome project opens the book on human evolution
Genome project opens the book on human evolution
February 12, 2001
Like an enormous library, the human genome project now awaits the work of a generation of scientists who will catalogue and organize its contents and begin to read and understand its secrets. Researchers at the University of Chicago open the book on human molecular evolution with a paper in the February 12, 2001, issue of Nature.
Evolutionary genomics, using computational analysis of whole genomes to directly address important questions about evolutionary biology, can now be applied to the understanding of human genes and their regulatory sequences.
"In this first exploration of the human genome data, we addressed questions interesting to molecular evolution that could be answered in some detail in a short time frame," said Wen-Hsiung Li, PhD, George Wells Beadle Distinguished Service Professor in the department of ecology and evolution at the University of Chicago.
One of the puzzles of human evolution has been the much higher percentage of repetitive DNA, stretches of DNA that are not genes but that share the same sequence of base pairs, in human than in other invertebrate genomes. The function of this so-called "junk DNA" has been a mystery. These repetitive elements (transposable elements) are found so frequently in our genome mainly because they are inserted more frequently into our genome than is possible to get rid of them--not because they confer advantage to us.
The University of Chicago researchers confirmed the very high percentage of repetitive elements in the human genome--their analysis found it to be 43 percent, while repetitive elements in the genomes of organisms as diverse as Drosophila (fruit flies) and Arabidopsis (a mustard plant) average10 percent. In addition, they were able to look at the location of these elements.
These repetitive elements, particularly the element known as Alu, were found in a surprising number of proteins.
"We have always assumed that insertions of repetitive elements into genes would be deleterious, that they would impair the protein's ability to function," said Li. "Instead we find a surprisingly large number in translated proteins."
The repetitive elements seem to insert into non-coding regions of a gene and be incorporated into protein through alternative splicing. Because the elements contain splicing sites--places where the editing machinery of the cell cuts genes for translation into proteins--new proteins may be created as the coding regions of the old gene are reshuffled, elongated, or truncated. The location and distribution of the human repetitive elements may hint at their role in gene evolution and species differentiation.
Many proteins also may have evolved by picking up structural or functional elements, called domains, from other proteins and mixing and matching these elements to develop altered or improved functions. The percentage of human proteins that are considered mosaics, i.e. they have more than one domain, is quite high--28 percent.
Li's group looked at how often domain sharing is conserved: where two or more proteins have the same combination of domains. This, too, was very high in human proteins--for example there are 88 cases where three proteins share two types of domain--indicating that this may be important to protein evolution.
In comparing domain sharing in the human genome to three other organisms--fruitflies, nematodes, and yeast--the researchers found that domain sharing is both common and highly conserved.
Olfactory receptors, immunoglobulins, and keratins were among the largest families of proteins. The largest gene family in the human genome was the remnant of an invader, reverse transcriptase, a gene found in the L1 repetitive element--probably an early interloper into the genome capable of copying and reinserting itself millions of times.
"We expected olfactory receptors to be high because even the nematode has a large number," said Li. "But we were surprised by the fourth largest family--keratins."
"Many challenges to our analysis of the human genome remain," said Li. "As the human genome is better annotated and databases for genes and proteins are improved more rigorous analysis will be possible."
Wen-Hsiung Li, PhD, is the George Wells Beadle Distinguished Service Professor in the department of ecology and evolution at the University of Chicago. Anton Nekrutenko, PhD, is a research associate in the department of ecology and evolution at the University of Chicago. Zhenglong Gu and Haldong Wang are graduate students in the department of ecology and evolution at the University of Chicago.
This research was supported by a grant from the National Institutes of Health.