Thursday, June 23, 2011

What Jeff didn't tell you

Jeff Tomkins has a new paper out in ARJ this week, "How Genomes are Sequenced and Why it Matters: Implications for Studies in Comparative Genomics of Humans and Chimpanzees." Jeff and I go way back, and I think it's fair to say that he helped me tremendously in learning as much as I know about genomics. So I write this post with the utmost respect - respectful disagreement, but definitely respect. His paper is a really decent summary of recent genome sequencing techniques, and it raises some interesting points about how the chimp genome sequences were obtained. He concludes:
A majority of the public and scientific community are not aware of these caveats [how the chimp genome sequence has been generated] and still told hold to the dogma that the human genome is 98 to 99% similar to chimpanzee, which is most likely not the case. The fact is that major differences between the structure of the human and a chimpanzee genomes are now being documented as the genomic resources improve.
The "major differences" he cites there is the Y chromosome, which did indeed show significant differences between the human and chimp sequences.

He does leave out one very, very important issue, however, namely that the Y chromosome is unrepresentative of the entire genome and should therefore not be considered vindication of the opinion that the chimp genome will be much more different than is generally reported. For example, in an early paper on single nucleotide polymorphism (SNP) discovery in the human genome, the International SNP Map Working Group reported SNP frequencies about four times higher on the human Y chromosome than on the human autosomes (non-sex chromosomes). SNPs are single nucleotide differences that exist within a species' gene pool. SEE CORRECTION We also know that the human Y chromosome is a hotbed of repetitive sequences, much more dense than on the autosomes, which leads to even more variability. So when the chimp Y chromosome was sequenced, it was not surprising to find that it was very different from the human Y chromosome.

In fact, a 2003 research paper by Rozen et al. seemed to say exactly that. Rozen et al. looked at palindromic sequences in the Y chromosomes of humans and apes - sequences that read the same forward and backward. The human Y chromosome contains eight palindromes, which are enormous as far as palindromes go. A typical palindrome sequence might be a few tens of nucleotides, but the ones on the human Y chromosome range from 9,000 nucleotides to 1,450,000 nucleotides. That's HUGE! Rozen et al. tried to find six of these palindromes in the genomes of chimpanzee, bonobo, and gorilla. They found five in chimps, four in bonobos, and only two in gorillas. That was the first hint that the Y chromosomes were going to be very different in the human and chimp genomes. When the final chimp Y chromosome sequence was published, it turned out to have 19 palindromes, with only 7 shared with the human Y chromosome. The key point is that this difference was anticipated because we already knew that the Y chromosome was a really variable chromosome.

Why does this matter? Because the Y chromosome is not like the other chromosomes in the human genome. The autosomes are far less variable and have less repetitive sequence. You therefore cannot extrapolate the similarity of autosomes from the similarity of Y chromosomes. So the chimp Y chromosome gives us no hope whatsoever that the true similarity of chimp and human autosomes will be very low.

I'll have more to say about the chimpanzee genome at the CBS conference. I hope to see you there!

Feedback? Email me at toddcharleswood [at] gmail [dot] com.