RTB and the chimp genome Part 5

This week, I've been discussing a recent blog post by Dennis Venema critiquing RTB's published statements on the chimp genome. Previous posts (one, two, three, and four) have examined Fuz Rana's four part response posted on the RTB website (one, two, three, and four). I've already documented a series of factual errors and misquotations in Rana's posts. Yesterday, I tested Rana's hypothesis about why the human and chimpanzee genomes are not really 98-99% identical, and my results showed that Rana was wrong. Today, I want to address Rana's claim that such similarity measures are "meaningless."

The theme of Rana's third (abbreviated) post responding to Venema is that the gross genomic similarity really doesn't tell you anything about how similar or different organisms really are. According to Rana,
The crux of his [Venema's] criticism was that... We misrepresent scientific opinion when we claim that the genetic comparisons between humans and chimpanzees that describe the differences (or similarities) in terms of percentages are meaningless.
I should point out that the criticism here attributed to Venema was not Venema's. Looking over Venema's post, I don't find that criticism at all. Venema is concerned primarily with how RTB represented the research on the simlarity of the human and chimp genomes, not what that similarity means.

Despite this erroneous attribution, I think Rana is trying to make an interesting point. I definitely sympathize with his claim that percent identity measurements don't tell us much. At the same time, I think we need to be careful about what we mean when we talk about the "meaning" of similarity.

The significance of the chimp/human genome similarity can be interpreted in two different ways. I'm going to distinguish here the biological meaning from the phylogenetic meaning, which kind of overlap, but I think you'll get my point. Biologically, the similarity of the human and chimpanzee genomes is surprising, and this surprise was expressed very early in the history of molecular biology. In a 1975 article, Mary-Claire King and Allan Wilson wrote,
During the past decade, many workers have participated in the development and application of biochemical methods for estimating genetic distance. ... The intriguing result ... is that all the biochemical methods agree in showing that the genetic distance between humans and the chimpanzee is probably too small to account for their substantial organismal differences.
What could account for the biological differences? King and Wilson proposed that it must be differences in the regulatory elements (the switches that turn the genes on and off), rather than the genes themselves, that explained why chimpanzees are so different from humans.

These are the differences that Rana emphasizes when he talks about the "meaning" of the genomic similarity between chimps and humans. In his third response to Venema, Rana directed his readers to a previous post where he discusses what he called the 1% myth. He wrote, "The 99% genetic similarity provides limited biological insight, at best." True enough. In fact, the 99% similarity makes the differences all the more baffling.

Rana's "1% myth" post relies heavily on a piece written by Jon Cohen for Science in 2007 that explores the very same themes. From Cohen's article,
Using novel yardsticks and the flood of sequence data now available for several species, researchers have uncovered a wide range of genomic features that may help explain why we walk upright and have bigger brains - and why chimps remain resistant to AIDS and rarely miscarry. Researchers are finding that on top of the 1% distinction, chunks of missing DNA, extra genes, altered connections in gene networks, and the very structure of chromosomes confound any quantification of "humanness" versus "chimpness."

Once again, though, Rana does not summarize the article accurately. Rana wrote,
Cohen identifies several key differences between human and chimp genomes that went unnoticed until recently because of the fixation on the 1% genetic difference. For example: The true genetic similarity between humans and chimps is not 99% (which is based on substitution mutations). Instead it’s about 90% when indels (insertions and deletions in the DNA sequences) are considered.
The only place Cohen's article mentioned indels is in a paragraph that summarized the chimp genome paper, which put the indels at about 3% of the size of the human genome. Add that to the 1.23% single nucleotide differences, and you get 4.23% difference, which would imply a 95% similarity. The only way to turn that into 90% similarity is to add the 6% copy number difference that Rana also cites, but if you consult the original paper that the 6% copy number difference is based on, you find that it's a percentage of the number of genes that differ not a percent sequence difference. In any event, Cohen never wrote what Rana said he wrote.

As I said above, I sympathize with the point that Rana wants to make. The striking similarity of the human and chimp genomes simply does not account for why the two species are so different. Looking just at a raw percent identity of DNA sequence really is biologically meaningless by itself, but that's not the same kind of "meaning" as the phylogenetic meaning. To an evolutionary biologist, phylogenetic information tells us which species are closely related and which are more distantly related. In this instance, Rana is concerned with whether or not chimps and humans are related at all, which is obviously related to the phylogenetic meaning of sequence similarity rather than strictly the biological one. From a phylogenetic standpoint, arguing for the common ancestry of the human and chimp genomes does not depend on biological similarities or differences. In fact, as Venema demonstrated in the second post of his critique of RTB's position and his PSCF paper, the parts of the genome with no known biological function are far more potent arguments for human/chimp common ancestry than the characterized functional differences.

Furthermore, the phylogenetic significance of a "percent similarity" doesn't necessarily depend on the precise value of the similarity. Sure, in the case of humans and chimpanzees, it seems like a pretty big thing that we have nearly identical genomes. But if that were not the case, if the chimp genome was only 80% or 75% identical to the human genome, that would still be pretty good evidence of kinship between humans and chimps. Rana seems to think that lower similarity is a significant issue. He wrote, "If a 99% genetic similarity implies a close evolutionary relationship, what does a 90% similarity mean?" It would just mean that humans and chimps had a slightly less close evolutionary relationship.

Don't believe me? Let's do a little thought experiment: Imagine there were no chimpanzees. Evolutionary biologists would then emphasize the genomic similarity to gorillas. If there were no great apes at all, then we'd hear about the similarity to monkeys. If there were no other primates, we'd be confronted with the general similarity of all mammalian genomes. Shoot, humans have a high degree of genomic similarity to elephants, mice, and aardvarks. The chimp genome just sticks out because it is the most similar, not because it's 90% or 95% or 99.999% similar. The actual number (whatever it is) doesn't really matter all that much when two genomes share statistically significant similarity.

So by confusing the biological with the phylogenetic significance of similarity, Rana makes an interesting point that isn't all that relevant to the question of the hypothesized common ancestry between humans and chimps. Ultimately, it seems like Rana just wants to say that chimps and humans are too different to share a common ancestor, which by itself is not a very interesting (or even testable) claim.

I wish at this point I could say that I'm done with RTB's position on the chimp genome, but I'm not. Tomorrow, I need to address part four of Rana's response to Venema, then in the final part of this series, I'm going to try to wrap this up and draw some conclusions from what I've found in Rana's work.

King & Wilson. 1975. Evolution at two levels in humans and chimpanzees. Science 188:107-116.

Feedback? Email me at toddcharleswood [at] gmail [dot] com.