A public genome sequencing effort of cotton was initiated in 2007 by a consortium of public researchers. They agreed on a strategy to sequence the genome of cultivated, tetraploid cotton. "Tetraploid" means that cultivated cotton actually has two separate genomes within its nucleus, referred to as the A and D genomes. The sequencing consortium first agreed to sequence the D-genome relative of cultivated cotton (G. raimondii, a wild Central American cotton species) because of its small size and limited number of repetitive elements. It is nearly one-third the number of bases of tetraploid cotton (AD), and each chromosome is only present once.[clarification needed] The A genome of G. arboreum would be sequenced next. Its genome is roughly twice the size of G. raimondii's. Part of the difference in size between the two genomes is the amplification of retrotransposons (GORGE). Once both diploid genomes are assembled, then research could begin sequencing the actual genomes of cultivated cotton varieties. This strategy is out of necessity; if one were to sequence the tetraploid genome without model diploid genomes, the euchromatic DNA sequences of the AD genomes would co-assemble and the repetitive elements of AD genomes would assembly independently into A and D sequences respectively. Then there would be no way to untangle the mess of AD sequences without comparing them to their diploid counterparts.

What is the final sequencing goal of sequencing diploid cotton genomes first ?