Assembly

Orangutan This site presents the 6X whole genome shotgun assembly from a female Sumatran orangutan (Pongo pygmaeus abelii) named Susie, housed at the Gladys Porter Zoo (Brownsville, TX). The primary donor-derived reads were assembled using PCAP (Huang, 2006) using stringent parameters; by aligning the orangutan genome against the human genome, it was possible to identify interchromosomal cross-overs and thus eliminate global mis-assemblies larger than 50kb.

Of the 3.09Gb of total sequence, 3.08Gb are ordered and oriented along the chromosomes. Gap sizes between supercontigs were estimated based on their size in human, with a maximum allowed gap size of 30kb.

The Orangutan genome has been released in pre-publication status from the Genome Sequencing Center at Washington University, St Louis. This is provided freely to be used by anyone, but they have requested that the scientific ethics of other groups publishing on this pre-publication data are respected. This is outlined in detail in the Fort Lauderdale agreement; in brief, small scale analysis, eg, the analysis of a single locus is an expected use of the data which can be published on without any expectation of coordination. In contrast, large scale, genomewide analysis is expected to be either coordinated with the Orangutan Analysis group in some manner or published after the initial paper. More details on the reasoning for this and details are given in the Fort Lauderdale document.

Download Orangutan genome sequence (FASTA)

Annotation

Due to the high sequence similarity to the human genome, the Orangutan genebuild was based on a projection of human gene structures. The projections were made through chained whole genome BLASTz alignments. These projected genes were combined with orangutan-specific proteins, and additional human genes were added using exonerate where the projection was unable to make satisfactory gene models. UTRs were added using orangutan-specific ESTs and cDNAs as well as human cDNAs.