Assembly

Atlantic cod This is the first draft seqeunce and assembly of the Atlantic cod (Gadus morhua), provided by the cod genome consortium. Whole genome shotgun and paired-end data were generated using the Roche 454 FLX Titanium platform. The 0.9 GB genome was sequenced to 25x coverage and assembled using Newbler, resulting in an assembly with a contig N50 of 2.3 kb and a scaffold N50 of 687.7 kb.

The genome assembly represented here corresponds to EMBL-Bank WGS Master CAEA00000000.1

Download Cod genome sequence (FASTA)

Annotation

Owing to the fragmentary nature of the Atlantic cod assembly, it was necessary to combine the standard protein-evidence based annotation approach with a complementary annotation method based on a whole genome alignment to stickleback. Some scaffolds were rearranged into "gene-scaffold" super-structures using our projection method, and 17,920 out of 20,787 protein-coding stickleback genes were mapped onto reorganized scaffolds. In addition, protein-coding genes, pseudogenes and non-coding RNAs were annotated using the standard protein evidenced based Ensembl pipeline. These approaches resulted in a final gene set of 20,095 protein-coding genes, 518 pseudogenes, and 1,541 noncoding RNA genes.