Archive Ensembl HomeArchive Ensembl Home

Genomic alignments

BlastZ-net/Lastz-net Pairwise Alignment Analysis

BlastZ-net (Schwartz S et al., Genome Res.;13(1):103-7, Kent WJ et al., Proc Natl Acad Sci U S A., 2003;100(20):11484-9) or the newer version LastZ-net alignments are provided for closely related pairs of species. The alignments are the results of post-processing the raw BlastZ or LastZ results. In the first step, original blocks are chained according to their location in both genomes. The netting process chooses for the reference species the best sub-chain in each region. The reference species in the BlastZ-net or LastZ-net alignments is in bold:

Human (Homo sapiens)

Gorilla (Gorilla gorilla)
Chimpanzee (Pan troglodytes)
Orangutan (Pongo abelii)
Gibbon (Nomascus leucogenys)
Macaque (Macaca mulatta)
Marmoset (Callithrix jacchus)
Tarsier (Tarsius syrichta)
Mouse Lemur (Microcebus murinus)
Bushbaby (Otolemur garnettii)
Mouse (Mus musculus)
Rat (Rattus norvegicus)
Kangaroo rat (Dipodomys ordii)
Squirrel (Spermophilus tridecemlineatus)
Guinea Pig (Cavia porcellus)
Pika (Ochotona princeps)
Rabbit (Oryctolagus cuniculus)
Tree Shrew (Tupaia belangeri)
Sloth (Choloepus hoffmanni)
Armadillo (Dasypus novemcinctus)
Lesser hedgehog tenrec (Echinops telfairi)
Elephant (Loxodonta africana)
Hyrax (Procavia capensis)
Hedgehog (Erinaceus europaeus)
Shrew (Sorex araneus)
Microbat (Myotis lucifugus)
Megabat (Pteropus vampyrus)
Horse (Equus caballus)
Cat (Felis catus)
Dog (Canis familiaris)
Panda (Ailuropoda melanoleuca)
Dolphin (Tursiops truncatus)
Pig (Sus scrofa)
Cow (Bos taurus)
Alpaca (Vicugna pacos)
Tasmanian devil (Sarcophilus harrisii)
Wallaby (Macropus eugenii)
Opossum (Monodelphis domestica)
Platypus (Ornithorhynchus anatinus)
Chicken (Gallus gallus)

Mouse (Mus musculus)

Rat (Rattus norvegicus)
Dog (Canis familiaris)
Platypus (Ornithorhynchus anatinus)
Medaka (Oryzias latipes)

Dog (Canis familiaris)

Horse (Equus caballus)
Panda (Ailuropoda melanoleuca)

Pig (Sus scrofa)

Cow (Bos taurus)

Opossum (Monodelphis domestica)

Tasmanian devil (Sarcophilus harrisii)
Wallaby (Macropus eugenii)

Chicken (Gallus gallus)

Turkey (Meleagris gallopavo)
Zebra Finch (Taeniopygia guttata)
Anole Lizard (Anolis carolinensis)

Zebrafish (Danio rerio)

Cod (Gadus morhua)

Medaka (Oryzias latipes)

Stickleback (Gasterosteus aculeatus)

Stickleback (Gasterosteus aculeatus)

Cod (Gadus morhua)

C.intestinalis (Ciona intestinalis)

C.savignyi (Ciona savignyi)

Translated Blat Pairwise Alignment Analysis

Translated blat (Kent W, Genome Res., 2002;12(4):656-64) is used to look for homologous regions between more distantly related pairs of species. We expect to find homologies mainly in coding regions. There are 2 sets of translated blat analyses: a new set where the raw results were passed through a chain and netting procedure similar to that used for the BlastZ-net analyses to produce the best sub-chain for the reference species (Translated Blat Net).

Translated Blat Net

Homo sapiens H.sap
Mus musculus - M.mus
Rattus norvegicus - - R.nor
Gallus gallus YES YES - G.gal
Meleagris gallopavo YES - - - M.gal
Taeniopygia guttata YES - - - - T.gut
Anolis carolinensis YES - - - - - A.car
Xenopus tropicalis YES YES - YES - - - X.tro
Latimeria chalumnae YES - - - - - - YES L.cha
Danio rerio YES YES YES YES - - - YES YES D.rer
Gadus morhua YES - - - - - - - - - G.mor
Oreochromis niloticus YES YES - - - - - - - YES - O.nil
Takifugu rubripes YES YES - - - - - - - YES - - T.rub
Tetraodon nigroviridis YES YES YES - - - - YES - YES - - - T.nig
Oryzias latipes YES - - - - - - - - YES - - - - O.lat
Gasterosteus aculeatus YES - - - - - - - YES YES - - - - - G.acu
Petromyzon marinus YES - - - - - - - - YES - - - - - YES P.mar
Ciona intestinalis YES YES - - - - - - - YES - - - - - - YES C.int
Ciona savignyi YES - - YES - - - - - YES - - - - - - - - C.sav
H.sap M.mus R.nor G.gal M.gal T.gut A.car X.tro L.cha D.rer G.mor O.nil T.rub T.nig O.lat G.acu P.mar C.int C.sav

PECAN Multiple Alignment Analysis

Pecan is used to provide global multiple genomic alignments. First, Mercator is used to build a synteny map between the genomes and then Pecan builds alignments in these syntenic regions.

Pecan is a global multiple sequence alignment program that makes practical the probabilistic consistency methodology for significant numbers of sequences of practically arbitrary length. As input it takes a set of sequences and a phylogenetic tree. The parameters and heuristics it employs are highly user configurable, it is written entirely in Java and also requires the installation of Exonerate. Read more about Pecan.

19 amniota vertebrates Pecan

(method_link_type="PECAN" : species_set_name="amniotes")
Human (Homo sapiens)
Gorilla (Gorilla gorilla)
Chimpanzee (Pan troglodytes)
Orangutan (Pongo abelii)
Macaque (Macaca mulatta)
Marmoset (Callithrix jacchus)
Mouse (Mus musculus)
Rat (Rattus norvegicus)
Rabbit (Oryctolagus cuniculus)
Horse (Equus caballus)
Dog (Canis familiaris)
Pig (Sus scrofa)
Cow (Bos taurus)
Opossum (Monodelphis domestica)
Platypus (Ornithorhynchus anatinus)
Chicken (Gallus gallus)
Turkey (Meleagris gallopavo)
Zebra Finch (Taeniopygia guttata)
Anole Lizard (Anolis carolinensis)

EPO Multiple Alignment Analysis

The new EPO (Enredo, Pecan, Ortheus) pipeline is a three steps pipeline for whole-genome multiple alignments. Enredo produces colinear segments from extant genomes handling both rearrangements, deletions and duplications. Pecan, as described above, is used to align these segments. Finally, Ortheus is used to create genome-wide ancestral sequence reconstructions. Further details on these methods can be found at:

The high coverage eutherian mammal alignments were generated using the recent EPO (Enredo Pecan Ortheus) pipeline.

Each alignment set can be accessed using the Compara API via the Bio::EnsEMBL::DBSQL::MethodLinkSpeciesSetAdaptor using the method_link_type and either the list of the species or the species_set_name.

3 neognath birds EPO

(method_link_type="EPO" : species_set_name="birds")
Chicken (Gallus gallus)
Turkey (Meleagris gallopavo)
Zebra Finch (Taeniopygia guttata)

5 teleost fish EPO

(method_link_type="EPO" : species_set_name="fish")
Zebrafish (Danio rerio)
Fugu (Takifugu rubripes)
Tetraodon (Tetraodon nigroviridis)
Medaka (Oryzias latipes)
Stickleback (Gasterosteus aculeatus)

6 primates EPO

(method_link_type="EPO" : species_set_name="primates")
Human (Homo sapiens)
Gorilla (Gorilla gorilla)
Chimpanzee (Pan troglodytes)
Orangutan (Pongo abelii)
Macaque (Macaca mulatta)
Marmoset (Callithrix jacchus)

12 eutherian mammals EPO

(method_link_type="EPO" : species_set_name="mammals")
Human (Homo sapiens)
Gorilla (Gorilla gorilla)
Chimpanzee (Pan troglodytes)
Orangutan (Pongo abelii)
Macaque (Macaca mulatta)
Marmoset (Callithrix jacchus)
Mouse (Mus musculus)
Rat (Rattus norvegicus)
Horse (Equus caballus)
Dog (Canis familiaris)
Pig (Sus scrofa)
Cow (Bos taurus)

The full set of eutherian mammmal alignments were not generated using the EPO pipeline due to difficulties with running Ortheus on the low coverage genomes. Instead the low coverage genomes were projected on to the high coverage EPO eutherian mammal alignments using (B)lastZ-net alignments.

35 eutherian mammals EPO_LOW_COVERAGE

(method_link_type="EPO_LOW_COVERAGE" : species_set_name="mammals")
Human (Homo sapiens)
Gorilla (Gorilla gorilla)
Chimpanzee (Pan troglodytes)
Orangutan (Pongo abelii)
Gibbon (Nomascus leucogenys)
Macaque (Macaca mulatta)
Marmoset (Callithrix jacchus)
Tarsier (Tarsius syrichta)
Mouse Lemur (Microcebus murinus)
Bushbaby (Otolemur garnettii)
Mouse (Mus musculus)
Rat (Rattus norvegicus)
Kangaroo rat (Dipodomys ordii)
Squirrel (Spermophilus tridecemlineatus)
Guinea Pig (Cavia porcellus)
Pika (Ochotona princeps)
Rabbit (Oryctolagus cuniculus)
Tree Shrew (Tupaia belangeri)
Sloth (Choloepus hoffmanni)
Armadillo (Dasypus novemcinctus)
Lesser hedgehog tenrec (Echinops telfairi)
Elephant (Loxodonta africana)
Hyrax (Procavia capensis)
Hedgehog (Erinaceus europaeus)
Shrew (Sorex araneus)
Microbat (Myotis lucifugus)
Megabat (Pteropus vampyrus)
Horse (Equus caballus)
Cat (Felis catus)
Dog (Canis familiaris)
Panda (Ailuropoda melanoleuca)
Dolphin (Tursiops truncatus)
Pig (Sus scrofa)
Cow (Bos taurus)
Alpaca (Vicugna pacos)

Ancestral sequences are inferred from the EPO multiple alignments using Ortheus. Ortheus is a probabilistic method for the inference of ancestor, a.k.a tree, alignments. The main contribution of Ortheus is the use of a phylogenetic model incorporating gaps to infer insertion and deletion events. Ancestral sequences are predicted for each node of the phylogenetic tree that relates the sequences. Each ancestral sequence is named according to the derived extant species. For example, a sequence named Hsap, Ptro, Mmul corresponds to the ancestor of the Homo sapiens, Pan troglodytes, and Macaca mulatta genomes.

Conservation Analysis

Additionally we use Gerp (Cooper GM et al., Genome Res., 2005; 15:901-913) to calculate conservation scores and call constrained elements on the PECAN and EPO_LOW_COVERAGE multiple alignments. Conservation scores are estimated on a column-by-column basis. Constrained elements are stretches of the multiple alignment where the sequences are highly conserved according to the previous score.

Synteny Analysis

We calculate syntenic regions using blastz-net alignments. We look for stretches where the alignment blocks are in synteny. The search is run in two phases. In the first one, syntenic alignments that are closer than 200 kbp are grouped. In the second phase, the groups that are in synteny are linked provided that no more than 2 non-syntenic groups are found between them and they are less than 3Mbp apart.

Homo sapiens H.sap
Gorilla gorilla YES G.gor
Pan troglodytes YES - P.tro
Pongo abelii YES - - P.abe
Macaca mulatta YES - - - M.mul
Callithrix jacchus YES - - - - C.jac
Mus musculus YES - - - - - M.mus
Rattus norvegicus YES - - - - - YES R.nor
Oryctolagus cuniculus YES - - - - - - - O.cun
Equus caballus YES - - - - - - - - E.cab
Canis familiaris YES - - - - - YES - - YES C.fam
Sus scrofa YES - - - - - - - - - - S.scr
Bos taurus YES - - - - - - - - - - YES B.tau
Monodelphis domestica YES - - - - - - - - - - - - M.dom
Ornithorhynchus anatinus YES - - - - - - - - - - - - - O.ana
Gallus gallus YES - - - - - - - - - - - - - - G.gal
Meleagris gallopavo - - - - - - - - - - - - - - - YES M.gal
Anolis carolinensis - - - - - - - - - - - - - - - YES - A.car
H.sap G.gor P.tro P.abe M.mul C.jac M.mus R.nor O.cun E.cab C.fam S.scr B.tau M.dom O.ana G.gal M.gal A.car