Regulatory Features

Ensembl stores regulatory features for human and mouse, inferred from experimental data through the Regulatory Build process. It also provides the experimental evidence supporting the regulatory features, which is derived from publicly available data sources. These data inlude:

  • Open Chromatin: Dnase1 Seq & FAIRE
  • Transcription Factor ChIP-Seq
  • Histone Modification ChIP-Seq
  • Transcription Factor Binding Site Annotations (PWMs)

The Regulatory Build produces a special 'MultiCell' summary set of features, and cell specific feature sets with more detailed structures and classifications, such as:

  • Promoter associated
  • Gene associated
  • Non-gene associated
  • Polymerase III associated
  • Unclassified
Follow the link for more details about the classification process.

Regulatory Segmentation

Genome segmentation analyses from the ENCODE project, providing genome wide summary of regulatory function/status. See here for more details.

Other Regulatory Data

Ensembl Regulation databases also store data directly imported from external sources (identified as 'Other regulatory regions'):

  • Micro RNA target predictions for human and mouse (a stringent subset of MiRanda described here).
  • Experimentally validated human non-coding fragments from the VISTA Enhancer Browser.
  • Genome-wide regulatory module and element prediction from cisRED for human and mouse.
  • Human DNA Methylation DAS tracks, including:
    • MeDIP-Chip methylation for 17 cell lines and tissues (Rakyan et al (2008))
    • RRBS (Reduced Representation Bisulfite Sequencing) for 44 cell lines (ENCODE).
      Replicates were merged using mean percentage methylation, and a minimum 20x read coverage in the combined set.
      Methylation state is indicated by a color gradient. Dark blue indicates highly methylated areas, green intermediate and yellow which shows low methylation.

Microarray Probe Mappings

Ensembl stores microarray probe mappings for several species and technologies, including:

  • Affymetrix: IVT & ST arrays
  • Codelink: Whole genome arrays
  • Agilent: Whole genome and CGH arrays
  • Illumina: Whole genome and Infinium methylation arrays
  • Phalanx

Regulation displays

Regulation data can be viewed in the browser through pages such as:

  • Gene: Regulatory Regions surrounding a gene (e.g. for all regulatory features around KCNE2). Note that even if a regulatory region is near or within a gene, this does not mean that it is acting on that gene.
  • Location: Region in Detail (by default 'MultiCell' regulatory features are displayed. Cell-specific annotations and evidence tracks can also be drawn using "Configure this page" at the left. The menu allows display of information in Ensembl databases along with external sources in DAS format such as Methylation data.)

Clicking on any regulatory feature on an Ensembl page will open a Regulation tab with information about the evidence supporting the selected regulatory feature, as well as cell-specific annotations. You can choose different views in the Regulation tab:

  • The 'Details by cell line' view is the default view, where you get the generic 'MultiCell' regulatory feature and cell-specific classifications, as well as part of their supporting evidence displayed as tracks. To display more evidence tracks, you can go to the 'Configure this page' link and select tracks within the 'Regulation' subsection. 'DNA Methylation' and external sources ('Other regulatory regions') are not used as evidence for regulatory regions.
  • The 'Summary' view will display the 'MultiCell' regulatory feature, along with it's core evidence and cell-specific annotations. By default, this view does not display cell-specific evidence.
  • The 'Feature Context' view displays the regulatory features in a wider context around the chosen regulatory feature.
  • The complete list of supporting core evidence (TF binding and open chromatin) for a regulatory feature can be obtained via the 'Evidence' view.

Display configuration

Configuration of the various regulation displays is available under the 'Regulation' section, with the following subsections:
  • Regulatory Features These are the output of the Regulatory Build. You can specifiy regulatory feature tracks for individual cell lines
  • Open Chromatin and TFBS Here you can configure the evidence tracks used in the Regulatory Build to build regulatory features.
  • Histones and Polymerases Here you can configure the evidence tracks used in the Regulatory Build to establish cell-specific boundaries for regulatory features.
  • DNA Methylation Here you can configure the methylation tracks you wish to visualise.
  • Other regulatory regions Here you can configure the external tracks you wish to visualise.

When diplaying the regulatory feature evidence, there are two visualisations available:

  • Peaks - Shows significantly enriched 'peak' regions
  • Signal - Shows a plot or wiggle representing the raw signal.

Species-specific microarray probe mappings can be visualized on the Location view by turning on tracks from the 'Oligo Probes' section in the configuration panel.

Data Access

Regulation information stored in Ensembl databases can be accessed through the website, but also using BioMart (regulation database), the Perl API (regulation API) or by directly accessing the databases. More information about programmatic access is available in the regulation API tutorial page. A more detailed description of the regulation databases can be seen here.