Downloads

Introduction

All data sets generated by the Ensembl project are freely available to download from the ftp.ensembl.org site. Please see also the disclaimer.

Please note: Ensembl supports downloading of many more correlation tables via the highly customisable BioMart data mining tool. You may find exploring this web-based data mining tool easier than extracting information from our normalised database dumps.

Additionally, Ensembl would like to encourage users to directly extract information from our databases via SQL rather than downloading huge flat files. We offer a public MySQL interface at ensembldb.ensembl.org that accepts SQL queries as user 'anonymous'. Client programs for accessing this interface are available via MySQL.

It is also possible to install the Ensembl API code locally and configure it to access databases on ensembldb.ensembl.org. Users can develop their own analysis scripts to access Ensembl's object orientated representation of biological objects. This is much easier than querying SQL and avoids downloading huge databases.

The Ensembl FTP Site

Ensembl provides sequence databases of gene, transcript and protein predictions. These sequences are suitable for a local installation of a sequence similarity search system. MySQL database table dumps are available for all databases underlying the Ensembl system. These text format dumps can be imported into relational database management systems which would enable installation of a complete Ensembl mirror site.

There are three types of data dumps for each species on the FTP site:

These additional documents explain the FTP directory structure and the FASTA file naming and header conventions used on this site.

[[INCLUDE::/info/data/download_links.inc]]