SANGER INSTITUTE GENE SET FOR CHROMOSOME 20 -DECEMBER 2001 RELEASE
==================================================================

Further questions concerning this data should be addressed to Panos
Deloukas at the address below. Users are requested to cite Nature 414,
865-871(2001) "The DNA sequence and comparative analysis of human 
chromosome 20" Deloukas et al. in all references to this data.

The data in these files contain the transcripts and translations of all 
genes annotated on Ch20 by the Sanger Institute described in the above 
Nature publication.

Description of gene sets:
-------------------------

chr20-mRNA.fasta.gz:
Contains the mRNA sequences of all the transcripts including isoforms of
the individual genes. These are identified using their HUGO name and isoform
number eg ARFRP1-001 or if a HUGO gene name was not assigned the Sanger
identifier was used eg dJ1161H23.1-001. Also given is the gene name assigned
by HUGO after publication of the gene set eg gene=C20orf141. Lastly, the type
identifier refers to the gene index which is given to Sanger annotation (for
more details refer to Nature 414 Pg866 (2001). 

chr20-CDS.fasta.gz:
Contains the coding sequences of all the transcripts including isoforms of
the individual genes.These have the same identifiers as the mRNA gene set.

chr20-Peptide.fasta.gz
Contains all the translations of the isoforms of the individual genes. 
These have the same identifiers as the mRNA gene set. NB. some isoforms will 
have the same translation as they only have alternative 5' UTRs.


Panos Deloukas, Chromosome 20 mapping 
Jennifer Ashurst, Human sequence analysis group

The Wellcome Trust Sanger Institute
Wellcome Trust Genome Campus
Hinxton
Cambs
CN10 1SA
Tel 01223 494909
Fax 01223 494919
e-mail panos@sanger.ac.uk or jla1@sanger.ac.uk