10.02.03 Notes on chr1 array data file : sortedchr1arrayset-NCBI31.txt ---------------------------------------------------------------------- This file contains a sorted list of chromosome 1 clones on the array which are still recorded as active (or rather not cancelled) in the Sanger Insitiute database, which tracks the sequencing of chromosome 1 clones. The file also contains nineteen X chromosome clones which are used a during nomalisation. The file is produced as a tab delimited plain text file which is suitable for importing into excel. The columns are :- 1) estimated left end position of clone (kb) based on fingerprint map co-ordinates (please see http://www.sanger.ac.uk/HGP/Chr1 for further details on the chromosome 1 fingerprint map). The X chromosome clones are arbitarily assigned a left end. position of 300000 or higher and are therefore listed at the end of the file. 2) Sanger clone name (name used for clone of array). 3) External clone name - internationally recognised version of the sanger clone name. Please see http://www.ncbi.nlm.nih.gov/genome/clone/nomenclature.shtml for further details. 4) Status of clone as recorded in the Sanger's sequencing tracking database. 5) Sequence accession number (e.g. embl). 6) Sequence accession version (e.g. embl). 7) NCBI reference contig accession number (NT_xxxxxx). 8) NCBI reference contig name (prefix consists of organism abbreviation followed by chromosome e.g. Hs1) 9) NCBI start (bp) co-ordinate of accession in NCBI build 31 (on relevant chromosome - see NCBI reference contig name). 10) NCBI end (bp) co-ordinate of accession in NCBI build 31 (on relevant chromosome - see NCBI reference contig name). 11) Orientation of accession in NCBI contig. For further details on the NCBI genome build 31 (based on data in genbank on November 15, 2002) please see : http://www.ncbi.nlm.nih.gov/genome/guide/human/HsStats.html OR http://www.ncbi.nlm.nih.gov/mapview/map_search.cgi for a graphical view. Ensembl currently use NCBI build 30 but will update to build 31 in the near future. The NCBI positions given relate to the co-ordinates assigned by NCBI for the listed accession for the clone. Please note there may be other accessions (column 5) for a clone (it may have been sequenced by another centre, or it may have been sequenced in two parts) and generally the accession does not provide the sequence for the full insert of the clone (please read the embl entry carefully). You may notice a small number of clones (30) have no position in the NCBI 31 build, this is generally due to problems with data synchronisation e.g. accession numbers being replaced, new sequence available or no accession number is available (non tile path clones). In future NCBI builds most of these problems should be tidied up. Please send any questions you have regarding the chromosome 1 array to Simon Gregory (sgg@sanger.ac.uk).