The GeneFinder Tutorial database contains 38 Test_Sequences together
with regions of homology determined by BlastX, BlastN, TBlastX and
PSsearch. If you would like some practice at GeneFinding, have a
go at these and compare your results to the table below. User segments
of splice branch/acceptor motifs are available in the
GeneFinder_Tutorial.user_segments/ directory for each of
the sequences.
Of the 32 protein coding genes specified in the EMBL feature tables of
the Test_Sequences,
GeneFinder correctly predicts 24 (75%), with a single set of GeneFinder
parameters (see below). 39 introns from a total of 41 (95%) are correctly
predicted, with only 5 false positives. Two unannotated genes are
found within these sequences. All genes can be found by using
selection and antiselection of features.
29 of the protein coding genes have the
/evidence=EXPERIMENTAL qualifier in the EMBL feature table.
The table below gives the results of GeneFinding in these Test_Sequences
Gene symbol/Name access. spliced introns agrees? ================ ====== ======= ======= ======= PRH1 ATP-dependant RNA Helicase D13249 yes 2 2 + dsk1+ D13447 no + gamma glutamylcysteine synthase D55676 yes 1 0 - first exon missing; although evidence > blastx homology rpb6 L00597 yes 1 1 + cdc8 L04126 yes 1 1 - uses downstream ATG pold L07734 yes 1 1 +1 - splice AT^G to A > ATA , bug! exo2 L35232 yes 2 1 - uses rare donor gttgtt heat shock protein (sis1;Psi) L37753 no + another gene at 1897 2133 fus1 L37838 no +1 - intron internal to orf cdc27+ M74062 yes 5 5 + cdc27+ mRNA M83307 n/a + let1 U02280 yes 1 1 +1 - extra exon cki1 mRNA U06929 n/a + HIS1 mRNA U07830 n/a + HIS5 mRNA U07831 n/a + cnx1 U13389 n/a + ENO1 mRNA U13799 no + rpb1 X56564 yes 6 6 + rad9 X58231 yes 3 3 + SSP1 X59987 no + sts1+ X63549 no +1 - extra exon vma1+ X68580 yes 2 2 + vma2 X69638 yes 4 4 + FIB X69930 no + another gene at -3088 -1799 rad26 X76558 yes 2 2 + gar2 Z48166 no + sak1 U19978 no + IDI1 U21154 no + chk1 L13742 yes 6 6 + mcs2 S59895 yes 2 2 + csk1 S59896 yes 2 2 + rad1 M38132 no +1 - extra exon -------------------------------------------------------------------------------------------------------------------------------- 32 Total 24 Exact predictions 16 Spliced gene prediction 11 Exact spliced predictions introns 41 39 false +5 GeneFinder parameters Feature parameters Features range 10000 3' splice cutoff 2.00 5' splicecutoff 1.00 ATG cutoff 0.00 Autofind parameters min intron length 30 min exon length 3 intron cost -4.5 intron rate per log bp -2.0 coding:intron score ratio 1.00 GeneFinder_Tutorial.user_segments were included for each prediction using Read Segments
Correspondance to Sean Walsh svw@sanger.ac.uk