The End

The GeneFinder Tutorial database contains 38 Test_Sequences together with regions of homology determined by BlastX, BlastN, TBlastX and PSsearch. If you would like some practice at GeneFinding, have a go at these and compare your results to the table below. User segments of splice branch/acceptor motifs are available in the GeneFinder_Tutorial.user_segments/ directory for each of the sequences.

Of the 32 protein coding genes specified in the EMBL feature tables of the Test_Sequences, GeneFinder correctly predicts 24 (75%), with a single set of GeneFinder parameters (see below). 39 introns from a total of 41 (95%) are correctly predicted, with only 5 false positives. Two unannotated genes are found within these sequences. All genes can be found by using selection and antiselection of features. 29 of the protein coding genes have the /evidence=EXPERIMENTAL qualifier in the EMBL feature table.

The table below gives the results of GeneFinding in these Test_Sequences


Gene symbol/Name			access.	spliced	introns	agrees? 
================			======	=======	=======	=======


PRH1 ATP-dependant RNA Helicase		D13249	yes	2 2	+
dsk1+					D13447	no		+
gamma glutamylcysteine synthase		D55676	yes	1 0	- 	first exon missing; although evidence > blastx homology 
rpb6					L00597	yes	1 1	+
cdc8					L04126	yes	1 1	-	uses downstream ATG
pold					L07734	yes	1 1 +1	- 	splice AT^G to A > ATA , bug!
exo2					L35232	yes	2 1	-	uses rare donor gttgtt
heat shock protein (sis1;Psi)		L37753	no		+	another gene at 1897 2133
fus1					L37838	no	    +1	- 	intron internal to orf
cdc27+                         		M74062	yes	5 5	+
cdc27+ mRNA				M83307	n/a		+
let1					U02280	yes	1 1 +1	-	extra exon
cki1 mRNA				U06929	n/a		+
HIS1 mRNA				U07830	n/a		+
HIS5 mRNA				U07831	n/a		+
cnx1					U13389	n/a		+
ENO1 mRNA				U13799 	no		+
rpb1					X56564	yes	6 6	+
rad9					X58231	yes	3 3	+
SSP1					X59987	no		+
sts1+					X63549	no	    +1	-	extra exon
vma1+					X68580	yes	2 2	+
vma2					X69638	yes	4 4	+
FIB					X69930	no		+	another gene at -3088 -1799
rad26					X76558	yes	2 2	+
gar2					Z48166	no		+
sak1 					U19978	no		+
IDI1					U21154	no		+
chk1					L13742	yes	6 6	+
mcs2					S59895	yes	2 2	+
csk1					S59896	yes 	2 2	+
rad1					M38132	no	    +1	-	extra exon

--------------------------------------------------------------------------------------------------------------------------------

								32 Total
								24 Exact predictions	
								16 Spliced gene prediction	
								11 Exact spliced predictions
					introns		41 39	
					false		     +5 


GeneFinder parameters

Feature parameters

Features range                  10000 
3' splice cutoff                2.00	
5' splicecutoff                 1.00
ATG cutoff                      0.00

Autofind parameters

min intron length               30
min exon length                 3
intron cost                    -4.5
intron rate per log bp         -2.0
coding:intron score ratio       1.00


GeneFinder_Tutorial.user_segments were included for each prediction
using Read Segments

Correspondance to Sean Walsh svw@sanger.ac.uk


Return to first page