Enhanced by Richard Bruskiewich email rbsk@sanger.ac.uk
Note: beta code
this module is evolving rapidly so
we don't guarantee that future versions will be backwards compatible.
Extra data controlling drawing is added to the individual GeneFeature objects using the addMember method. It is this which allows the definition of (1) mouseover text (2) href links (3) drawing as an intron. There are currently 2 types of introns: type 1 are drawn as solid lines/boxes and type 2 are drawn as dashed lines/boxes. One use of type 2 drawing is to allow 'introns' to be drawn to join up a gene structure, but indicate where there is likely to be a missing exon in the structure. This control is independent of the intronstyle parameter which controls the selection of straight lines or angle links.
Example script (note: make sure to align formatted print <<ENDOFTEXT statements to the left hand margin of the script, to make this script work)
#!/usr/local/bin/perl
use strict; use GFF::GifGFF; use GFF; # get GFF (normally this would be read from a file, but here it is # hard coded just to show you exactly what is needed) my $gff = new GFF::GeneFeatureSet; &get_gff($gff); # grouping function (used for bumping, to keep features with same name on same line) my $group_by_seqname=sub{ my $self=shift; return $self->seqname; }; # label function (specifies what is used for text written to gif) my $make_label=sub{ my $self=shift; return $self->getMember('label'); }; # to map gff features->layout labels my $group_features=sub{ my $self=shift; my $label=$self->feature; # exon,intron->gene if($label=~/^exon|intron$/){ return 'gene'; } return $label; }; # differential colour function based upon 'Sequence_by' [group] and strand my $seq_colour = sub { my $coltab = shift ; # a reference to a hash table of colours my $gf = shift ; return $coltab->{'black'} if !(defined($gf) and $gf) ; my $source = $gf->group_value('Sequenced_by') ; my $strand = $gf->strand ; if($source =~ /Sanger/) { if($strand eq '+') { return $coltab->{'blue'} ; } else { return $coltab->{'darkblue'} ; } } elsif($source =~ /WUSTL/) { if($strand eq '+') { return $coltab->{'green'} ; } else { return $coltab->{'darkgreen'} ; } } else { if($strand eq '+') { return $coltab->{'red'} ; } else { return $coltab->{'darkred'} ; } } };
# this specifies how the draw each label and their relative positions # label names starting with /^label/ have special behaviour (see below) # 5th field can be either a colour (for reverse direction) or F/R/P (see doc) # (reverse genes are duplicated, to show the effect of this) # 6th field can be flag to indicate if feature should be drawn with a box unless # label, where this field should contain a function to generate text to be drawn my @layout=( [-4,15,'label','black','black',$make_label,$group_by_seqname], [-3,15,'labelg','red','darkred',$group_by_seqname,$group_by_seqname], [-2,6,'gene','red','darkred',1,$group_by_seqname], [-1,15], [0,10,'contig',$seq_colour,$seq_colour], [1,15], [2,10,'hom','blue','P',1,$group_by_seqname], [3,6,'gene','red','R',1,$group_by_seqname], );
# this specifies row labels in the left hand margin of the plot my %rowlabels = ( 'gene' => 'Gene<http://www.sanger.ac.uk/HGP/Genes/>', 'contig' => 'Contig Map', ) ;
my $file1='clone1.html'; my $file2='clone1.gif'; open(OUT1,">$file1") || die "cannot open $file1"; print OUT1 <<ENDOFTEXT; <html> <head> <title>Output from example1.pl</title> </head> <body bgcolor="#FFFFFF"> <map name="gifmap"> ENDOFTEXT open(OUT2,">$file2") || die "cannot open $file2"; binmode(OUT2); my($x,$y)=&GFF::GifGFF::gff2gif(gff => $gff, layout => \@layout, rowlabels => \%rowlabels, filter => $group_features, io => \*OUT2, width => 600, leftmargin => 100, scale => 200, mio => \*OUT1, intronstyle => 'straight', margin => 10, ); print OUT1 <<ENDOFTEXT; </map> <h2>Output from example.pl</h2> <img src="$file2" usemap="#gifmap" width=$x height=$y border=0> <pre> ENDOFTEXT $gff->dump(\*OUT1); print OUT1 <<ENDOFTEXT; </pre> </body> </html> ENDOFTEXT close(OUT1); close(OUT2); sub get_gff{ my($gff)=@_; # make an example gff with a contig + a 2 exon gene # make a contig gf my $gf=new GFF::GeneFeature; my $name='contig1'; $gf->seqname($name); $gf->feature('contig'); my $start=1; my $end=74999; $gf->start($start); $gf->end($end); $gf->strand('+');
# Flag source of sequence, for differential colouring $gf->group_value('Sequenced_by',0,'Sanger') ;
# add a mouseover label my $contig1_label="$name:$start-$end"; $gf->addMember($contig1_label,'label'); # add a href to another file (dummy) $gf->addMember("$name.html",'href'); $gff->addGeneFeature($gf); # make a contig gf my $gf=new GFF::GeneFeature; my $name='contig1'; $gf->seqname($name); $gf->feature('contig'); my $start=75000; my $end=123456; $gf->start($start); $gf->end($end); $gf->strand('+');
# Flag source of sequence, for differential colouring $gf->group_value('Sequenced_by',0,'Whitehead') ;
# add a mouseover label my $contig1_label="$name[Whitehead]:$start-$end"; $gf->addMember($contig1_label,'label'); # add a href to another file (dummy) $gf->addMember("$name.html",'href'); $gff->addGeneFeature($gf); # make a bar (to show that contigs are not linked) $gf=new GFF::GeneFeature; $name='bar'; $gf->seqname($name); $gf->feature('bar'); $start=124456; $end=124456; $gf->start($start); $gf->end($end); $gff->addGeneFeature($gf); # another contig gf (reversed) $gf=new GFF::GeneFeature; $name='contig2'; $gf->seqname($name); $gf->feature('contig'); $start=125456; $end=234567; $gf->start($start); $gf->end($end); $gf->strand('-');
# Flag source of sequence, for differential colouring $gf->group_value('Sequenced_by',0,'Wustl') ;
# add a mouseover label my $contig2_label="$name:$start-$end"; $gf->addMember($contig2_label,'label'); # add a href to another file $gf->addMember("$name.html",'href'); $gff->addGeneFeature($gf); # make labels for each contig $gf=new GFF::GeneFeature; $name='contig'; $gf->seqname($name); $gf->feature('label'); $start=1; $end=1; $gf->start($start); $gf->end($end); $gf->addMember($contig1_label,'label'); $gff->addGeneFeature($gf); $gf=new GFF::GeneFeature; $name='contig'; $gf->seqname($name); $gf->feature('label'); $start=125456; $end=125456; $gf->start($start); $gf->end($end); $gf->addMember($contig2_label,'label'); $gff->addGeneFeature($gf); # make and label some genes # label this gene $gf=new GFF::GeneFeature; $gf->seqname('note'); $gf->feature('label'); $start=10000; $end=10000; $gf->strand('+'); $gf->start($start); $gf->end($end); $gf->addMember('[genes overlap]','label'); $gff->addGeneFeature($gf); # gene1 (2 exons - features grouped using name) $name='gene1'; # make 1st exon gf $gf=new GFF::GeneFeature; $gf->seqname($name); $gf->feature('exon'); $start=100; $end=1150; $gf->start($start); $gf->end($end); $gf->strand('+'); # add a mouseover label $gf->addMember("Gene $name, Exon 1:$start-$end",'label'); # add a href to another file $gf->addMember("$name.html",'href'); $gff->addGeneFeature($gf); # make intron gf $gf=new GFF::GeneFeature; $gf->seqname($name); $gf->feature('intron'); $start=1151; $end=10299; $gf->start($start); $gf->end($end); $gf->strand('+'); # identify as an intron $gf->addMember(1,'intron'); # add a mouseover label $gf->addMember("Gene $name, Intron 1:$start-$end",'label'); # add a href to another file $gf->addMember("$name.html",'href'); $gff->addGeneFeature($gf); # make 2nd exon gf $gf=new GFF::GeneFeature; $gf->seqname($name); $gf->feature('exon'); $start=10300; $end=11450; $gf->start($start); $gf->end($end); $gf->strand('+'); # add a mouseover label $gf->addMember("Gene $name, Exon 2:$start-$end",'label'); # add a href to another file $gf->addMember("$name.html",'href'); $gff->addGeneFeature($gf); # label this gene $gf=new GFF::GeneFeature; $gf->seqname($name); $gf->feature('labelg'); $start=100; $end=100; $gf->strand('+'); $gf->start($start); $gf->end($end); $gff->addGeneFeature($gf); # gene2 (2 exons) $name='gene2'; # make 2nd exon gf $gf=new GFF::GeneFeature; $gf->seqname($name); $gf->feature('exon'); $start=10000; $end=10400; $gf->start($start); $gf->end($end); $gf->strand('-'); # add a mouseover label $gf->addMember("Gene $name, Exon 2:$start-$end",'label'); # add a href to another file $gf->addMember("$name.html",'href'); $gff->addGeneFeature($gf); # make intron gf $gf=new GFF::GeneFeature; $gf->seqname($name); $gf->feature('intron'); $start=10401; $end=30000; $gf->start($start); $gf->end($end); $gf->strand('-'); # identify as an intron $gf->addMember(1,'intron'); # add a mouseover label $gf->addMember("Gene $name, Intron 1:$start-$end",'label'); # add a href to another file $gf->addMember("$name.html",'href'); $gff->addGeneFeature($gf); # make 1st exon gf $gf=new GFF::GeneFeature; $gf->seqname($name); $gf->feature('exon'); $start=30001; $end=30300; $gf->start($start); $gf->end($end); $gf->strand('-'); # add a mouseover label $gf->addMember("Gene $name, Exon 1:$start-$end",'label'); # add a href to another file $gf->addMember("$name.html",'href'); $gff->addGeneFeature($gf); # make intron gf $gf=new GFF::GeneFeature; $gf->seqname($name); $gf->feature('intron'); $start=30001; $end=60000; $gf->start($start); $gf->end($end); $gf->strand('-'); # identify as an intron $gf->addMember(2,'intron'); # add a mouseover label $gf->addMember("Gene $name, Intron 2:$start-$end",'label'); # add a href to another file $gf->addMember("$name.html",'href'); $gff->addGeneFeature($gf); # make 1st exon gf $gf=new GFF::GeneFeature; $gf->seqname($name); $gf->feature('exon'); $start=60001; $end=61000; $gf->start($start); $gf->end($end); $gf->strand('-'); # add a mouseover label $gf->addMember("Gene $name, Exon 3:$start-$end",'label'); # add a href to another file $gf->addMember("$name.html",'href'); $gff->addGeneFeature($gf); # make a label $gf=new GFF::GeneFeature; $gf->seqname($name); $gf->feature('labelg'); $start=10000; $end=10000; $gf->strand('-'); $gf->start($start); $gf->end($end); $gff->addGeneFeature($gf); # make hom features $gf=new GFF::HomolGeneFeature; $name='hom1'; $gf->seqname($name); $gf->feature('hom'); $start=10300; $end=13450; $gf->start($start); $gf->end($end); $gf->start2(14000); $gf->end2(18000); $gf->percentid(90); $gf->strand('+'); # add a mouseover label $gf->addMember("Homol $name (90%)",'label'); # add a href to another file $gf->addMember("$name.html",'href'); $gff->addGeneFeature($gf); $gf=new GFF::HomolGeneFeature; $name='hom2'; $gf->seqname($name); $gf->feature('hom'); $gf->start(80000); $gf->end(90000); $gf->start2(93456); $gf->end2(123456); $gf->percentid(20); $gf->strand('+'); # add a mouseover label $gf->addMember("Homol $name (20%)",'label'); # add a href to another file $gf->addMember("$name.html",'href'); $gff->addGeneFeature($gf); } There is a special behaviour for any GeneFeature object that is grouped into a set labelled 'bar', which does not need any entry in the layout array since it causes a vertical bar at that coordinate point to be drawn. Another special drawing mechanism is for any layout array entry with a label of form /^label/ where text is written at the coordinate of the start of the feature rather than a box of the feature itself.
my $gf=new GFF::GeneFeature; $gf->seqname('bar'); $gf->feature('bar'); $gf->start(123460); $gf->end(123500); $gff->addGeneFeature($gf);
bar
elements are 3 pixels wide. If you wish the have the width defined by
start,end (remember this will depend in terms of the pixels on the
coordinate scaling for that particular gif) then set 'barwidth' to zero.
# layout line [-4,15,'label','black','black',$make_label,$group_by_seqname],
# function for label (take text from seqname) my $make_label=sub{ my $self=shift; return $self->seqname; };
my $gf=new GFF::GeneFeature; my $text='Title Text'; $gf->seqname($text); $gf->feature('label'); $gf->start(0); $gf->end(0); $gff->addGeneFeature($gf);
gff=$gff
.
io=$TOUT
where $TOUT
is set by:
my $TOUT; open(TOUT,">$html_dir/$clone.gif") || die "cannot open $clone.gif"; $TOUT=\*TOUT; binmode(TOUT);
Image will be written to $html_dir/$clone.gif
.
filter=$group_features
where an example of a filter is:
my $group_features=sub{ my $self=shift; return $self->feature; };
Features will be displayed according labels derived from the
feature
element of each GeneFeature object in the GFF object.
layout=\@layout
where an example of a layout is:
my @layout=( [-3,5,'label','black','black',$make_label,$group_by_name], [-2,5,'gene','red','F',1,$group_by_name], [-1,5], [0,5,'contig','green','darkgreen'], [1,5], [2,5,'gene','darkred','R',1,$group_by_name], [3,15,'hom','red','P',1,$group_by_name], );
Each line describes a set of rules for drawing the set of features which have been labelled, in this case 'gene' or 'contig', by the filter function that was passed to the call. The fields are: (1) real vertical position of feature line, (2) int height of feature in pixels, (3) text label identifying this drawing line, (4) text/ref(sub) colour1 [see below] (5) text/ref(sub) colour2 [see below for exceptions], (6) no_box_outline flag [see below for exceptions], (7) bump filter [see description in discussion of bumping].
Only features labelled gene
or contig
from output of group_features filter will be displayed.
Contigs will be drawn in a single y axis region of the gif with green for a
forward contig and darkgreen for a reverse contig (indicated by the strand
element of each GeneFeature object.
Genes will be drawn on two separate y axis regions of the gif depending on
the direction (strand
). Forward will be drawn above the contig, reverse will be drawn below the
contig, in red and darkred accordingly.
Both Contigs and Genes will be 5 pixels high and will be separated from each other by 5 pixels.
Features labelled 'gene' will be first grouped according to the function
$group_by_name
before being bumped, to allow all the elements
of a gene (exons, introns) to be kept together on a single line.
Contigs will be drawn as a box coloured green with a black outline. Gene will be coloured in plain red (see that 'no_box_outline' flag is set.
Functions with a label starting with 'label' have special behaviour (not
used here). In this case it is assumed that 'text' should be written to the
canvas rather than drawing a box. The position of the text is defined by
$gf->start. Obviously field 6 ('no_box_outline') has no meaning in the
context of writing text, so in this case it is expected to be another
filter that will return the text string to be written. This allows text to
be built on the basis of values in the $gf
or simply the
content of $gf->seqname etc.
Field 5 can take a second colour for reversed features (if forward and reverse are to be drawn on the same line) or it can be 'F' or 'R' to indicate that this line should only contain forward or reverse features. This allows two alternative displaying modes for example of forward and reverse features on opposite sides of the DNA or all grouped together.
In addition, under these circumstances, field '5' can also take the special directive 'V' (optionally with 'B') which direct that the specified line of row of text is drawn vertically, (i.e. at 90 degrees), in the text direction reading from bottom to top. By default, the label is written 'top' aligned, i.e. with the gene feature start coordinate positioned at the upper left hand corner of the text box (or upper right of the text itself). If the 'B' directive is additionally given, then the alignment becomes 'lower left hand corner' that is, 'Bottom' aligned. Since the text characters are 6 x 12 pixels (gdGiantFont) in size, a feature height of n x 6 pixels (where n == size of the longest text string, for the given layout row, should ideally be provided. The colour of the text is that designated by field 4. The 'V' directive (with or without 'B') may be given alongside a 'F' or 'R' directive, in which case, strand specific labels are drawn vertically. When 'V' or 'VB' is given, it must *preceed' the 'F' or 'R' directive, i.e. 'VF','VR', 'VBF' or 'VBR'.
A further exception to Field 5 is when it takes the value 'P' to indicate shading by percentage homology. In this case, colour1 (field 4) must be either 'red', 'green' or 'blue' and the gf's must all be HomolGeneFeature objects. In this case a shape is drawn such that the coordinates of the top line correspond to the values of 'start','end' in the object and the bottom line to 'start2','end2'. The 'percentid' field is used to set the colour such that 100% homol matches are the raw colour fading to a light colour as the homology decreases.
The colour argument to field 4 and/or field 5 (subject to the constraints outlined above) may be a reference to a user defined colour function expecting up to two arguments: the first argument is a module provided reference to a hash of allocated colours ('the palette'), keyed by string labels for each (system or user defined) allocated colour, from which a GD::Image colour handle may be selected and returned to the module point calling the function. The second argument, if present, is a reference to the current GFF::GeneFeature being drawn. The function should be designed to always return a colour value from the palette, including a default colour in the absence of a defined Gene Feature.
It is allowed to have multiple drawing rules for the same label, so complex gifs can be constructed with the same features repeated in different places but drawn in the same or different ways.
With the exception of the constraints outlined above, the colours available for use in gifgff layout specifications (or user defined colour functions) are specified as string scalars taken from the following list:
white (presumably with a black border!) gray black red lightred darkred green lightgreen darkgreen blue lightblue darkblue yellow brown magenta (a.k.a. purple) darkmagenta cyan darkcyan
gff2gif()
function; however, an existing GD::Image() reference handle (i.e. one
created by a 'GD::Image->new($width,$height)' call) can be passed to the
function, in which case the gff2gif()
function writes its
features onto the canvas of this existing image. See also xorigin
and xorigin
below.
colorAllocate()'d
by
the caller, into the GD::Image() 'image', and indexed by string labels
denoting custom colours.
compress=0
turns off (default on) mode to remove space for empty label types in the
gif image. Make gif smaller.
bump=0
turns off bumping (default on). If there are multiple features (such as
gene predictions) that overlap, then normally you don't want these to
overwrite each other. Bumping fixes this. Bumping is away from the centre
of the display, which is defined as index 0 in the layout array (such that
genes and other features lie preferentially close to the clone when
displayed on either size.
my $group_by_name=sub{ my $self=shift; return $self->seqname; };
rowlabels=\%rowlabels
, indexed upon the @layout row identification field tag(*).
Rows without defined
rowlabels are left blank.
Optionally, a specified colour and URL may be appended to the label, comma delimited, for example:
$rowlabel{'GFF'} = 'GFF,red,http://www.sanger.ac.uk/Software/GFF/' ;
If no colour is provided, then the field 4 colour is used. Note, however, that if the field 4 colour is a reference to a user specified colour mapping function, that no gene feature is available to the function, hence only the default colour returned by the function will be used.
If a URL is provided, the row label box become a map coord clickable box.
(*) with strand decoration: namely, where the 'reverse' colour for a given @layout identification 'tag' is 'F', 'R', the actual row identification field tag is 'tagF' (or 'tagR').
x
, y
(e.g. so they can be written in the <img> tag to speed loading).
_colour():
returns undef if undefined, so test for that - font argument added to provide flexibility in default base font definition
1.08 (10/11/99): th/rbsk - several enhancements in label bumping, 'bar' drawing, etc.
1.07 (26/10/99): rbsk - 'V' may now be combined with 'R' or 'F' ie. 'VR' or 'VF' - added 'B' specifier to direct 'V' alignment
1.06 (26/10/99): rbsk - (bug fix) 'bar' object drawing needed to be offset by topmargin - added 'barcolour' argument
1.05 (18/10/99): rbsk
- User defined palette of GD::Image() colours. - 'label*' text 'V' field 5 directive, for vertically drawn labels
1.04 (11/10/99): rbsk
- generalized colour specification mechanism in @Layout
fields 4 and 5 to allow the use of an reference to a user defined feature
discriminator function so that colour used in drawing specific features is
gene feature content specific (e.g. different clone sources). See example
script.
1.03 (30/9/99): rbsk
- added more colours to the GifGFF palette of colours: light versions of the primary colours, dark versions of the alternate colours, plus 'gray'; 'magenta' considered equivalent to 'purple'
1.02 (28/9/99): rbsk
- added arguments: 'image', 'leftmargin', 'rightmargin', 'topmargin', 'bottommargin' and 'rowlabels'
- because pre-existing GD 'image' handle may now be provided, the 'io' argument may be undefined in such situations, leaving the user the responsibility of printing out the Gif image.
- the 'rowlabels' argument provides a means of labeling,within the left margin space, any row generated by a given @Layout entry. See example script.
1.01 (07/99): th - Creation