Quick links: Style configuration options Implementation notes.
As of April 2010 glyphs will be changed, the main differences being:
Compatability with previous versions of ACEDB/ Otterlace will be maintained.
Here's a summary of glyph configuration by way of examples.
[my-other-style] mode=alignment # here glyphs are sub-features # and the styles to use are specified here: sub-features = homology:red-triangles; splice: blue-diamond ;polyA: polyA-tail [my-glyph-style] # can be for feature type glyph or sub-feature glyph = shape # shape to draw (both ends) glyph-3 = shape # shape to draw (bottom end) glyph-5 = shape # shape to draw (top end) glyph-3-rev = shape # shape to draw (bottom end) reverse strand glyph-5_rev = shape # shape to draw (top end) reverse strand glyph-score-mode = width height size # change size according to score glyph-score-mode = alt # change colour according to threshold glyph-threshold = X glyph-strand = flip-x flip-y # invert if on reverse strand glyph-align = left right # (defaults to centre if not specified) # style width must be set # see 'Aligning Glyphs' below for some useful comments glyph-colours = pink # for sub feature glyphs glyph-alt-colours = turquoise
[3-frame-splice] mode = glyph width=30.0 # make enough space for left and right pointing hooks frame-mode=only-1 # is frame specific so will use the colours, display in 1 column strand-specific=true show-reverse-strand=false # NB there are no rev strand features show_when_empty=false score-mode=width min-score= -2.0 # as in previous ACEDB style max-score = 4.0 glyph-5 = up-hook glyph-3 = dn-hook colours = normal fill grey # for central vertical line if we draw it frame0-colours = normal fill red; normal border red frame1-colours = normal fill green; normal border green frame2-colours = normal fill blue; normal border blue bump-mode=unbump
These are a little non standard. The data is for the forwards strand only and the min and max scores can be set in the GeneFinder application and typically are -2.0 to +4.0. These scores are the log probability of there being a splice site, so -ve values (hooks on the left) are less likely than random.
If the score for a glyph is -ve then the width factor calculated will also be -ve and this will cause the glyph to be reflected in the origin, which itself will be calculated off centre in the column as min and max are not symetrical.
Note that 3-Frame splice markers are generated by the GeneFinder program and currently require ACEDB to be running. They are also explicitly only for the forward strand (meaningless with reverse complemented data) and the style above is designed to hide them when reverse complemented.
Note also that the 5' and 3' feature attributes from GeneFinder are Intron-centric and therefore are the opposite sense to that normally used in ZMap.
The style as defined above is what gets used if you specify in [ZMap] 'legacy_styles = true', and the glyphs are defined as:
[glyphs] dn-hook = <0,0; 15,0; 15,10> up-hook = <0,0; 15,0; 15,-10>
[homology-marker] mode = glyph glyph-5 = up-tri glyph-3 = dn-tri glyph-score-mode = alt glyph-threshold = 5 glyph-colour = normal fill red # for >= threshold glyph-alt-colour = mormal fill green # for < threshold
[nc-splice] mode = glyph glyph = diamond glyph-colours = blue
mode = glyph (TBD) glyph-5 = up-dotted-line glyph-3 = dn-dotted-line glyph-colours = blue
This could be coded as a separate featureset in the transcripts column.
In order to make styles files readable by humans glyph shapes will be defined by names, which will refer to a config stanza in the ZMap main config file. This will be called [glyphs] and will consist of a single line per glyph of the form 'name=drawing-spec'.
As we draw glyphs using GDK operations we will define shapes in a way compatible with these, and this means we can specify lines, simple polygons and ellipses. We assume that only one filled polygon is to be specified per glyph in the interests of display speed. However, experiments with GDK show that we can cross lines to generate more complex shapes such as bow-ties.
Glyphs are defined via coordinates relative to an origin (0,0) specified in pixels. The origin is where the shape is anchored to the feature, and the feature's anchor point will be in the centre of its column at its Y-coordinate - it will be possible to define shapes offset from the centre. Points are defined as (signed) coordinate pairs separated by semi-colons.
As the GDK drawing primitives draw filled shapes with border and fill colours if we attempt to combine these into one glyph we would end up with internal lines and opt not to allow this. (for example a square and a triangle combined)
_____ | |\ | | \ | | / |_____|/
We use angle brackets to enclose the glyph description.
A list of points is given and by default lines between consecutive points are implied. Adding a '/' between points will signify a break.
If the list of points is unbroken and the first and last points are identical then the stanza defines a polygon and the shape will be filled with the fill colour specified in the style. There is a hard coded limit of 16 points per shape and line breaks are included in this.
[glyphs] up-triangle=<0,-4; 4,0; -4,0; 0,-4> # 4th point to complete the loop and trigger internal fill up-walking-stick=<0,0; 8,0; 8,6> truncated=<0,0; 3,1 / 6,2; 9,3 / 12,4; 15,5> # a sloping dashed line
We use round brackets to delimit the description.
Currently ZMap implements only whole circles and we will extend this to allow ellipses and fractions of a circle. To define a circular glyph we specify the bounding box with top left and bottom right coordinates, and then optionally a start and stop angle in degrees, 0/360 being at 3 o'clock (due to GDK), angles count up anticlockwise (due to GDK). It appears that ellipses can only be drawn as vertical or horizontal using GDK.
[glyphs] circle=(-4,-4 4,4) # a full circle horizontal-ellipse=(-2,-4 2,4) # a flat ellipse lr-circle=(0,0 4,4) # a small circle offset to below and right of the feature r-half-moon=(-4,-4 4,4 270 90) # a half moon on the RHS
The points used to specify a glyph's size and shape correspond to the maximum size and when 'glyph-score-mode' is set as width or height then the pixel coordinates will be adjusted accordingly.
A glyph that scores less than the minimum will not be displayed and a glyph exceeding the maximum will be displayed as for max-score. 'min-score' will likely be one pixel across.
It's a simple concept but there are a few nuances to consider.
Currently when a canvas item is drawn much of the information used is derived from a style structure but at some point gets copied into some local structure. The glyphs code makes use of GObject get() and set() functions which inevitably impose an overhead. Styles are attached to a view and also the window structures, and the reason for copying style data is presumed to be to allow styles to be changed at run time. Whenever a new view is created the config files are re-read and exisiting views (ie each ZMap) must continue to function as before. Arguably there is no need to have features displayed using two versions of the same style in one window (in fact this would be confusing for the user) and therefore the style information could be efficiently stored in the window instead of in the canvas item, once per item or by a reference to the style. However it is not possibly to address this without a review of the whole view - window - canvas item interface and it's not appropriate to do this now.
A glyph shape is a simple data structure and can be stored easily in a style object much like a colour definition. Following current practice a displayed glyph can have the relevant config choices extracted for the style and stored in its glyph object parameters. This requires the addition of glyph-shape-name glyph-points and glyph-type (or similar).
In terms of configuration a glyph shape is defined as 'name=shape' where shape includes points and type (lines or circles). For the style object we need to store the name and shape as separate properties, which implies a difference from default config file handling (beyond the fact that we are required to handle user defined names anyway). When reading styles we will have to read the glyphs config first and store this data in the styles - the glyphs data will then be freed as it cannot be accessed from anywhere else due to existing data structure design.
We wish to avoid continually interpreting text into a list of points and if so in the context of GObjects being used for canvas items with no external references we will have to copy the style's shape data structure as a G_PARAM_BOXED parameter. As glyph shapes must de defined in the main ZMap configuration file they could be stored as a hash table of structures globally in the view (like styles) but unfortunately by the time we get to display a glyph that data is not accessable. A CanvasItem has a feature pointer which has a style, and we are constrained to hold the reference there. We can convert the text to data in the style and provide an extra interface to extract this data and copy it to the Glyph canvas item. Doint this via a g_object_get call would require two interfaces to the same parameter and this can be provided at the cost of slowing down object copy functions if necessary. We can provide a C functions to extract the data structure directly, which will appear like other existing style access functions. This will not require any major re-organisation of existing data structures.
Although we can support different glyphs to 5' and 3' ends with current styles there is a need to support multiple sub-features.
Previous code has handled homology and concensus splice site glyphs to alignments by hard coding these. Splice sites were removed due to clutter and homology glyphs were made configurable by adding an explicit glyph to alignment styles. For alignments we can easily imaging needing three type of glyph to attach to features: a) incomplete homology markers b) non-concensus splice sites, c) poly A tails and it is not feasable to keep adding data to the styles, although the handling of these glyphs does need to be hard coded as it involves ZMap calculating where and whether or not to display them.
One obvious approach is to code glyphs using separate styles which allow the appearance to be configured but are only used as sub-features. Two simple ways to attatch these to feature (alignment) styles would be:
[align-xyz] mode-alignment sub-features = homology:red-triangles; splice: blue-diamond ;polyA: polyA-tail
[sub-features] homology = red-triangles splice = blue-diamond polyA = polyA-tail
Here it is necessary to follow existing practices.
When a column is created a list of styles which are necessary to display all the featuresets in the column is created and these are copied to a hash table in the container, and they can be looked up later by unique-id. When reading in a style's data the id quarks are stored in a short array in the style (sub_feature[]) which is indexed by a sub-feature enum. Sub features are all hard coded and we can predict how many there are.
When creating the column first a list of all styles needed is created by zmapWindowUtils.c/zmapWindowFeatureSetStyles() and this is later passed to zmapWindowContainerFeatureSet.c/zmapWindowContainerFeatureSetAugment which takes a copy of the actual styles data and stored this in a hash table in the container. zmapWindowFeatureSetStyles() has access to a global list of styles for the window, which may be a copy of the styles known to the view.
To add sub-feature styles to this table zmapWindowFeatureSetStyles() must find these for each style it adds to the list by looking up the id in the window's style table and add these to its list. As the id's are simple quarks the style config can be read in without any resource deadlocks; when the container hash table of styles is created all the relevant styles have already been created and are accessable.
A glyph item is a zmapWindowCanvasItem yet is created by a call to foo_canvas_item_new() which creates an object of the requested type. Sub feature glyphs are created in zmapWindowCollectionFeature.c/markMatchIfIncomplete() and also zMapWindowCollectionFeatureAddSpliceMarkers(). Gylph mode features get added as plain zmapWindowCanvasItems. Currently the only glyph mode features used are the splice marker triangles and there is some old #iffed out code in zmapWindowItemFactory.c/drawGlyphFeature() that uses the zMapDrawGlyph() function in zmapDraw.c; instead a plain canvas item is created.
ACEDB provides a GF_splice style which is defined as mode glyph and glyph_mode ZMAPSTYLE_GLYPH_SPLICE, and other than config code this glyph-mode is only referenced in zmapWindowBasicFeature.c/zmap_window_basic_feature_add_interval() where the RH triangles are hard coded. This function calls foo_canvas_item_new() asking for a ZMAP_TYPE_WINDOW_GLYPH_ITEM. This item type is only referenced once elsewhere in zmapWindowCanvasItem.c/zmap_window_canvas_item_set_colour(). However, in zmapWindowCollectionFeature.c there are three references to zMapWindowGlyphItemGetType() which is the same thing. So... it does appear that glyphs end up as the same things regardless of being features or sub-features and the existing code can simply be tweaked to use configured data.
zmapFeature.c/addFeatureModeCB() also sets mode glyph in any style called "Gf_Splice", although experiments reveal that this function (which is to patch up some ACEDB issues) does not appear to be called - this is according to some 'has modes' flag.
A new function (zMapWindowGlyphItemCreate()) will be provided by zmapWindowGlyphItem.c which will draw a glyph (via a foo_canvas_item_new() call) given its style and some context parameters.
So...
Note that to access the same data structure in zmapStyle.c and also zmapWindowGlyphItem.c we have to make this public and include zmapStyle.h in the glyph code, which makes it the only CanvasItem module including styles data.
Handling legacy styles data from ACEDB needs some thought (it used to be hard coded) and the following is proposed:
[ZMap] legacy_glyphs=true