A number of requests for features have been received and may involve significant interaction between Otterlace and Zmap. Those requested following the recent annotation test (2009/10) are collated here.
A common theme is the selection of features from a (marked) region and displaying or highlighting relevant ones according to variable criteria. The intention is to perform the necessary analysis in Otterlace as with a scripting interface more rapid development is possible. Similar functions may already exist in ZMap (eg colinear lines and homology markers on bumped alignments). Some of these could possibly be implemented in ZMap and this point is open to discussion.
Some notes are available below.
Some notes about how to do this in ZMap without interaction with otterlace can be found here. There is also the option of otterlace providing separate featuresets for masked and unmasked data, which would be a very neat solution.
Although not mentioned in the RT ticket there are plans for this to be driven by a button or menu option. An annotator will be working on a marked region (or an implied marked region related to a selected feature?) and will want to see related non-duplicated evidence. For example given an mRNA there may be hundreds of EST features covering the same region and many of these may tell nothing new - the EST's may correspond exactly to some of the mRNA data.
On selecting the feature/region, and the function (for example by a right click menu in ZMap or by a menu option in the otterlace Tools menu) otterlace will request or Zmap will send information about the features present in the relevant columns in the given range. As ZMap has no way to know which columns/ featuresets are relevant (and hard coding this is a poor choice - the names can change) it makes sense for Otterlace to specify the featuresets (these could be specified in styles) and range. Protocol already exists for otterlace to know the selected feature or marked region. If necessary, ZMap can send a request to trigger the whole process.
The end result of this process would be a list of features to display within the marked region - effectively like bumping a column (or several) but with only the relevant features displayed.
This process does look complicated: is there any benefit in implementing this instead of having ZMap calculate it?
For a series of EST's we wish to have an automated process to see if each of them is represented by any of a series of existing annotated transcripts. As the number of combinations can be quite large it is not practical to do this by hand. The original ticket mentions 'check to see of the dotter is an uninterrupted diagonal line', but the reality is that approximate matches are needed and that in most cases there will be that diagonal line but it will fade out and not be full length, or there will be breaks near the tail - it is necessary for the annotator to see the dotter output and interpret it.
Initially individual transcripts will be compared with multiple EST's - this is already implemented in dotter.
To provide a clean interface for selecting features and presenting results the following is suggested:
The above initial solution will be trialed for feedback. Some idea for improving it have been suggested:
As in 'Apollo' - given a selected transcript light up homologies that match.
Given a transcript we can define the splice sites unambiguously via intron/exon boundaries in which case any analysis is clear cut. If we can select a transcript, and then RC on a (bumped) alignment column to select a menu item then we also have a simple user interface.
This is an obvious candidate for a new sub-feature glyph type where we can emulate the Apollo display.
Evidence and transcript will both be highlit. As evidence can appear in several columns a single colour defined in the window will be used - this will identify all the evidence as one group. Defining different (contrasting) colours per style could be confusing for the user. A 'washed out' version of the original feature's colour has been suggested but this would require more colour information per style - calculated colours are diffcult to make perfect. The default is yellow in black and applies to all features/ columns affected.
Configuration is done by:
[ZMapWindow] colour-evidence-fill=pink colour-evidence-border=green
ZMap styles provide normal and selected colours. To show related features such as evidence already used it has been suggested that these features 'go grey' which requires another colour spec or style. zmapWindowFocus.c mentions window->colour_item_highlight which is used for items that have no select colour defined and further colours can be defined in the window for 'used as evidence' features. Currently highlighting with window colours only affects the fill colour.
As we are now dealing with two independant sets of highlit features (evidence and focus) we need to define how they interact. Focus is at the top level and should always be visible. It is displayed by changing the fill colour, and the focus column is always shown. If we set the evidence colours to affect border and fill then it will be possible to see both at the same time.
Window colours are set in zmapWindowDrawFeatures.c/setColours().
The WindowFocus code will be modified to handle multiple lists of related features; see here.
There is some value in having related features highlit semi-permanently. When do they get un-highlit? a) if another feature is used to request related data b) via the RC menu/'hide related features'. If a feature is selected (becomes the focus) and then used to select evidence related features and then loses focus the evidence is still highlit. We need to provide two menu options 1) to show evidence and 2) hide evidence, and 1) will also hide any evidence already highlit. A further option will be provided to add more features for the evidence list.
But it's a little more complicated. Requesting related features is a feature specific function and appears on the feature menu. Hide features must appear if some are highlit but cannot be in the feature menu as the option will not appear if there is no focus feature. The feature menu can offer 'highlight evidence' or 'highlight transcript' depending on the focus feature type. The column menu can offer 'hide xxxx' depending on the type highlit, but to avoid confusion (as hide implies removal from display) this will be expressed as a toggle item with a tick and be called 'highlight XXX', and by selecting this menu item the user is unticking the box.
The existing 'feature details' popup has a section for evidence features and this information could be used to drive this highlighting. Here is an example repsonse from Otterlace
<response handled="true"> <notebook> <chapter> <page name="Details"> <subsection name="Annotation"> <paragraph type="tagvalue_table"> <tagvalue name="Transcript Stable ID" type="simple">OTTHUMT00000326814</tagvalue> <tagvalue name="Translation Stable ID" type="simple">OTTHUMP00000201976</tagvalue> <tagvalue name="Transcript author" type="simple">jel</tagvalue> </paragraph> <paragraph columns="'Type' 'Accession.SV'" name="Evidence" type="compound_table" column_types="string string"> <tagvalue type="compound">EST Em:BI909683.1</tagvalue> <tagvalue type="compound">Protein Sw:P43626.1</tagvalue> <tagvalue type="compound">Protein Sw:Q14954.1</tagvalue> <tagvalue type="compound">cDNA Em:AF022046.1</tagvalue> <tagvalue type="compound">cDNA Em:U24078.1</tagvalue> </paragraph> </subsection> <subsection name="Locus"> <paragraph type="tagvalue_table"> <tagvalue name="Symbol" type="simple">KIR2DS1</tagvalue> ... etc ... </paragraph> </subsection> </page> </chapter> </notebook> </response>
The existing message handling code is tied in quite tightly to some other code that drives creating a popup window and this involves using a callback from the window to its owning view to actually request the data - the view is aware of external interfaces but the window is not.
zmapWindowFeatureShow.c/featureShow() supplies stacks of start and end handlers for XML tags which imply parent/child relationships and these mirror the structure of the XML document expected from otterlace. The easiest way to process this data and extract the list of supporting features will be to provide different tag handlers that simply accumulate a list of items to be used by the calling code.
char *content = zMapXMLElementStealContent(element);
Ths code can be re-used with the addition of a couple of flags and can be made to return a list of accession quarks defining the evidence features. Note that this necessarily creates data structures to drive the feature details popup and this must be freed.
There is a need for ZMap to operate on its own and some advanced features can be implemented in several ways. If Otterlace is connected then we wish to have the option of alternate behaviour and a mechanism for this exists in the current 'Feature Show' requests.
ZMap will make requests to Otterlace informing Otterlace what it is about to do and Otterlace will respond this a message like the following:
<response handled="false"></response<> <response handled="true"> ... extra data ... </response>
Otterlace will have to choice of replying with a request for ZMap to do nothing and later follow up with a request to parform various actions. This may be important if we wish to perform some extensive analysis of data in ZMap by default (eg masking ESTs by mRNAs) - it allows a quick decision to be made without having to wait for data.
NB: All the following is subject to implementation - user requirements and time available.
As most of the data will no longer be stored in ACEDB it becomes necessary for ZMap to provide Otterlace with information about what features are present and this will include:
Actions that Otterlace can request from ZMap will include:
There is possible confusion over the use of the words Featureset and Column. To avoid ambiguity we shall state the following:
The Column 'Repeats' is typically used to display the featuresets 'Repeatmasker_LINE' and 'Repeatmasker_SINE'. Usually all these features are displayed in one column (not strand sensitive).