Historically ZMap has requested all its data sequenctially fromACEDB and more recently this has been expedited by the avdent of pipe Servers which allows data from different columns to be requested in parallel and displayed piecemeal. This allows low volumne columns to be displayed quickly and allow the user to interact with ZMap while high volume columns are loading (Caveat: there is a noticable speed reduction due to processor use).
Data volumnes have increased and we expect these to continue to increase at an accelerating rate. To provide a usably fast response it it probably not feasable to continue with the current strategy which is to display all the data and allow fast scrolling via X. Note that the existing implementation using the foo canvas appears not to provide this fast scrolling (via an internal static bitmap) and instead draws features via an X expose event.
One strategy is to display summary data for high volume columns and while this may be a good solution at some point we have to display the actual features.
This is because every feature loaded is displayed even if not visible due to being hidden by others. If there are 100k features in a column then the same picture can be generated by about 1000 features.
As for the initial display this is due to repainting every feature in the display.
This was due to the foo canvas passing the cursor motion event to every canvas group, resulting in eg 600k function calls for each event. This has been commented out.
These are implemented by adding foo canvas items and in the current implementation this implies a repaint of all canvas groups overlapping the areas involved. For the ruler this means a line across the whole window for the old and new position, and subject to X behaviour could result in an expose event to all groups between the two. For the lasso there will be an expose event for the larger of the old and new rectangles.
It's obvious the we have to prevent the expose events from repainting features and using some kind of bitmap to hold the image or persuading X to blit the ruler and lasso will solve this.
We expect that visible scrolled items will not be re-painted but the previously hidden ones are. Using a bitmap to hold the image will cure this but at the expense of having to paint it in the first place.
The basic problem is that it takes a long time to display a lot of features and the current strategy is to display all the features ZMap has and allow fast scrolling via bitmap blitting (which does not happen). All the problems to solve can be expressed in these terms but there are a number of aspects to consider:
Items are added to groups and the groups store these in a Glist. The known performance problem with g_list_append has been addressed (see foo-canvas.c/group_add()), So other than operating foo data this is not thought to be relevant.
This is certainly true. Due to the use of the foo canvas by ZMap (data arranged in column groups and stored unsorted) if we want to paint items in a small section of the column then these are searched for using GObject methods and every group in the column is asked if it it is in the region. As every ZMapCanvasItem is a group then we search every feature which will run in O(n) time rather than a more reasonable O(log n). We know that this is a real performance problem as this was the cause of the cursor seizing up - this used the GObject signaling mechanism and this implies a double overhead - one function call to send the signal and another to receive it for every item in the column. Note that reverting ZMapCanvasItems to be simple foo_canvas_items will not affect this much as groups have to query each item.
The general approach will be to take exising code using a high volume column and selective remove various functions while timing operations. Attempts using vtune for example have proved useful but are swamped by glib internbal routines and apply to a complete run of zmap and therefore provide imprecise data.
We will use a large data set (eg 170k trembl features) so that non-linear functions appear more significant.
Data sizes
gtkobject 16 foocanvasitem 56 foocanvasgroup 80 zmapcanvasitem 108 zmapwindowalignmentfeature 116 zmapwindowcontainergroup 156
foocanvasitem 56 zmapcanvas item 108 - 80 - 16 = 12 (16 bytes saved due to not needing 4 lists of canvas items)
General Program: ZMap - 0.1.110 User: mh17 (Malcolm Hinsley) Machine: deskpro18979 Sequence: chr9-03_86271101-87038365 Session Statistics chr9-03_86271101-87038365: Context children: 0, canvas children: 0 chr9-03_86271101-87038365: Context children: 0, canvas children: 0 chr9-03_86271101-87038365: Context children: 0, canvas children: 0 Novel_CDS: Context children: 2, canvas children: 2 Transcript features:1 exons:11, introns:10, cds:1 boxes:0 exon_boxes:11 intron_boxes:20 cds_boxes:1 Transcript: Context children: 0, canvas children: 0 Genscan: Context children: 8, canvas children: 8 Transcript features:4 exons:22, introns:18, cds:4 boxes:0 exon_boxes:22 intron_boxes:36 cds_boxes:4 Halfwise: Context children: 16, canvas children: 16 Transcript features:8 exons:30, introns:22, cds:8 boxes:0 exon_boxes:30 intron_boxes:44 cds_boxes:8 Genomic_canonical: Context children: 100, canvas children: 100 Basic features:50 boxes:50 Novel_CDS: Context children: 0, canvas children: 0 Transcript: Context children: 2, canvas children: 2 Transcript features:1 exons:20, introns:19, cds:0 boxes:0 exon_boxes:20 intron_boxes:38 cds_boxes:0 Genscan: Context children: 8, canvas children: 8 Transcript features:4 exons:30, introns:26, cds:4 boxes:0 exon_boxes:30 intron_boxes:52 cds_boxes:4 Halfwise: Context children: 18, canvas children: 18 Transcript features:9 exons:44, introns:35, cds:9 boxes:0 exon_boxes:44 intron_boxes:70 cds_boxes:9 3 Frame Translation: Context children: 0, canvas children: 0 trf: Context children: 2890, canvas children: 2890 Basic features:126 boxes:126 Alignment features:144 gapped:0 not perfect gapped:0 ungapped:144 boxes:144 gapped boxes:0 ungapped boxes:144 gapped boxes not drawn:0 Alignment features:698 gapped:0 not perfect gapped:0 ungapped:698 boxes:698 gapped boxes:0 ungapped boxes:698 gapped boxes not drawn:0 Alignment features:477 gapped:0 not perfect gapped:0 ungapped:477 boxes:477 gapped boxes:0 ungapped boxes:477 gapped boxes not drawn:0 CpG: Context children: 26, canvas children: 26 Basic features:5 boxes:5 Basic features:8 boxes:8 GF_coding_seg: Context children: 0, canvas children: 0 GF_ATG: Context children: 0, canvas children: 0 GF_splice: Context children: 0, canvas children: 0 SwissProt: Context children: 23884, canvas children: 23884 Alignment features:11942 gapped:0 not perfect gapped:0 ungapped:11942 boxes:11942 gapped boxes:0 ungapped boxes:11942 gapped boxes not drawn:0 TrEMBL: Context children: 402232, canvas children: 402232 Alignment features:201116 gapped:0 not perfect gapped:0 ungapped:201116 boxes:201116 gapped boxes:0 ungapped boxes:201116 gapped boxes not drawn:0 EST_Human: Context children: 2360, canvas children: 2360 Alignment features:1180 gapped:0 not perfect gapped:0 ungapped:1180 boxes:1180 gapped boxes:0 ungapped boxes:1180 gapped boxes not drawn:0 EST_Mouse: Context children: 1278, canvas children: 1278 Alignment features:639 gapped:0 not perfect gapped:0 ungapped:639 boxes:639 gapped boxes:0 ungapped boxes:639 gapped boxes not drawn:0 EST_Pig: Context children: 6920, canvas children: 6920 Alignment features:3460 gapped:0 not perfect gapped:0 ungapped:3460 boxes:3460 gapped boxes:0 ungapped boxes:3460 gapped boxes not drawn:0 EST_Other: Context children: 4806, canvas children: 4806 Alignment features:2403 gapped:0 not perfect gapped:0 ungapped:2403 boxes:2403 gapped boxes:0 ungapped boxes:2403 gapped boxes not drawn:0 vertebrate_mRNA: Context children: 5988, canvas children: 5988 Alignment features:2994 gapped:0 not perfect gapped:0 ungapped:2994 boxes:2994 gapped boxes:0 ungapped boxes:2994 gapped boxes not drawn:0 Saturated_SwissProt: Context children: 258, canvas children: 258 Basic features:129 boxes:129 Saturated_TrEMBL: Context children: 3292, canvas children: 3292 Basic features:1646 boxes:1646 Saturated_EST_Human: Context children: 280, canvas children: 280 Basic features:140 boxes:140 Saturated_EST_Mouse: Context children: 186, canvas children: 186 Basic features:93 boxes:93 Saturated_EST_Pig: Context children: 1166, canvas children: 1166 Basic features:583 boxes:583 Saturated_EST_Other: Context children: 796, canvas children: 796 Basic features:398 boxes:398 Saturated_vertebrate_mRNA: Context children: 560, canvas children: 560 Basic features:280 boxes:280 DNA: Context children: 1, canvas children: 1 Basic features:0 boxes:0 Locus: Context children: 12, canvas children: 12 Basic features:6 boxes:6
There appears to be approximately 3x as many canvas groups as there are features, which is odd, these statistics are incremented when the Item factory is run, once per canvas item/ feature displayed.
Start 114.975 Merge Context Stop 115.100 Merge Context Start 115.102 DrawBlock Stop 115.103 DrawBlock ... Start 116.611 DrawFeatureSet trembl Start 116.611 DrawFeatureSet ProcessFeature Stop 134.304 DrawFeatureSet ProcessFeature Start 134.304 DrawFeatureSet Bump Stop 134.304 DrawFeatureSet Bump Start 134.304 DrawFeatureSet SetState Stop 149.438 SetVis true Stop 149.438 DrawFeatureSet SetVis Stop 149.438 DrawFeatureSet SetState Stop 149.446 DrawFeatureSet trembl ... Stop 154.382 DrawFeatureSet SetVis Stop 154.382 DrawFeatureSet SetState Stop 154.382 DrawFeatureSet saturated_est_mouse expose complete: 0 items picked, 681258 groups drawn
As we can see from this the picture is more complicated for the initial canvas display. There is a process known as updating that appears to set the extents of all canvas groups and this is done via an idle callback if needed, but also triggered by the expose handler if it is pending. In this example it is done before the expose. Note that there are two calls - one for the navigator and one for the main canvas.
Start 0.132 canvas_expose draw Stop 0.132 canvas_expose draw expose complete: 0 items picked, 1 groups drawn Start 0.132 canvas_expose draw Stop 0.132 canvas_expose draw expose complete: 0 items picked, 4 groups drawn Start 0.231 do_update Stop 0.231 do_update Start 92.876 Merge Context Stop 92.999 Merge Context Start 93.001 DrawBlock Stop 93.002 DrawBlock Start 94.622 DrawFeatureSet trembl Start 94.622 DrawFeatureSet ProcessFeature Stop 112.091 DrawFeatureSet ProcessFeature Start 112.091 DrawFeatureSet Bump Stop 112.091 DrawFeatureSet Bump Start 112.091 DrawFeatureSet SetState Stop 130.262 SetVis true Stop 130.269 DrawFeatureSet SetVis Stop 130.269 DrawFeatureSet SetState Stop 130.276 DrawFeatureSet trembl Start 134.888 do_update Stop 140.436 do_udate Start 140.505 do_update Stop 140.505 do_udate Start 140.751 canvas_expose draw Stop 170.883 canvas_expose draw expose complete: 0 items picked, 681258 groups drawn Start 170.883 canvas_expose draw Stop 170.914 canvas_expose draw expose complete: 0 items picked, 4 groups drawn
Removing the canvas item creation from the display code results in 270ms being needed to drive everything else for the Trembl column, and we could loose 40 seconds by handling all this in the ZMap code, if we chose to draw features from the context of a column container object.
This was done by hiding all columns except Trembl and placing the columns dialog over the Trembl column, and the minimising it, and the process repeated but only exposing a small number of pixel rows.
Start 556.541 canvas_expose draw Stop 579.675 canvas_expose draw expose complete: 0 items picked, 603359 groups drawn Start 579.701 canvas_expose draw Stop 580.297 canvas_expose draw expose complete: 0 items picked, 12626 groups drawn
Start 937.394 canvas_expose draw Stop 937.420 canvas_expose draw expose complete: 0 items picked, 26 groups drawn Start 937.454 canvas_expose draw Stop 937.483 canvas_expose draw expose complete: 0 items picked, 146 groups drawn Start 937.485 canvas_expose draw Stop 937.485 canvas_expose draw expose complete: 0 items picked, 1 groups drawn Start 937.506 canvas_expose draw Stop 937.506 canvas_expose draw expose complete: 0 items picked, 1 groups drawn
Start 370.945 canvas_expose draw Stop 370.971 canvas_expose draw expose complete: 0 items picked, 11 groups drawn Start 370.996 canvas_expose draw Stop 371.197 canvas_expose draw expose complete: 0 items picked, 6167 groups drawn Start 371.200 canvas_expose draw Stop 371.200 canvas_expose draw expose complete: 0 items picked, 1 groups drawn Start 406.691 canvas_expose draw Stop 406.691 canvas_expose draw expose complete: 0 items picked, 1 groups drawn
So this gives us:
In rough terms the above stats suggest that to display data from the feature context via the foo canvas takes approx 12 seconds per 100k features. (36 + 36 / 600k). 25% of this is through adding data to the foo canvas, 25% through setting show/hide status, 8% doing a foo-update and 42% drawing the data via GDK.
This should save 25%; we can also optimise the Drawfeatures code to not draw columns that are not expected to be visible, which should save mode time for the initial display but require extra time of columns are shown later. This should be acceptable for user controlled show/hide, but may be irritating for columns configured to be hidden at certain zoom levels. Note that it is only relevant to optimise columns with large amounts of data.
Using column summarise (same picture) or specific summary styles (heatmaps/ graphs) for low zoom + high volume columns for the initial display this allows a maximum time need to be determined as for each column there is never a need to display more features than there are vertical pixels If we assume 40 columns and display 1000 pixels tall that gives us a worst case of 40k features, which would take about 5 seconds, but in practice we would expect this to be much faster as most columns will not have that much data. (in our sample data we have 8 columns with more than 1000 features). It may be easiest to control this via styles config.
Note that this approach has some downside:
If we assume that for the initial Zmap display we only have to display 20k features, then we require only approx 2.5 seconds using existing foo canvas technology, no matter how many features some columns have.
Is this possible or advisable? Let's guess 5% improvement possible - further investigation is needed
If we construct our own image in a pixmap (or several) and display these via the foo canvas then there will be no noticable performance issues from the foo canvas (assuming that pixmaps will operate efficiently) as the number of display items will be small (eg less than 100).
This would allow us to remove almost entirely the step of adding items to the foo canvas as all we need to do is to create a mapping from feature context to pixmap - this will be equivalent to the FToIhash operations already in place. The draw process would then consist of writing to the pixmap and triggering a GDK paint. Note that this implicitly treats the display as a representation of the data and the concept of searching the canvas is not relevant.
This would give 24% from not operating the foo canvas except for minimal numbers of items, and 8% from not operating the update process. Gains in drawing should also be possible - a simple test using a glyph drawing function shows that displaying 10k items to a gdk_drawable takes 150ms for background and 120 ms for outline - if we have 600k items to display that would take 15 seconds approx, which is half the time spent currently. Obviously with an off screen drawable (pixmap) there can be less interaction with X and items can be draw as a batch. Note that if we adopt the policy of only displaying what needs to be displayed it is unlikely that we will have to display significantly more than 10 items and it should be possible to engineer a ZMap where all display operations take less than 1 second, even without resorting to off-screen pixmaps.
It should also be possible to lose the need for long items code.
NOTE that if we are to avoid continually redrawing features whenever another column is displayed then we must implement something like this. See here for a discussion of how to do this with minimum effort.
This (or Zmap supplied pixmaps) is essential to allow efficient scrolling. It is also essential to allow columns to be hidden or shown or moved without repainting half the screen - Either we have one foo canvas per column or one pixmap per column.
Using well known graphics techniques of displaying a current view and having adjacent view already prepared (eg like google maps) we can provide smooth scrolling using pixmaps without having to display all the data at high zoom. It would be desirable/ necessary to implement a display thread to paint to a cache of pixmaps, but this would unlink control and view and provide a more comfortable user experience. Note that we move much of the drawing operations to idle CPU time as we can paint adjacent regions while the user is doing soemthing else, and the appearance will be of a much faster operation.
This strategy would allow us to paint only a subset of the data at high zoom and would therefore speed up zoom and revcomp considerably.
Investigations reveal that with approx 230k features we end up with 680k foo canvas groups and there is clearly something to explain.
Currently, to display 100k canvas items requires 12 seconds By setting column state appropriately we can reduce this to 9 seconds
If we create and display our own pixmaps instead of displaying features direct on the foo canvas then this gains us another 3 seconds. Tests need to be made to determine if using pixmaps is a workable strategy, and what performance gains we can make in the drawing process.
By displaying summary data rather than all the features for high volume columns we can set a practical limit (related to the number of columns with more than 1000 features) of approx 2.5 seconds (see above for assumptions), and with the improvements above this would be halved (1.25 seconds). Compared with 72 seconds this is a speed up of 56x. Thsi would require little change to the existing canvas operation but would require new styles (as already being designed for heatmaps etc) and sundry changes to some ZMap code.
With data loaded via pipes the initial 2 minutes of 'Data Loading' will effectively disappear and many columns will be displayed very quickly, with Trembl still taking 2 minutes to arrive. Note that to avoid subsequent long delays on requesting other columns it is essential to prevent repainting of high volume columns, either by using summary data or by implementing pixmaps per column.
Currently all or as much of the feature context as is possible is displayed whenever a zoom (or revcomp etc) is selected. We could opt to set the canvas size to the full size and only paint around the visible region, which would reduce drawing time significantly, depending on the zoom level.
At high zooms summary data is not possible or useful and we cannot achive any gains from this. However as we know that the number of visible features is small we could operate a paint on demand policy and provide usable worst case performance. Tests on a smaller data set show 120k groups painted in 1.7 seconds.
The foo canvas has support for pixbufs which appear to be static images. A cursory scan og GDK pixbuf documenation give the impression these are not much used these days, and arfe authored by the same person as the foo canvas. There is a library which is 11 years old.
These are Drawable objects and can be treated as off screen windows and look like a much better option - existing code can be used to draw on them. Note that they are already deprecated and we are advised to use Cairo instead, but given that that applies to most of the foo canvas this is hardly an issue. (NB we would not be able to upgrade to GTK 3).
Without changing the overall structure of the ZMap code it should be possible to create a pixmap per column and have existing code paint features on it. We would have to add in an extra layer to place pixmaps onto the canvas, but then scrolling should be instant and re-ordering and moving columns would not require features to be repainted, which will become important very soon when pipe servers make it out into the real world.
Here we try to preserve existing code and data with the aim of inserting pixmaps with the minumum code written/ changed.
These are drawn as foo canvas items and require a redraw of the whole region when changed: ie they are not blitted. Without changing code if we operate pixmaps as the display technology this will be significantly faster.
Another way to speed up the ruler would be to implement a 1-pixel deep tooltip (if possible), assuming that this would be displayed by X with bit-blitting. (if not then no performance improvement).
Currently when a column is selected the Canvas items are sorted and a selected feature is highlit a) by changing the colour and b) moving it to the start of the column's list of features. When unhighlit it is moved back into place, but with multiple highlight this is a problem. Highlit items can be displayed on top of pixmaps in the window's foo canvas and this provides a way to add or remove highlight without affectng the order of features in the canvas. These features would be flagged to not respond to mouse events and pass them onto the underlying pixmap/ foo canvas combo.