![]()
ACEDB Version 4_9
Guide To Blixem
Originally written by
Ed Griffiths <edgrif@sanger.ac.uk>, August 2009Input File Formats
set out extended format and normal format, rest can be added later....lean on gffv3, check strings in gffv3, semi-colons ????? escaped ?????
exblx and seqbl
FILL THIS IN....
exblx_x and seqbl_x
These are extended versions of the exblx and seqbl formats which add:
- Strand field for match sequence.
- Gaps data for gapped alignments.
- Tag/value format to allow addition of attributes in a flexible way.
These changes have become necessary to allow blixem to display the more extensive match data now available.
The format is line based and all lines are either comments or valid feature lines. Comment lines start with a "#" and the first line of the file must be either "# exblx_x" or "# seqbl_x" to specify the format.
The feature lines are column based and the format borrows from the GFF version 3 spec, in particular:
- The format consists of 9 columns, separated by tabs (NOT spaces).
- The following characters must be escaped using URL escaping conventions (%XX hex codes):
- tab
- newline
- carriage return
- control characters
- The following characters have reserved meanings and must be escaped when used in other contexts:
- ; (semicolon)
- = (equals)
- % (percent)
- & (ampersand)
- , (comma)
- Unescaped quotation marks, backslashes and other ad-hoc escaping conventions that have been added to the GFF format are explicitly forbidden.
- Note that unescaped spaces are allowed within fields, meaning that parsers must split on tabs, not spaces.
- Undefined fields are replaced with the "." character, as described in the original GFF spec.
The 9 columns are:
score reference_strand_frame reference_start reference_end match_start match_end match_strand match_name attributesThe attributes column consists of a list of feature attributes in the format "tag values". Multiple "tag values" pairs are separated by semicolons. URL escaping rules are used for tags or values containing the following characters: ",=;". Spaces are allowed in this field, but tabs must be replaced with the %09 URL escape.
For exblx_x the attributes are:
[gaps data] [match_description]and for seqbl_x:
[gaps data] [match_sequence]The format of these attributes is:
"Gaps [ref_start ref_end match_start match_end]+ ;" "Description the sequence description ;" "Sequence aaagggtttttcccccc ;"
ACEDB Version 4_9
Ed Griffiths <edgrif@sanger.ac.uk> Last modified: Tue Aug 18 17:09:15 BST 2009