Self-self blast (farm_blast bsubs itself) farm_blast reference.fasta reference.fasta act reference.fasta Farm_blast.reference.fasta.reference.fasta.blast_plus.blastn.out/blast.out.gz reference.fasta & Right-click -> ignore self matches Look for sequence match start to end (= overlap). Overlap of 6840bp. Index the fasta samtools faidx reference.fasta reference.fasta.fai file looks like this: unitig_0 6402387 10 60 61 Trim overlap from start of sequence samtools faidx reference.fasta 'unitig_0':6841-6402387 > trim.fa Find dnaA, using automated annotation annotationfind -t sample -i 3187STDY5863006 -g dnaA -n Use the --act flag in farm_blast to make sure all sequences are compared farm_blast --act output.dnaA.fa trim.fa dnaA sits at 5950817 Index the trim.fa file samtools faidx trim.fa unitig_0|trim 6395547 15 60 61 samtools faidx trim.fa 'unitig_0|trim':5950817-6395547 > new_break.fa samtools faidx trim.fa 'unitig_0|trim':1-5950816 >> new_break.fa Use text editor to remove second 'contig'. Open in Artemis and re-write out all bases, to be sure it's in proper fasta format! Find raw data with pathfind, and symlink directly to this folder pathfind -t sample -i 3187STDY5863006 -f pacbio -l . Re-map raw data back to this new reference, into output folder named reseq pacbio_smrtanalysis --memory 30 --reference new_break.fa RS_Resequencing reseq *.bax.h5 Check join (at 6395547-5950817 = 444730) by looking at reseq/aligned_reads.bam over the reference sequence new_break.fa Join looks fine, moving on to annotation (for which you use reseq/consensus.fasta) bsub -M4000 -R "select[mem>4000] rusage[mem=4000]" 'annotate_bacteria -a consensus.fasta --sample_name PAK --genus Pseudomonas'