This article describes how to use the Find Heterozygotes pipeline under the Pre-processing options. Find Heterozygotes can also be chosen as an option when running Batch Assemble Sanger Sequences.
Heterogeneity in base sequence occurs when a single peak position within a trace contains more than one peak. In general, base callers tend may list this base position as an 'N' or may call this base based on the highest observed peak. These heterozygous bases can be identified and annotated to suit with the Find Heterozygotes and Batch Assemble Sanger Sequences operations.
Find heterozygotes without assembly
To identify and annotate heterozygous bases without prior sequence assembly, (1) select your sequences, (2) click Pre-processing and select (3) Find Heterozygotes in the dropdown.
To annotate or modify heterozygous bases according to the selected peak similarity, select Annotate or Change bases to ambiguities respectively in the Find Heterozygotes dialog. If you select Annotate, the heterozygous bases will be annotated and not changed, but if you select Change bases to ambiguities the heterozygous bases will be changed to ambiguous bases accordingly.
You can set the peak similarity percentage to suit where a heterozygous base will be call when the alternative peak is as high as the set percentage of the best peak. In the example below, the number of ambiguous bases reduces with increasing peak similarity.
Find heterozygotes during Batch Assembly
To identify and change heterozygous bases in consensus sequences, (1) select the sequences to be assembled, (2) click Pre-processing and (3) select Batch Assemble Sanger Sequences in the dropdown.
When a Sanger trace has an alternative peak that is at least as high as this percentage of the best peak, that sequence will contribute a heterozygous call to the consensus calculation. To change heterozygous bases in the consensus sequences, enter your preferred value in the Consensus: call Sanger heterozygotes option (the higher the percentage, the lower the chance of an ambiguous base being called).
** Note that base calls with a quality score of at least 63 (i.e. those manually edited) will not be analysed for heterozygotes.