This article describes how to use the Find Heterozygotes pipeline under the Pre-processing options. Find Heterozygotes can also be chosen as an option when running Batch Assemble Sanger Sequences.
What is heterogeneity?
Heterogeneity in base sequences occurs when a single peak position within a trace contains more than one peak. In general, base callers tend to list this base position as an 'N' or may call this base based on the highest observed peak. These heterozygous bases can be identified and annotated to suit with the Find Heterozygotes and Batch Assemble Sanger Sequences operations.
Find Heterozygotes without Assembly
To identify and annotate heterozygous bases without prior sequence assembly, (1) select your sequences, (2) click Pre-processing and select (3) Find Heterozygotes in the dropdown.
To annotate or modify heterozygous bases according to the selected peak similarity, select either:
- Heterozygous bases will be annotated and not changed
Change bases to ambiguities
- Heterozygous bases will be changed to ambiguous bases accordingly.
You can set the peak similarity percentage to suit where a heterozygous base will be called when the alternative peak is as high as the set percentage of the best peak. In the example below, the number of ambiguous bases reduces with increasing peak similarity.
Find Heterozygotes during Batch Assembly
To identify and change heterozygous bases in consensus sequences, (1) select the sequences to be assembled, (2) click Pre-processing and (3) select Batch Assemble Sanger Sequences in the dropdown.
When a Sanger trace has an alternative peak that is at least as high as this percentage of the best peak, that sequence will contribute a heterozygous call to the consensus calculation. To change heterozygous bases in the consensus sequences, enter your preferred value in the Consensus: call Sanger heterozygotes option (the higher the percentage, the lower the chance of an ambiguous base being called).
***Note that base calls with a quality score of at least 63 (i.e. those manually edited) will not be analysed for heterozygotes.