This article outlines how to take a subset of your sequences from a Biologics Annotator Result document and re-run some of the analysis options. For example, you might wish to extract all sequences meeting a certain quality threshold to a new document while re-calculating the clusters and adding new clusters. If you are unsure what clusters can tell you about your data, please see this article. If you would like to extract or export sequences to a downloadable file without re-running Antibody Annotation, see this article.
How to run Extract and Recluster
First select all the sequences that you would like to extract from a Biologics Annotator Result document in the Sequences Table. You may find the Filter function useful to narrow down your results to the sequences you are interested in. Then click the Export/Extract button as highlighted below and select Extract and Recluster from the drop-down menu.
In the above example, a Filter has been applied to pull out all the sequences that are fully annotated and had IGHV4 identified as the closest germline Heavy V Gene. To learn how to apply more complex filters like this, see Filtering your Sequences.
The below image shows the Extract and Recluster options. The Clustering Options header allows you to define new clusters or remove clusters from the analysis. If you are unsure about what clusters are, please see this article. To learn more about how to set multi-region or percent threshold clusters see Clustering Options. Note: All the other outputs from the initial Antibody Annotator result will be carried over (e.g. Protein Statistics and Annotate germline gene differences).
Select Run to start the analysis. This will create a new Biologics Annotator Result document which will have re-calculated Cluster Tables and Graphs. For tips on checking the quality of your results see this article: Using graphs for quality assurance. Biologics also produces a number of graphs that help you to represent and find trends within individual clusters/regions (eg. cluster diversity, HCDR3 lengths, Gene Combination Heat-maps and Amino Acid Distribution charts). To find out more about interpreting those cluster results see this article: Using graphs to interpret clusters and clonotypes.