Single Cell Antibody Annotator is useful for anyone who is expecting to find one (or more) dominant heavy and light chains in each of their samples. This could be barcoded data where each barcode corresponds to a cell or a well, or fasta/fastq files where each file is expected to have a dominant chain or pair of chains.
See Understanding Single Cell technologies: Barcodes and UMIs if you are unsure what single cell analysis is. The video below gives an overview of these concepts:
There are three main ways to handle 10X and other barcoded/UMI sequence data in Geneious Biologics. The best path depends on how much preprocessing has already been done prior to upload, which will determine your entry point.
- Method 1 - Unprocessed - Pair, Collapse UMIs, Single Cell Antibody Annotator
- Method 2 - Partially pre-processed - Single Cell Antibody Analysis from Separate Lists
- Method 3 - Fully pre-processed - Antibody Annotator + Add Assay Data
- FAQ: Why do my results for Antibody Annotator and Single Cell Antibody Annotator look different?
Method 1 - Pair, Collapse UMIs, Single Cell Antibody Analysis
If you have a fastq/fasta (or other format) file with raw sequences that contain UMI and/or Barcodes and have not been demultiplexed, this is the method to use. The NGS Tutorials 3 & 4 on our website follow a similar workflow, so that may be a good place to start. You can find them here: NGS Tutorials 3: Using Barcodes and UMIs and Tutorial 4: Single Cell Antibody Analysis. This tutorial has a small sampled dataset, the full dataset can be provided on request.
A typical workflow could look like:
- Set + Merge Paired Reads: feed raw FASTQ files into 'Set and Merge Paired Reads'. You may wish to choose the Setting Paired Reads option if you are working with 10X data. The UMI/Barcode tool can handle paired unmerged reads.
- UMI Collapse and Separate Barcodes: Feed paired (or merged) reads into Collapse UMI Duplicates and Separate Barcodes. This merges all similar sequences with the same UMI and sorts sequences from the same Barcode together. It is useful to leave the Trim UMIs/Barcodes' option off while experimenting, but you will want to have them trimmed off for the next step. The Barcode will be saved on each sequence as metadata. See Collapse UMI Duplicates and Separate Barcodes.
- Annotation: Feed the (trimmed) generated consensus sequences into Single Cell Antibody Annotator. Single Cell Antibody Annotator attempts to combine related sequences within a barcode so that you can see the dominant clones. You could also use Antibody Annotator if you wish, which will analyse each of the sequences without merging them. Both tools offer clonotype analysis (our cluster tables, see Understanding "Clusters").
Discovery: Sequence Alignment can be performed directly from the results. You can choose to align translated regions of interest (e.g. HCDR3), with trees.
Export: Extract selected annotate candidate sequences, or export tables, graphs, and images for reporting or record keeping purposes. See Exporting Annotated Sequences and Sequence Tables.
Method 2 - Single Cell Antibody Analysis from Separate Lists
If you or your sequence provider have already done some demultiplexing in another tool, you can feed the preprocessed sequences into Geneious Biologics for annotation and analysis. If you have your sequences sorted into sequence lists (such as one fastq file per clone/barcode/well/cell) then you could use these sequences directly with Single Cell Antibody Annotator. When you run Single Cell Analysis on multiple files or sequence lists, then it will attempt to find a dominant heavy and light chain for each list. NGS Tutorial 5. Single Cell Analysis from Separate Lists goes over this method.
Method 3 - Antibody Annotator + Add Assay Data
If you have already done some demultiplexing in another tool such as Cell Ranger, but don't have them sorted into sequence lists, then you could feed the consensus sequences directly into Antibody Annotator. You can import it as sequences (such as fasta, fastq or genbank format) or as a csv file using our csv sequence import tool.
Antibody Annotator will annotate FR and CDR regions, identify liabilities, variants relative to germline, clonotypes and more. The Add Assay Data feature can also be useful to add any coverage or depth information from csv or excel files from your demultiplexing stage.
FAQ: Why do my results for Antibody Annotator and Single Cell Analysis look different?
- A single row in the "All Sequences" table may represent many individual reads.
- Sequences are combined together based on a similarity threshold (you can set this in the options). If you have the 'De Novo Assembly' option turned on then sequences are assembled together. This means that one row in the All Sequences table might include sequences that originally had slightly different VJ protein sequences. The VJ Region amino acid sequence shown is the consensus across all similar reads.
- Only "Fully Annotated" sequences (or assembled contigs) are kept to show in the analysis results. You can define what regions should be used to determine "Fully Annotated".
- Sequences shorter than a certain length may also be filtered out, depending on your options.
- More detailed explanation of Single Clone Analysis filtering and options can be found in the main Single Cell Antibody Annotator.