This article lists each column in the All Sequences Table produced by a typical Antibody Annotator run, and what each column represents. If you are interested in the Cluster Table columns, please see our other article: Exploring the Cluster Table Columns. If you are unsure what a cluster is, please see our article on Understanding "Clusters".
If you are looking for the output columns produced by Single Cell Antibody Analysis, please see this article.
Jump to:
The All Sequences Table
Each row of the All Sequences Table represents either a single heavy/light sequence, two associated (paired) chains or an scFv, depending on your input data. The table also has columns which contain various data associated with each row, like Assay Data or Sequence Liabilities.
Searching and Filtering the Table
To search for any column, go to Table preferences (1), start typing into the search bar (2) and hover over the column you would like to navigate to and click on the Focus Column button that appears (3) as shown below:
For more column management options, see How to Customize the Sequences Table.
In addition, all the cells of the table can be Filtered upon, allowing you to pull out sequences of interest by right clicking on the cell and selecting "Filter..." as shown below:
After selecting a cell to filter on, it will be added to the filter bar above, where it can be edited. Filters can also be layered; right clicking on another cell will allow you to add another filter with an AND operator. Our filtering uses SQL syntax, please see our main article on Filtering your Sequences for more detail and examples.
Standard columns
-
ID
The ID column consists of automatically-assigned numerical numbers for each sequence, or pairs of sequences (such as for paired chains). -
Name
The sequence name from your original un-annotated sequences. If associated pairs, this will be of the form "name1 & name2". -
Labels
This column contains any custom labels you have added to tag your sequences. See Using Custom Labels to learn more -
Notes
Here you can type in notes for any sequence by double-clicking on the cell. -
Chain
This column indicates what chain(s) were identified. This can be Light, Heavy or Both. -
Fully Annotated
This indicates whether the sequence(s) could be fully annotated. This does not necessarily mean that the sequence(s) are in frame, or without stop codons. -
In Frame & Fully Annotated
This indicates whether the sequence(s) are in frame and could be fully annotated. This does not necessarily mean that the sequence(s) are without stop codons. -
Without Stop Codons & In Frame & Fully Annotated
This indicates whether the sequence(s) are without stop codons, in frame and could be fully annotated. -
Sequence Length
The sequence length in nucleotides. If protein sequences were used as the input, the length will be in amino acids. For paired sequences, the values for both are added together. -
Score
This indicates the score for the sequence(s), based on your chosen Liabilities and Assets. Liabilities and assets need to be turned on for this column to be present.- See Antibody Sequence Liabilities for our list of default liabilities.
- See How to Customize Sequence Liabilities and Assets to learn how to specify your own custom liabilities
- See Positional Liabilities based on Antibody Numbering to view our default positional liabilities and learn how to specify your own.
-
Error
This column lists any errors and the region the error(s) were found in for the sequence(s). This could be "Frame Shift (Heavy CDR1)" for example. Liabilities and assets need to be turned on for this column to be present. See the above "Score" column for more info.
Region-dependent columns
All these columns will be generated for the various regions of your sequences. The full list includes:
-
Light regions:
- The FR1, CDR1, FR2, CDR2, FR1, CDR3, FR4
- The VJ Region, VJC Region
-
Heavy regions:
- The FR1, CDR1, FR2, CDR2, FR1, CDR3, FR4
- The VDJ Region, VDJC Region
Note: There will also be some columns in the All Sequences Table for any of specified clustered regions. These will be the ID column (showing the cluster ID the sequence belongs to) and another column which has the amino acid sequence of the parent cluster. See this article on the Cluster Tables for more information.
For each region, these columns will be generated:
-
Region
This column contains the amino acid sequence for that region -
ID
Each unique region sequence is given a numerical ID number. Regions with the same amino acid sequence will have the same ID number. The number corresponds to a ranking of how common the sequence is, with 1 being the most common sequence for the given region. This is effectively the Cluster ID - see Exploring the Cluster Table Columns for more -
Length
The region length in amino acids -
Nucleotides
The nucleotide sequence of the region -
DNA Germline/Template Mismatches
This column is only generated if Annotate variants is turned on in Antibody Annotator. The number of DNA mismatches relative to the reference sequences used (either germline or template sequence) is listed here. -
AA Germline/Template Mismatches
This column is only generated if Annotate variants is turned on in Antibody Annotator. The number of amino acid mismatches relative to the reference sequences used (either germline or template sequence) is listed here. -
AA HGVS
This column is only generated if Annotate variants is turned on in Antibody Annotator. The amino acid mismatches relative to the reference sequences used (either germline or template sequence) are listed here. We use standard HGVS nomenclature.
Gene columns
For each gene (and some gene combinations like Heavy VJ gene), these columns will be generated:
-
Gene
This column lists the closest matching germline gene (eg. IGHV1-5). If there are two evenly matching genes, both are listed. -
ID
Each gene (eg. IGHV1-5) is assigned a numerical ID. The number corresponds to the most common gene, with 1 being the most common sequence for the given gene family. This is effectively the Gene Cluster ID - see Exploring the Cluster Table Columns for more information. -
DNA Germline Mismatches
This column is only generated if Annotate variants is turned on in Antibody Annotator. The number of DNA mismatches relative to the identified germline gene is listed here. -
AA Germline Mismatches
This column is only generated if Annotate variants is turned on in Antibody Annotator. The number of amino acid mismatches relative to the identified germline gene is listed here. -
AA HGVS
This column is only generated if Annotate variants is turned on in Antibody Annotator. The amino acid mismatches relative to the identified germline gene are listed here. We use standard HGVS nomenclature. -
Identity %
This is the percent identity match found between the sequence and the found length of the closest-match germline gene -
Coverage %
This is what percentage of the closest-match germline gene can be found, not including any mismatches within the "covered area" of the gene. -
Matches %
This is the percent identity match found between the sequence and the entire length of the closest-match germline gene
Note that gene combination columns (like Heavy VJ Gene) will only list the Gene and ID columns above.
Additional columns
-
Liability columns
Various columns for your specified liabilities, with a count for the number of times the liability is found in the sequence and the region(s) the liability is found in. For example, the liability column for Deamidation (SN) might have cell values like 2 (Heavy CDR3, Light CDR1).
Liabilities and assets need to be turned on when running Antibody Annotator for these columns to be present:
- See Antibody Sequence Liabilities for our list of default liabilities.
- See How to Customize Sequence Liabilities and Assets to learn how to specify your own custom liabilities
- See Positional Liabilities based on Antibody Numbering to view our default positional liabilities and learn how to specify your own.
-
Assay Data columns
These will only be present if you have added Assay Data
-
Linker Columns
These will only be present if you have scFv-like data or VHH-VHH data, and selected the "with linker" options when running Antibody Annotator.- Linker: This gives the amino acid sequence of the linker
- Linker ID: This assigns each unique linker an ID, and a ranking based on how prevalent it is in the dataset
- Linker Length: The length of the linker in amino acids
- Linker Nucleotides: The nucleotide sequence of the linker
- Linker Match: This will only be generated if you have created and selected a Linker Database when running Antibody Annotator. If you have specified linkers, this table lists the linker match in the database
-
Protein Statistics columns calculated for the VJ and VDJ Regions
These will only be present if Calculate protein statistics is turned on in Antibody Annotator. The values are calculated for full length VDJ or VJ regions.- Charge at pH 7
- Extinction Coefficient
-
Feature Database columns
If you are using a feature database (see our main article on Using Feature Databases to identify Constant Regions and Fusion Proteins), additional columns for the sequences in your feature database will appear. These include:- Name
The name of the feature eg. Signal Peptide - Mismatches
Number of amino acid mismatches relative to the feature sequence - Identity %
This is the percent identity match found between the query sequence and the found length of the feature - Coverage %
This is what percentage of the feature can be found on your sequence, not including any mismatches within the "covered area" of the feature. - Translation
- Translation ID
Each unique translation of the feature is given a numerical ID number. The number corresponds to the most common translation, with 1 being the most common translation for the feature. - Translation Length
- Nucleotides
- Name
How to export an Excel file of selected columns
Before exporting the Sequences Table, you may find it useful to both filter your sequence results and select the columns you want using Table Column Preferences. Below, I have filtered for sequences that have a score above -400 and I have selected to display only the following columns:
- Name
- Chain
- Score
- Heavy CDR1
- Heavy CDR2
- Heavy CDR3
- VDJ Region
*** Note that you can save these column table preferences as Profiles. See this article: How to customise the Sequences Table.
Click Export Table once you have selected your sequences. This will open a pop-up allowing you to select the output format (Excel or .csv) and the option to export only the selected columns, or all hidden columns.
Make sure to select Only Visible if you would just like to export your selected columns.