How to Customize Sequence Liabilities and Assets

June 04, 2024 22:08
Updated

Sequence liabilities have the potential to negatively affect antibody conformational stability and ability to bind antigen, and can range in severity. To learn more about the liabilities and the automatic liability options, see Antibody Sequence Liabilities.

Jump to:

How to edit the liabilities options
Saving sets of liabilities for different datasets
Examples

How to edit the liabilities options

Specific liabilities can be selected by checking the Find liabilities and assets option in the Antibody Annotator and Single Clone Analysis pipelines. You can add, edit or remove liabilities and asset scores by editing the text box.

Screen_Shot_2020-08-21_at_5.47.48_PM.png

Sequences will be annotated with the liabilities and assets as per the motif specified in the text box. An example of how this works using 2 sections of default liabilities is given below:

-1000 Error:
Frame_Shift
Not_Fully_Annotated

-100 Liability (High):
Deamidation NG NS NA
Isomerization DG DS
Cleavage DP
Oxidation M C
Glycosylation N{P}S{P} N{P}T{P}

General rules

Any line ending with a colon (e.g. -1000 Error: , -100 Liability (High):) indicates that all following lines use that level of liability severity. This line should include:

A number (e.g. -1000 and -100), which indicates the score that all motifs in that section will be assigned.
A name (e.g. Liability and Error), which will appear in the table within a column with that name

Each other line defines a specific liability in the section and should include:

Name of liability (e.g. Frame_shift, Deamination)
A space-separated list of motifs for that liability (e.g. NG NS NA)

Special Keywords

The Frame_Shift keyword has special meaning for identifying frameshifts.
The Not_Fully_Annotated keyword has special meaning for identifying sequences which have not been fully annotated, in which case other liabilities or assets may not have been detected.
Motif names starting with 'Stop_Codon' have slightly different treatment around naming of annotations and column values.
The property called Base Quality has special meaning, it can be used to create liabilities based on the base call quality scores.

Note that the liabilities system allows you to base liabilities on any annotation property, and the default "Likely_Sequencing_Error" and "Possible_Sequencing_Error" liability rules make use of the annotation property "Expected Sequencing Errors", created by the antibody annotation pipeline. For more information about how the "Expected Sequencing Errors" are calculated see here.

Special Characters

N represents any nucleotide base.
X represents any amino acid.
A motif character surrounded by {} matches any amino acid apart from the one specified
- (e.g. N{P} signifies a motif that starts with N and is followed by any amino acid except for P).
A motif may start with a '.' (period/full stop) to indicate it is a nucleotide rather than amino acid motif.
A region in brackets may start with a '.' (period/full stop) to indicate the region right upstream.
A motif may start with a '+' to indicate it is a nucleotide motif that must be on the forward strand.
A motif may start with a '!' to indicate it matches anything apart from the specified motif.
A motif may end with a less than or greater than sign followed by a number, in which case it is only considered a liability if the number of occurrences of the motif satisfies the condition.
A section surrounded by quotes indicates an annotation property based rule instead of a motif matching rule. These rules consist of an annotation property name, a condition, and a value. The condition must be one of = (Equals), (Less than), (Greater than), != (Not equals), (Less than or equal), = (Greater than or equal), : (Contains), !: (Not Contains).
A number in square brackets after a motif indicates the maximum number of mismatches allowed (gaps are not supported) when matching that motif.
Space separated region names in brackets after a motif list indicates that motif is only applied if within those regions. If no region is specified, nucleotide motifs will be found anywhere, while amino acid motifs will only be found in frame in annotated regions. Use * for the region for amino acid motifs to find them anywhere in any frame. Prepend or append an annotation name with '...' to only find a motif before or after that annotation. Use 'In_Frame' so that a nucleotide motif is only found if it is in frame of a translatable region (e.g. VDJ region).

Saving sets of liabilities for different datasets

Geneious Biologics allows you to save Profiles which can be used to record and re-run alternative settings depending on the dataset. This means that you can specify different Antibody Sequence Liabilities and other settings depending on what dataset you are working with.

Profiles can be saved and applied at the bottom of our Annotation analysis pipelines:

apply or save profile.png

Examples

-100 Liability (High):
 Deamidation NG NS NA (CDR3)
-10 Liability (Medium):
 Deamidation NG NS
-5 Liability (Medium):
 Deamidation NA

In the example above, the antibody annotator will treat these motifs prone to deamidation in the CDR3 region as high severity and give them a score of -100, but medium severity in other regions with a score of -10 for NG or NS and -5 for NA. When multiple motifs of the same length match at a position, only the first in the list is applied.

+10 Asset:
 6His HHHHHH (FR4...)

In the example above, the antibody annotator will find and annotate a series of 6 histidines only downstream of FR4. Assets can be associated with improved antibody conformational stability.

Not_SAL !SAL (.Light_FR1)
Not_PAMA !PAMA (.Heavy_FR1)

In the example above, the antibody annotator will annotate any region immediately upstream of Light FR1 that does not translate as SAL and it will also annotate any region immediately upstream of Heavy FR1 that does not translate as PAMA.

Note that:

The scores of all motifs in a sequence are summed and will appear in the table in a column called Score.
The liability name must not contain spaces but may contain underscores which will get replaced by spaces.
Click the Reset to default button in the Antibody Annotator popup to reset back to the default set of liabilities and assets.