The filter search bar is very flexible and can be used to pull out sequences that meet a specified criteria - including assay data results if assay data has been added. You can also layer filters to find sequences that meet specific requirements.
Jump to:
- How to Filter
- Examples of simple searches
- Examples of advanced searches
- Examples of layered search queries
- Filtering using Scripts
How to Filter
To filter your results, you can either right click on a cell with the values you want to filter on or type in filters manually.
In the example below, right clicking and selecting the suggested ['ASSAY_DATA_Biophysical_Assays:Fab Tm by DSF (°C)'] = 62.5 will add ['ASSAY_DATA_Biophysical_Assays:Fab Tm by DSF (°C)'] = 62.5 to the Filter bar. You can edit this value in the filter bar, or change the operator of equals (=) to something else.
To learn how to add assay data, see Adding Assay Data to your Analysis Results.
You can also add or layer filters by right clicking on any other columns you want to filter on. This will add another filter with the AND operator. Once you are happy with the filter and made any edits, hit Enter or click Filter.
Note that this right click to build a filter syntax function automatically adds the AND operator but this operator can be easily changed to other operators.
It might be useful to jump quickly to columns of interest. To do this, go to Table preferences (1), start typing into the search bar (2) and hover over the column you would like to navigate to and click on the Focus Column button that appears (3) as shown below:
Note that:
- SQL filter query is case sensitive and single quotes are only used for column type STRING.
- To combine multiple queries for a single search, you can use 'AND' and 'OR'.
- It is not possible to test for NULL values with comparison operators, such as =, > etc.
Filter Syntax
The Biologics Annotator Result table and Comparison table can be filtered using SQL search queries. The following outlines the filtering language, and some examples of how to use it to build your search query.
Filtering can be performed by specifying a valid SQL condition using column headers as the fields to filter on. Comparative operators define how query conditions will be compared and are dependent on the column type. While most columns in the NGS results table are of type STRING, columns may also be of type NUMBER.
The SQL filter syntax supports the following STRING comparative operators:
- =
- Equal to
- !=
- Does not equal
- LIKE
- (search for a pattern - regex)
- NOT LIKE
- (search for a pattern - regex)
Note that '=' and '!=' will only work for exact character queries while 'LIKE' and 'NOT LIKE' will work with ambiguous character queries.
There are two wildcards used in conjunction with the LIKE operator:
- % (zero, one, or multiple unspecified characters)
- _ (a single unspecified character)
The SQL filter syntax supports the following NUMBER comparative operators:
- >
- Greater than
- <
- Less than
- =
- Equal to
- =>
- Greater than or equal to
- <=
- Less than or equal to
- !=
- Does not equal
The SQL filter syntax supports the following logical operators:
- AND
- OR
- BETWEEN
- IN
- IS NULL
- IS NOT NULL
Note that NULL is a special value that signifies 'no value'. Both the 'IS NULL' and 'IS NOT NULL' operators are not followed by a value.
Examples of simple searches
To search for sequences that meet the condition of having a heavy CDR3 sequence which contains the motif AAQTRPRPYGTSRAEGDYGY, use an SQL filter query in the form:
['Heavy CDR3'] = 'AAQTRPRPYGTSRAEGDYGY'
In this example, Heavy CDR3 is the field we are filtering on, = is the operator, and AAQTRPRPYGTSRAEGDYGY is the value.
To search for sequences that meet the condition of having a score of more than -1000, use an SQL filter query in the form:
['Score'] > -1000
In this example, Score is the field we are filtering on, > is the operator, and -1000 is the value.
To search for sequences that meet the condition of having a Heavy CDR3 length between 10 and 12 amino acids long, use an SQL filter query in the form:
['Heavy CDR3 Length'] BETWEEN 10 AND 12
In this example, Heavy CDR3 Length is the field we are filtering on, BETWEEN and AND are the operators, and 10 and 12 are the values.
To search for sequences that meet the condition of having a heavy CDR3 sequence, use an SQL filter query in the form:
['Heavy CDR3'] IS NOT NULL
In this example, Heavy CDR3 is the field we are filtering on, IS NOT NULL is the operator.
Examples of advanced searches
To search for sequences that meet the condition of having a Heavy FR4 sequence which contains an amino acid motif that starts with “W”, use an SQL filter query in the form:
['Heavy FR4'] LIKE 'W%'
In this example, Heavy FR4 is the field we are filtering on, LIKE is the operator, and W% is the value.
To search for sequences that meet the condition of having a Heavy CDR3 sequence which contains an amino acid motif that has a "Y{any amino acid residue}Y” motif, use an SQL filter query in the form:
['Heavy CDR3'] LIKE '%Y_Y%'
In this example, Heavy CDR3 is the field we are filtering on, LIKE is the operator, and %Y_Y% is the value.
To search for sequences that meet the condition of having a Heavy V Gene that comes from the IGHV1 family, use an SQL filter query in the form:
['Heavy V Gene'] LIKE 'IGHV1-%'
In this example, Heavy V Gene is the field we are filtering on, LIKE is the operator, and IGHV1-% is the value.
To search for sequences that meet the condition of having a Heavy FR4 sequence which contains an amino acid motif that has a “TLVTV” motif in any position of the Heavy FR4 sequence, use an SQL filter query in the form:
['Heavy FR4'] LIKE '%TLVTV%'
In this example, Heavy FR4 is the field we are filtering on, LIKE is the operator, and %TLVTV% is the value.
To search for sequences that meet the condition of not having a stop codon (*) in the Heavy CDR3 sequence, use an SQL filter query in the form:
['Heavy CDR3'] NOT LIKE '%\\*%'
In this example, Heavy CDR3 is the field we are filtering on, NOT LIKE is the operator, and %\\*% is the value.
Examples of layered search queries
To search for sequences that meet the conditions of being both a Heavy chain and fully annotated, use an SQL filter query in the form:
['Chain'] = 'Heavy' AND ['Fully Annotated'] = 'Yes'
In this example, Chain and Fully Annotated are the fields we are filtering on, = and AND are the operators, and Heavy and Yes are the values.
To search for sequences that meet the conditions of having either both chains or a heavy chain, use an SQL filter query in the form:
['Chain'] = 'Heavy' OR ['Chain'] = 'Both'
In this example, chain is the field we are filtering on, = and OR are the operators, and Both and Heavy are the values.
To search for sequences that meet the conditions of being fully annotated, with a Heavy CDR3 length between 8 and 10 amino acids long, use an SQL filter query in the form:
['Fully Annotated'] = 'Yes' AND ['Heavy CDR3 Length'] BETWEEN 8 AND 10
In this example, Fully Annotated and Heavy CDR3 Length are the fields we are filtering on, =, AND and BETWEEN are the operators, and Yes, 8, and 10 are the values.
To search for sequences that meet the conditions of having a Heavy V gene from the IGHV1 family and are in frame and fully annotated, use an SQL filter query in the form:
['Heavy V Gene'] LIKE 'IGHV1-%' AND ['In Frame & Fully Annotated'] = 'Yes'
In this example, Heavy V Gene and In Frame & Fully Annotated are the fields we are filtering on, LIKE and AND are the operators, and IGHV1-% and Yes are the values.
The operator IN can also be useful for performing multiple searches within the same column. If you would like to search for a list of particular CDR3 sequences you could use the following SQL filter query:
['Light CDR3'] IN ('QQYGSSQGT', 'QQYYTTPIT', 'QSYDGSLSGSV')
Filtering using Scripts
To create a filter based on a condition between two fields, advanced queries using script can be used. This is especially useful when analyzing comparisons of annotated results. To learn more about Comparison Result Documents, and how to make them, see Comparing Results across Multiple Experiments. To search for sequences that have more normalized counts in focused_library_trimer_pan4 over focused_library_trimer_pan0, use an SQL filter query in the form:
script('doc["Normalized Count focused_library_trimer_pan0"].size() > 0 && doc["Normalized Count focused_library_trimer_pan4"].size() > 0 && doc["Normalized Count focused_library_trimer_pan4"].value > doc["Normalized Count focused_library_trimer_pan0"].value')
In this example, Normalized Count focused_library_trimer_pan0 and Normalized Count focused_library_trimer_pan4 are the two fields we are filtering on, > is the operator.
Scripting also allows for more custom expression using Arithmetic Operators (+, -, *, /, %).