Jump to:
- What are linker databases used for?
- How to create a linker database
- How to select a linker database when running an analysis
-
FR/Linker boundary calling
What are linker databases used for?
A linker database will force the boundary FR regions of the two chains on either side of the linker to "snap" to the edges of the defined linker.
If your dataset contains pre-defined linkers between two antibody chains, creating a linker database may aid in correctly annotating the boundary regions of the two chains. For example, the end of the first chain (e.g. FR4) and the start of your second chain (e.g. FR1).
A linker database can also help you to group your dataset by the linker sequence, if you have multiple linkers for screening purposes.
How to create a Linker database
First you will need to prepare a sequence file or sequence list file (e.g. fasta) with your desired linker(s). The sequences can be either protein or nucleotide, but not a mixture of both.
The Reference Database section can be found on the navigation panel under Organization Databases. To create a new Linker database, click on the 3 vertical dots to bring up the New database option:
The create new database wizard will prompt you to select the type of database: select Linker Database from the options.
To upload your linker sequences into this database you can either:
1. Choose to add them while in the Create Database wizard
OR:
2. Or wait for the reference database folder to be created and upload the sequences into the linker database folder
How to select a linker database when running an analysis
Linker databases are optional databases that are used in conjunction with a Germline Gene or Antibody Template reference database. They are only available when selecting the "Both chains in single sequence with linker (scFv)" and "Two heavy chains in single sequence with linker" options when running Antibody Annotator or NGS Antibody Annotator:
The Linker database option allows you to:
- Select a pre-made linker database (See the section above)
- Set the maximum amount of mismatches allowed in the linker.
- For example, for a 10 amino acid linker, a 10% mismatch would allow linkers on the query sequences that differed from the reference linker by 1 aa.
- For example, for a 10 amino acid linker, a 10% mismatch would allow linkers on the query sequences that differed from the reference linker by 1 aa.
FR/Linker boundary calling
First, the FR boundary regions are annotated as usual, and the linker region is identified.
- If the linker and the FR1/FR4 region are found to overlap, the start or the end of the FR1/FR4 region will be shortened to "snap" to the edges of the linker.
- If there is non-matching sequence between the linker and the FR1/FR4 regions, the FR1/FR4 region will be extended to snap to the edges of the linker.
In both of the above cases if the FR1 or FR4 were found to be truncated, they will be marked as non-truncated.
If the linker region could not be identified, the FR1/FR4 regions will be unaltered.