Molecular barcodes are short nucleotide tags added to sequences of interest during sample preparation; these tags provide information about which cell the sequence came from in single cell sequencing analyses. Unique Molecular Identifiers (UMIs) can help identify rare variants, detect differential amplification and also enable you to screen out probable sequencing errors. Geneious Biologics is able to pull out and sort Barcoded data as well as collapse UMIs according to user specifics by using the Collapse UMI Duplicates and Separate Barcodes pipeline before Single Clone Antibody Analysis.
How are Barcodes and UMIs used in 10X Genomics?
For 10x Genomics single cell sequence 3’ gene expression library preparation, cell barcodes refer to tags that designate the single cell source, while UMIs refer to a specific kind of barcode that detects differential amplification of transcripts during PCR among other applications. 10x genomics sequencing technology uses individual beads coated with primers containing a sequencing primer, a 16 bp cell barcode, a 10 bp UMI, and a poly dT primer, in that order. When a bead is paired with a single cell in a droplet, the poly dT primer binds to the poly A tail of the mRNA sequences, with the cell barcode identical for all sequences bound, indicating that sequence is associated with that cell. The UMI sequences are random, with a very low likelihood of duplicate UMI within a single bead. Post amplification, sequences are sorted by barcode to identify which cell the sequence came from and collapsed by UMI similarity to reduce redundancy and account for PCR biases.
During first-strand synthesis upon reaching the 5’ end of the RNA template the reverse transcriptase enzyme used in the reaction appends additional non-templated nucleotides to the sequence, mostly deoxycytidine residues. This non-templated sequence is used as a binding site for a template switching oligonucleotide (TSO). This TSO causes the transcriptase enzyme to switch from transcribing the RNA template sequence to transcribing the TSO sequence, terminating transcription at the end of the TSO. This serves to signal the end of a complete sequence and improves representation of the 5’ end of the template sequence.