combine_reports_tabset.knit

Navigation

Below is an overview of the experiment from TCR/BCR nucleotides to reads to clonotypes. Yellow tabs represent information on the pipeline which includes the Cellecta DriverMap Adaptive Immune Receptor (AIR) which provides a description of the DriverMap AIR assay and Bioinformatics Workflow which provides an in-depth description of the computational workflow.

The tabs in green represent the experimental data and data analysis. This includes information on DriveMap Adaptive Immune Receptor TCR/BCR Profiling assay which is described in the Experimental Description tab. This tab contain the details on the sample collection all the way to library generation. The Sequencing & Alignment Quality tab contain details on the initial processing of the sequences and quality of reads and success of read alignment to TCR/BCR genes. The Clonotype Summary tab provides an overview of the identified clonotypes within each sample. The analysis of these clonotypes are further broken down into chains (IGH, IGK, IGL, TRAD, TRB and TRG). For each chain relevant to the experiment, a variety of metrics are calculated provide an overall picture of the repertoire characteristics. This includes repertoire statistics, the top clonotypes, clonotype overlaps between samples, gene usage, gene usage overlaps between samples, diversity metrics and kmer analysis.

DriverMap Adaptive Immune Receptor Repertoire Profiling

The DriverMap Adaptive Immune Receptor (AIR) Repertoire Profiling Service from Cellecta provides you with a profile of all TCR and BCR CDR3 or full-length variable regions in blood, cell, or RNA samples. With the DriverMap AIR TCR-BCR Profiling Service, you get a larger complement of clonotypes than other similar assays, reproducible and comprehensive coverage from a range of immune sample inputs, including total RNA from whole blood and Rapid, 1-month turnaround from sample submission to an extensive analysis report

Since T- and B-cells work synergistically in the adaptive immune response, Cellecta has designed an assay that profiles both T-cell receptor (TCR) and B-cell receptor (BCR) repertoires in a single convenient reaction. Separate assays specific for T- or B-cell chains are also available. The DriverMap AIR-RNA assay quantifies T-cell and B-cell receptor transcripts. It is designed to specifically amplify only functional RNA molecules from human or mouse TCR and BCR cells, avoiding non-functional pseudogenes with similar structures or full-length variable regions from human RNA molecules enables highly sensitive detection of low-frequency, rare TCR and BCR clonotypes and more comprehensive profiling when working with small samples and limited numbers of cells. The DriverMap AIR-DNA assay amplifies receptor genes directly from genomic DNA. The AIR-DNA assay provides a more quantitative measurement of the genetic copies for each CDR3-specific clonotype which correlates to the number of cells with that clonotype in that sample. This data enables the measurement of clonal expansion in T and B cells. Combining data obtained from both the AIR-DNA and AIR-RNA assays enables assessment of both the transcriptional activation and number of cells with a particular clonotype. The ability to differentiate these two effects provides a quantitative basis to assess antigen-activated clonotypes

Applications of BCR sequencing: Identify broadly neutralizing antibodies (BNAbs) and map Ig-seq datasets to known antibody structures for antibody and vaccine development, Track B-cell migration and development patterns, Find markers of autoimmune diseases such as multiple sclerosis, rheumatoid arthritis and cancers (e.g. B-cell lymphoma), and Contrast naïve and antigenically challenged datasets to understand antibody maturation.

Applications of TCR sequencing: Track T-cell clonality and diversity for insights into mechanisms of action of immune checkpoint inhibitors for immunotherapies, Assess TCR overlap between repertoires to define spatial and temporal heterogeneity of the anti-tumoral immune response, and Analyze TCR sequence and structure to annotate antigenic specificity for developing personalized cellular immunotherapies

How is the DriverMap AIR Assay Different from other AIR Assays?

DriverMap™ Multiplex PCR technology uses gene-specific primers which significantly reduce the level of non-specific binding and primer-dimer amplification products, and are designed to target only TCR/BCR isoforms. Unique Molecular Identifiers (UMIs) facilitate accurate quantitation of the copy number of cDNA or DNA molecules in amplification steps, as well as detection of low abundance clonotypes and correction of amplification biases and sequencing errors. Dual-index amplicon labeling strategy minimizes index hopping during NGS allowing for comprehensive readouts. Full profiles of the antigen-recognition CDR3 region enable assessment of CDR3 length distribution, V(D)J segment usage, isotype composition for BCRs, somatic mutations, and similar characteristics with immune receptor profiling software such as MiXCR (MiLabs).

DriverMap Adaptive Imumune Repertoire (AIR) profiling Assay workflow is as follow:

Pipeline

Below is an overview of the bioinformatics analysis pipeline used on DriverMap Adaptive Immune Receptor (AIR) Sequencing data. MiXCR (Bolotin et al., 2015) is used to align the sequencing reads and identify clonotypes and their abundances. The MiXCR Cellecta DNA or RNA Preset is used to perform the read alignment. After alignment, a variety of repetoire metrics can be calculated from the resulting clonotype abundances. This includes repertoire statistics, the top clonotypes, clonotype overlaps between samples, gene usage, gene usage overlaps between samples, diversity metrics and kmer analysis. This is performed mainly using the Immunarch package alongside a variety of R packages for data visualization.

Experimental Details

The details of the library generation is described below. The Protocol section describes the sample collection, the profiling assay, PCR amplification, PCR yields, and primer sequences. The PCR Amplification Results section contains the gel image from the PCR. Sample Description section contains the list of samples in the analysis including relevant metadata.

5-20-24. Repeat of 306 and FC1, Immunization-Rx_Alex
Chenchik(AC)> AIR-CDR3 vs AIR-CDR1-2-3 profiling in whole
blood.

38 Samples – flow cells (#350) > 300-n paired-end read
(high-throughput)> NextSeq500

Sample Description > please find in the attached Excel
File

Experiment description: This is fourth experiment (repeat
of FC298, 306 and FC1) using reduced starting amount of RNA (25 ng vs
old 100ng), reduced cycle numbers for Samples 7 (Im1)>4 cycle less,
only immunization 1. Goal is to compare CDR1-2-3 vs CDR3 profiling, as
previous data show less variability in controls for CDR1-2-3 profiling
(FC1, Dongfang data) .

AIR profiling of Whole Blood RNA samples isolated from AC before
and after following treatments:

Immunization 1: PPSV23(pneumococcal polysaccharide)plus Td
(tetanus+diphteria) vaccine > 4/27/22

Rx treatment against H.Pilory
(clarithromycion+larisolorazole+amoxicillin) > 6/21/22

Immunization 2: Shingrix (against shingles, based on recombinant
VZV glycoprotein E antigen herpes zoster virus)> 7/15/22

Before and after each treatment we collected blood in 2 Tempus
test tubes (3ml). From Tempus test tubes we purified both total RNA (R-T
name in the Sample list) and DNA. Please, note that Tempus and AXgene
are two main test tubes used for stabilization/collection of whole blood
for RNA/DNA purification.

E.g. for most samples we have duplicates (D1 and D2) for the
most time points:

1) 4/1/22 > control before any treatment

2) 4/6/22 > control before any treatment

3) 4/26/22 > control before any treatment

- 4/27/22 > Immunization 1

5) 4/29/22 > 2 days after Im1

6) 5/2/22 > 5 days after Im1

7) 5/5/22 > 8 days after Im1 (T/B cell fraction
sorting)

8) 5/12/22 > 15 days after Im1

9) 5/19/22 > 22 days after Im1

10) 6/17/22 > control before Rx

Additional “control” AC whole blood samples (first 3 from Alex
Chenchik) collected before:

C_R-T_BP > 3/20/22 (back pain condition)

C1_R-T > 1/12/22

C2_R-T > 1/27/22

Observation and Data quality: Yield of amplified NGS PCR
products was +/-2-fold for all samples (e.g. it equal to 16x activation
for 7) after appr. one week after Im1t > decided to combine all
amplified products in equal amount except sample 7 (4x more) in order to
“compensate” loss of reads for activated clonotypes and all CDR1-2-3
will 1.5x more than CDR3 (considering differences in amplicon sizes
500bp vs 350bp). No significant background, smear or primer dimers in
any samples.

Protocol: 

Step 2- RevGSP binding to mRNA. Total RNA (50ng for WB
for all AC samples was incubated with mix of Reverse AIR TCR+BCR GSPs
(set10 of 6/23/22, final primer concentration is appr. 10nM of each
primer) in 20ul of 1xHyb buffer at 70C, 5 min, 60C for 60 min, cool down
to 25C. Hybridized RNA-RevGSP products were purified with 1.2 volume (24
ul) of SPRI beads. The bind to beads RNA-RevGSP hybrid was washed by
2x80% ethanol.

Step 2- Rev GSP extension > cDNA synthesis. The washed
magnetic beads were resuspended in 45 ul of 1xRT-Ext buffer, dNTP (0.5
mM), reverse transcriptase (RTscript) and RT hot-start aptamer, collect
41 ul without beads and incubated at 50C for 30 min, and 72C for
10min.

Step 3. Fwd GSP extension. cDNA product (in 40 ul) was
splitted for two test tubes and extended by adding 20 ul of master mix
with pool of Forward AIR CDR1-2-3 TCR+BCR or AIR CDR3 TCR+BCR GSPs (10
nM final concentration of the each primer)and incubated 98C, 1min, 68C,
10 min and treated with 2 ul of ExoI, 37C, 20min, 95C,5min.

Step 4-1st PCR. FwdGSP-extended cDNA was
diluted in PCR master mix (60ul), and anchored cDNA fragments were
amplified in 100-ul of Multiplex DNA polymerase reaction mix with
universal anchor PCR primers for 19 (most samples) or
15 (sample 7) cycles.

Step 5-2nd PCR. 2ul aliquotes of
1st PCR were added in 96-well plate with 50ul master mix for
2nd PCR step (each well has unique Dual Nextera Index
primers) and amplified using unique combination of Nextera Fwd-P5-Index
and Rev-P7-Index for 8 cycles, treated with ExoI (1-ul) at 37C for
30min. PCR products were analyzed in Fragment analyzer (see attached
file), combined at equal amount, except sample 7 (4x) and all
CDR1-2-3 were 1.5x more than CDR3, purified using AMPpure magnetic
beads (1.5X volume). The purified cDNA products were quantitated by
Qubit fluorescence measurement, and diluted to 10 nM (2.1 ng/ul) for
next-generation sequencing using NextSeq500.

Program:

Read1:eSeqDNA-Fwd>148c; Ind1:eSeqIND-Fwd>10c;
Ind2:eSeqIND-Rev>10c;Read2: eSeqDNA-Rev>148c.

DriverMap AIR assay Amplicon Structure

eSeqDNA-Fwd

FP5  UDPIndex10 AGCAGCAGCACCGACCAGCAGACA F
ACGGCGACCACCGAGATCTACACNNNNNNNNNNAGCAGCAGCACCGACCAGCAGACA-GSP-DNA-

TGCCGCTGGTGGCTCTAGATGTGNNNNNNNNNNTCGTCGTCGTGGCTGGTCGTCTGT-GSP-DNA-

TCGTCGTCGTGGCTGGTCGTCTGT

eSeqIND-Rev

eSeqIND-Fwd

UMI14 TCTGTGCTGGTCGGTGCTCGTCGT

-DNA-GSP-NNNNNNNNNNNNNN-TCTGTGCTGGTCGGTGCTCGTCGTNNNNNNNNNNTATCTCGTATGCCGTCTTCTGCT

-DNA-GSP-NNNNNNNNNNNNNN-AGACACGACCAGCCACGAGCAGCANNNNNNNNNNATAGAGCATACGGCAGAAGACGA

R AGACACGACCAGCCACGAGCAGCA UDPIndex10 RP7

eSeqDNA-Rev

The following samples are included in the analysis.

Samples	Sample_Source	Preset	Experiment	Species	Condition
Control_1	Control_1	cellecta-human-rna-xcr-umi-drivermap-air	CDR3	hsa	Control
Control_2	Control_2	cellecta-human-rna-xcr-umi-drivermap-air	CDR3	hsa	Control
Control_3	Control_3	cellecta-human-rna-xcr-umi-drivermap-air	CDR3	hsa	Control
Immunized_Day5_1	Immunized_Day5_1	cellecta-human-rna-xcr-umi-drivermap-air	CDR3	hsa	PPSV23+Td_Immunized
Immunized_Day5_2	Immunized_Day5_2	cellecta-human-rna-xcr-umi-drivermap-air	CDR3	hsa	PPSV23+Td_Immunized
Immunized_Day8_1	Immunized_Day8_1	cellecta-human-rna-xcr-umi-drivermap-air	CDR3	hsa	PPSV23+Td_Immunized
Immunized_Day8_2	Immunized_Day8_2	cellecta-human-rna-xcr-umi-drivermap-air	CDR3	hsa	PPSV23+Td_Immunized

Sequencing and Alignment Quality

This section contains an overview of the quality of read sequencing and alignment. The sequencing section outlines the total number of sequences in each sample, as well as FastQC metrics which are relevant to DriverMap AIR libraries. The alignment section outlines the alignment of the samples to the reference genome. Highlighted in this section are successful alignments, non-TCR/IG alignments, reads with no V and/or J hits, reads with no CDR3 regions, reads with no barcodes, etc. Lastly, the reads per UMI filter section shows the histogram generated by MiXCR which shows the number of reads per UMI. A cut-off threshold marked in a red dotted line is used to filter out erroneous UMI prior to downstream analysis.

fastqc is used to determine the quality of the sequences. A variety of metrics are curated by the software to determine whether the samples are suited for downstream bioinformatic analysis. Each metric is deemed to PASS, WARN or FAIL indicating the success, requirement for some concern, or that the sample needs to be evaluated more carefully. Note that the quality metrics were designed for generic purposes. For further explanation on each metric, see the following article (external source):

mixcr_qc_report.knit

Sequencing Statistics

sample	pct.gc	tot.seq	seq.length
Control_1_S20_R1_001	58	8952881	308
Control_1_S20_R2_001	57	8952881	308
Control_2_S21_R1_001	57	7328409	308
Control_2_S21_R2_001	57	7328409	308
Control_3_S28_R1_001	58	7205548	308
Control_3_S28_R2_001	57	7205548	308
Immunized_Day5_1_S24_R1_001	57	11659613	308
Immunized_Day5_1_S24_R2_001	57	11659613	308
Immunized_Day5_2_S31_R1_001	57	10644001	308
Immunized_Day5_2_S31_R2_001	57	10644001	308
Immunized_Day8_1_S25_R1_001	58	27375362	308
Immunized_Day8_1_S25_R2_001	57	27375362	308
Immunized_Day8_2_S32_R1_001	57	25610902	308
Immunized_Day8_2_S32_R2_001	57	25610902	308

pct.gc = GC Percentage
tot.seq = Total Number of Reads
seq.length = Sequencing Length (NT)

Poor quality samples

sample	nb_problems	module
Control_1_S20_R1_001	1	Adapter Content
Control_1_S20_R2_001	1	Adapter Content
Control_2_S21_R1_001	1	Adapter Content
Control_2_S21_R2_001	1	Adapter Content
Control_3_S28_R1_001	1	Adapter Content
Control_3_S28_R2_001	1	Adapter Content
Immunized_Day5_1_S24_R1_001	1	Adapter Content
Immunized_Day5_1_S24_R2_001	1	Adapter Content
Immunized_Day5_2_S31_R1_001	1	Adapter Content
Immunized_Day5_2_S31_R2_001	1	Adapter Content
Immunized_Day8_1_S25_R1_001	1	Adapter Content
Immunized_Day8_1_S25_R2_001	1	Adapter Content
Immunized_Day8_2_S32_R1_001	1	Adapter Content
Immunized_Day8_2_S32_R2_001	1	Adapter Content

nb_problems = Number of criteria that failed
module = List of criteria that failed

Summary of FastQC Calls

	Control_1_S20_R1_001	Control_1_S20_R2_001	Control_2_S21_R1_001	Control_2_S21_R2_001	Control_3_S28_R1_001	Control_3_S28_R2_001	Immunized_Day5_1_S24_R1_001	Immunized_Day5_1_S24_R2_001	Immunized_Day5_2_S31_R1_001	Immunized_Day5_2_S31_R2_001	Immunized_Day8_1_S25_R1_001	Immunized_Day8_1_S25_R2_001	Immunized_Day8_2_S32_R1_001	Immunized_Day8_2_S32_R2_001
Basic Statistics	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS
Per base sequence quality	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS
Per tile sequence quality	WARN	WARN	WARN	WARN	WARN	WARN	WARN	WARN	WARN	WARN	WARN	WARN	WARN	WARN
Per sequence quality scores	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS
Per base N content	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS
Sequence Length Distribution	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS	PASS
Adapter Content	FAIL	FAIL	FAIL	FAIL	FAIL	FAIL	FAIL	FAIL	FAIL	FAIL	FAIL	FAIL	FAIL	FAIL

MiXCR Alignment Calls

	Control_1	Control_2	Control_3	Immunized_Day5_1	Immunized_Day5_2	Immunized_Day8_1	Immunized_Day8_2
Successfully aligned reads:	OK	OK	OK	OK	OK	OK	OK
Off target (non TCR/IG) reads:	OK	OK	OK	OK	OK	OK	OK
Reads with no V or J hits:	OK	OK	OK	OK	OK	OK	OK
Reads with no barcode:	OK	OK	OK	OK	OK	OK	OK
Overlapped paired-end reads:	OK	OK	OK	OK	OK	OK	OK
Alignments that do not cover VDJRegion:	NA	NA	NA	NA	NA	NA	NA
Tag groups that do not cover VDJRegion:	NA	NA	NA	NA	NA	NA	NA
Barcode collisions in clonotype assembly:	OK	OK	OK	OK	OK	ALERT	ALERT
Unassigned alignments in clonotype assembly:	OK	OK	OK	OK	OK	OK	OK
Reads used in clonotypes:	OK	OK	OK	OK	OK	WARN	WARN
Alignments dropped due to low sequence quality:	OK	OK	OK	OK	OK	OK	OK
Alignments clustered in PCR error correction:	NA	NA	NA	NA	NA	NA	NA
Clonotypes clustered in PCR error correction:	NA	NA	NA	NA	NA	NA	NA
Clones dropped in post-filtering:	OK	OK	OK	OK	OK	OK	OK
Alignments dropped in clones post-filtering:	OK	OK	OK	OK	OK	OK	OK
Reads dropped in tags error correction and filtering:	OK	OK	OK	OK	OK	WARN	ALERT
UMIs artificial diversity eliminated:	OK	OK	WARN	OK	OK	OK	OK
Reads dropped in UMI error correction and whitelist:	OK	OK	OK	OK	OK	OK	OK
Reads dropped in tags filtering:	OK	OK	OK	OK	OK	WARN	ALERT

MiXCR Alignment Statistics

	Control_1	Control_2	Control_3	Immunized_Day5_1	Immunized_Day5_2	Immunized_Day8_1	Immunized_Day8_2
Successfully aligned reads:	99.15%	99.01%	99.0%	99.21%	99.22%	99.55%	99.56%
Off target (non TCR/IG) reads:	0.11%	0.12%	0.11%	0.11%	0.1%	0.11%	0.1%
Reads with no V or J hits:	0.73%	0.86%	0.88%	0.67%	0.68%	0.33%	0.33%
Reads with no barcode:	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%
Overlapped paired-end reads:	99.5%	99.55%	99.51%	99.51%	99.52%	99.47%	99.47%
Alignments that do not cover VDJRegion:	NA	NA	NA	NA	NA	NA	NA
Tag groups that do not cover VDJRegion:	NA	NA	NA	NA	NA	NA	NA
Barcode collisions in clonotype assembly:	0.28%	0.28%	0.24%	0.6%	0.72%	14.98%	15.81%
Unassigned alignments in clonotype assembly:	0.45%	0.41%	0.44%	0.53%	0.52%	2.6%	2.66%
Reads used in clonotypes:	97.59%	97.3%	97.32%	97.4%	96.81%	87.79%	85.0%
Alignments dropped due to low sequence quality:	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%
Alignments clustered in PCR error correction:	NA	NA	NA	NA	NA	NA	NA
Clonotypes clustered in PCR error correction:	NA	NA	NA	NA	NA	NA	NA
Clones dropped in post-filtering:	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%
Alignments dropped in clones post-filtering:	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%
Reads dropped in tags error correction and filtering:	0.74%	0.92%	0.86%	0.96%	1.56%	9.19%	11.0%
UMIs artificial diversity eliminated:	28.45%	26.29%	31.44%	25.36%	21.65%	8.1%	7.81%
Reads dropped in UMI error correction and whitelist:	0.27%	0.28%	0.26%	0.25%	0.3%	0.53%	0.59%
Reads dropped in tags filtering:	0.47%	0.64%	0.59%	0.71%	1.26%	8.66%	10.41%

Alignment Percentages

MiXCR automatically sets a filter to identify UMIs that attain a sufficient number of reads to be called real. Shown below are the number of samples with a given number of reads per UMI. In the dotted red line is the filter applied for that particular sample.

Sample: Control_1

Sample: Control_2

Sample: Control_3

Sample: Immunized_Day5_1

Sample: Immunized_Day5_2

Sample: Immunized_Day8_1

Sample: Immunized_Day8_2

clonotype_summary.knit

Clonotypes Overview

This section contains an overview of the clonotypes identified in all the datasets across all receptor chain types.

Chain Usage Summary

This section counts the number of reads (nRead) and the number of clonotypes (nClons) in each dataset. The number of clonotypes is further broken down into the receptor chain type to display the repertoire chain composition.

Samples	nRead	nClons	IGH	IGK	IGL	TRA	TRB	TRD	TRG
Control_1	8952881	98216	21390	9896	11301	22996	31730	271	632
Control_2	7328409	97340	20169	9600	10761	23382	32609	244	574
Control_3	7205548	82635	17026	8137	8732	19783	28307	193	457
Immunized_Day5_1	11659613	120534	24569	11409	12652	29417	41619	243	625
Immunized_Day5_2	10644001	137955	27631	12229	13701	34281	49182	312	618
Immunized_Day8_1	27375362	95489	23479	11088	13193	20047	26907	232	543
Immunized_Day8_2	25610902	96094	23480	11246	13451	20180	26985	213	539

Total Number of Clonotypes

Chain Composition in the Repertoire

immunarch_markdown.knit

IGH

Repertoire Statistics

This section shows repertoire statistical measures in each sample. Described in this section are the total number of unique clonotypes, the sum of all UMI counts for each clonotype, the distribution of clonotype abundance, and the distribution of clonotype CDR3 length.

Unique clonotype in each sample

Total clonotype counts in each sample

Clonotype Abundance

CDR3 Region Length

Top Clonotypes

This section outlines the most abundant clonotypes across the entire dataset. It takes the top 100 most abundant clonotypes. These 100 clonotype are identified by first taking the mean of their frequency across all datasets and then sorting them in descending order. The top 100 are plotted on a heatmap. For visualization purposes, the top 20 clonotypes are plotted as a barplot. The final plot in this section shows the percent occupancy of clonotype indices 1-10, 11-100, 101-500, etc.

Most Abundant Clonotypes

Frequency of Most Abundant Clonotypes

Show entries

Search:

	CDR3.aa	Control_1	Control_2	Control_3	Immunized_Day5_1	Immunized_Day5_2	Immunized_Day8_1	Immunized_Day8_2
1	CARTNSFDVW	0.000122	0.000147	0.0000184	0.0583	0.0622	0.0608	0.0612
2	CAYSGLEGWDTVMAGNFDYW	0	0	0.000184	0.00471	0.00587	0.0398	0.0385
3	CTTDGIHCHASFDYW	0	0	0	0.0258	0.0204	0.0195	0.0195
4	CTRGWGLDITLVRFDYW	0.0000136	0.0000147	0.00035	0.0289	0.0288	0.0133	0.0135
5	CTPGSYYKSRGYW	0	0	0.00035	0.00486	0.0052	0.0303	0.0292
6	CARADVALPATMNHW	0	0	0.000443	0.00139	0.00118	0.0334	0.0327
7	CAKDFMGTIPDQFDCW	0	0	0.000184	0.0111	0.0105	0.0231	0.022
8	CTPGSSYKSRGYW	0	0	0	0.00609	0.00529	0.0272	0.0277
9	CGDYHHRGSFPPW	0.000354	0	0.000314	0.0103	0.00841	0.0204	0.0205
10	CARSNAFDVW	0	0	0	0.0176	0.0177	0.0121	0.0123

Showing 1 to 10 of 100 entries

Previous1 2 3 4 5…10Next

Top 20 Clonotypes

Repertoire Space by Clonotype Index

Repertoire Overlap

This section shows overlap of repertoires between samples. The two metrics used are the overlap of public (or shared) clonotypes and the Morisita overlap index.

Gene Usage Statistics

This section quantifies the usage of receptor genes in the repertoire. The first figure identifies the top 10 used genes by taking the mean value of usage of each gene across datasets and ranking the genes based on this mean value. The accompanying table includes the usage of all observed receptor genes.

Top 10 Used Genes

Gene Usage

Show entries

Search:

	Names	Control_1	Control_2	Control_3	Immunized_Day5_1	Immunized_Day5_2	Immunized_Day8_1	Immunized_Day8_2
1	IGHV3-21	0.0741	0.0727	0.0675	0.0755	0.0717	0.103	0.0968
2	IGHV3-23	0.0753	0.0732	0.0739	0.0763	0.0725	0.0902	0.0905
3	IGHV3-21, IGHV3-7	0.0714	0.0743	0.0674	0.0703	0.07	0.0543	0.0592
4	IGHV5-51	0.0625	0.0632	0.0612	0.0606	0.0626	0.0475	0.0458
5	IGHV1-69	0.0552	0.0496	0.0553	0.0528	0.052	0.0407	0.0495
6	IGHV4-34	0.0545	0.0568	0.0549	0.0516	0.0538	0.0374	0.0388
7	IGHV3-30-3	0.0497	0.0486	0.046	0.0488	0.0504	0.0411	0.0392
8	IGHV4-59, IGHV4-61	0.0466	0.05	0.0501	0.046	0.0482	0.0378	0.0368
9	IGHV3-15	0.0355	0.0339	0.0364	0.0349	0.0359	0.058	0.0586
10	IGHV1-2	0.0345	0.0353	0.0373	0.0363	0.0356	0.0421	0.0423

Showing 1 to 10 of 506 entries

Previous1 2 3 4 5…51Next

Table: The values correspond to the frequency of usage for each gene in each sample. The rows are ordered to show more frequently used genes first.

Gene Usage Overlap

This section quantifies the similarity of gene usage across the samples. The metrics used are the Jensen-Shannon Divergence, which measures the dissimilarity between samples, and the gene usage correlation.

Diversity Metrics

This section quantifies commonly used metrics in species (i.e. clonotype) diversity. Displayed here are the Chao1 diversity index, the D50 diversity index and the true diversity measure.

Kmer Analysis

This section identifies highly represented kmers (5-mer) across all repertoires. The top kmers are identified as the most abundant kmers across the entire experiment.

Top 20 kmers visualized

Top 100 Kmer abundance

Show entries

Search:

	Kmer	Control_1	Control_2	Control_3	Immunized_Day5_1	Immunized_Day5_2	Immunized_Day8_1	Immunized_Day8_2
1	GMDVW	4160	3922	3310	4760	5587	3282	3460
2	YGMDV	3522	3302	2861	4101	4746	2660	2765
3	YYGMD	2888	2686	2342	3312	3915	2095	2218
4	YYYGM	2383	2256	1975	2759	3323	1736	1831
5	YYYYG	1675	1601	1406	1948	2395	1193	1316
6	YFDYW	1534	1477	1263	1905	2105	1539	1518
7	WFDPW	1638	1508	1311	1758	2112	1239	1257
8	AFDIW	1446	1349	1180	1567	1797	1254	1250
9	YDSSG	850	748	699	901	1151	569	563
10	YYFDY	700	713	590	876	955	723	702

Showing 1 to 10 of 100 entries

Previous1 2 3 4 5…10Next

Table: Each value is the calculated number of times each kmer is present in each sample. The rows are ordered to show more frequently identified kmers first.

immunarch_markdown.knit

IGK

Repertoire Statistics

This section shows repertoire statistical measures in each sample. Described in this section are the total number of unique clonotypes, the sum of all UMI counts for each clonotype, the distribution of clonotype abundance, and the distribution of clonotype CDR3 length.

Unique clonotype in each sample

Total clonotype counts in each sample

Clonotype Abundance

CDR3 Region Length

Top Clonotypes

This section outlines the most abundant clonotypes across the entire dataset. It takes the top 100 most abundant clonotypes. These 100 clonotype are identified by first taking the mean of their frequency across all datasets and then sorting them in descending order. The top 100 are plotted on a heatmap. For visualization purposes, the top 20 clonotypes are plotted as a barplot. The final plot in this section shows the percent occupancy of clonotype indices 1-10, 11-100, 101-500, etc.

Most Abundant Clonotypes

Frequency of Most Abundant Clonotypes

Show entries

Search:

	CDR3.aa	Control_1	Control_2	Control_3	Immunized_Day5_1	Immunized_Day5_2	Immunized_Day8_1	Immunized_Day8_2
1	CQQYGNSPFTF	0.000176	0.000418	0.000325	0.0605	0.0586	0.0613	0.0643
2	CQQYGSSPFTF	0.00142	0.00079	0.000839	0.026	0.0253	0.0221	0.0222
3	CQKYNSPPHTF	0	0	0.000445	0.034	0.0333	0.0136	0.0135
4	CQQYGNTPFTF	0	0.0000155	0	0.0166	0.0221	0.0275	0.0284
5	CQQAYFIPRTF	0	0	0	0.0219	0.0165	0.0262	0.0259
6	CQQYYAPPAAF	0	0	0.0794	0	0	0	0
7	CQQRATWPLTF	0	0	0.000428	0.00255	0.00347	0.0349	0.033
8	CQHHIPGITF	0	0	0	0.00253	0.0021	0.0335	0.0343
9	CMQGTHWPRTF	0.000483	0.00173	0.00115	0.00249	0.00285	0.0318	0.0305
10	CMQGTHWPYTL	0.000644	0.0000155	0.000514	0.00817	0.0064	0.0242	0.0254

Showing 1 to 10 of 100 entries

Previous1 2 3 4 5…10Next

Top 20 Clonotypes

Repertoire Space by Clonotype Index

Repertoire Overlap

This section shows overlap of repertoires between samples. The two metrics used are the overlap of public (or shared) clonotypes and the Morisita overlap index.

Gene Usage Statistics

This section quantifies the usage of receptor genes in the repertoire. The first figure identifies the top 10 used genes by taking the mean value of usage of each gene across datasets and ranking the genes based on this mean value. The accompanying table includes the usage of all observed receptor genes.

Top 10 Used Genes

Gene Usage

Show entries

Search:

	Names	Control_1	Control_2	Control_3	Immunized_Day5_1	Immunized_Day5_2	Immunized_Day8_1	Immunized_Day8_2
1	IGKV3-20, IGKV3D-20	0.195	0.183	0.178	0.185	0.176	0.176	0.174
2	IGKV1-39, IGKV1D-39	0.117	0.121	0.119	0.121	0.123	0.117	0.124
3	IGKV4-1	0.0877	0.0982	0.0986	0.0969	0.094	0.0958	0.101
4	IGKV3-11	0.0684	0.0851	0.0687	0.073	0.065	0.0744	0.0736
5	IGKV3-15	0.0688	0.0776	0.0663	0.0708	0.0673	0.0729	0.0702
6	IGKV1-5	0.0719	0.0722	0.073	0.07	0.0714	0.0683	0.0671
7	IGKV2-28, IGKV2D-28	0.0448	0.0433	0.0495	0.0449	0.0469	0.04	0.0417
8	IGKV1-33, IGKV1D-33	0.0372	0.04	0.0383	0.0394	0.043	0.045	0.0447
9	IGKV2-30, IGKV2D-30	0.0323	0.0189	0.032	0.0303	0.0301	0.0377	0.0382
10	IGKV1-27, IGKV1D-27	0.0231	0.0265	0.0236	0.0242	0.0257	0.033	0.0313

Showing 1 to 10 of 153 entries

Previous1 2 3 4 5…16Next

Table: The values correspond to the frequency of usage for each gene in each sample. The rows are ordered to show more frequently used genes first.

Gene Usage Overlap

This section quantifies the similarity of gene usage across the samples. The metrics used are the Jensen-Shannon Divergence, which measures the dissimilarity between samples, and the gene usage correlation.

Diversity Metrics

This section quantifies commonly used metrics in species (i.e. clonotype) diversity. Displayed here are the Chao1 diversity index, the D50 diversity index and the true diversity measure.

Kmer Analysis

This section identifies highly represented kmers (5-mer) across all repertoires. The top kmers are identified as the most abundant kmers across the entire experiment.

Top 20 kmers visualized

Top 100 Kmer abundance

Show entries

Search:

	Kmer	Control_1	Control_2	Control_3	Immunized_Day5_1	Immunized_Day5_2	Immunized_Day8_1	Immunized_Day8_2
1	CQQYN	973	970	877	1126	1177	917	889
2	CQQYG	1015	932	812	1112	1165	901	907
3	CQQYY	648	707	628	872	907	751	809
4	CQQSY	676	684	592	796	897	678	725
5	QQYGS	688	643	562	763	785	547	561
6	QYGSS	656	604	531	723	750	525	522
7	CQQRS	525	481	474	585	638	479	471
8	QQSYS	495	508	435	574	640	475	500
9	CQQYD	437	454	370	488	562	470	479
10	QQYYS	434	453	411	535	554	403	428

Showing 1 to 10 of 100 entries

Previous1 2 3 4 5…10Next

Table: Each value is the calculated number of times each kmer is present in each sample. The rows are ordered to show more frequently identified kmers first.

immunarch_markdown.knit

IGL

Repertoire Statistics

This section shows repertoire statistical measures in each sample. Described in this section are the total number of unique clonotypes, the sum of all UMI counts for each clonotype, the distribution of clonotype abundance, and the distribution of clonotype CDR3 length.

Unique clonotype in each sample

Total clonotype counts in each sample

Clonotype Abundance

CDR3 Region Length

Top Clonotypes

This section outlines the most abundant clonotypes across the entire dataset. It takes the top 100 most abundant clonotypes. These 100 clonotype are identified by first taking the mean of their frequency across all datasets and then sorting them in descending order. The top 100 are plotted on a heatmap. For visualization purposes, the top 20 clonotypes are plotted as a barplot. The final plot in this section shows the percent occupancy of clonotype indices 1-10, 11-100, 101-500, etc.

Most Abundant Clonotypes

Frequency of Most Abundant Clonotypes

Show entries

Search:

	CDR3.aa	Control_1	Control_2	Control_3	Immunized_Day5_1	Immunized_Day5_2	Immunized_Day8_1	Immunized_Day8_2
1	CLLSYNDGWVF	0.0000147	0	0.0000434	0.0207	0.0169	0.0822	0.0858
2	CLLSYSDGWVF	0	0	0.000825	0.0135	0.0167	0.0715	0.0717
3	CCSYTSSYTYVF	0	0	0.000738	0.00323	0.00294	0.0728	0.0715
4	CLLSYSNAWVF	0	0	0	0.0023	0.00234	0.0544	0.0541
5	CCSNRGIPTLVF	0	0	0.000391	0.0185	0.0158	0.0346	0.0323
6	CSAWDSSLSAWVF	0.00209	0.00211	0.00269	0.0174	0.0197	0.021	0.0208
7	CFLSYSDGWVF	0.00104	0	0	0.0187	0.0168	0.0153	0.0159
8	CQAWDSSTVVF	0.00544	0.00564	0.00528	0.00309	0.0047	0.0213	0.0208
9	CLLSYYNGWVF	0	0	0.0000217	0.00555	0.00322	0.0283	0.0288
10	CSSHAGDNNLGVF	0	0	0.000391	0.00322	0.00155	0.0294	0.0281

Showing 1 to 10 of 100 entries

Previous1 2 3 4 5…10Next

Top 20 Clonotypes

Repertoire Space by Clonotype Index

Repertoire Overlap

This section shows overlap of repertoires between samples. The two metrics used are the overlap of public (or shared) clonotypes and the Morisita overlap index.

Gene Usage Statistics

This section quantifies the usage of receptor genes in the repertoire. The first figure identifies the top 10 used genes by taking the mean value of usage of each gene across datasets and ranking the genes based on this mean value. The accompanying table includes the usage of all observed receptor genes.

Top 10 Used Genes

Gene Usage

Show entries

Search:

	Names	Control_1	Control_2	Control_3	Immunized_Day5_1	Immunized_Day5_2	Immunized_Day8_1	Immunized_Day8_2
1	IGLV2-14	0.122	0.12	0.127	0.13	0.138	0.153	0.154
2	IGLV1-40	0.0838	0.0801	0.0792	0.078	0.0792	0.058	0.0599
3	IGLV3-1	0.0685	0.0626	0.0725	0.0717	0.0703	0.082	0.0822
4	IGLV1-36, IGLV1-44	0.0716	0.0757	0.0743	0.0742	0.0682	0.0619	0.062
5	IGLV1-51	0.063	0.0613	0.0643	0.0663	0.0558	0.0698	0.0685
6	IGLV3-21	0.0652	0.0654	0.0674	0.0616	0.0664	0.0477	0.0493
7	IGLV2-8	0.049	0.0494	0.0521	0.0506	0.0476	0.0845	0.079
8	IGLV3-25	0.0559	0.0634	0.057	0.0615	0.0607	0.0426	0.0458
9	IGLV2-23	0.0505	0.0538	0.0526	0.0477	0.0551	0.0433	0.0454
10	IGLV2-11	0.049	0.0451	0.0411	0.0444	0.0433	0.049	0.0473

Showing 1 to 10 of 147 entries

Previous1 2 3 4 5…15Next

Table: The values correspond to the frequency of usage for each gene in each sample. The rows are ordered to show more frequently used genes first.

Gene Usage Overlap

This section quantifies the similarity of gene usage across the samples. The metrics used are the Jensen-Shannon Divergence, which measures the dissimilarity between samples, and the gene usage correlation.

Diversity Metrics

This section quantifies commonly used metrics in species (i.e. clonotype) diversity. Displayed here are the Chao1 diversity index, the D50 diversity index and the true diversity measure.

Kmer Analysis

This section identifies highly represented kmers (5-mer) across all repertoires. The top kmers are identified as the most abundant kmers across the entire experiment.

Top 20 kmers visualized

Top 100 Kmer abundance

Show entries

Search:

	Kmer	Control_1	Control_2	Control_3	Immunized_Day5_1	Immunized_Day5_2	Immunized_Day8_1	Immunized_Day8_2
1	CSSYT	832	780	655	1047	1207	1239	1260
2	CQSYD	902	849	665	956	1028	759	758
3	WDDSL	851	740	646	907	924	738	786
4	CAAWD	759	682	558	828	849	695	737
5	CCSYA	751	685	519	750	871	718	720
6	SYAGS	719	653	536	753	862	734	742
7	CSYAG	688	610	470	672	796	630	646
8	AAWDD	671	597	491	738	748	561	606
9	DSSLS	611	537	467	681	718	622	660
10	SSYTS	585	535	487	681	786	603	574

Showing 1 to 10 of 100 entries

Previous1 2 3 4 5…10Next

Table: Each value is the calculated number of times each kmer is present in each sample. The rows are ordered to show more frequently identified kmers first.

immunarch_markdown.knit

TRAD

Repertoire Statistics

This section shows repertoire statistical measures in each sample. Described in this section are the total number of unique clonotypes, the sum of all UMI counts for each clonotype, the distribution of clonotype abundance, and the distribution of clonotype CDR3 length.

Unique clonotype in each sample

Total clonotype counts in each sample

Clonotype Abundance

CDR3 Region Length

Top Clonotypes

This section outlines the most abundant clonotypes across the entire dataset. It takes the top 100 most abundant clonotypes. These 100 clonotype are identified by first taking the mean of their frequency across all datasets and then sorting them in descending order. The top 100 are plotted on a heatmap. For visualization purposes, the top 20 clonotypes are plotted as a barplot. The final plot in this section shows the percent occupancy of clonotype indices 1-10, 11-100, 101-500, etc.

Most Abundant Clonotypes

Frequency of Most Abundant Clonotypes

Show entries

Search:

	CDR3.aa	Control_1	Control_2	Control_3	Immunized_Day5_1	Immunized_Day5_2	Immunized_Day8_1	Immunized_Day8_2
1	CAVSPGGYQKVTF	0.138	0.138	0.0845	0.108	0.109	0.13	0.133
2	CVVNSGGYQKVTF	0.0316	0.0285	0.0201	0.0224	0.0239	0.0262	0.0278
3	CAVFMDSNYQLIW	0.0291	0.0249	0.018	0.0221	0.0202	0.0203	0.0196
4	CALRVRPGRSGGSYIPTF	0.0274	0.0251	0.0145	0.0195	0.0187	0.0233	0.0254
5	CAENVQAGTALIF	0.0194	0.0195	0.0149	0.0158	0.0157	0.0165	0.0171
6	CAVNAPGGSQGNLIF	0.019	0.0163	0.0141	0.0158	0.0148	0.0169	0.0168
7	CGTDNIPNTGFQKLVF	0.019	0.0164	0.0125	0.0141	0.0135	0.0171	0.0155
8	CVVYTGRRALTF	0.017	0.0158	0.0139	0.0157	0.0147	0.0144	0.0162
9	CACDTLGTTRADKLIF	0.0186	0.0173	0.013	0.0137	0.0141	0.0138	0.0145
10	CACDKWGIRADKLIF	0.0152	0.0135	0.0112	0.0111	0.011	0.0118	0.0109

Showing 1 to 10 of 100 entries

Previous1 2 3 4 5…10Next

Top 20 Clonotypes

Repertoire Space by Clonotype Index

Repertoire Overlap

This section shows overlap of repertoires between samples. The two metrics used are the overlap of public (or shared) clonotypes and the Morisita overlap index.

Gene Usage Statistics

This section quantifies the usage of receptor genes in the repertoire. The first figure identifies the top 10 used genes by taking the mean value of usage of each gene across datasets and ranking the genes based on this mean value. The accompanying table includes the usage of all observed receptor genes.

Top 10 Used Genes

Gene Usage

Show entries

Search:

	Names	Control_1	Control_2	Control_3	Immunized_Day5_1	Immunized_Day5_2	Immunized_Day8_1	Immunized_Day8_2
1	TRAV13-1	0.0656	0.0647	0.0653	0.0655	0.0638	0.0611	0.0635
2	TRAV9-2	0.0647	0.0619	0.0612	0.0629	0.0636	0.065	0.0672
3	TRAV12-1	0.0524	0.0498	0.0489	0.0466	0.0482	0.0457	0.0494
4	TRAV12-2	0.0448	0.047	0.0456	0.0467	0.0458	0.0456	0.0447
5	TRAV17	0.0443	0.0456	0.0485	0.0422	0.0434	0.0447	0.0433
6	TRAV12-3	0.0435	0.0422	0.0413	0.0442	0.0439	0.0403	0.0406
7	TRAV21	0.0412	0.0421	0.0449	0.0422	0.0415	0.0401	0.0421
8	TRAV29DV5	0.0369	0.0376	0.042	0.0388	0.0389	0.0381	0.0379
9	TRAV8-4	0.0373	0.0375	0.0359	0.0368	0.0367	0.0333	0.0346
10	TRAV23DV6	0.0315	0.0324	0.0289	0.0298	0.0305	0.0297	0.0292

Showing 1 to 10 of 89 entries

Previous1 2 3 4 5…9Next

Table: The values correspond to the frequency of usage for each gene in each sample. The rows are ordered to show more frequently used genes first.

Gene Usage Overlap

This section quantifies the similarity of gene usage across the samples. The metrics used are the Jensen-Shannon Divergence, which measures the dissimilarity between samples, and the gene usage correlation.

Diversity Metrics

This section quantifies commonly used metrics in species (i.e. clonotype) diversity. Displayed here are the Chao1 diversity index, the D50 diversity index and the true diversity measure.

Kmer Analysis

This section identifies highly represented kmers (5-mer) across all repertoires. The top kmers are identified as the most abundant kmers across the entire experiment.

Top 20 kmers visualized

Top 100 Kmer abundance

Show entries

Search:

	Kmer	Control_1	Control_2	Control_3	Immunized_Day5_1	Immunized_Day5_2	Immunized_Day8_1	Immunized_Day8_2
1	GKLIF	1296	1302	1020	1636	1800	1127	1076
2	NKLTF	918	933	841	1171	1452	830	793
3	GNKLT	914	930	839	1166	1449	826	792
4	QKLVF	876	923	805	1152	1321	824	818
5	NKLIF	862	887	751	1112	1293	715	767
6	TGKLI	745	743	580	921	1054	662	627
7	NQFYF	721	715	640	928	1090	585	617
8	NTGKL	704	717	549	884	1012	631	607
9	GNQFY	693	685	614	896	1050	567	593
10	GNLIF	678	651	581	848	1044	537	593

Showing 1 to 10 of 100 entries

Previous1 2 3 4 5…10Next

Table: Each value is the calculated number of times each kmer is present in each sample. The rows are ordered to show more frequently identified kmers first.

immunarch_markdown.knit

TRB

Repertoire Statistics

This section shows repertoire statistical measures in each sample. Described in this section are the total number of unique clonotypes, the sum of all UMI counts for each clonotype, the distribution of clonotype abundance, and the distribution of clonotype CDR3 length.

Unique clonotype in each sample

Total clonotype counts in each sample

Clonotype Abundance

CDR3 Region Length

Top Clonotypes

This section outlines the most abundant clonotypes across the entire dataset. It takes the top 100 most abundant clonotypes. These 100 clonotype are identified by first taking the mean of their frequency across all datasets and then sorting them in descending order. The top 100 are plotted on a heatmap. For visualization purposes, the top 20 clonotypes are plotted as a barplot. The final plot in this section shows the percent occupancy of clonotype indices 1-10, 11-100, 101-500, etc.

Most Abundant Clonotypes

Frequency of Most Abundant Clonotypes

Show entries

Search:

	CDR3.aa	Control_1	Control_2	Control_3	Immunized_Day5_1	Immunized_Day5_2	Immunized_Day8_1	Immunized_Day8_2
1	CATSDPSGGALETQYF	0.0649	0.064	0.0368	0.0501	0.0501	0.0656	0.0658
2	CASSWREFEQYF	0.0439	0.0344	0.023	0.0289	0.0282	0.0343	0.0351
3	CASSQAAGGHYNEQFF	0.0345	0.0306	0.0231	0.0262	0.026	0.0268	0.0279
4	CASSLVGRDYNEQFF	0.0294	0.024	0.0133	0.0206	0.02	0.0235	0.0232
5	CSATYQGPSDEQFF	0.0247	0.0236	0.0143	0.0176	0.0179	0.0215	0.0219
6	CASSLGQRTDTQYF	0.0229	0.0205	0.0171	0.0163	0.0164	0.0188	0.0184
7	CAWRVTWGTEAFF	0.0192	0.0181	0.0125	0.0142	0.0148	0.0176	0.0172
8	CASRDRLGLTYEQYF	0.0123	0.0105	0.00569	0.00772	0.00762	0.00863	0.00922
9	CASSPRLAVGLQETQYF	0.0117	0.00996	0.00609	0.00702	0.00664	0.00791	0.00876
10	CASSQRNRETQYF	0.00819	0.00764	0.00496	0.00547	0.00606	0.00677	0.00804

Showing 1 to 10 of 100 entries

Previous1 2 3 4 5…10Next

Top 20 Clonotypes

Repertoire Space by Clonotype Index

Repertoire Overlap

This section shows overlap of repertoires between samples. The two metrics used are the overlap of public (or shared) clonotypes and the Morisita overlap index.

Gene Usage Statistics

This section quantifies the usage of receptor genes in the repertoire. The first figure identifies the top 10 used genes by taking the mean value of usage of each gene across datasets and ranking the genes based on this mean value. The accompanying table includes the usage of all observed receptor genes.

Top 10 Used Genes

Gene Usage

Show entries

Search:

	Names	Control_1	Control_2	Control_3	Immunized_Day5_1	Immunized_Day5_2	Immunized_Day8_1	Immunized_Day8_2
1	TRBV20-1	0.0801	0.0817	0.0822	0.0867	0.0853	0.0933	0.0939
2	TRBV5-1	0.0705	0.0704	0.0732	0.072	0.0727	0.07	0.0691
3	TRBV12-3, TRBV12-4	0.0638	0.0646	0.0637	0.0639	0.0656	0.0703	0.0706
4	TRBV7-2	0.0587	0.0605	0.0613	0.0601	0.0599	0.0568	0.06
5	TRBV29-1	0.0572	0.0579	0.0558	0.055	0.0547	0.0542	0.0514
6	TRBV7-9	0.0503	0.0505	0.0508	0.0501	0.0511	0.0495	0.0481
7	TRBV19	0.0502	0.0476	0.0475	0.0495	0.0478	0.0497	0.0492
8	TRBV2	0.0327	0.0318	0.0325	0.0321	0.0323	0.0336	0.0337
9	TRBV11-2	0.031	0.0321	0.033	0.0314	0.031	0.0298	0.0304
10	TRBV6-1	0.0294	0.0298	0.03	0.0299	0.0295	0.0284	0.0274

Showing 1 to 10 of 140 entries

Previous1 2 3 4 5…14Next

Table: The values correspond to the frequency of usage for each gene in each sample. The rows are ordered to show more frequently used genes first.

Gene Usage Overlap

This section quantifies the similarity of gene usage across the samples. The metrics used are the Jensen-Shannon Divergence, which measures the dissimilarity between samples, and the gene usage correlation.

Diversity Metrics

This section quantifies commonly used metrics in species (i.e. clonotype) diversity. Displayed here are the Chao1 diversity index, the D50 diversity index and the true diversity measure.

Kmer Analysis

This section identifies highly represented kmers (5-mer) across all repertoires. The top kmers are identified as the most abundant kmers across the entire experiment.

Top 20 kmers visualized

Top 100 Kmer abundance

Show entries

Search:

	Kmer	Control_1	Control_2	Control_3	Immunized_Day5_1	Immunized_Day5_2	Immunized_Day8_1	Immunized_Day8_2
1	CASSL	5483	5651	5011	7187	8566	4519	4623
2	YEQYF	3541	3639	3192	4692	5458	3107	3075
3	NEQFF	3494	3692	3258	4659	5426	3063	3071
4	DTQYF	3141	3341	2802	4251	5044	2694	2655
5	CASSP	2854	2901	2447	3609	4315	2307	2312
6	TEAFF	2746	2751	2484	3552	4359	2303	2284
7	ETQYF	2560	2619	2248	3340	3923	2110	2141
8	GELFF	2433	2368	2207	3081	3716	1926	1988
9	CASSQ	2318	2445	2146	3214	3749	1948	1886
10	QPQHF	2085	2101	1800	2721	3108	1730	1729

Showing 1 to 10 of 100 entries

Previous1 2 3 4 5…10Next

Table: Each value is the calculated number of times each kmer is present in each sample. The rows are ordered to show more frequently identified kmers first.

immunarch_markdown.knit

TRG

Repertoire Statistics

This section shows repertoire statistical measures in each sample. Described in this section are the total number of unique clonotypes, the sum of all UMI counts for each clonotype, the distribution of clonotype abundance, and the distribution of clonotype CDR3 length.

Unique clonotype in each sample

Total clonotype counts in each sample

Clonotype Abundance

CDR3 Region Length

Top Clonotypes

This section outlines the most abundant clonotypes across the entire dataset. It takes the top 100 most abundant clonotypes. These 100 clonotype are identified by first taking the mean of their frequency across all datasets and then sorting them in descending order. The top 100 are plotted on a heatmap. For visualization purposes, the top 20 clonotypes are plotted as a barplot. The final plot in this section shows the percent occupancy of clonotype indices 1-10, 11-100, 101-500, etc.

Most Abundant Clonotypes

Frequency of Most Abundant Clonotypes

Show entries

Search:

	CDR3.aa	Control_1	Control_2	Control_3	Immunized_Day5_1	Immunized_Day5_2	Immunized_Day8_1	Immunized_Day8_2
1	CATWDGSGRTTGWFKIF	0.14	0.15	0.122	0.142	0.147	0.169	0.16
2	CALWEVVELGKKIKVF	0.094	0.0908	0.0855	0.0731	0.0883	0.0811	0.0867
3	CALWAQELGKKIKVF	0.0726	0.0639	0.0703	0.0727	0.0802	0.0775	0.0725
4	CALWEEQELGKKIKVF	0.0683	0.073	0.0722	0.0661	0.0637	0.0624	0.0675
5	CATWDGHYYKKLF	0.0698	0.0734	0.0566	0.0648	0.0664	0.0634	0.0558
6	CATWDGLNYKKLF	0.0564	0.0611	0.0437	0.0611	0.0574	0.0688	0.0708
7	CATWDDYYKKLF	0.0612	0.0603	0.0527	0.0611	0.0624	0.0588	0.0599
8	CATDGSDWIKTF	0.0466	0.0443	0.0633	0.0495	0.0516	0.0386	0.0368
9	CALWEVRKELGKKIKVF	0.0506	0.0417	0.0539	0.0434	0.0415	0.0445	0.0383
10	CATWPYYKKLF	0.0379	0.0352	0.0375	0.0379	0.029	0.0312	0.0396

Showing 1 to 10 of 100 entries

Previous1 2 3 4 5…10Next

Top 20 Clonotypes

Repertoire Space by Clonotype Index

Repertoire Overlap

This section shows overlap of repertoires between samples. The two metrics used are the overlap of public (or shared) clonotypes and the Morisita overlap index.

Gene Usage Statistics

This section quantifies the usage of receptor genes in the repertoire. The first figure identifies the top 10 used genes by taking the mean value of usage of each gene across datasets and ranking the genes based on this mean value. The accompanying table includes the usage of all observed receptor genes.

Top 10 Used Genes

Gene Usage

Show entries

Search:

	Names	Control_1	Control_2	Control_3	Immunized_Day5_1	Immunized_Day5_2	Immunized_Day8_1	Immunized_Day8_2
1	TRGV9	0.301	0.305	0.315	0.322	0.292	0.316	0.336
2	TRGV2	0.295	0.259	0.267	0.276	0.292	0.294	0.262
3	TRGV8	0.143	0.165	0.136	0.13	0.136	0.132	0.133
4	TRGV4	0.129	0.116	0.121	0.108	0.131	0.135	0.117
5	TRGV3	0.0871	0.116	0.103	0.1	0.111	0.0839	0.0895
6	TRGV5	0.0449	0.0396	0.0549	0.0623	0.0362	0.0355	0.0617
7	TRGV4, TRGV8			0.00366
8	TRGV3, TRGV5						0.00323

Showing 1 to 8 of 8 entries

Previous1Next

Table: The values correspond to the frequency of usage for each gene in each sample. The rows are ordered to show more frequently used genes first.

Gene Usage Overlap

This section quantifies the similarity of gene usage across the samples. The metrics used are the Jensen-Shannon Divergence, which measures the dissimilarity between samples, and the gene usage correlation.

Diversity Metrics

This section quantifies commonly used metrics in species (i.e. clonotype) diversity. Displayed here are the Chao1 diversity index, the D50 diversity index and the true diversity measure.

Kmer Analysis

This section identifies highly represented kmers (5-mer) across all repertoires. The top kmers are identified as the most abundant kmers across the entire experiment.

Top 20 kmers visualized

Top 100 Kmer abundance

Show entries

Search:

	Kmer	Control_1	Control_2	Control_3	Immunized_Day5_1	Immunized_Day5_2	Immunized_Day8_1	Immunized_Day8_2
1	CATWD	192	161	137	189	194	159	161
2	YKKLF	165	144	125	159	167	143	148
3	YYKKL	116	107	95	117	110	102	99
4	CALWE	89	82	70	94	85	77	90
5	ATWDG	95	63	62	90	91	76	73
6	KIKVF	70	71	61	78	69	65	71
7	GKKIK	70	71	61	78	68	65	71
8	KKIKV	70	71	61	78	68	65	71
9	LGKKI	68	68	59	77	66	63	68
10	ELGKK	62	63	55	69	60	59	64

Showing 1 to 10 of 100 entries

Previous1 2 3 4 5…10Next

Table: Each value is the calculated number of times each kmer is present in each sample. The rows are ordered to show more frequently identified kmers first.

Appendix

Methods of unzip compressed files

Compressed files in the format of *.gz:

Unix/Linux/Mac user use “gzip *.gz” command

Windows user use uncompressed software such as WinRAR, 7-Zip et al

Compressed files in the format of *.zip:

Unix/Linux/Mac user use “unzip *.zip” command

Windows user use uncompressed software such as WinRAR, 7-Zip et al

How to operate different format data files

*.fastq reads sequence file, in the format of fasta. it is not easy to open since it is a large big file.

Unix/Linux/Mac users use less or more commands;

Windows users use editor Editplus/Notepad++ et al

.xls,.txt, *.tsv table result file; files are separated by(Tab)

Unix/Linux/Mac users use “less” or “more” commands

Windows users use editor Editplus/Notepad++ et al, also can use Microsoft Excel to open.

Software catalog:

FastQC v0.11.9

MiXCR v4.5.0

R V4.3.1

Reference

Cock P J A, Fields C J, Goto N, et al. (2010). The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic acids research 38, 1767-1771. (FASTQ)

Bolotin DA, Poslavsky S, Mitrophanov I, Shugay M, Mamedov IZ, et al. (2015) MiXCR: software for comprehensive adaptive immunity profiling. Nat Methods 12: 380.381. 10.1038/nmeth.3364

Shugay M, Bagaev D V., Turchaninova M a., Bolotin D a., Britanova O V., Putintseva E V., et al. VDJtools: unifying post-analysis of T cell receptor repertoires. PLoS Comput Biol 2015;11:e1004503

Erlich Y, Mitra PP, delaBastide M, et al. (2008). Alta-Cyclic: a self-optimizing base caller for next-generation sequencing.Nat Methods. 2008 Aug;5(8):679-82.(sequencing error rate distribution)

Jiang L, Schlesinger F, Davis CA, et al. (2011). Synthetic spike-in standards for RNA-seq experiments.Genome Res. 2011 Sep;21(9):1543-51. (sequencing error rate distribution)

König, J., Zarnack, K., Rot, G., et al. (2010). iCLIP reveals the function of hnRNP particles in splicing at individual nucleotide resolution. Nature structural & molecular biology, 17(7), 909-915.

Parekh, S., Ziegenhain, C., Vieth, B., et al. (2016). The impact of amplification on differential expression analyses by RNA-seq. Scientific reports, 6(1), 1-11.

Fu, Y., Wu, P. H., Beane, T., et al. (2018). Elimination of PCR duplicates in RNA-seq and small RNA-seq using unique molecular identifiers. Bmc Genomics, 19(1), 1-14.

Kennedy, S. R., Schmitt, M. W., Fox, E. J., et al. (2014). Detecting ultralow-frequency mutations by Duplex Sequencing. Nature protocols, 9(11), 2586-2606.

Smith, T., Heger, A., & Sudbery, I. (2017). UMI-tools: modeling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy. Genome research, 27(3), 491-499.

#> R version 4.3.2 (2023-10-31)
#> Platform: aarch64-apple-darwin20 (64-bit)
#> Running under: macOS Sonoma 14.4.1
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> time zone: America/Los_Angeles
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#>  [1] pheatmap_1.0.12   fastqcr_0.1.3     yaml_2.3.10       htmltools_0.5.8.1
#>  [5] lubridate_1.9.3   forcats_1.0.0     purrr_1.0.2       readr_2.1.5      
#>  [9] tidyr_1.3.1       tibble_3.2.1      tidyverse_2.0.0   xfun_0.46        
#> [13] cowplot_1.1.3     gginnards_0.2.0   DT_0.33           stringr_1.5.1    
#> [17] kableExtra_1.4.0  immunarch_0.9.1   patchwork_1.2.0   data.table_1.15.4
#> [21] dtplyr_1.3.1      dplyr_1.1.4       ggplot2_3.5.1     knitr_1.48       
#> [25] rmarkdown_2.27   
#> 
#> loaded via a namespace (and not attached):
#>   [1] RColorBrewer_1.1-3  rstudioapi_0.16.0   jsonlite_1.8.8     
#>   [4] shape_1.4.6.1       magrittr_2.0.3      modeltools_0.2-23  
#>   [7] farver_2.1.2        GlobalOptions_0.1.2 vctrs_0.6.5        
#>  [10] memoise_2.0.1       rstatix_0.7.2       broom_1.0.6        
#>  [13] cellranger_1.1.0    sass_0.4.9          bslib_0.7.0        
#>  [16] htmlwidgets_1.6.4   plyr_1.8.9          cachem_1.1.0       
#>  [19] uuid_1.2-1          igraph_2.0.3        mime_0.12          
#>  [22] lifecycle_1.0.4     iterators_1.0.14    pkgconfig_2.0.3    
#>  [25] Matrix_1.6-5        R6_2.5.1            fastmap_1.2.0      
#>  [28] shiny_1.8.1.1       digest_0.6.36       colorspace_2.1-1   
#>  [31] ggpubr_0.6.0        fansi_1.0.6         timechange_0.3.0   
#>  [34] polyclip_1.10-7     abind_1.4-5         compiler_4.3.2     
#>  [37] withr_3.0.0         doParallel_1.0.17   backports_1.5.0    
#>  [40] carData_3.0-5       viridis_0.6.5       UpSetR_1.4.0       
#>  [43] highr_0.11          ggforce_0.4.2       ggsignif_0.6.4     
#>  [46] MASS_7.3-60.0.1     tools_4.3.2         ape_5.8            
#>  [49] prabclus_2.3-3      httpuv_1.6.15       ggseqlogo_0.2      
#>  [52] nnet_7.3-19         glue_1.7.0          quadprog_1.5-8     
#>  [55] nlme_3.1-165        promises_1.3.0      grid_4.3.2         
#>  [58] stringdist_0.9.12   cluster_2.1.6       reshape2_1.4.4     
#>  [61] generics_0.1.3      gtable_0.3.5        tzdb_0.4.0         
#>  [64] class_7.3-22        hms_1.1.3           tidygraph_1.3.1    
#>  [67] xml2_1.3.6          car_3.1-2           utf8_1.2.4         
#>  [70] flexmix_2.3-19      ggrepel_0.9.5       foreach_1.5.2      
#>  [73] pillar_1.9.0        later_1.3.2         robustbase_0.99-3  
#>  [76] circlize_0.4.16     tweenr_2.0.3        lattice_0.22-6     
#>  [79] tidyselect_1.2.1    gridExtra_2.3       svglite_2.1.3      
#>  [82] stats4_4.3.2        graphlayouts_1.1.1  diptest_0.77-1     
#>  [85] factoextra_1.0.7    DEoptimR_1.1-3      stringi_1.8.4      
#>  [88] evaluate_0.24.0     codetools_0.2-20    kernlab_0.9-32     
#>  [91] ggraph_2.2.1        cli_3.6.3           shinythemes_1.2.0  
#>  [94] xtable_1.8-4        systemfonts_1.1.0   jquerylib_0.1.4    
#>  [97] munsell_0.5.1       Rcpp_1.0.13         readxl_1.4.3       
#> [100] parallel_4.3.2      mclust_6.1.1        ggalluvial_0.12.5  
#> [103] phangorn_2.11.1     viridisLite_0.4.2   rlist_0.4.6.2      
#> [106] scales_1.3.0        fpc_2.2-12          rlang_1.1.4        
#> [109] fastmatch_1.1-4

Project: AIR_demo_immunization

Pipeline Version: DriverMapAIR_v1.2.0 Cellecta, Inc

2024-09-25

Project Details

Sequencing Statistics

Poor quality samples

Summary of FastQC Calls

MiXCR Alignment Calls

MiXCR Alignment Statistics

Alignment Percentages

Chain Usage Summary

Total Number of Clonotypes

Chain Composition in the Repertoire

Unique clonotype in each sample

Total clonotype counts in each sample

Clonotype Abundance

CDR3 Region Length

Most Abundant Clonotypes

Frequency of Most Abundant Clonotypes

Top 20 Clonotypes

Repertoire Space by Clonotype Index

Top 10 Used Genes

Gene Usage

Top 20 kmers visualized

Top 100 Kmer abundance

Unique clonotype in each sample

Total clonotype counts in each sample

Clonotype Abundance

CDR3 Region Length

Most Abundant Clonotypes

Frequency of Most Abundant Clonotypes

Top 20 Clonotypes

Repertoire Space by Clonotype Index

Top 10 Used Genes

Gene Usage

Top 20 kmers visualized

Top 100 Kmer abundance

Unique clonotype in each sample

Total clonotype counts in each sample

Clonotype Abundance

CDR3 Region Length

Most Abundant Clonotypes

Frequency of Most Abundant Clonotypes

Top 20 Clonotypes

Repertoire Space by Clonotype Index

Top 10 Used Genes

Gene Usage

Top 20 kmers visualized

Top 100 Kmer abundance

Unique clonotype in each sample

Total clonotype counts in each sample

Clonotype Abundance

CDR3 Region Length

Most Abundant Clonotypes

Frequency of Most Abundant Clonotypes

Top 20 Clonotypes

Repertoire Space by Clonotype Index

Top 10 Used Genes

Gene Usage

Top 20 kmers visualized

Top 100 Kmer abundance

Unique clonotype in each sample

Total clonotype counts in each sample

Clonotype Abundance

CDR3 Region Length

Most Abundant Clonotypes

Frequency of Most Abundant Clonotypes

Top 20 Clonotypes

Repertoire Space by Clonotype Index

Top 10 Used Genes

Gene Usage

Top 20 kmers visualized

Top 100 Kmer abundance

Unique clonotype in each sample

Total clonotype counts in each sample

Clonotype Abundance

CDR3 Region Length

Most Abundant Clonotypes

Frequency of Most Abundant Clonotypes

Top 20 Clonotypes

Pipeline Version: DriverMapAIR_v1.2.0

Cellecta, Inc