TY - JOUR
T1 - High-throughput sequencing of murine immunoglobulin heavy chain repertoires using single side unique molecular identifiers on an Ion Torrent PGM
AU - Bürckert, Jean Philippe
AU - Faison, William J.
AU - Mustin, Danielle E.
AU - Dubois, Axel R.S.X.
AU - Sinner, Regina
AU - Hunewald, Oliver
AU - Wienecke-Baldacchino, Anke
AU - Brieger, Anne
AU - Muller, Claude P.
N1 - Funding Information:
#7039209 and #1196376, respectively, from the FNR (Fonds National de la Recherche), Luxembourg.
Funding Information:
J.-P. Bürckert and A.R.S.X. Dubois were supported by the AFR (Aides à la Formation Recherche) fellowships
Publisher Copyright:
©Bürckert et al.
PY - 2018/7/1
Y1 - 2018/7/1
N2 - With the advent of high-throughput sequencing (HTS), profiling immunoglobulin (IG) repertoires has become an essential part of immunological research. Advances in sequencing technology enable the IonTorrent Personal Genome Machine (PGM) to cover the full-length of IG mRNA transcripts. Nucleotide insertions and deletions (indels) are the dominant errors of the PGM sequencing platform and can critically influence IG repertoire assessments. Here, we present a PGM-tailored IG repertoire sequencing approach combining error correction through unique molecular identifier (UID) barcoding and indel detection through ImMunoGeneTics (IMGT), the most commonly used sequence alignment database for IG sequences. Using artificially falsified sequences for benchmarking, we found that IMGT's underlying algorithms efficiently detect 98% of the introduced indels. Undetected indels are either located at the end of the sequences or produce masked frameshifts with an insertion and deletion in close proximity. The complementary determining regions 3 (CDR3s) are returned correct for up to 3 insertions or 3 deletions through conservative culling. We further show, that our PGM-tailored unique molecular identifiers result in highly accurate HTS data if combined with the presented processing strategy. In this regard, considering sequences with at least two copies from datasets with UID families of minimum 3 reads result in correct sequences with over 99% confidence. Finally, we show that the protocol can readily be used to generate homogenous datasets for bulk sequencing of murine bone marrow samples. Taken together, this approach will help to establish benchtop-scale sequencing of IG heavy chain transcripts in the field of IG repertoire research.
AB - With the advent of high-throughput sequencing (HTS), profiling immunoglobulin (IG) repertoires has become an essential part of immunological research. Advances in sequencing technology enable the IonTorrent Personal Genome Machine (PGM) to cover the full-length of IG mRNA transcripts. Nucleotide insertions and deletions (indels) are the dominant errors of the PGM sequencing platform and can critically influence IG repertoire assessments. Here, we present a PGM-tailored IG repertoire sequencing approach combining error correction through unique molecular identifier (UID) barcoding and indel detection through ImMunoGeneTics (IMGT), the most commonly used sequence alignment database for IG sequences. Using artificially falsified sequences for benchmarking, we found that IMGT's underlying algorithms efficiently detect 98% of the introduced indels. Undetected indels are either located at the end of the sequences or produce masked frameshifts with an insertion and deletion in close proximity. The complementary determining regions 3 (CDR3s) are returned correct for up to 3 insertions or 3 deletions through conservative culling. We further show, that our PGM-tailored unique molecular identifiers result in highly accurate HTS data if combined with the presented processing strategy. In this regard, considering sequences with at least two copies from datasets with UID families of minimum 3 reads result in correct sequences with over 99% confidence. Finally, we show that the protocol can readily be used to generate homogenous datasets for bulk sequencing of murine bone marrow samples. Taken together, this approach will help to establish benchtop-scale sequencing of IG heavy chain transcripts in the field of IG repertoire research.
KW - Database benchmarking
KW - High-throughput sequencing
KW - IMGT
KW - Immunology
KW - Murine IG repertoire
KW - Unique molecular barcoding
UR - http://www.scopus.com/inward/record.url?scp=85049925246&partnerID=8YFLogxK
U2 - 10.18632/oncotarget.25493
DO - 10.18632/oncotarget.25493
M3 - Article
AN - SCOPUS:85049925246
SN - 1949-2553
VL - 9
SP - 30225
EP - 30239
JO - Oncotarget
JF - Oncotarget
IS - 54
ER -