TY - JOUR
T1 - The rate and role of pseudogenes of the Mycobacterium tuberculosis complex
AU - Soler-Camargo, Naila Cristina
AU - Silva-Pereira, Taiana Tainá
AU - Zimpel, Cristina Kraemer
AU - Camacho, Maurício F.
AU - Zelanis, André
AU - Aono, Alexandre H.
AU - Patané, José Salvatore
AU - Dos Santos, Andrea Pires
AU - Guimarães, Ana Marcia Sá
N1 - Publisher Copyright:
© 2022 The Authors.
PY - 2022
Y1 - 2022
N2 - Whole-genome sequence analyses have significantly contributed to the understanding of virulence and evolution of the Mycobacterium tuberculosis complex (MTBC), the causative pathogens of tuberculosis. Most MTBC evolutionary studies are focused on single nucleotide polymorphisms and deletions, but rare studies have evaluated gene content, whereas none has comprehensively evaluated pseudogenes. Accordingly, we describe an extensive study focused on quantifying and predicting possible functions of MTBC and Mycobacterium canettii pseudogenes. Using NCBI’s PGAP-detected pseudogenes, we analysed 25 837 pseudogenes from 158 MTBC and M. canetii strains and combined transcriptomics and proteomics of M. tuberculosis H37Rv to gain insights about pseudogenes' expression. Our results indicate significant variability concerning rate and conservancy of in silico predicted pseudogenes among different ecotypes and lineages of tuberculous mycobacteria and pseudogenization of important virulence factors and genes of the metabolism and antimicrobial resistance/tolerance. We show that in silico predicted pseudogenes contribute considerably to MTBC genetic diversity at the population level. Moreover, the transcription machinery of M. tuberculosis can fully transcribe most pseudogenes, indicating intact promoters and recent pseudogene evolutionary emergence. Proteomics of M. tuberculosis and close evaluation of mutational lesions driving pseudogenization suggest that few in silico predicted pseudogenes are likely capable of neofunctionalization, nonsense mutation reversal, or phase variation, contradicting the classical definition of pseudogenes. Such findings indicate that genome annotation should be accompanied by proteomics and protein function assays to improve its accuracy. While indels and insertion sequences are the main drivers of the observed mutational lesions in these species, population bottlenecks and genetic drift are likely the evolutionary processes acting on pseudogenes' emergence over time. Our findings unveil a new perspective on MTBC’s evolution and genetic diversity.
AB - Whole-genome sequence analyses have significantly contributed to the understanding of virulence and evolution of the Mycobacterium tuberculosis complex (MTBC), the causative pathogens of tuberculosis. Most MTBC evolutionary studies are focused on single nucleotide polymorphisms and deletions, but rare studies have evaluated gene content, whereas none has comprehensively evaluated pseudogenes. Accordingly, we describe an extensive study focused on quantifying and predicting possible functions of MTBC and Mycobacterium canettii pseudogenes. Using NCBI’s PGAP-detected pseudogenes, we analysed 25 837 pseudogenes from 158 MTBC and M. canetii strains and combined transcriptomics and proteomics of M. tuberculosis H37Rv to gain insights about pseudogenes' expression. Our results indicate significant variability concerning rate and conservancy of in silico predicted pseudogenes among different ecotypes and lineages of tuberculous mycobacteria and pseudogenization of important virulence factors and genes of the metabolism and antimicrobial resistance/tolerance. We show that in silico predicted pseudogenes contribute considerably to MTBC genetic diversity at the population level. Moreover, the transcription machinery of M. tuberculosis can fully transcribe most pseudogenes, indicating intact promoters and recent pseudogene evolutionary emergence. Proteomics of M. tuberculosis and close evaluation of mutational lesions driving pseudogenization suggest that few in silico predicted pseudogenes are likely capable of neofunctionalization, nonsense mutation reversal, or phase variation, contradicting the classical definition of pseudogenes. Such findings indicate that genome annotation should be accompanied by proteomics and protein function assays to improve its accuracy. While indels and insertion sequences are the main drivers of the observed mutational lesions in these species, population bottlenecks and genetic drift are likely the evolutionary processes acting on pseudogenes' emergence over time. Our findings unveil a new perspective on MTBC’s evolution and genetic diversity.
KW - comparative genomics
KW - frameshift
KW - loss of function mutations
KW - Mycobacterium tuberculosis complex
KW - phase variation
KW - pseudogenes
UR - http://www.scopus.com/inward/record.url?scp=85140271524&partnerID=8YFLogxK
U2 - 10.1099/mgen.0.000876
DO - 10.1099/mgen.0.000876
M3 - Article
C2 - 36250787
AN - SCOPUS:85140271524
SN - 2057-5858
VL - 8
JO - Microbial genomics
JF - Microbial genomics
IS - 10
M1 - 000876
ER -