TY - JOUR
T1 - A bioinformatics tool for ensuring the backwards compatibility of Legionella pneumophila typing in the genomic era
AU - on behalf of
AU - The ESCMID Study Group for Legionella Infections
AU - Gordon, M.
AU - Yakunin, E.
AU - Valinsky, L.
AU - Chalifa-Caspi, V.
AU - Moran-Gilad, J.
N1 - Publisher Copyright:
© 2017 European Society of Clinical Microbiology and Infectious Diseases
PY - 2017/5/1
Y1 - 2017/5/1
N2 - Objectives Whole genome sequencing (WGS) has revolutionized the subtyping of Legionella pneumophila but calling the traditional sequence-based type from genomic data is hampered by multiple copies of the mompS locus. We propose a novel bioinformatics solution for rectifying that limitation, ensuring the feasibility of WGS for cluster investigation. Methods We designed a novel approach based on the alignment of raw reads with a reference sequence. With WGS, reads originating from either of the two mompS copies cannot be differentiated. Therefore, when non-identical copies were present, we applied a read-filtering strategy based on read alignment to a reference sequence via unique ‘anchors’. If minimal read coverage was achieved after filtration (≥3X), a consensus sequence was built based on mapped reads followed by calling the sequence-based typing allele. The entire procedure was implemented using a Perl script. Results The method was validated using a diverse sample of 265 L. pneumophila genomes, consisting of 59 different sequence types (STs) and 23 mompS variants; 57 of the 265 (22%) had non-identical mompS copies. In 237 of the 265 samples (89.4%), mompS calling was successful and no erroneous calling occurred. A 98.1% success was recorded among 109 samples meeting quality requirements. The method was superior to alternative approaches. Conclusions As WGS becomes more accessible, technical difficulties in routine clinical and surveillance work will arise. The case of mompS in L. pneumophila serves as an example for such limitations that necessitate the development of novel computational solutions that meet end-user demands.
AB - Objectives Whole genome sequencing (WGS) has revolutionized the subtyping of Legionella pneumophila but calling the traditional sequence-based type from genomic data is hampered by multiple copies of the mompS locus. We propose a novel bioinformatics solution for rectifying that limitation, ensuring the feasibility of WGS for cluster investigation. Methods We designed a novel approach based on the alignment of raw reads with a reference sequence. With WGS, reads originating from either of the two mompS copies cannot be differentiated. Therefore, when non-identical copies were present, we applied a read-filtering strategy based on read alignment to a reference sequence via unique ‘anchors’. If minimal read coverage was achieved after filtration (≥3X), a consensus sequence was built based on mapped reads followed by calling the sequence-based typing allele. The entire procedure was implemented using a Perl script. Results The method was validated using a diverse sample of 265 L. pneumophila genomes, consisting of 59 different sequence types (STs) and 23 mompS variants; 57 of the 265 (22%) had non-identical mompS copies. In 237 of the 265 samples (89.4%), mompS calling was successful and no erroneous calling occurred. A 98.1% success was recorded among 109 samples meeting quality requirements. The method was superior to alternative approaches. Conclusions As WGS becomes more accessible, technical difficulties in routine clinical and surveillance work will arise. The case of mompS in L. pneumophila serves as an example for such limitations that necessitate the development of novel computational solutions that meet end-user demands.
KW - Backwards compatibility
KW - Bioinformatics
KW - Legionella
KW - Sequencing
KW - Typing
UR - http://www.scopus.com/inward/record.url?scp=85013485261&partnerID=8YFLogxK
U2 - 10.1016/j.cmi.2017.01.002
DO - 10.1016/j.cmi.2017.01.002
M3 - Article
C2 - 28082190
AN - SCOPUS:85013485261
SN - 1198-743X
VL - 23
SP - 306
EP - 310
JO - Clinical Microbiology and Infection
JF - Clinical Microbiology and Infection
IS - 5
ER -