GeneMates: an R package for detecting horizontal gene co-transfer between bacteria using gene-gene associations controlled for population structure
journal contribution
posted on 2020-12-21, 06:31authored byYu Wan, Ryan R Wick, Justin Zobel, Danielle J Ingle, Michael Inouye, Kathryn E Holt
Abstract
Background
Horizontal gene transfer contributes to bacterial evolution through mobilising genes across various taxonomical boundaries. It is frequently mediated by mobile genetic elements (MGEs), which may capture, maintain, and rearrange mobile genes and co-mobilise them between bacteria, causing horizontal gene co-transfer (HGcoT). This physical linkage between mobile genes poses a great threat to public health as it facilitates dissemination and co-selection of clinically important genes amongst bacteria. Although rapid accumulation of bacterial whole-genome sequencing data since the 2000s enables study of HGcoT at the population level, results based on genetic co-occurrence counts and simple association tests are usually confounded by bacterial population structure when sampled bacteria belong to the same species, leading to spurious conclusions.
Results
We have developed a network approach to explore WGS data for evidence of intraspecies HGcoT and have implemented it in R package GeneMates (github.com/wanyuac/GeneMates). The package takes as input an allelic presence-absence matrix of interested genes and a matrix of core-genome single-nucleotide polymorphisms, performs association tests with linear mixed models controlled for population structure, produces a network of significantly associated alleles, and identifies clusters within the network as plausible co-transferred alleles. GeneMates users may choose to score consistency of allelic physical distances measured in genome assemblies using a novel approach we have developed and overlay scores to the network for further evidence of HGcoT. Validation studies of GeneMates on known acquired antimicrobial resistance genes in Escherichia coli and Salmonella Typhimurium show advantages of our network approach over simple association analysis: (1) distinguishing between allelic co-occurrence driven by HGcoT and that driven by clonal reproduction, (2) evaluating effects of population structure on allelic co-occurrence, and (3) direct links between allele clusters in the network and MGEs when physical distances are incorporated.
Conclusion
GeneMates offers an effective approach to detection of intraspecies HGcoT using WGS data.
Funding
YW was supported by a Melbourne International Research Scholarship from the University of Melbourne. KEH was supported by the Bill & Melinda Gates Foundation, Seattle and a Senior Medical Research Fellowship from the Viertel Foundation of Australia.
History
Publication Date
2020-09-24
Journal
BMC Genomics
Volume
21
Issue
1
Article Number
658
Pagination
14p. (p. 1-14)
Publisher
BioMed Central
ISSN
1471-2164
Rights Statement
The Author reserves all moral rights over the deposited text and must be credited if any re-use occurs. Documents deposited in OPAL are the Open Access versions of outputs published elsewhere. Changes resulting from the publishing process may therefore not be reflected in this document. The final published version may be obtained via the publisher’s DOI. Please note that additional copyright and access restrictions may apply to the published version.