CLASSIFICATION - Word Related Documents




#
Rank
Similarity
Title + Abs.
Year
PMID
012345
907400.9963BacAnt: A Combination Annotation Server for Bacterial DNA Sequences to Identify Antibiotic Resistance Genes, Integrons, and Transposable Elements. Whole genome sequencing (WGS) of bacteria has become a routine method in diagnostic laboratories. One of the clinically most useful advantages of WGS is the ability to predict antimicrobial resistance genes (ARGs) and mobile genetic elements (MGEs) in bacterial sequences. This allows comprehensive investigations of such genetic features but can also be used for epidemiological studies. A plethora of software programs have been developed for the detailed annotation of bacterial DNA sequences, such as rapid annotation using subsystem technology (RAST), Resfinder, ISfinder, INTEGRALL and The Transposon Registry. Unfortunately, to this day, a reliable annotation tool of the combination of ARGs and MGEs is not available, and the generation of genbank files requires much manual input. Here, we present a new webserver which allows the annotation of ARGs, integrons and transposable elements at the same time. The pipeline generates genbank files automatically, which are compatible with Easyfig for comparative genomic analysis. Our BacAnt code and standalone software package are available at https://github.com/xthua/bacant with an accompanying web application at http://bacant.net.202134367079
846310.9962Safety assessment of five candidate probiotic lactobacilli using comparative genome analysis. Micro-organisms belonging to the Lactobacillus genus complex are often used for oral consumption and are generally considered safe but can exhibit pathogenicity in rare and specific cases. Therefore, screening and understanding genetic factors that may contribute to pathogenicity can yield valuable insights regarding probiotic safety. Limosilactobacillus mucosae LM1, Lactiplantibacillus plantarum SK151, Lactiplantibacillus plantarum BS25, Limosilactobacillus fermentum SK152 and Lactobacillus johnsonii PF01 are current probiotics of interest; however, their safety profiles have not been explored. The genome sequences of LM1, SK151, SK152 and PF01 were downloaded from the NCBI GenBank, while that of L. plantarum BS25 was newly sequenced. These genomes were then annotated using the Rapid Annotation using Subsystem Technology tool kit pipeline. Subsequently, a command line blast was performed against the Virulence Factor Database (VFDB) and the Comprehensive Antibiotic Resistance Database (CARD) to identify potential virulence factors and antibiotic resistance (AR) genes. Furthermore, ResFinder was used to detect acquired AR genes. The query against the VFDB identified genes that have a role in bacterial survivability, platelet aggregation, surface adhesion, biofilm formation and immunoregulation; and no acquired AR genes were detected using CARD and ResFinder. The study shows that the query strains exhibit genes identical to those present in pathogenic bacteria with the genes matched primarily having roles related to survival and surface adherence. Our results contribute to the overall strategies that can be employed in pre-clinical safety assessments of potential probiotics. Gene mining using whole-genome data, coupled with experimental validation, can be implemented in future probiotic safety assessment strategies.202438361650
907620.9962ResiDB: An automated database manager for sequence data. The amount of publicly available DNA sequence data is drastically increasing, making it a tedious task to create sequence databases necessary for the design of diagnostic assays. The selection of appropriate sequences is especially challenging in genes affected by frequent point mutations such as antibiotic resistance genes. To overcome this issue, we have designed the webtool resiDB, a rapid and user-friendly sequence database manager for bacteria, fungi, viruses, protozoa, invertebrates, plants, archaea, environmental and whole genome shotgun sequence data. It automatically identifies and curates sequence clusters to create custom sequence databases based on user-defined input sequences. A collection of helpful visualization tools gives the user the opportunity to easily access, evaluate, edit, and download the newly created database. Consequently, researchers do no longer have to manually manage sequence data retrieval, deal with hardware limitations, and run multiple independent software tools, each having its own requirements, input and output formats. Our tool was developed within the H2020 project FAPIC aiming to develop a single diagnostic assay targeting all sepsis-relevant pathogens and antibiotic resistance mechanisms. ResiDB is freely accessible to all users through https://residb.ait.ac.at/.202133495705
907230.9961PanGeT: Pan-genomics tool. A decade after the concept of Pan-genome was first introduced; research in this field has spread its tentacles to areas such as pathogenesis of diseases, bacterial evolutionary studies and drug resistance. Gene content-based differentiation of virulent and a virulent strains of bacteria and identification of pathogen specific genes is imperative to understand their physiology and gain insights into the mechanism of genome evolution. Subsequently, this will aid in identifying diagnostic targets and in developing and selecting vaccines. The root of pan-genomic studies, however, is to identify the core genes, dispensable genes and strain specific genes across the genomes belonging to a clade. To this end, we have developed a tool, "PanGeT - Pan-genomics Tool" to compute the 'pan-genome' based on comparisons at the genome as well as the proteome levels. This automated tool is implemented using LaTeX libraries for effective visualization of overall pan-genome through graphical plots. Links to retrieve sequence information and functional annotations have also been provided. PanGeT can be downloaded from http://pranag.physics.iisc.ernet.in/PanGeT/ or https://github.com/PanGeTv1/PanGeT.201727851981
514840.9959Unveiling the whole genomic features and potential probiotic characteristics of novel Lactiplantibacillus plantarum HMX2. This study investigates the genomic features and probiotic potential of Lactiplantibacillus plantarum HMX2, isolated from Chinese Sauerkraut, using whole-genome sequencing (WGS) and bioinformatics for the first time. This study also aims to find genetic diversity, antibiotic resistance genes, and functional capabilities to help us better understand its food safety applications and potential as a probiotic. L. plantarum HMX2 was cultured, and DNA was extracted for WGS. Genomic analysis comprised average nucleotide identity (ANI) prediction, genome annotation, pangenome, and synteny analysis. Bioinformatics techniques were used to identify CoDing Sequences (CDSs), transfer RNA (tRNA) and ribosomal RNA (rRNA) genes, and antibiotic resistance genes, as well as to conduct phylogenetic analysis to establish genetic diversity and evolution. The study found a significant genetic similarity (99.17% ANI) between L. plantarum HMX2 and the reference strain. Genome annotation revealed 3,242 coding sequences, 65 tRNA genes, and 16 rRNA genes. Significant genetic variety was found, including 25 antibiotic resistance genes. A phylogenetic study placed L. plantarum HMX2 among closely related bacteria, emphasizing its potential for probiotic and food safety applications. The genomic investigation of L. plantarum showed essential genes, including plnJK and plnEF, which contribute to antibacterial action against foodborne pathogens. Furthermore, genes such as MurA, Alr, and MprF improve food safety and probiotic potential by promoting bacterial survival under stress conditions in food and the gastrointestinal tract. This study introduces the new genomic features of L. plantarum HMX2 about specific genetics and its possibility of relevant uses in food security and technologies. These findings of specific genes involved in antimicrobial activity provide fresh possibilities for exploiting this strain in forming probiotic preparations and food preservation methods. The future research should focus on the experimental validation of antibiotic resistance genes, comparative genomics to investigate functional diversity, and the development of novel antimicrobial therapies that take advantage of L. plantarum's capabilities.202439611087
907550.9959CamPype: an open-source workflow for automated bacterial whole-genome sequencing analysis focused on Campylobacter. BACKGROUND: The rapid expansion of Whole-Genome Sequencing has revolutionized the fields of clinical and food microbiology. However, its implementation as a routine laboratory technique remains challenging due to the growth of data at a faster rate than can be effectively analyzed and critical gaps in bioinformatics knowledge. RESULTS: To address both issues, CamPype was developed as a new bioinformatics workflow for the genomics analysis of sequencing data of bacteria, especially Campylobacter, which is the main cause of gastroenteritis worldwide making a negative impact on the economy of the public health systems. CamPype allows fully customization of stages to run and tools to use, including read quality control filtering, read contamination, reads extension and assembly, bacterial typing, genome annotation, searching for antibiotic resistance genes, virulence genes and plasmids, pangenome construction and identification of nucleotide variants. All results are processed and resumed in an interactive HTML report for best data visualization and interpretation. CONCLUSIONS: The minimal user intervention of CamPype makes of this workflow an attractive resource for microbiology laboratories with no expertise in bioinformatics as a first line method for bacterial typing and epidemiological analyses, that would help to reduce the costs of disease outbreaks, or for comparative genomic analyses. CamPype is publicly available at https://github.com/JoseBarbero/CamPype .202337474912
511460.9959Datasets for benchmarking antimicrobial resistance genes in bacterial metagenomic and whole genome sequencing. Whole genome sequencing (WGS) is a key tool in identifying and characterising disease-associated bacteria across clinical, agricultural, and environmental contexts. One increasingly common use of genomic and metagenomic sequencing is in identifying the type and range of antimicrobial resistance (AMR) genes present in bacterial isolates in order to make predictions regarding their AMR phenotype. However, there are a large number of alternative bioinformatics software and pipelines available, which can lead to dissimilar results. It is, therefore, vital that researchers carefully evaluate their genomic and metagenomic AMR analysis methods using a common dataset. To this end, as part of the Microbial Bioinformatics Hackathon and Workshop 2021, a 'gold standard' reference genomic and simulated metagenomic dataset was generated containing raw sequence reads mapped against their corresponding reference genome from a range of 174 potentially pathogenic bacteria. These datasets and their accompanying metadata are freely available for use in benchmarking studies of bacteria and their antimicrobial resistance genes and will help improve tool development for the identification of AMR genes in complex samples.202235705638
907070.9958Automated annotation of mobile antibiotic resistance in Gram-negative bacteria: the Multiple Antibiotic Resistance Annotator (MARA) and database. BACKGROUND: Multiresistance in Gram-negative bacteria is often due to acquisition of several different antibiotic resistance genes, each associated with a different mobile genetic element, that tend to cluster together in complex conglomerations. Accurate, consistent annotation of resistance genes, the boundaries and fragments of mobile elements, and signatures of insertion, such as DR, facilitates comparative analysis of complex multiresistance regions and plasmids to better understand their evolution and how resistance genes spread. OBJECTIVES: To extend the Repository of Antibiotic resistance Cassettes (RAC) web site, which includes a database of 'features', and the Attacca automatic DNA annotation system, to encompass additional resistance genes and all types of associated mobile elements. METHODS: Antibiotic resistance genes and mobile elements were added to RAC, from existing registries where possible. Attacca grammars were extended to accommodate the expanded database, to allow overlapping features to be annotated and to identify and annotate features such as composite transposons and DR. RESULTS: The Multiple Antibiotic Resistance Annotator (MARA) database includes antibiotic resistance genes and selected mobile elements from Gram-negative bacteria, distinguishing important variants. Sequences can be submitted to the MARA web site for annotation. A list of positions and orientations of annotated features, indicating those that are truncated, DR and potential composite transposons is provided for each sequence, as well as a diagram showing annotated features approximately to scale. CONCLUSIONS: The MARA web site (http://mara.spokade.com) provides a comprehensive database for mobile antibiotic resistance in Gram-negative bacteria and accurately annotates resistance genes and associated mobile elements in submitted sequences to facilitate comparative analysis.201829373760
907380.9958EpitoCore: Mining Conserved Epitope Vaccine Candidates in the Core Proteome of Multiple Bacteria Strains. In reverse vaccinology approaches, complete proteomes of bacteria are submitted to multiple computational prediction steps in order to filter proteins that are possible vaccine candidates. Most available tools perform such analysis only in a single strain, or a very limited number of strains. But the vast amount of genomic data had shown that most bacteria contain pangenomes, i.e., their genomic information contains core, conserved genes, and random accessory genes specific to each strain. Therefore, in reverse vaccinology methods it is of the utmost importance to define core proteins and core epitopes. EpitoCore is a decision-tree pipeline developed to fulfill that need. It provides surfaceome prediction of proteins from related strains, defines core proteins within those, calculate their immunogenicity, predicts epitopes for a given set of MHC alleles defined by the user, and then reports if epitopes are located extracellularly and if they are conserved among the core homologs. Pipeline performance is illustrated by mining peptide vaccine candidates in Mycobacterium avium hominissuis strains. From a total proteome of ~4,800 proteins per strain, EpitoCore predicted 103 highly immunogenic core homologs located at cell surface, many of those related to virulence and drug resistance. Conserved epitopes identified among these homologs allows the users to define sets of peptides with potential to immunize the largest coverage of tested HLA alleles using peptide-based vaccines. Therefore, EpitoCore is able to provide automated identification of conserved epitopes in bacterial pangenomic datasets.202032431712
907790.9958The PLSDB 2025 update: enhanced annotations and improved functionality for comprehensive plasmid research. Plasmids are extrachromosomal DNA molecules in bacteria and archaea, playing critical roles in horizontal gene transfer, antibiotic resistance, and pathogenicity. Since its first release in 2018, our database on plasmids, PLSDB, has significantly grown and enhanced its content and scope. From 34 513 records contained in the 2021 version, PLSDB now hosts 72 360 entries. Designed to provide life scientists with convenient access to extensive plasmid data and to support computer scientists by offering curated datasets for artificial intelligence (AI) development, this latest update brings more comprehensive and accurate information for plasmid research, with interactive visualization options. We enriched PLSDB by refining the identification and classification of plasmid host ecosystems and host diseases. Additionally, we incorporated annotations for new functional structures, including protein-coding genes and biosynthetic gene clusters. Further, we enhanced existing annotations, such as antimicrobial resistance genes and mobility typing. To accommodate these improvements and to host the increase plasmid sets, the webserver architecture and underlying data structures of PLSDB have been re-reconstructed, resulting in decreased response times and enhanced visualization of features while ensuring that users have access to a more efficient and user-friendly interface. The latest release of PLSDB is freely accessible at https://www.ccb.uni-saarland.de/plsdb2025.202539565221
8405100.9957Mapping Major Disease Resistance Genes in Soybean by Genome-Wide Association Studies. Soybean is one of the most valuable agricultural crops in the world. Besides, this legume is constantly attacked by a wide range of pathogens (fungi, bacteria, viruses, and nematodes) compromising yield and increasing production costs. One of the major disease management strategies is the genetic resistance provided by single genes and quantitative trait loci (QTL). Identifying the genomic regions underlying the resistance against these pathogens on soybean is one of the first steps performed by molecular breeders. In the past, genetic mapping studies have been widely used to discover these genomic regions. However, over the last decade, advances in next-generation sequencing technologies and their subsequent cost decreasing led to the development of cost-effective approaches to high-throughput genotyping. Thus, genome-wide association studies applying thousands of SNPs in large sets composed of diverse soybean accessions have been successfully done. In this chapter, a comprehensive review of the majority of GWAS for soybean diseases published since this approach was developed is provided. Important diseases caused by Heterodera glycines, Phytophthora sojae, and Sclerotinia sclerotiorum have been the focus of the several GWAS. However, other bacterial and fungi diseases also have been targets of GWAS. As such, this GWAS summary can serve as a guide for future studies of these diseases. The protocol begins by describing several considerations about the pathogens and bringing different procedures of molecular characterization of them. Advice to choose the best isolate/race to maximize the discovery of multiple R genes or to directly map an effective R gene is provided. A summary of protocols, methods, and tools to phenotyping the soybean panel is given to several diseases. We also give details of options of DNA extraction protocols and genotyping methods, and we describe parameters of SNP quality to soybean data. Websites and their online tools to obtain genotypic and phenotypic data for thousands of soybean accessions are highlighted. Finally, we report several tricks and tips in Subheading 4, especially related to composing the soybean panel as well as generating and analyzing the phenotype data. We hope this protocol will be helpful to achieve GWAS success in identifying resistance genes on soybean.202235641772
9744110.9957PARGT: a software tool for predicting antimicrobial resistance in bacteria. With the ever-increasing availability of whole-genome sequences, machine-learning approaches can be used as an alternative to traditional alignment-based methods for identifying new antimicrobial-resistance genes. Such approaches are especially helpful when pathogens cannot be cultured in the lab. In previous work, we proposed a game-theory-based feature evaluation algorithm. When using the protein characteristics identified by this algorithm, called 'features' in machine learning, our model accurately identified antimicrobial resistance (AMR) genes in Gram-negative bacteria. Here we extend our study to Gram-positive bacteria showing that coupling game-theory-identified features with machine learning achieved classification accuracies between 87% and 90% for genes encoding resistance to the antibiotics bacitracin and vancomycin. Importantly, we present a standalone software tool that implements the game-theory algorithm and machine-learning model used in these studies.202032620856
9079120.9957Review, Evaluation, and Directions for Gene-Targeted Assembly for Ecological Analyses of Metagenomes. Shotgun metagenomics has greatly advanced our understanding of microbial communities over the last decade. Metagenomic analyses often include assembly and genome binning, computationally daunting tasks especially for big data from complex environments such as soil and sediments. In many studies, however, only a subset of genes and pathways involved in specific functions are of interest; thus, it is not necessary to attempt global assembly. In addition, methods that target genes can be computationally more efficient and produce more accurate assembly by leveraging rich databases, especially for those genes that are of broad interest such as those involved in biogeochemical cycles, biodegradation, and antibiotic resistance or used as phylogenetic markers. Here, we review six gene-targeted assemblers with unique algorithms for extracting and/or assembling targeted genes: Xander, MegaGTA, SAT-Assembler, HMM-GRASPx, GenSeed-HMM, and MEGAN. We tested these tools using two datasets with known genomes, a synthetic community of artificial reads derived from the genomes of 17 bacteria, shotgun sequence data from a mock community with 48 bacteria and 16 archaea genomes, and a large soil shotgun metagenomic dataset. We compared assemblies of a universal single copy gene (rplB) and two N cycle genes (nifH and nirK). We measured their computational efficiency, sensitivity, specificity, and chimera rate and found Xander and MegaGTA, which both use a probabilistic graph structure to model the genes, have the best overall performance with all three datasets, although MEGAN, a reference matching assembler, had better sensitivity with synthetic and mock community members chosen from its reference collection. Also, Xander and MegaGTA are the only tools that include post-assembly scripts tuned for common molecular ecology and diversity analyses. Additionally, we provide a mathematical model for estimating the probability of assembling targeted genes in a metagenome for estimating required sequencing depth.201931749830
9071130.9957RAC: Repository of Antibiotic resistance Cassettes. Antibiotic resistance in bacteria is often due to acquisition of resistance genes associated with different mobile genetic elements. In Gram-negative bacteria, many resistance genes are found as part of small mobile genetic elements called gene cassettes, generally found integrated into larger elements called integrons. Integrons carrying antibiotic resistance gene cassettes are often associated with mobile elements and here are designated 'mobile resistance integrons' (MRIs). More than one cassette can be inserted in the same integron to create arrays that contribute to the spread of multi-resistance. In many sequences in databases such as GenBank, only the genes within cassettes, rather than whole cassettes, are annotated and the same gene/cassette may be given different names in different entries, hampering analysis. We have developed the Repository of Antibiotic resistance Cassettes (RAC) website to provide an archive of gene cassettes that includes alternative gene names from multiple nomenclature systems and allows the community to contribute new cassettes. RAC also offers an additional function that allows users to submit sequences containing cassettes or arrays for annotation using the automatic annotation system Attacca. Attacca recognizes features (gene cassettes, integron regions) and identifies cassette arrays as patterns of features and can also distinguish minor cassette variants that may encode different resistance phenotypes (aacA4 cassettes and bla cassettes-encoding β-lactamases). Gaps in annotations are manually reviewed and those found to correspond to novel cassettes are assigned unique names. While there are other websites dedicated to integrons or antibiotic resistance genes, none includes a complete list of antibiotic resistance gene cassettes in MRI or offers consistent annotation and appropriate naming of all of these cassettes in submitted sequences. RAC thus provides a unique resource for researchers, which should reduce confusion and improve the quality of annotations of gene cassettes in integrons associated with antibiotic resistance. DATABASE URL: http://www2.chi.unsw.edu.au/rac.201122140215
5098140.9956Feature selection and aggregation for antibiotic resistance GWAS in Mycobacterium tuberculosis: a comparative study. INTRODUCTION: Drug resistance (DR) of pathogens remains a global healthcare concern. In contrast to other bacteria, acquiring mutations in the core genome is the main mechanism of drug resistance for Mycobacterium tuberculosis (MTB). For some antibiotics, the resistance of a particular isolate can be reliably predicted by identifying specific mutations, while for other antibiotics the knowledge of resistance mechanisms is limited. Statistical machine learning (ML) methods are used to infer new genes implicated in drug resistance leveraging large collections of isolates with known whole-genome sequences and phenotypic states for different drugs. However, high correlations between the phenotypic states for commonly used drugs complicate the inference of true associations of mutations with drug phenotypes by ML approaches. METHODS: Recently, several new methods have been developed to select a small subset of reliable predictors of the dependent variable, which may help reduce the number of spurious associations identified. In this study, we evaluated several such methods, namely, logistic regression with different regularization penalty functions, a recently introduced algorithm for solving the best-subset selection problem (ABESS) and "Hungry, Hungry SNPos" (HHS) a heuristic algorithm specifically developed to identify resistance-associated genetic variants in the presence of resistance co-occurrence. We assessed their ability to select known causal mutations for resistance to a specific drug while avoiding the selection of mutations in genes associated with resistance to other drugs, thus we compared selected ML models for their applicability for MTB genome wide association studies. RESULTS AND DISCUSSION: In our analysis, ABESS significantly outperformed the other methods, selecting more relevant sets of mutations. Additionally, we demonstrated that aggregating rare mutations within protein-coding genes into markers indicative of changes in PFAM domains improved prediction quality, and these markers were predominantly selected by ABESS, suggesting their high informativeness. However, ABESS yielded lower prediction accuracy compared to logistic regression methods with regularization.202540606161
4354150.9956ARDB--Antibiotic Resistance Genes Database. The treatment of infections is increasingly compromised by the ability of bacteria to develop resistance to antibiotics through mutations or through the acquisition of resistance genes. Antibiotic resistance genes also have the potential to be used for bio-terror purposes through genetically modified organisms. In order to facilitate the identification and characterization of these genes, we have created a manually curated database--the Antibiotic Resistance Genes Database (ARDB)--unifying most of the publicly available information on antibiotic resistance. Each gene and resistance type is annotated with rich information, including resistance profile, mechanism of action, ontology, COG and CDD annotations, as well as external links to sequence and protein databases. Our database also supports sequence similarity searches and implements an initial version of a tool for characterizing common mutations that confer antibiotic resistance. The information we provide can be used as compendium of antibiotic resistance factors as well as to identify the resistance genes of newly sequenced genes, genomes, or metagenomes. Currently, ARDB contains resistance information for 13,293 genes, 377 types, 257 antibiotics, 632 genomes, 933 species and 124 genera. ARDB is available at http://ardb.cbcb.umd.edu/.200918832362
5119160.9956ROCker models for reliable detection and typing of short-read sequences carrying mcr, erm, mph, and lnu antibiotic resistance genes. Quantitative monitoring of emerging antimicrobial resistance genes (ARGs) using short-read sequences remains challenging due to the high frequency of amino acid functional domains and motifs shared with related but functionally distinct (non-target) proteins. To facilitate ARG monitoring efforts using unassembled short reads, we present novel ROCker models for mcr, mph, erm, and lnu ARG families, as well as models for variants of special public health concern within these families, including mcr-1, mphA, ermB, lnuF, lnuB, and lnuG genes. For this, we curated target gene sequence sets for model training and built these models using the recently updated ROCker V2 pipeline (Gerhardt et al., in review). To validate our models, we simulated reads from the whole genome of ARG-carrying isolates spanning a range of common read lengths and used them to challenge the filtering efficacy of ROCker versus common static filtering approaches, such as similarity searches using BLASTx with various e-value thresholds or hidden Markov models. ROCker models consistently showed F1 scores up to 10× higher (31% higher on average) and lower false-positive (by 30%, on average) and false-negative (by 16%, on average) rates based on 250 bp reads compared to alternative methods. The ROCker models and all related reference materials and data are freely available through http://enve-omics.ce.gatech.edu/rocker/models, further expanding the available model collection previously developed for other genes. Their application to short-read metagenomes, metatranscriptomes, and PCR amplicon data should facilitate more accurate classification and quantification of unassembled short-read sequences for these ARG families and specific genes.IMPORTANCEAntimicrobial resistance gene families encoding erm and mph genes confer resistance to the macrolide class of antimicrobials, which are used to treat a wide range of infections. Similarly, the mcr gene family confers resistance to polymyxin E (colistin), a drug of last resort for many serious drug-resistant bacterial infections, and the lnu gene family confers resistance to lincomycin, which is reserved for patients allergic to penicillin or where bacteria have developed resistance to other antimicrobials. Assessing the prevalence of these genes in clinical or environmental samples and monitoring their spread to new pathogens are thus important for quantifying the associated public health risk. However, detecting these and other resistance genes in short-read sequence data is technically challenging. Our ROCker bioinformatic pipeline achieves reliable detection and typing of broad-range target gene sequences in complex data sets, thus contributing toward solving an important problem in ongoing surveillance efforts of antimicrobial resistance.202541143534
9078170.9955MetaCherchant: analyzing genomic context of antibiotic resistance genes in gut microbiota. MOTIVATION: Antibiotic resistance is an important global public health problem. Human gut microbiota is an accumulator of resistance genes potentially providing them to pathogens. It is important to develop tools for identifying the mechanisms of how resistance is transmitted between gut microbial species and pathogens. RESULTS: We developed MetaCherchant-an algorithm for extracting the genomic environment of antibiotic resistance genes from metagenomic data in the form of a graph. The algorithm was validated on a number of simulated and published datasets, as well as applied to new 'shotgun' metagenomes of gut microbiota from patients with Helicobacter pylori who underwent antibiotic therapy. Genomic context was reconstructed for several major resistance genes. Taxonomic annotation of the context suggests that within a single metagenome, the resistance genes can be contained in genomes of multiple species. MetaCherchant allows reconstruction of mobile elements with resistance genes within the genomes of bacteria using metagenomic data. Application of MetaCherchant in differential mode produced specific graph structures suggesting the evidence of possible resistance gene transmission within a mobile element that occurred as a result of the antibiotic therapy. MetaCherchant is a promising tool giving researchers an opportunity to get an insight into dynamics of resistance transmission in vivo basing on metagenomic data. AVAILABILITY AND IMPLEMENTATION: Source code and binaries are freely available for download at https://github.com/ctlab/metacherchant. The code is written in Java and is platform-independent. COTANCT: ulyantsev@rain.ifmo.ru. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.201829092015
9083180.9955ARGNet: using deep neural networks for robust identification and classification of antibiotic resistance genes from sequences. BACKGROUND: Emergence of antibiotic resistance in bacteria is an important threat to global health. Antibiotic resistance genes (ARGs) are some of the key components to define bacterial resistance and their spread in different environments. Identification of ARGs, particularly from high-throughput sequencing data of the specimens, is the state-of-the-art method for comprehensively monitoring their spread and evolution. Current computational methods to identify ARGs mainly rely on alignment-based sequence similarities with known ARGs. Such approaches are limited by choice of reference databases and may potentially miss novel ARGs. The similarity thresholds are usually simple and could not accommodate variations across different gene families and regions. It is also difficult to scale up when sequence data are increasing. RESULTS: In this study, we developed ARGNet, a deep neural network that incorporates an unsupervised learning autoencoder model to identify ARGs and a multiclass classification convolutional neural network to classify ARGs that do not depend on sequence alignment. This approach enables a more efficient discovery of both known and novel ARGs. ARGNet accepts both amino acid and nucleotide sequences of variable lengths, from partial (30-50 aa; 100-150 nt) sequences to full-length protein or genes, allowing its application in both target sequencing and metagenomic sequencing. Our performance evaluation showed that ARGNet outperformed other deep learning models including DeepARG and HMD-ARG in most of the application scenarios especially quasi-negative test and the analysis of prediction consistency with phylogenetic tree. ARGNet has a reduced inference runtime by up to 57% relative to DeepARG. CONCLUSIONS: ARGNet is flexible, efficient, and accurate at predicting a broad range of ARGs from the sequencing data. ARGNet is freely available at https://github.com/id-bioinfo/ARGNet , with an online service provided at https://ARGNet.hku.hk . Video Abstract.202438725076
9081190.9955Identification and reconstruction of novel antibiotic resistance genes from metagenomes. BACKGROUND: Environmental and commensal bacteria maintain a diverse and largely unknown collection of antibiotic resistance genes (ARGs) that, over time, may be mobilized and transferred to pathogens. Metagenomics enables cultivation-independent characterization of bacterial communities but the resulting data is noisy and highly fragmented, severely hampering the identification of previously undescribed ARGs. We have therefore developed fARGene, a method for identification and reconstruction of ARGs directly from shotgun metagenomic data. RESULTS: fARGene uses optimized gene models and can therefore with high accuracy identify previously uncharacterized resistance genes, even if their sequence similarity to known ARGs is low. By performing the analysis directly on the metagenomic fragments, fARGene also circumvents the need for a high-quality assembly. To demonstrate the applicability of fARGene, we reconstructed β-lactamases from five billion metagenomic reads, resulting in 221 ARGs, of which 58 were previously not reported. Based on 38 ARGs reconstructed by fARGene, experimental verification showed that 81% provided a resistance phenotype in Escherichia coli. Compared to other methods for detecting ARGs in metagenomic data, fARGene has superior sensitivity and the ability to reconstruct previously unknown genes directly from the sequence reads. CONCLUSIONS: We conclude that fARGene provides an efficient and reliable way to explore the unknown resistome in bacterial communities. The method is applicable to any type of ARGs and is freely available via GitHub under the MIT license.201930935407