ARGNet: using deep neural networks for robust identification and classification of antibiotic resistance genes from sequences. - Related Documents

#	Rank	Similarity	Title + Abs.	Year	PMID
0	1	2	3	4	5
9083	0	1.0000	ARGNet: using deep neural networks for robust identification and classification of antibiotic resistance genes from sequences. BACKGROUND: Emergence of antibiotic resistance in bacteria is an important threat to global health. Antibiotic resistance genes (ARGs) are some of the key components to define bacterial resistance and their spread in different environments. Identification of ARGs, particularly from high-throughput sequencing data of the specimens, is the state-of-the-art method for comprehensively monitoring their spread and evolution. Current computational methods to identify ARGs mainly rely on alignment-based sequence similarities with known ARGs. Such approaches are limited by choice of reference databases and may potentially miss novel ARGs. The similarity thresholds are usually simple and could not accommodate variations across different gene families and regions. It is also difficult to scale up when sequence data are increasing. RESULTS: In this study, we developed ARGNet, a deep neural network that incorporates an unsupervised learning autoencoder model to identify ARGs and a multiclass classification convolutional neural network to classify ARGs that do not depend on sequence alignment. This approach enables a more efficient discovery of both known and novel ARGs. ARGNet accepts both amino acid and nucleotide sequences of variable lengths, from partial (30-50 aa; 100-150 nt) sequences to full-length protein or genes, allowing its application in both target sequencing and metagenomic sequencing. Our performance evaluation showed that ARGNet outperformed other deep learning models including DeepARG and HMD-ARG in most of the application scenarios especially quasi-negative test and the analysis of prediction consistency with phylogenetic tree. ARGNet has a reduced inference runtime by up to 57% relative to DeepARG. CONCLUSIONS: ARGNet is flexible, efficient, and accurate at predicting a broad range of ARGs from the sequencing data. ARGNet is freely available at https://github.com/id-bioinfo/ARGNet , with an online service provided at https://ARGNet.hku.hk . Video Abstract.	2024	38725076
5115	1	0.9992	Search Engine for Antimicrobial Resistance: A Cloud Compatible Pipeline and Web Interface for Rapidly Detecting Antimicrobial Resistance Genes Directly from Sequence Data. BACKGROUND: Antimicrobial resistance remains a growing and significant concern in human and veterinary medicine. Current laboratory methods for the detection and surveillance of antimicrobial resistant bacteria are limited in their effectiveness and scope. With the rapidly developing field of whole genome sequencing beginning to be utilised in clinical practice, the ability to interrogate sequencing data quickly and easily for the presence of antimicrobial resistance genes will become increasingly important and useful for informing clinical decisions. Additionally, use of such tools will provide insight into the dynamics of antimicrobial resistance genes in metagenomic samples such as those used in environmental monitoring. RESULTS: Here we present the Search Engine for Antimicrobial Resistance (SEAR), a pipeline and web interface for detection of horizontally acquired antimicrobial resistance genes in raw sequencing data. The pipeline provides gene information, abundance estimation and the reconstructed sequence of antimicrobial resistance genes; it also provides web links to additional information on each gene. The pipeline utilises clustering and read mapping to annotate full-length genes relative to a user-defined database. It also uses local alignment of annotated genes to a range of online databases to provide additional information. We demonstrate SEAR's application in the detection and abundance estimation of antimicrobial resistance genes in two novel environmental metagenomes, 32 human faecal microbiome datasets and 126 clinical isolates of Shigella sonnei. CONCLUSIONS: We have developed a pipeline that contributes to the improved capacity for antimicrobial resistance detection afforded by next generation sequencing technologies, allowing for rapid detection of antimicrobial resistance genes directly from sequencing data. SEAR uses raw sequencing data via an intuitive interface so can be run rapidly without requiring advanced bioinformatic skills or resources. Finally, we show that SEAR is effective in detecting antimicrobial resistance genes in metagenomic and isolate sequencing data from both environmental metagenomes and sequencing data from clinical isolates.	2015	26197475
9554	2	0.9992	A multi-label learning framework for predicting antibiotic resistance genes via dual-view modeling. The increasing prevalence of antibiotic resistance has become a global health crisis. For the purpose of safety regulation, it is of high importance to identify antibiotic resistance genes (ARGs) in bacteria. Although culture-based methods can identify ARGs relatively more accurately, the identifying process is time-consuming and specialized knowledge is required. With the rapid development of whole genome sequencing technology, researchers attempt to identify ARGs by computing sequence similarity from public databases. However, these computational methods might fail to detect ARGs due to the low sequence identity to known ARGs. Moreover, existing methods cannot effectively address the issue of multidrug resistance prediction for ARGs, which is a great challenge to clinical treatments. To address the challenges, we propose an end-to-end multi-label learning framework for predicting ARGs. More specifically, the task of ARGs prediction is modeled as a problem of multi-label learning, and a deep neural network-based end-to-end framework is proposed, in which a specific loss function is introduced to employ the advantage of multi-label learning for ARGs prediction. In addition, a dual-view modeling mechanism is employed to make full use of the semantic associations among two views of ARGs, i.e. sequence-based information and structure-based information. Extensive experiments are conducted on publicly available data, and experimental results demonstrate the effectiveness of the proposed framework on the task of ARGs prediction.	2022	35272349
9081	3	0.9992	Identification and reconstruction of novel antibiotic resistance genes from metagenomes. BACKGROUND: Environmental and commensal bacteria maintain a diverse and largely unknown collection of antibiotic resistance genes (ARGs) that, over time, may be mobilized and transferred to pathogens. Metagenomics enables cultivation-independent characterization of bacterial communities but the resulting data is noisy and highly fragmented, severely hampering the identification of previously undescribed ARGs. We have therefore developed fARGene, a method for identification and reconstruction of ARGs directly from shotgun metagenomic data. RESULTS: fARGene uses optimized gene models and can therefore with high accuracy identify previously uncharacterized resistance genes, even if their sequence similarity to known ARGs is low. By performing the analysis directly on the metagenomic fragments, fARGene also circumvents the need for a high-quality assembly. To demonstrate the applicability of fARGene, we reconstructed β-lactamases from five billion metagenomic reads, resulting in 221 ARGs, of which 58 were previously not reported. Based on 38 ARGs reconstructed by fARGene, experimental verification showed that 81% provided a resistance phenotype in Escherichia coli. Compared to other methods for detecting ARGs in metagenomic data, fARGene has superior sensitivity and the ability to reconstruct previously unknown genes directly from the sequence reads. CONCLUSIONS: We conclude that fARGene provides an efficient and reliable way to explore the unknown resistome in bacterial communities. The method is applicable to any type of ARGs and is freely available via GitHub under the MIT license.	2019	30935407
7698	4	0.9991	Detecting horizontal gene transfer with metagenomics co-barcoding sequencing. Horizontal gene transfer (HGT) is the process through which genetic information is transferred between different genomes and that played a crucial role in bacterial evolution. HGT can enable bacteria to rapidly acquire antibiotic resistance and bacteria that have acquired resistance is spreading within the microbiome. Conventional methods of characterizing HGT patterns include short-read metagenomic sequencing (short-reads mNGS), long-read sequencing, and single-cell sequencing. These approaches present several limitations, such as short-read fragments, high amounts of input DNA, and sequencing costs, respectively. Here, we attempt to circumvent present limitations to detect HGT by developing a metagenomics co-barcode sequencing workflow (MECOS) and applying it to the human and mouse gut microbiomes. In addition to that, we have over 10-fold increased contig length compared to short-reads mNGS; we also obtained exceeding 30 million paired reads with co-barcode information. Applying the novel bioinformatic pipeline, we integrated this co-barcoding information and the context information from long reads, and observed over 50-fold HGT events after we corrected the potential wrong HGT events. Specifically, we detected approximately 3,000 HGT blocks in individual samples, encompassing ~6,000 genes and ~100 taxonomic groups, including loci conferring tetracycline resistance through ribosomal protection. MECOS provides a valuable tool for investigating HGT and advance our understanding on the evolution of natural microbial communities within hosts.IMPORTANCEIn this study, to better identify horizontal gene transfer (HGT) in individual samples, we introduce a new co-barcoding sequencing system called metagenomics co-barcoding sequencing (MECOS), which has three significant improvements: (i) long DNA fragment extraction, (ii) a special transposome insertion, (iii) hybridization of DNA to barcode beads, and (4) an integrated bioinformatic pipeline. Using our approach, we have over 10-fold increased contig length compared to short-reads mNGS, and observed over 50-fold HGT events after we corrected the potential wrong HGT events. Our results indicate the presence of approximately 3,000 HGT blocks, involving roughly 6,000 genes and 100 taxonomic groups in individual samples. Notably, these HGT events are predominantly enriched in genes that confer tetracycline resistance via ribosomal protection. MECOS is a useful tool for investigating HGT and the evolution of natural microbial communities within hosts, thereby advancing our understanding of microbial ecology and evolution.	2024	38315121
5118	5	0.9991	Automated extraction of genes associated with antibiotic resistance from the biomedical literature. The detection of bacterial antibiotic resistance phenotypes is important when carrying out clinical decisions for patient treatment. Conventional phenotypic testing involves culturing bacteria which requires a significant amount of time and work. Whole-genome sequencing is emerging as a fast alternative to resistance prediction, by considering the presence/absence of certain genes. A lot of research has focused on determining which bacterial genes cause antibiotic resistance and efforts are being made to consolidate these facts in knowledge bases (KBs). KBs are usually manually curated by domain experts to be of the highest quality. However, this limits the pace at which new facts are added. Automated relation extraction of gene-antibiotic resistance relations from the biomedical literature is one solution that can simplify the curation process. This paper reports on the development of a text mining pipeline that takes in English biomedical abstracts and outputs genes that are predicted to cause resistance to antibiotics. To test the generalisability of this pipeline it was then applied to predict genes associated with Helicobacter pylori antibiotic resistance, that are not present in common antibiotic resistance KBs or publications studying H. pylori. These genes would be candidates for further lab-based antibiotic research and inclusion in these KBs. For relation extraction, state-of-the-art deep learning models were used. These models were trained on a newly developed silver corpus which was generated by distant supervision of abstracts using the facts obtained from KBs. The top performing model was superior to a co-occurrence model, achieving a recall of 95%, a precision of 60% and F1-score of 74% on a manually annotated holdout dataset. To our knowledge, this project was the first attempt at developing a complete text mining pipeline that incorporates deep learning models to extract gene-antibiotic resistance relations from the literature. Additional related data can be found at https://github.com/AndreBrincat/Gene-Antibiotic-Resistance-Relation-Extraction.	2022	35134132
5102	6	0.9990	Pipeline for Antimicrobial Resistance Gene Quantification from Host Tissue. Antibiotics are frequently used in food production animals to control disease and improve productivity, but this promotes the development of antimicrobial resistance (AMR) and subsequent broader spread of AMR bacteria throughout food chain, endangering the well-being and health of both animals and humans. In humans, the gut microbiome harbors a diverse range of AMR bacteria, known as the resistome. To effectively mitigate AMR in food animals requires first determining the expression and abundance of AMR-related genes in the gut resistome. Currently, such knowledge in regard to food animals is largely lacking. Gut tissue RNA sequencing (GTRS) can capture metabolically active transcripts from both the host and the microbes attached to the gut epithelium. Ideally, AMR genes can be quantified using GTRS data, making it possible to study the relationship between host and microbe. For the majority of these GTRS studies, only host transcriptome changes have been reported, while the microbial AMR remains largely unexamined, mainly due to the lack of easily implementable bioinformatics tools. Here we present a straightforward workflow to accomplish that using common command-line bioinformatics tools. With this pipeline, the host is considered noise, and host data are filtered out from the microbial reads. Transcript quantification of the AMR genes is then performed. The pipeline then continues through AMR transcript quantification, differential gene expression, and SNP analysis. Using open-source tools, we made this analytical pipeline easy to implement and able to generate results ready to be incorporated into publishable reports. Published 2025. This article is a U.S. Government work and is in the public domain in the USA. Basic Protocol: Running the gene quantification pipeline Support Protocol 1: Downloading FASTQ files from the NCBI database Support Protocol 2: Building a genome reference index of the host Support Protocol 3: Differential gene expression analysis Support Protocol 4: Single-nucleotide polymorphism (SNP) analysis.	2025	40145236
9080	7	0.9990	Comparison of de-novo assembly tools for plasmid metagenome analysis. BACKGROUND: With the advent of next-generation sequencing techniques, culture-independent metagenome approaches have now made it possible to predict possible presence of genes in the environmental bacteria most of which may be non-cultivable. Short reads obtained from the deep sequencing can be assembled into long contigs some of which include plasmids. Plasmids are the circular double stranded DNA in bacteria and known as one of the major carriers of antibiotic resistance genes. OBJECTIVE: Metagenomic analyses, especially focused on plasmids, could help us predict dissemination mechanisms of antibiotic resistance genes in the environment. However, with the availability of a myriad of metagenomic assemblers, the selection of the most appropriate metagenome assembler for the plasmid metagenome study might be challenging. Therefore, in this study, we compared five open source assemblers to suggest most effective way of plasmid metagenome analysis. METHODS: IDBA-UD, MEGAHIT, SPAdes, SOAPdenovo2, and Velvet are compared for conducting plasmid metagenome analyses using two water samples. RESULTS: Our results clearly showed that abundance and types of antibiotic resistance genes on plasmids varied depending on the selection of assembly tools. IDBA-UD and MEGAHIT demonstrated the overall best assembly statistics with high N50 values with higher portion of longer contigs. CONCLUSION: These two assemblers also detected more diverse plasmids. Among the two, MEGAHIT showed more memory efficient assembly, therefore we suggest that the use of MEGAHIT for plasmid metagenome analysis may offer more diverse plasmids with less computer resource required. Here, we also summarized a fundamental plasmid metagenome work flow, especially for antibiotic resistance gene investigation.	2019	31187446
5100	8	0.9990	DeepPBI-KG: a deep learning method for the prediction of phage-bacteria interactions based on key genes. Phages, the natural predators of bacteria, were discovered more than 100 years ago. However, increasing antimicrobial resistance rates have revitalized phage research. Methods that are more time-consuming and efficient than wet-laboratory experiments are needed to help screen phages quickly for therapeutic use. Traditional computational methods usually ignore the fact that phage-bacteria interactions are achieved by key genes and proteins. Methods for intraspecific prediction are rare since almost all existing methods consider only interactions at the species and genus levels. Moreover, most strains in existing databases contain only partial genome information because whole-genome information for species is difficult to obtain. Here, we propose a new approach for interaction prediction by constructing new features from key genes and proteins via the application of K-means sampling to select high-quality negative samples for prediction. Finally, we develop DeepPBI-KG, a corresponding prediction tool based on feature selection and a deep neural network. The results show that the average area under the curve for prediction reached 0.93 for each strain, and the overall AUC and area under the precision-recall curve reached 0.89 and 0.92, respectively, on the independent test set; these values are greater than those of other existing prediction tools. The forward and reverse validation results indicate that key genes and key proteins regulate and influence the interaction, which supports the reliability of the model. In addition, intraspecific prediction experiments based on Klebsiella pneumoniae data demonstrate the potential applicability of DeepPBI-KG for intraspecific prediction. In summary, the feature engineering and interaction prediction approaches proposed in this study can effectively improve the robustness and stability of interaction prediction, can achieve high generalizability, and may provide new directions and insights for rapid phage screening for therapy.	2024	39344712
5119	9	0.9990	ROCker models for reliable detection and typing of short-read sequences carrying mcr, erm, mph, and lnu antibiotic resistance genes. Quantitative monitoring of emerging antimicrobial resistance genes (ARGs) using short-read sequences remains challenging due to the high frequency of amino acid functional domains and motifs shared with related but functionally distinct (non-target) proteins. To facilitate ARG monitoring efforts using unassembled short reads, we present novel ROCker models for mcr, mph, erm, and lnu ARG families, as well as models for variants of special public health concern within these families, including mcr-1, mphA, ermB, lnuF, lnuB, and lnuG genes. For this, we curated target gene sequence sets for model training and built these models using the recently updated ROCker V2 pipeline (Gerhardt et al., in review). To validate our models, we simulated reads from the whole genome of ARG-carrying isolates spanning a range of common read lengths and used them to challenge the filtering efficacy of ROCker versus common static filtering approaches, such as similarity searches using BLASTx with various e-value thresholds or hidden Markov models. ROCker models consistently showed F1 scores up to 10× higher (31% higher on average) and lower false-positive (by 30%, on average) and false-negative (by 16%, on average) rates based on 250 bp reads compared to alternative methods. The ROCker models and all related reference materials and data are freely available through http://enve-omics.ce.gatech.edu/rocker/models, further expanding the available model collection previously developed for other genes. Their application to short-read metagenomes, metatranscriptomes, and PCR amplicon data should facilitate more accurate classification and quantification of unassembled short-read sequences for these ARG families and specific genes.IMPORTANCEAntimicrobial resistance gene families encoding erm and mph genes confer resistance to the macrolide class of antimicrobials, which are used to treat a wide range of infections. Similarly, the mcr gene family confers resistance to polymyxin E (colistin), a drug of last resort for many serious drug-resistant bacterial infections, and the lnu gene family confers resistance to lincomycin, which is reserved for patients allergic to penicillin or where bacteria have developed resistance to other antimicrobials. Assessing the prevalence of these genes in clinical or environmental samples and monitoring their spread to new pathogens are thus important for quantifying the associated public health risk. However, detecting these and other resistance genes in short-read sequence data is technically challenging. Our ROCker bioinformatic pipeline achieves reliable detection and typing of broad-range target gene sequences in complex data sets, thus contributing toward solving an important problem in ongoing surveillance efforts of antimicrobial resistance.	2025	41143534
5114	10	0.9990	Datasets for benchmarking antimicrobial resistance genes in bacterial metagenomic and whole genome sequencing. Whole genome sequencing (WGS) is a key tool in identifying and characterising disease-associated bacteria across clinical, agricultural, and environmental contexts. One increasingly common use of genomic and metagenomic sequencing is in identifying the type and range of antimicrobial resistance (AMR) genes present in bacterial isolates in order to make predictions regarding their AMR phenotype. However, there are a large number of alternative bioinformatics software and pipelines available, which can lead to dissimilar results. It is, therefore, vital that researchers carefully evaluate their genomic and metagenomic AMR analysis methods using a common dataset. To this end, as part of the Microbial Bioinformatics Hackathon and Workshop 2021, a 'gold standard' reference genomic and simulated metagenomic dataset was generated containing raw sequence reads mapped against their corresponding reference genome from a range of 174 potentially pathogenic bacteria. These datasets and their accompanying metadata are freely available for use in benchmarking studies of bacteria and their antimicrobial resistance genes and will help improve tool development for the identification of AMR genes in complex samples.	2022	35705638
4943	11	0.9990	Targeted sequencing of Enterobacterales bacteria using CRISPR-Cas9 enrichment and Oxford Nanopore Technologies. Sequencing DNA directly from patient samples enables faster pathogen characterization compared to traditional culture-based approaches, but often yields insufficient sequence data for effective downstream analysis. CRISPR-Cas9 enrichment is designed to improve the yield of low abundance sequences but has not been thoroughly explored with Oxford Nanopore Technologies (ONT) for use in clinical bacterial epidemiology. We designed CRISPR-Cas9 guide RNAs to enrich the human pathogen Klebsiella pneumoniae, by targeting multi-locus sequence type (MLST) and transfer RNA (tRNA) genes, as well as common antimicrobial resistance (AMR) genes and the resistance-associated integron gene intI1. We validated enrichment performance in 20 K. pneumoniae isolates, finding that guides generated successful enrichment across all conserved sites except for one AMR gene in two isolates. Enrichment of MLST genes led to a correct allele call in all seven loci for 8 out of 10 isolates that had depth of 30× or more in these regions. We then compared enriched and unenriched sequencing of three human fecal samples spiked with K. pneumoniae at varying abundance. Enriched sequencing generated 56× and 11.3× the number of AMR and MLST reads, respectively, compared to unenriched sequencing, and required approximately one-third of the computational storage space. Targeting the intI1 gene often led to detection of 10-20 proximal resistance genes due to the long reads produced by ONT sequencing. We demonstrated that CRISPR-Cas9 enrichment combined with ONT sequencing enabled improved genomic characterization outcomes over unenriched sequencing of patient samples. This method could be used to inform infection control strategies by identifying patients colonized with high-risk strains. IMPORTANCE: Understanding bacteria in complex samples can be challenging due to their low abundance, which often results in insufficient data for analysis. To improve the detection of harmful bacteria, we implemented a technique aimed at increasing the amount of data from target pathogens when combined with modern DNA sequencing technologies. Our technique uses CRISPR-Cas9 to target specific gene sequences in the bacterial pathogen Klebsiella pneumoniae and improve recovery from human stool samples. We found our enrichment method to significantly outperform traditional methods, generating far more data originating from our target genes. Additionally, we developed new computational techniques to further enhance the analysis, providing a thorough method for characterizing pathogens from complex biological samples.	2025	39772804
5101	12	0.9990	Identification of Key Features Pivotal to the Characteristics and Functions of Gut Bacteria Taxa through Machine Learning Methods. BACKGROUND: Gut bacteria critically influence digestion, facilitate the breakdown of complex food substances, aid in essential nutrient synthesis, and contribute to immune system balance. However, current knowledge regarding intestinal bacteria remains insufficient. OBJECTIVE: This study aims to discover essential differences for different intestinal bacteria. METHODS: This study was conducted by investigating a total of 1478 gut bacterial samples comprising 235 Actinobacteria, 447 Bacteroidetes, and 796 Firmicutes, by utilizing sophisticated machine learning algorithms. By building on the dataset provided by Chen et al., we engaged sophisticated machine learning techniques to further investigate and analyze the gut bacterial samples. Each sample in the dataset was described by 993 unique features associated with gut bacteria, including 342 features annotated by the Antibiotic Resistance Genes Database, Comprehensive Antibiotic Research Database, Kyoto Encyclopedia of Genes and Genomes, and Virulence Factors of Pathogenic Bacteria. We employed incremental feature selection methods within a computational framework to identify the optimal features for classification. RESULTS: Eleven feature ranking algorithms selected several key features as pivotal to the characteristics and functions of gut bacteria. These features appear to facilitate the identification of specific gut bacterial species. Additionally, we established quantitative rules for identifying Actinobacteria, Bacteroidetes, and Firmicutes. CONCLUSION: This research underscores the significant potential of machine learning in studying gut microbes and enhances our understanding of the multifaceted roles of gut bacteria.	2025	40671232
9744	13	0.9990	PARGT: a software tool for predicting antimicrobial resistance in bacteria. With the ever-increasing availability of whole-genome sequences, machine-learning approaches can be used as an alternative to traditional alignment-based methods for identifying new antimicrobial-resistance genes. Such approaches are especially helpful when pathogens cannot be cultured in the lab. In previous work, we proposed a game-theory-based feature evaluation algorithm. When using the protein characteristics identified by this algorithm, called 'features' in machine learning, our model accurately identified antimicrobial resistance (AMR) genes in Gram-negative bacteria. Here we extend our study to Gram-positive bacteria showing that coupling game-theory-identified features with machine learning achieved classification accuracies between 87% and 90% for genes encoding resistance to the antibiotics bacitracin and vancomycin. Importantly, we present a standalone software tool that implements the game-theory algorithm and machine-learning model used in these studies.	2020	32620856
6600	14	0.9990	Metagenomic approaches for the quantification of antibiotic resistance genes in swine wastewater treatment system: a systematic review. This systematic review aims to identify the metagenomic methodological approaches employed for the detection of antimicrobial resistance genes (ARGs) in swine wastewater treatment systems. The search terms used were metagenome AND bacteria AND ("antimicrobial resistance gene" OR resistome OR ARG) AND wastewater AND (swine OR pig), and the search was conducted across the following electronic databases: PubMed, Scopus, ScienceDirect, Web of Science, Embase, and Cochrane Library. The search was limited to studies published between 2020 and 2024. Of the 220 studies retrieved, eight met the eligibility criteria for full-text analysis. The number of publications in this research area has increased in recent years, with China contributing the highest number of studies. ARGs are typically identified using bioinformatics pipelines that include steps such as quality trimming, assembly, metagenome-assembled genome (MAG) reconstruction, open reading frame (ORF) prediction, and ARG annotation. However, comparing ARGs quantification across studies remains challenging due to methodological differences and variability in quantification approaches. Therefore, this systematic review highlights the need for methodological standardization to facilitate comparison and enhance our understanding of antimicrobial resistance in swine wastewater treatment systems through metagenomic approaches.	2025	40788461
5113	15	0.9990	Identification of bacterial antibiotic resistance genes in next-generation sequencing data (review of literature). The spread of antibiotic-resistant human bacterial pathogens is a serious threat to modern medicine. Antibiotic susceptibility testing is essential for treatment regimens optimization and preventing dissemination of antibiotic resistance. Therefore, development of antibiotic susceptibility testing methods is a priority challenge of laboratory medicine. The aim of this review is to analyze the capabilities of the bioinformatics tools for bacterial whole genome sequence data processing. The PubMed database, Russian scientific electronic library eLIBRARY, information networks of World health organization and European Society of Clinical Microbiology and Infectious Diseases (ESCMID) were used during the analysis. In this review, the platforms for whole genome sequencing, which are suitable for detection of bacterial genetic resistance determinants, are described. The classic step of genetic resistance determinants searching is an alignment between the query nucleotide/protein sequence and the subject (database) nucleotide/protein sequence, which is performed using the nucleotide and protein sequence databases. The most commonly used databases are Resfinder, CARD, Bacterial Antimicrobial Resistance Reference Gene Database. The results of the resistance determinants searching in genome assemblies is more correct in comparison to results of the searching in contigs. The new resistance genes searching bioinformatics tools, such as neural networks and machine learning, are discussed in the review. After critical appraisal of the current antibiotic resistance databases we designed a protocol for predicting antibiotic resistance using whole genome sequence data. The designed protocol can be used as a basis of the algorithm for qualitative and quantitative antimicrobial susceptibility testing based on whole genome sequence data.	2021	34882354
6581	16	0.9990	Do wastewater treatment plants increase antibiotic resistant bacteria or genes in the environment? Protocol for a systematic review. BACKGROUND: Antibiotic resistance is a global public health threat. Water from human activities is collected at wastewater treatment plants where processes often do not sufficiently neutralize antibiotic resistant bacteria and genes, which are further shed into the local environment. This protocol outlines the steps to conduct a systematic review based on the Population, Exposure, Comparator and Outcome (PECO) framework, aiming at answering the question "Are antimicrobial-resistant enterobacteriaceae and antimicrobial resistance genes present (O) in air and water samples (P) taken either near or downstream or downwind or down-gradient from wastewater treatment plants (E), as compared to air and water samples taken either further away or upstream or upwind or up-gradient from such wastewater treatment plant (C)?" Presence of antimicrobial-resistant bacteria and genes will be quantitatively measured by extracting their prevalence or concentration, depending on the reviewed study. METHODS: We will search PubMed, EMBASE, the Cochrane database and Web of Science for original articles published from 1 Jan 2000 to 3 Sep 2018 with language restriction. Articles will undergo a relevance and a design screening process. Data from eligible articles will be extracted by two independent reviewers. Further, we will perform a risk of bias assessment using a decision matrix. We will synthesize and present results in narrative and tabular form and will perform a meta-analysis if heterogeneity of results allows it. DISCUSSION: Antibiotic resistance in environmental samples around wastewater treatment plants may pose a risk of exposure to workers and nearby residents. Results from the systematic review outlined in this protocol will allow to estimate the extend of exposure, to inform policy making and help to design future studies.	2019	31806019
9075	17	0.9990	CamPype: an open-source workflow for automated bacterial whole-genome sequencing analysis focused on Campylobacter. BACKGROUND: The rapid expansion of Whole-Genome Sequencing has revolutionized the fields of clinical and food microbiology. However, its implementation as a routine laboratory technique remains challenging due to the growth of data at a faster rate than can be effectively analyzed and critical gaps in bioinformatics knowledge. RESULTS: To address both issues, CamPype was developed as a new bioinformatics workflow for the genomics analysis of sequencing data of bacteria, especially Campylobacter, which is the main cause of gastroenteritis worldwide making a negative impact on the economy of the public health systems. CamPype allows fully customization of stages to run and tools to use, including read quality control filtering, read contamination, reads extension and assembly, bacterial typing, genome annotation, searching for antibiotic resistance genes, virulence genes and plasmids, pangenome construction and identification of nucleotide variants. All results are processed and resumed in an interactive HTML report for best data visualization and interpretation. CONCLUSIONS: The minimal user intervention of CamPype makes of this workflow an attractive resource for microbiology laboratories with no expertise in bioinformatics as a first line method for bacterial typing and epidemiological analyses, that would help to reduce the costs of disease outbreaks, or for comparative genomic analyses. CamPype is publicly available at https://github.com/JoseBarbero/CamPype .	2023	37474912
4296	18	0.9990	Twenty-first century molecular methods for analyzing antimicrobial resistance in surface waters to support One Health assessments. Antimicrobial resistance (AMR) in the environment is a growing global health concern, especially the dissemination of AMR into surface waters due to human and agricultural inputs. Within recent years, research has focused on trying to understand the impact of AMR in surface waters on human, agricultural and ecological health (One Health). While surface water quality assessments and surveillance of AMR have historically utilized culture-based methods, culturing bacteria has limitations due to difficulty in isolating environmental bacteria and the need for a priori information about the bacteria for selective isolation. The use of molecular techniques to analyze AMR at the genetic level has helped to overcome the difficulties with culture-based techniques since they do not require advance knowledge of the bacterial population and can analyze uncultivable environmental bacteria. The aim of this review is to provide an overview of common contemporary molecular methods available for analyzing AMR in surface waters, which include high throughput real-time polymerase chain reaction (HT-qPCR), metagenomics, and whole genome sequencing. This review will also feature how these methods may provide information on human and animal health risks. HT-qPCR works at the nanoliter scale, requires only a small amount of DNA, and can analyze numerous gene targets simultaneously, but may lack in analytical sensitivity and the ability to optimize individual assays compared to conventional qPCR. Metagenomics offers more detailed genomic information and taxonomic resolution than PCR by sequencing all the microbial genomes within a sample. Its open format allows for the discovery of new antibiotic resistance genes; however, the quantity of DNA necessary for this technique can be a limiting factor for surface water samples that typically have low numbers of bacteria per sample volume. Whole genome sequencing provides the complete genomic profile of a single environmental isolate and can identify all genetic elements that may confer AMR. However, a main disadvantage of this technique is that it only provides information about one bacterial isolate and is challenging to utilize for community analysis. While these contemporary techniques can quickly provide a vast array of information about AMR in surface waters, one technique does not fully characterize AMR nor its potential risks to human, animal, or ecological health. Rather, a combination of techniques (including both molecular- and culture-based) are necessary to fully understand AMR in surface waters from a One Health perspective.	2021	33774111
6544	19	0.9990	A rapid approach with machine learning for quantifying the relative burden of antimicrobial resistance in natural aquatic environments. The massive use and discharge of antibiotics have led to increasing concerns about antimicrobial resistance (AMR) in natural aquatic environments. Since the dose-response mechanisms of pathogens with AMR have not yet been fully understood, and the antibiotic resistance genes and bacteria-related data collection via field sampling and laboratory testing is time-consuming and expensive, designing a rapid approach to quantify the burden of AMR in the natural aquatic environment has become a challenge. To cope with such a challenge, a new approach involving an integrated machine-learning framework was developed by investigating the associations between the relative burden of AMR and easily accessible variables (i.e., relevant environmental variables and adjacent land-use patterns). The results, based on a real-world case analysis, demonstrate that the quantification speed has been reduced from 3-7 days, which is typical for traditional measurement procedures with field sampling and laboratory testing, to approximately 0.5 hours using the new approach. Moreover, all five metrics for AMR relative burden quantification exceed the threshold level of 85%, with F1-score surpassing 0.92. Compared to logistic regression, decision trees, and basic random forest, the adaptive random forest model within the framework significantly improves quantification accuracy without sacrificing model interpretability. Two environmental variables, dissolved oxygen and resistivity, along with the proportion of green areas were identified as three key feature variables for the rapid quantification. This study contributes to the enrichment of burden analyses and management practices for rapid quantification of the relative burden of AMR without dose-response information.	2024	39047454