# | Rank | Similarity | Title + Abs. | Year | PMID |
|---|---|---|---|---|---|
| 0 | 1 | 2 | 3 | 4 | 5 |
| 9076 | 0 | 0.9838 | ResiDB: An automated database manager for sequence data. The amount of publicly available DNA sequence data is drastically increasing, making it a tedious task to create sequence databases necessary for the design of diagnostic assays. The selection of appropriate sequences is especially challenging in genes affected by frequent point mutations such as antibiotic resistance genes. To overcome this issue, we have designed the webtool resiDB, a rapid and user-friendly sequence database manager for bacteria, fungi, viruses, protozoa, invertebrates, plants, archaea, environmental and whole genome shotgun sequence data. It automatically identifies and curates sequence clusters to create custom sequence databases based on user-defined input sequences. A collection of helpful visualization tools gives the user the opportunity to easily access, evaluate, edit, and download the newly created database. Consequently, researchers do no longer have to manually manage sequence data retrieval, deal with hardware limitations, and run multiple independent software tools, each having its own requirements, input and output formats. Our tool was developed within the H2020 project FAPIC aiming to develop a single diagnostic assay targeting all sepsis-relevant pathogens and antibiotic resistance mechanisms. ResiDB is freely accessible to all users through https://residb.ait.ac.at/. | 2021 | 33495705 |
| 9075 | 1 | 0.9832 | CamPype: an open-source workflow for automated bacterial whole-genome sequencing analysis focused on Campylobacter. BACKGROUND: The rapid expansion of Whole-Genome Sequencing has revolutionized the fields of clinical and food microbiology. However, its implementation as a routine laboratory technique remains challenging due to the growth of data at a faster rate than can be effectively analyzed and critical gaps in bioinformatics knowledge. RESULTS: To address both issues, CamPype was developed as a new bioinformatics workflow for the genomics analysis of sequencing data of bacteria, especially Campylobacter, which is the main cause of gastroenteritis worldwide making a negative impact on the economy of the public health systems. CamPype allows fully customization of stages to run and tools to use, including read quality control filtering, read contamination, reads extension and assembly, bacterial typing, genome annotation, searching for antibiotic resistance genes, virulence genes and plasmids, pangenome construction and identification of nucleotide variants. All results are processed and resumed in an interactive HTML report for best data visualization and interpretation. CONCLUSIONS: The minimal user intervention of CamPype makes of this workflow an attractive resource for microbiology laboratories with no expertise in bioinformatics as a first line method for bacterial typing and epidemiological analyses, that would help to reduce the costs of disease outbreaks, or for comparative genomic analyses. CamPype is publicly available at https://github.com/JoseBarbero/CamPype . | 2023 | 37474912 |
| 9074 | 2 | 0.9832 | BacAnt: A Combination Annotation Server for Bacterial DNA Sequences to Identify Antibiotic Resistance Genes, Integrons, and Transposable Elements. Whole genome sequencing (WGS) of bacteria has become a routine method in diagnostic laboratories. One of the clinically most useful advantages of WGS is the ability to predict antimicrobial resistance genes (ARGs) and mobile genetic elements (MGEs) in bacterial sequences. This allows comprehensive investigations of such genetic features but can also be used for epidemiological studies. A plethora of software programs have been developed for the detailed annotation of bacterial DNA sequences, such as rapid annotation using subsystem technology (RAST), Resfinder, ISfinder, INTEGRALL and The Transposon Registry. Unfortunately, to this day, a reliable annotation tool of the combination of ARGs and MGEs is not available, and the generation of genbank files requires much manual input. Here, we present a new webserver which allows the annotation of ARGs, integrons and transposable elements at the same time. The pipeline generates genbank files automatically, which are compatible with Easyfig for comparative genomic analysis. Our BacAnt code and standalone software package are available at https://github.com/xthua/bacant with an accompanying web application at http://bacant.net. | 2021 | 34367079 |
| 9083 | 3 | 0.9831 | ARGNet: using deep neural networks for robust identification and classification of antibiotic resistance genes from sequences. BACKGROUND: Emergence of antibiotic resistance in bacteria is an important threat to global health. Antibiotic resistance genes (ARGs) are some of the key components to define bacterial resistance and their spread in different environments. Identification of ARGs, particularly from high-throughput sequencing data of the specimens, is the state-of-the-art method for comprehensively monitoring their spread and evolution. Current computational methods to identify ARGs mainly rely on alignment-based sequence similarities with known ARGs. Such approaches are limited by choice of reference databases and may potentially miss novel ARGs. The similarity thresholds are usually simple and could not accommodate variations across different gene families and regions. It is also difficult to scale up when sequence data are increasing. RESULTS: In this study, we developed ARGNet, a deep neural network that incorporates an unsupervised learning autoencoder model to identify ARGs and a multiclass classification convolutional neural network to classify ARGs that do not depend on sequence alignment. This approach enables a more efficient discovery of both known and novel ARGs. ARGNet accepts both amino acid and nucleotide sequences of variable lengths, from partial (30-50 aa; 100-150 nt) sequences to full-length protein or genes, allowing its application in both target sequencing and metagenomic sequencing. Our performance evaluation showed that ARGNet outperformed other deep learning models including DeepARG and HMD-ARG in most of the application scenarios especially quasi-negative test and the analysis of prediction consistency with phylogenetic tree. ARGNet has a reduced inference runtime by up to 57% relative to DeepARG. CONCLUSIONS: ARGNet is flexible, efficient, and accurate at predicting a broad range of ARGs from the sequencing data. ARGNet is freely available at https://github.com/id-bioinfo/ARGNet , with an online service provided at https://ARGNet.hku.hk . Video Abstract. | 2024 | 38725076 |
| 9072 | 4 | 0.9826 | PanGeT: Pan-genomics tool. A decade after the concept of Pan-genome was first introduced; research in this field has spread its tentacles to areas such as pathogenesis of diseases, bacterial evolutionary studies and drug resistance. Gene content-based differentiation of virulent and a virulent strains of bacteria and identification of pathogen specific genes is imperative to understand their physiology and gain insights into the mechanism of genome evolution. Subsequently, this will aid in identifying diagnostic targets and in developing and selecting vaccines. The root of pan-genomic studies, however, is to identify the core genes, dispensable genes and strain specific genes across the genomes belonging to a clade. To this end, we have developed a tool, "PanGeT - Pan-genomics Tool" to compute the 'pan-genome' based on comparisons at the genome as well as the proteome levels. This automated tool is implemented using LaTeX libraries for effective visualization of overall pan-genome through graphical plots. Links to retrieve sequence information and functional annotations have also been provided. PanGeT can be downloaded from http://pranag.physics.iisc.ernet.in/PanGeT/ or https://github.com/PanGeTv1/PanGeT. | 2017 | 27851981 |
| 9067 | 5 | 0.9825 | PIPdb: a comprehensive plasmid sequence resource for tracking the horizontal transfer of pathogenic factors and antimicrobial resistance genes. Plasmids, as independent genetic elements, carrying resistance or virulence genes and transfer them among different pathogens, posing a significant threat to human health. Under the 'One Health' approach, it is crucial to control the spread of plasmids carrying such genes. To achieve this, a comprehensive characterization of plasmids in pathogens is essential. Here we present the Plasmids in Pathogens Database (PIPdb), a pioneering resource that includes 792 964 plasmid segment clusters (PSCs) derived from 1 009 571 assembled genomes across 450 pathogenic species from 110 genera. To our knowledge, PIPdb is the first database specifically dedicated to plasmids in pathogenic bacteria, offering detailed multi-dimensional metadata such as collection date, geographical origin, ecosystem, host taxonomy, and habitat. PIPdb also provides extensive functional annotations, including plasmid type, insertion sequences, integron, oriT, relaxase, T4CP, virulence factors genes, heavy metal resistance genes and antibiotic resistance genes. The database features a user-friendly interface that facilitates studies on plasmids across diverse host taxa, habitats, and ecosystems, with a focus on those carrying antimicrobial resistance genes (ARGs). We have integrated online tools for plasmid identification and annotation from assembled genomes. Additionally, PIPdb includes a risk-scoring system for identifying potentially high-risk plasmids. The PIPdb web interface is accessible at https://nmdc.cn/pipdb. | 2025 | 39460620 |
| 9068 | 6 | 0.9822 | TnCentral: a Prokaryotic Transposable Element Database and Web Portal for Transposon Analysis. We describe here the structure and organization of TnCentral (https://tncentral.proteininformationresource.org/ [or the mirror link at https://tncentral.ncc.unesp.br/]), a web resource for prokaryotic transposable elements (TE). TnCentral currently contains ∼400 carefully annotated TE, including transposons from the Tn3, Tn7, Tn402, and Tn554 families; compound transposons; integrons; and associated insertion sequences (IS). These TE carry passenger genes, including genes conferring resistance to over 25 classes of antibiotics and nine types of heavy metal, as well as genes responsible for pathogenesis in plants, toxin/antitoxin gene pairs, transcription factors, and genes involved in metabolism. Each TE has its own entry page, providing details about its transposition genes, passenger genes, and other sequence features required for transposition, as well as a graphical map of all features. TnCentral content can be browsed and queried through text- and sequence-based searches with a graphic output. We describe three use cases, which illustrate how the search interface, results tables, and entry pages can be used to explore and compare TE. TnCentral also includes downloadable software to facilitate user-driven identification, with manual annotation, of certain types of TE in genomic sequences. Through the TnCentral homepage, users can also access TnPedia, which provides comprehensive reviews of the major TE families, including an extensive general section and specialized sections with descriptions of insertion sequence and transposon families. TnCentral and TnPedia are intuitive resources that can be used by clinicians and scientists to assess TE diversity in clinical, veterinary, and environmental samples. IMPORTANCE The ability of bacteria to undergo rapid evolution and adapt to changing environmental circumstances drives the public health crisis of multiple antibiotic resistance, as well as outbreaks of disease in economically important agricultural crops and animal husbandry. Prokaryotic transposable elements (TE) play a critical role in this. Many carry "passenger genes" (not required for the transposition process) conferring resistance to antibiotics or heavy metals or causing disease in plants and animals. Passenger genes are spread by normal TE transposition activities and by insertion into plasmids, which then spread via conjugation within and across bacterial populations. Thus, an understanding of TE composition and transposition mechanisms is key to developing strategies to combat bacterial pathogenesis. Toward this end, we have developed TnCentral, a bioinformatics resource dedicated to describing and exploring the structural and functional features of prokaryotic TE whose use is intuitive and accessible to users with or without bioinformatics expertise. | 2021 | 34517763 |
| 9077 | 7 | 0.9822 | The PLSDB 2025 update: enhanced annotations and improved functionality for comprehensive plasmid research. Plasmids are extrachromosomal DNA molecules in bacteria and archaea, playing critical roles in horizontal gene transfer, antibiotic resistance, and pathogenicity. Since its first release in 2018, our database on plasmids, PLSDB, has significantly grown and enhanced its content and scope. From 34 513 records contained in the 2021 version, PLSDB now hosts 72 360 entries. Designed to provide life scientists with convenient access to extensive plasmid data and to support computer scientists by offering curated datasets for artificial intelligence (AI) development, this latest update brings more comprehensive and accurate information for plasmid research, with interactive visualization options. We enriched PLSDB by refining the identification and classification of plasmid host ecosystems and host diseases. Additionally, we incorporated annotations for new functional structures, including protein-coding genes and biosynthetic gene clusters. Further, we enhanced existing annotations, such as antimicrobial resistance genes and mobility typing. To accommodate these improvements and to host the increase plasmid sets, the webserver architecture and underlying data structures of PLSDB have been re-reconstructed, resulting in decreased response times and enhanced visualization of features while ensuring that users have access to a more efficient and user-friendly interface. The latest release of PLSDB is freely accessible at https://www.ccb.uni-saarland.de/plsdb2025. | 2025 | 39565221 |
| 5125 | 8 | 0.9816 | Do we still need Illumina sequencing data? Evaluating Oxford Nanopore Technologies R10.4.1 flow cells and the Rapid v14 library prep kit for Gram negative bacteria whole genome assemblies. The best whole genome assemblies are currently built from a combination of highly accurate short-read sequencing data and long-read sequencing data that can bridge repetitive and problematic regions. Oxford Nanopore Technologies (ONT) produce long-read sequencing platforms and they are continually improving their technology to obtain higher quality read data that is approaching the quality obtained from short-read platforms such as Illumina. As these innovations continue, we evaluated how much ONT read coverage produced by the Rapid Barcoding Kit v14 (SQK-RBK114) is necessary to generate high-quality hybrid and long-read-only genome assemblies for a panel of carbapenemase-producing Enterobacterales bacterial isolates. We found that 30× long-read coverage is sufficient if Illumina data are available, and that more (at least 100× long-read coverage is recommended for long-read-only assemblies. Illumina polishing is still improving single nucleotide variants (SNVs) and INDELs in long-read-only assemblies. We also examined if antimicrobial resistance genes could be accurately identified in long-read-only data, and found that Flye assemblies regardless of ONT coverage detected >96% of resistance genes at 100% identity and length. Overall, the Rapid Barcoding Kit v14 and long-read-only assemblies can be an optimal sequencing strategy (i.e., plasmid characterization and AMR detection) but finer-scale analyses (i.e., SNV) still benefit from short-read data. | 2024 | 38354391 |
| 9079 | 9 | 0.9815 | Review, Evaluation, and Directions for Gene-Targeted Assembly for Ecological Analyses of Metagenomes. Shotgun metagenomics has greatly advanced our understanding of microbial communities over the last decade. Metagenomic analyses often include assembly and genome binning, computationally daunting tasks especially for big data from complex environments such as soil and sediments. In many studies, however, only a subset of genes and pathways involved in specific functions are of interest; thus, it is not necessary to attempt global assembly. In addition, methods that target genes can be computationally more efficient and produce more accurate assembly by leveraging rich databases, especially for those genes that are of broad interest such as those involved in biogeochemical cycles, biodegradation, and antibiotic resistance or used as phylogenetic markers. Here, we review six gene-targeted assemblers with unique algorithms for extracting and/or assembling targeted genes: Xander, MegaGTA, SAT-Assembler, HMM-GRASPx, GenSeed-HMM, and MEGAN. We tested these tools using two datasets with known genomes, a synthetic community of artificial reads derived from the genomes of 17 bacteria, shotgun sequence data from a mock community with 48 bacteria and 16 archaea genomes, and a large soil shotgun metagenomic dataset. We compared assemblies of a universal single copy gene (rplB) and two N cycle genes (nifH and nirK). We measured their computational efficiency, sensitivity, specificity, and chimera rate and found Xander and MegaGTA, which both use a probabilistic graph structure to model the genes, have the best overall performance with all three datasets, although MEGAN, a reference matching assembler, had better sensitivity with synthetic and mock community members chosen from its reference collection. Also, Xander and MegaGTA are the only tools that include post-assembly scripts tuned for common molecular ecology and diversity analyses. Additionally, we provide a mathematical model for estimating the probability of assembling targeted genes in a metagenome for estimating required sequencing depth. | 2019 | 31749830 |
| 9066 | 10 | 0.9810 | VRprofile: gene-cluster-detection-based profiling of virulence and antibiotic resistance traits encoded within genome sequences of pathogenic bacteria. VRprofile is a Web server that facilitates rapid investigation of virulence and antibiotic resistance genes, as well as extends these trait transfer-related genetic contexts, in newly sequenced pathogenic bacterial genomes. The used backend database MobilomeDB was firstly built on sets of known gene cluster loci of bacterial type III/IV/VI/VII secretion systems and mobile genetic elements, including integrative and conjugative elements, prophages, class I integrons, IS elements and pathogenicity/antibiotic resistance islands. VRprofile is thus able to co-localize the homologs of these conserved gene clusters using HMMer or BLASTp searches. With the integration of the homologous gene cluster search module with a sequence composition module, VRprofile has exhibited better performance for island-like region predictions than the other widely used methods. In addition, VRprofile also provides an integrated Web interface for aligning and visualizing identified gene clusters with MobilomeDB-archived gene clusters, or a variety set of bacterial genomes. VRprofile might contribute to meet the increasing demands of re-annotations of bacterial variable regions, and aid in the real-time definitions of disease-relevant gene clusters in pathogenic bacteria of interest. VRprofile is freely available at http://bioinfo-mml.sjtu.edu.cn/VRprofile. | 2018 | 28077405 |
| 9069 | 11 | 0.9808 | Pdif-mediated antibiotic resistance genes transfer in bacteria identified by pdifFinder. Modules consisting of antibiotic resistance genes (ARGs) flanked by inverted repeat Xer-specific recombination sites were thought to be mobile genetic elements that promote horizontal transmission. Less frequently, the presence of mobile modules in plasmids, which facilitate a pdif-mediated ARGs transfer, has been reported. Here, numerous ARGs and toxin-antitoxin genes have been found in pdif site pairs. However, the mechanisms underlying this apparent genetic mobility is currently not understood, and the studies relating to pdif-mediated ARGs transfer onto most bacterial genera are lacking. We developed the web server pdifFinder based on an algorithm called PdifSM that allows the prediction of diverse pdif-ARGs modules in bacterial genomes. Using test set consisting of almost 32 thousand plasmids from 717 species, PdifSM identified 481 plasmids from various bacteria containing pdif sites with ARGs. We found 28-bp-long elements from different genera with clear base preferences. The data we obtained indicate that XerCD-dif site-specific recombination mechanism may have evolutionary adapted to facilitate the pdif-mediated ARGs transfer. Through multiple sequence alignment and evolutionary analyses of duplicated pdif-ARGs modules, we discovered that pdif sites allow an interspecies transfer of ARGs but also across different genera. Mutations in pdif sites generate diverse arrays of modules which mediate multidrug-resistance, as these contain variable numbers of diverse ARGs, insertion sequences and other functional genes. The identification of pdif-ARGs modules and studies focused on the mechanism of ARGs co-transfer will help us to understand and possibly allow controlling the spread of MDR bacteria in clinical settings. The pdifFinder code, standalone software package and description with tutorials are available at https://github.com/mjshao06/pdifFinder. | 2023 | 36470841 |
| 5126 | 12 | 0.9808 | Blanket antimicrobial resistance gene database with structural information, BOARDS, provides insights on historical landscape of resistance prevalence and effects of mutations in enzyme structure. Antimicrobial resistance (AMR) in pathogenic bacteria poses a significant threat to public health, yet there is still a need for development in the tools to deeply understand AMR genes based on genetic or structural information. In this study, we present an interactive web database named Blanket Overarching Antimicrobial-Resistance gene Database with Structural information (BOARDS, sbml.unist.ac.kr), a database that comprehensively includes 3,943 reported AMR gene information for 1,997 extended spectrum beta-lactamase (ESBL) and 1,946 other genes as well as a total of 27,395 predicted protein structures. These structures, which include both wild-type AMR genes and their mutants, were derived from 80,094 publicly available whole-genome sequences. In addition, we developed the rapid analysis and detection tool of antimicrobial-resistance (RADAR), a one-stop analysis pipeline to detect AMR genes across whole-genome sequencing (WGSs). By integrating BOARDS and RADAR, the AMR prevalence landscape for eight multi-drug resistant pathogens was reconstructed, leading to unexpected findings such as the pre-existence of the MCR genes before their official reports. Enzymatic structure prediction-based analysis revealed that the occurrence of mutations found in some ESBL genes was found to be closely related to the binding affinities with their antibiotic substrates. Overall, BOARDS can play a significant role in performing in-depth analysis on AMR.IMPORTANCEWhile the increasing antibiotic resistance (AMR) in pathogen has been a burden on public health, effective tools for deep understanding of AMR based on genetic or structural information remain limited. In this study, a blanket overarching antimicrobial-resistance gene database with structure information (BOARDS)-a web-based database that comprehensively collected AMR gene data with predictive protein structural information was constructed. Additionally, we report the development of a RADAR pipeline that can analyze whole-genome sequences as well. BOARDS, which includes sequence and structural information, has shown the historical landscape and prevalence of the AMR genes and can provide insight into single-nucleotide polymorphism effects on antibiotic degrading enzymes within protein structures. | 2024 | 38085058 |
| 9078 | 13 | 0.9807 | MetaCherchant: analyzing genomic context of antibiotic resistance genes in gut microbiota. MOTIVATION: Antibiotic resistance is an important global public health problem. Human gut microbiota is an accumulator of resistance genes potentially providing them to pathogens. It is important to develop tools for identifying the mechanisms of how resistance is transmitted between gut microbial species and pathogens. RESULTS: We developed MetaCherchant-an algorithm for extracting the genomic environment of antibiotic resistance genes from metagenomic data in the form of a graph. The algorithm was validated on a number of simulated and published datasets, as well as applied to new 'shotgun' metagenomes of gut microbiota from patients with Helicobacter pylori who underwent antibiotic therapy. Genomic context was reconstructed for several major resistance genes. Taxonomic annotation of the context suggests that within a single metagenome, the resistance genes can be contained in genomes of multiple species. MetaCherchant allows reconstruction of mobile elements with resistance genes within the genomes of bacteria using metagenomic data. Application of MetaCherchant in differential mode produced specific graph structures suggesting the evidence of possible resistance gene transmission within a mobile element that occurred as a result of the antibiotic therapy. MetaCherchant is a promising tool giving researchers an opportunity to get an insight into dynamics of resistance transmission in vivo basing on metagenomic data. AVAILABILITY AND IMPLEMENTATION: Source code and binaries are freely available for download at https://github.com/ctlab/metacherchant. The code is written in Java and is platform-independent. COTANCT: ulyantsev@rain.ifmo.ru. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. | 2018 | 29092015 |
| 8259 | 14 | 0.9807 | Secondary Metabolite Transcriptomic Pipeline (SeMa-Trap), an expression-based exploration tool for increased secondary metabolite production in bacteria. For decades, natural products have been used as a primary resource in drug discovery pipelines to find new antibiotics, which are mainly produced as secondary metabolites by bacteria. The biosynthesis of these compounds is encoded in co-localized genes termed biosynthetic gene clusters (BGCs). However, BGCs are often not expressed under laboratory conditions. Several genetic manipulation strategies have been developed in order to activate or overexpress silent BGCs. Significant increases in production levels of secondary metabolites were indeed achieved by modifying the expression of genes encoding regulators and transporters, as well as genes involved in resistance or precursor biosynthesis. However, the abundance of genes encoding such functions within bacterial genomes requires prioritization of the most promising ones for genetic manipulation strategies. Here, we introduce the 'Secondary Metabolite Transcriptomic Pipeline' (SeMa-Trap), a user-friendly web-server, available at https://sema-trap.ziemertlab.com. SeMa-Trap facilitates RNA-Seq based transcriptome analyses, finds co-expression patterns between certain genes and BGCs of interest, and helps optimize the design of comparative transcriptomic analyses. Finally, SeMa-Trap provides interactive result pages for each BGC, allowing the easy exploration and comparison of expression patterns. In summary, SeMa-Trap allows a straightforward prioritization of genes that could be targeted via genetic engineering approaches to (over)express BGCs of interest. | 2022 | 35580059 |
| 3771 | 15 | 0.9806 | RFPlasmid: predicting plasmid sequences from short-read assembly data using machine learning. Antimicrobial-resistance (AMR) genes in bacteria are often carried on plasmids and these plasmids can transfer AMR genes between bacteria. For molecular epidemiology purposes and risk assessment, it is important to know whether the genes are located on highly transferable plasmids or in the more stable chromosomes. However, draft whole-genome sequences are fragmented, making it difficult to discriminate plasmid and chromosomal contigs. Current methods that predict plasmid sequences from draft genome sequences rely on single features, like k-mer composition, circularity of the DNA molecule, copy number or sequence identity to plasmid replication genes, all of which have their drawbacks, especially when faced with large single-copy plasmids, which often carry resistance genes. With our newly developed prediction tool RFPlasmid, we use a combination of multiple features, including k-mer composition and databases with plasmid and chromosomal marker proteins, to predict whether the likely source of a contig is plasmid or chromosomal. The tool RFPlasmid supports models for 17 different bacterial taxa, including Campylobacter, Escherichia coli and Salmonella, and has a taxon agnostic model for metagenomic assemblies or unsupported organisms. RFPlasmid is available both as a standalone tool and via a web interface. | 2021 | 34846288 |
| 5116 | 16 | 0.9806 | Prediction of Antimicrobial Resistance in Gram-Negative Bacteria From Whole-Genome Sequencing Data. BACKGROUND: Early detection of antimicrobial resistance in pathogens and prescription of more effective antibiotics is a fast-emerging need in clinical practice. High-throughput sequencing technology, such as whole genome sequencing (WGS), may have the capacity to rapidly guide the clinical decision-making process. The prediction of antimicrobial resistance in Gram-negative bacteria, often the cause of serious systemic infections, is more challenging as genotype-to-phenotype (drug resistance) relationship is more complex than for most Gram-positive organisms. METHODS AND FINDINGS: We have used NCBI BioSample database to train and cross-validate eight XGBoost-based machine learning models to predict drug resistance to cefepime, cefotaxime, ceftriaxone, ciprofloxacin, gentamicin, levofloxacin, meropenem, and tobramycin tested in Acinetobacter baumannii, Escherichia coli, Enterobacter cloacae, Klebsiella aerogenes, and Klebsiella pneumoniae. The input is the WGS data in terms of the coverage of known antibiotic resistance genes by shotgun sequencing reads. Models demonstrate high performance and robustness to class imbalanced datasets. CONCLUSION: Whole genome sequencing enables the prediction of antimicrobial resistance in Gram-negative bacteria. We present a tool that provides an in silico antibiogram for eight drugs. Predictions are accompanied with a reliability index that may further facilitate the decision making process. The demo version of the tool with pre-processed samples is available at https://vancampn.shinyapps.io/wgs2amr/. The stand-alone version of the predictor is available at https://github.com/pieterjanvc/wgs2amr/. | 2020 | 32528441 |
| 8263 | 17 | 0.9806 | CRISPR/Cas9: A Novel Weapon in the Arsenal to Combat Plant Diseases. Plant pathogens like virus, bacteria, and fungi incur a huge loss of global productivity. Targeting the dominant R gene resulted in the evolution of resistance in pathogens, which shifted plant pathologists' attention toward host susceptibility factors (or S genes). Herein, the application of sequence-specific nucleases (SSNs) for targeted genome editing are gaining more importance, which utilize the use of meganucleases (MN), zinc finger nucleases (ZFNs), transcription activator-like effector-based nucleases (TALEN) with the latest one namely clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein 9 (Cas9). The first generation of genome editing technologies, due to their cumbersome nature, is becoming obsolete. Owing to its simple and inexpensive nature the use of CRISPR/Cas9 system has revolutionized targeted genome editing technology. CRISPR/Cas9 system has been exploited for developing resistance against virus, bacteria, and fungi. For resistance to DNA viruses (mainly single-stranded DNA viruses), different parts of the viral genome have been targeted transiently and by the development of transgenic plants. For RNA viruses, mainly the host susceptibility factors and very recently the viral RNA genome itself have been targeted. Fungal and bacterial resistance has been achieved mainly by targeting the host susceptibility genes through the development of transgenics. In spite of these successes CRISPR/Cas9 system suffers from off-targeting. This and other problems associated with this system are being tackled by the continuous discovery/evolution of new variants. Finally, the regulatory standpoint regarding CRISPR/Cas9 will determine the fate of using this versatile tool in developing pathogen resistance in crop plants. | 2018 | 30697226 |
| 9070 | 18 | 0.9805 | Automated annotation of mobile antibiotic resistance in Gram-negative bacteria: the Multiple Antibiotic Resistance Annotator (MARA) and database. BACKGROUND: Multiresistance in Gram-negative bacteria is often due to acquisition of several different antibiotic resistance genes, each associated with a different mobile genetic element, that tend to cluster together in complex conglomerations. Accurate, consistent annotation of resistance genes, the boundaries and fragments of mobile elements, and signatures of insertion, such as DR, facilitates comparative analysis of complex multiresistance regions and plasmids to better understand their evolution and how resistance genes spread. OBJECTIVES: To extend the Repository of Antibiotic resistance Cassettes (RAC) web site, which includes a database of 'features', and the Attacca automatic DNA annotation system, to encompass additional resistance genes and all types of associated mobile elements. METHODS: Antibiotic resistance genes and mobile elements were added to RAC, from existing registries where possible. Attacca grammars were extended to accommodate the expanded database, to allow overlapping features to be annotated and to identify and annotate features such as composite transposons and DR. RESULTS: The Multiple Antibiotic Resistance Annotator (MARA) database includes antibiotic resistance genes and selected mobile elements from Gram-negative bacteria, distinguishing important variants. Sequences can be submitted to the MARA web site for annotation. A list of positions and orientations of annotated features, indicating those that are truncated, DR and potential composite transposons is provided for each sequence, as well as a diagram showing annotated features approximately to scale. CONCLUSIONS: The MARA web site (http://mara.spokade.com) provides a comprehensive database for mobile antibiotic resistance in Gram-negative bacteria and accurately annotates resistance genes and associated mobile elements in submitted sequences to facilitate comparative analysis. | 2018 | 29373760 |
| 9082 | 19 | 0.9805 | GeneMates: an R package for detecting horizontal gene co-transfer between bacteria using gene-gene associations controlled for population structure. BACKGROUND: Horizontal gene transfer contributes to bacterial evolution through mobilising genes across various taxonomical boundaries. It is frequently mediated by mobile genetic elements (MGEs), which may capture, maintain, and rearrange mobile genes and co-mobilise them between bacteria, causing horizontal gene co-transfer (HGcoT). This physical linkage between mobile genes poses a great threat to public health as it facilitates dissemination and co-selection of clinically important genes amongst bacteria. Although rapid accumulation of bacterial whole-genome sequencing data since the 2000s enables study of HGcoT at the population level, results based on genetic co-occurrence counts and simple association tests are usually confounded by bacterial population structure when sampled bacteria belong to the same species, leading to spurious conclusions. RESULTS: We have developed a network approach to explore WGS data for evidence of intraspecies HGcoT and have implemented it in R package GeneMates ( github.com/wanyuac/GeneMates ). The package takes as input an allelic presence-absence matrix of interested genes and a matrix of core-genome single-nucleotide polymorphisms, performs association tests with linear mixed models controlled for population structure, produces a network of significantly associated alleles, and identifies clusters within the network as plausible co-transferred alleles. GeneMates users may choose to score consistency of allelic physical distances measured in genome assemblies using a novel approach we have developed and overlay scores to the network for further evidence of HGcoT. Validation studies of GeneMates on known acquired antimicrobial resistance genes in Escherichia coli and Salmonella Typhimurium show advantages of our network approach over simple association analysis: (1) distinguishing between allelic co-occurrence driven by HGcoT and that driven by clonal reproduction, (2) evaluating effects of population structure on allelic co-occurrence, and (3) direct links between allele clusters in the network and MGEs when physical distances are incorporated. CONCLUSION: GeneMates offers an effective approach to detection of intraspecies HGcoT using WGS data. | 2020 | 32972363 |