# | Rank | Similarity | Title + Abs. | Year | PMID |
|---|---|---|---|---|---|
| 0 | 1 | 2 | 3 | 4 | 5 |
| 5098 | 0 | 1.0000 | Feature selection and aggregation for antibiotic resistance GWAS in Mycobacterium tuberculosis: a comparative study. INTRODUCTION: Drug resistance (DR) of pathogens remains a global healthcare concern. In contrast to other bacteria, acquiring mutations in the core genome is the main mechanism of drug resistance for Mycobacterium tuberculosis (MTB). For some antibiotics, the resistance of a particular isolate can be reliably predicted by identifying specific mutations, while for other antibiotics the knowledge of resistance mechanisms is limited. Statistical machine learning (ML) methods are used to infer new genes implicated in drug resistance leveraging large collections of isolates with known whole-genome sequences and phenotypic states for different drugs. However, high correlations between the phenotypic states for commonly used drugs complicate the inference of true associations of mutations with drug phenotypes by ML approaches. METHODS: Recently, several new methods have been developed to select a small subset of reliable predictors of the dependent variable, which may help reduce the number of spurious associations identified. In this study, we evaluated several such methods, namely, logistic regression with different regularization penalty functions, a recently introduced algorithm for solving the best-subset selection problem (ABESS) and "Hungry, Hungry SNPos" (HHS) a heuristic algorithm specifically developed to identify resistance-associated genetic variants in the presence of resistance co-occurrence. We assessed their ability to select known causal mutations for resistance to a specific drug while avoiding the selection of mutations in genes associated with resistance to other drugs, thus we compared selected ML models for their applicability for MTB genome wide association studies. RESULTS AND DISCUSSION: In our analysis, ABESS significantly outperformed the other methods, selecting more relevant sets of mutations. Additionally, we demonstrated that aggregating rare mutations within protein-coding genes into markers indicative of changes in PFAM domains improved prediction quality, and these markers were predominantly selected by ABESS, suggesting their high informativeness. However, ABESS yielded lower prediction accuracy compared to logistic regression methods with regularization. | 2025 | 40606161 |
| 4394 | 1 | 0.9996 | Signatures of Selection at Drug Resistance Loci in Mycobacterium tuberculosis. Tuberculosis (TB) is the leading cause of death by an infectious disease, and global TB control efforts are increasingly threatened by drug resistance in Mycobacterium tuberculosis. Unlike most bacteria, where lateral gene transfer is an important mechanism of resistance acquisition, resistant M. tuberculosis arises solely by de novo chromosomal mutation. Using whole-genome sequencing data from two natural populations of M. tuberculosis, we characterized the population genetics of known drug resistance loci using measures of diversity, population differentiation, and convergent evolution. We found resistant subpopulations to be less diverse than susceptible subpopulations, consistent with ongoing transmission of resistant M. tuberculosis. A subset of resistance genes ("sloppy targets") were characterized by high diversity and multiple rare variants; we posit that a large genetic target for resistance and relaxation of purifying selection contribute to high diversity at these loci. For "tight targets" of selection, the path to resistance appeared narrower, evidenced by single favored mutations that arose numerous times in the phylogeny and segregated at markedly different frequencies in resistant and susceptible subpopulations. These results suggest that diverse genetic architectures underlie drug resistance in M. tuberculosis and that combined approaches are needed to identify causal mutations. Extrapolating from patterns observed for well-characterized genes, we identified novel candidate variants involved in resistance. The approach outlined here can be extended to identify resistance variants for new drugs, to investigate the genetic architecture of resistance, and when phenotypic data are available, to find candidate genetic loci underlying other positively selected traits in clonal bacteria. IMPORTANCEMycobacterium tuberculosis, the causative agent of tuberculosis (TB), is a significant burden on global health. Antibiotic treatment imposes strong selective pressure on M. tuberculosis populations. Identifying the mutations that cause drug resistance in M. tuberculosis is important for guiding TB treatment and halting the spread of drug resistance. Whole-genome sequencing (WGS) of M. tuberculosis isolates can be used to identify novel mutations mediating drug resistance and to predict resistance patterns faster than traditional methods of drug susceptibility testing. We have used WGS from natural populations of drug-resistant M. tuberculosis to characterize effects of selection for advantageous mutations on patterns of diversity at genes involved in drug resistance. The methods developed here can be used to identify novel advantageous mutations, including new resistance loci, in M. tuberculosis and other clonal pathogens. | 2018 | 29404424 |
| 5112 | 2 | 0.9995 | Genome-Based Prediction of Bacterial Antibiotic Resistance. Clinical microbiology has long relied on growing bacteria in culture to determine antimicrobial susceptibility profiles, but the use of whole-genome sequencing for antibiotic susceptibility testing (WGS-AST) is now a powerful alternative. This review discusses the technologies that made this possible and presents results from recent studies to predict resistance based on genome sequences. We examine differences between calling antibiotic resistance profiles by the simple presence or absence of previously known genes and single-nucleotide polymorphisms (SNPs) against approaches that deploy machine learning and statistical models. Often, the limitations to genome-based prediction arise from limitations of accuracy of culture-based AST in addition to an incomplete knowledge of the genetic basis of resistance. However, we need to maintain phenotypic testing even as genome-based prediction becomes more widespread to ensure that the results do not diverge over time. We argue that standardization of WGS-AST by challenge with consistently phenotyped strain sets of defined genetic diversity is necessary to compare the efficacy of methods of prediction of antibiotic resistance based on genome sequences. | 2019 | 30381421 |
| 9610 | 3 | 0.9995 | The evolutionary rate of antibacterial drug targets. BACKGROUND: One of the major issues in the fight against infectious diseases is the notable increase in multiple drug resistance in pathogenic species. For that reason, newly acquired high-throughput data on virulent microbial agents attract the attention of many researchers seeking potential new drug targets. Many approaches have been used to evaluate proteins from infectious pathogens, including, but not limited to, similarity analysis, reverse docking, statistical 3D structure analysis, machine learning, topological properties of interaction networks or a combination of the aforementioned methods. From a biological perspective, most essential proteins (knockout lethal for bacteria) or highly conserved proteins (broad spectrum activity) are potential drug targets. Ribosomal proteins comprise such an example. Many of them are well-known drug targets in bacteria. It is intuitive that we should learn from nature how to design good drugs. Firstly, known antibiotics are mainly originating from natural products of microorganisms targeting other microorganisms. Secondly, paleontological data suggests that antibiotics have been used by microorganisms for million years. Thus, we have hypothesized that good drug targets are evolutionary constrained and are subject of evolutionary selection. This means that mutations in such proteins are deleterious and removed by selection, which makes them less susceptible to random development of resistance. Analysis of the speed of evolution seems to be good approach to test this hypothesis. RESULTS: In this study we show that pN/pS ratio of genes coding for known drug targets is significantly lower than the genome average and also lower than that for essential genes identified by experimental methods. Similar results are observed in the case of dN/dS analysis. Both analyzes suggest that drug targets tend to evolve slowly and that the rate of evolution is a better predictor of drugability than essentiality. CONCLUSIONS: Evolutionary rate can be used to score and find potential drug targets. The results presented here may become a useful addition to a repertoire of drug target prediction methods. As a proof of concept, we analyzed GO enrichment among the slowest evolving genes. These may become the starting point in the search for antibiotics with a novel mechanism. | 2013 | 23374913 |
| 5100 | 4 | 0.9995 | DeepPBI-KG: a deep learning method for the prediction of phage-bacteria interactions based on key genes. Phages, the natural predators of bacteria, were discovered more than 100 years ago. However, increasing antimicrobial resistance rates have revitalized phage research. Methods that are more time-consuming and efficient than wet-laboratory experiments are needed to help screen phages quickly for therapeutic use. Traditional computational methods usually ignore the fact that phage-bacteria interactions are achieved by key genes and proteins. Methods for intraspecific prediction are rare since almost all existing methods consider only interactions at the species and genus levels. Moreover, most strains in existing databases contain only partial genome information because whole-genome information for species is difficult to obtain. Here, we propose a new approach for interaction prediction by constructing new features from key genes and proteins via the application of K-means sampling to select high-quality negative samples for prediction. Finally, we develop DeepPBI-KG, a corresponding prediction tool based on feature selection and a deep neural network. The results show that the average area under the curve for prediction reached 0.93 for each strain, and the overall AUC and area under the precision-recall curve reached 0.89 and 0.92, respectively, on the independent test set; these values are greater than those of other existing prediction tools. The forward and reverse validation results indicate that key genes and key proteins regulate and influence the interaction, which supports the reliability of the model. In addition, intraspecific prediction experiments based on Klebsiella pneumoniae data demonstrate the potential applicability of DeepPBI-KG for intraspecific prediction. In summary, the feature engineering and interaction prediction approaches proposed in this study can effectively improve the robustness and stability of interaction prediction, can achieve high generalizability, and may provide new directions and insights for rapid phage screening for therapy. | 2024 | 39344712 |
| 9553 | 5 | 0.9995 | A machine learning framework to predict antibiotic resistance traits and yet unknown genes underlying resistance to specific antibiotics in bacterial strains. Recently, the frequency of observing bacterial strains without known genetic components underlying phenotypic resistance to antibiotics has increased. There are several strains of bacteria lacking known resistance genes; however, they demonstrate resistance phenotype to drugs of that family. Although such strains are fewer compared to the overall population, they pose grave emerging threats to an already heavily challenged area of antimicrobial resistance (AMR), where death tolls have reached ~700 000 per year and a grim projection of ~10 million deaths per year by 2050 looms. Considering the fact that development of novel antibiotics is not keeping pace with the emergence and dissemination of resistance, there is a pressing need to decipher yet unknown genetic mechanisms of resistance, which will enable developing strategies for the best use of available interventions and show the way for the development of new drugs. In this study, we present a machine learning framework to predict novel AMR factors that are potentially responsible for resistance to specific antimicrobial drugs. The machine learning framework utilizes whole-genome sequencing AMR genetic data and antimicrobial susceptibility testing phenotypic data to predict resistance phenotypes and rank AMR genes by their importance in discriminating the resistance from the susceptible phenotypes. In summary, we present here a bioinformatics framework for training machine learning models, evaluating their performances, selecting the best performing model(s) and finally predicting the most important AMR loci for the resistance involved. | 2021 | 34015806 |
| 8377 | 6 | 0.9994 | Genome-Wide Association Analyses in the Model Rhizobium Ensifer meliloti. Genome-wide association studies (GWAS) can identify genetic variants responsible for naturally occurring and quantitative phenotypic variation. Association studies therefore provide a powerful complement to approaches that rely on de novo mutations for characterizing gene function. Although bacteria should be amenable to GWAS, few GWAS have been conducted on bacteria, and the extent to which nonindependence among genomic variants (e.g., linkage disequilibrium [LD]) and the genetic architecture of phenotypic traits will affect GWAS performance is unclear. We apply association analyses to identify candidate genes underlying variation in 20 biochemical, growth, and symbiotic phenotypes among 153 strains of Ensifer meliloti For 11 traits, we find genotype-phenotype associations that are stronger than expected by chance, with the candidates in relatively small linkage groups, indicating that LD does not preclude resolving association candidates to relatively small genomic regions. The significant candidates show an enrichment for nucleotide polymorphisms (SNPs) over gene presence-absence variation (PAV), and for five traits, candidates are enriched in large linkage groups, a possible signature of epistasis. Many of the variants most strongly associated with symbiosis phenotypes were in genes previously identified as being involved in nitrogen fixation or nodulation. For other traits, apparently strong associations were not stronger than the range of associations detected in permuted data. In sum, our data show that GWAS in bacteria may be a powerful tool for characterizing genetic architecture and identifying genes responsible for phenotypic variation. However, careful evaluation of candidates is necessary to avoid false signals of association.IMPORTANCE Genome-wide association analyses are a powerful approach for identifying gene function. These analyses are becoming commonplace in studies of humans, domesticated animals, and crop plants but have rarely been conducted in bacteria. We applied association analyses to 20 traits measured in Ensifer meliloti, an agriculturally and ecologically important bacterium because it fixes nitrogen when in symbiosis with leguminous plants. We identified candidate alleles and gene presence-absence variants underlying variation in symbiosis traits, antibiotic resistance, and use of various carbon sources; some of these candidates are in genes previously known to affect these traits whereas others were in genes that have not been well characterized. Our results point to the potential power of association analyses in bacteria, but also to the need to carefully evaluate the potential for false associations. | 2018 | 30355664 |
| 3829 | 7 | 0.9994 | Associations among Antibiotic and Phage Resistance Phenotypes in Natural and Clinical Escherichia coli Isolates. The spread of antibiotic resistance is driving interest in new approaches to control bacterial pathogens. This includes applying multiple antibiotics strategically, using bacteriophages against antibiotic-resistant bacteria, and combining both types of antibacterial agents. All these approaches rely on or are impacted by associations among resistance phenotypes (where bacteria resistant to one antibacterial agent are also relatively susceptible or resistant to others). Experiments with laboratory strains have shown strong associations between some resistance phenotypes, but we lack a quantitative understanding of associations among antibiotic and phage resistance phenotypes in natural and clinical populations. To address this, we measured resistance to various antibiotics and bacteriophages for 94 natural and clinical Escherichia coli isolates. We found several positive associations between resistance phenotypes across isolates. Associations were on average stronger for antibacterial agents of the same type (antibiotic-antibiotic or phage-phage) than different types (antibiotic-phage). Plasmid profiles and genetic knockouts suggested that such associations can result from both colocalization of resistance genes and pleiotropic effects of individual resistance mechanisms, including one case of antibiotic-phage cross-resistance. Antibiotic resistance was predicted by core genome phylogeny and plasmid profile, but phage resistance was predicted only by core genome phylogeny. Finally, we used observed associations to predict genes involved in a previously uncharacterized phage resistance mechanism, which we verified using experimental evolution. Our data suggest that susceptibility to phages and antibiotics are evolving largely independently, and unlike in experiments with lab strains, negative associations between antibiotic resistance phenotypes in nature are rare. This is relevant for treatment scenarios where bacteria encounter multiple antibacterial agents.IMPORTANCE Rising antibiotic resistance is making it harder to treat bacterial infections. Whether resistance to a given antibiotic spreads or declines is influenced by whether it is associated with altered susceptibility to other antibiotics or other stressors that bacteria encounter in nature, such as bacteriophages (viruses that infect bacteria). We used natural and clinical isolates of Escherichia coli, an abundant species and key pathogen, to characterize associations among resistance phenotypes to various antibiotics and bacteriophages. We found associations between some resistance phenotypes, and in contrast to past work with laboratory strains, they were exclusively positive. Analysis of bacterial genome sequences and horizontally transferred genetic elements (plasmids) helped to explain this, as well as our finding that there was no overall association between antibiotic resistance and bacteriophage resistance profiles across isolates. This improves our understanding of resistance evolution in nature, potentially informing new rational therapies that combine different antibacterials, including bacteriophages. | 2017 | 29089428 |
| 5099 | 8 | 0.9994 | A machine learning-based strategy to elucidate the identification of antibiotic resistance in bacteria. Microorganisms, crucial for environmental equilibrium, could be destructive, resulting in detrimental pathophysiology to the human host. Moreover, with the emergence of antibiotic resistance (ABR), the microbial communities pose the century's largest public health challenges in terms of effective treatment strategies. Furthermore, given the large diversity and number of known bacterial strains, describing treatment choices for infected patients using experimental methodologies is time-consuming. An alternative technique, gaining popularity as sequencing prices fall and technology advances, is to use bacterial genotype rather than phenotype to determine ABR. Complementing machine learning into clinical practice provides a data-driven platform for categorization and interpretation of bacterial datasets. In the present study, k-mers were generated from nucleotide sequences of pathogenic bacteria resistant to antibiotics. Subsequently, they were clustered into groups of bacteria sharing similar genomic features using the Affinity propagation algorithm with a Silhouette coefficient of 0.82. Thereafter, a prediction model based on Random Forest algorithm was developed to explore the prediction capability of the k-mers. It yielded an overall specificity of 0.99 and a sensitivity of 0.98. Additionally, the genes and ABR drivers related to the k-mers were identified to explore their biological relevance. Furthermore, a multilayer perceptron model with a hamming loss of 0.05 was built to classify the bacterial strains into resistant and non-resistant strains against various antibiotics. Segregating pathogenic bacteria based on genomic similarities could be a valuable approach for assessing the severity of diseases caused by new bacterial strains. Utilization of this strategy could aid in enhancing our understanding of ABR patterns, paving the way for more informed and effective treatment options. | 2024 | 39816256 |
| 8932 | 9 | 0.9994 | Alternative Evolutionary Paths to Bacterial Antibiotic Resistance Cause Distinct Collateral Effects. When bacteria evolve resistance against a particular antibiotic, they may simultaneously gain increased sensitivity against a second one. Such collateral sensitivity may be exploited to develop novel, sustainable antibiotic treatment strategies aimed at containing the current, dramatic spread of drug resistance. To date, the presence and molecular basis of collateral sensitivity has only been studied in few bacterial species and is unknown for opportunistic human pathogens such as Pseudomonas aeruginosa. In the present study, we assessed patterns of collateral effects by experimentally evolving 160 independent populations of P. aeruginosa to high levels of resistance against eight commonly used antibiotics. The bacteria evolved resistance rapidly and expressed both collateral sensitivity and cross-resistance. The pattern of such collateral effects differed to those previously reported for other bacterial species, suggesting interspecific differences in the underlying evolutionary trade-offs. Intriguingly, we also identified contrasting patterns of collateral sensitivity and cross-resistance among the replicate populations adapted to the same drug. Whole-genome sequencing of 81 independently evolved populations revealed distinct evolutionary paths of resistance to the selective drug, which determined whether bacteria became cross-resistant or collaterally sensitive towards others. Based on genomic and functional genetic analysis, we demonstrate that collateral sensitivity can result from resistance mutations in regulatory genes such as nalC or mexZ, which mediate aminoglycoside sensitivity in β-lactam-adapted populations, or the two-component regulatory system gene pmrB, which enhances penicillin sensitivity in gentamicin-resistant populations. Our findings highlight substantial variation in the evolved collateral effects among replicates, which in turn determine their potential in antibiotic therapy. | 2017 | 28541480 |
| 9662 | 10 | 0.9994 | Species-Scale Genomic Analysis of Staphylococcus aureus Genes Influencing Phage Host Range and Their Relationships to Virulence and Antibiotic Resistance Genes. Phage therapy has been proposed as a possible alternative treatment for infections caused by the ubiquitous bacterial pathogen Staphylococcus aureus. However, successful therapy requires understanding the genetic basis of host range-the subset of strains in a species that could be killed by a particular phage. We searched diverse sets of S. aureus public genome sequences against a database of genes suggested from prior studies to influence host range to look for patterns of variation across the species. We found that genes encoding biosynthesis of molecules that were targets of S. aureus phage adsorption to the outer surface of the cell were the most conserved in the pangenome. Putative phage resistance genes that were core components of the pangenome genes had similar nucleotide diversity, ratio of nonsynonymous to synonymous substitutions, and functionality (measured by delta-bitscore) to other core genes. However, phage resistance genes that were not part of the core genome were significantly less consistent with the core genome phylogeny than all noncore genes in this set, suggesting more frequent movement between strains by horizontal gene transfer. Only superinfection immunity genes encoded by temperate phages inserted in the genome correlated with experimentally determined temperate phage resistance. Taken together, these results suggested that, while phage adsorption genes are heavily conserved in the S. aureus species, HGT may play a significant role in strain-specific evolution of host range patterns. IMPORTANCE Staphylococcus aureus is a widespread, hospital- and community-acquired pathogen that is commonly antibiotic resistant. It causes diverse diseases affecting both the skin and internal organs. Its ubiquity, antibiotic resistance, and disease burden make new therapies urgent, such as phage therapy, in which viruses specific to infecting bacteria clear infection. S. aureus phage host range not only determines whether phage therapy will be successful by killing bacteria but also horizontal gene transfer through transduction of host genetic material by phages. In this work, we comprehensively reviewed existing literature to build a list of S. aureus phage resistance genes and searched our database of almost 43,000 S. aureus genomes for these genes to understand their patterns of evolution, finding that prophages' superinfection immunity correlates best with phage resistance and HGT. These findings improved our understanding of the relationship between known phage resistance genes and phage host range in the species. | 2022 | 35040700 |
| 9657 | 11 | 0.9994 | Machine Learning Leveraging Genomes from Metagenomes Identifies Influential Antibiotic Resistance Genes in the Infant Gut Microbiome. Antibiotic resistance in pathogens is extensively studied, and yet little is known about how antibiotic resistance genes of typical gut bacteria influence microbiome dynamics. Here, we leveraged genomes from metagenomes to investigate how genes of the premature infant gut resistome correspond to the ability of bacteria to survive under certain environmental and clinical conditions. We found that formula feeding impacts the resistome. Random forest models corroborated by statistical tests revealed that the gut resistome of formula-fed infants is enriched in class D beta-lactamase genes. Interestingly, Clostridium difficile strains harboring this gene are at higher abundance in formula-fed infants than C. difficile strains lacking this gene. Organisms with genes for major facilitator superfamily drug efflux pumps have higher replication rates under all conditions, even in the absence of antibiotic therapy. Using a machine learning approach, we identified genes that are predictive of an organism's direction of change in relative abundance after administration of vancomycin and cephalosporin antibiotics. The most accurate results were obtained by reducing annotated genomic data to five principal components classified by boosted decision trees. Among the genes involved in predicting whether an organism increased in relative abundance after treatment are those that encode subclass B2 beta-lactamases and transcriptional regulators of vancomycin resistance. This demonstrates that machine learning applied to genome-resolved metagenomics data can identify key genes for survival after antibiotics treatment and predict how organisms in the gut microbiome will respond to antibiotic administration. IMPORTANCE The process of reconstructing genomes from environmental sequence data (genome-resolved metagenomics) allows unique insight into microbial systems. We apply this technique to investigate how the antibiotic resistance genes of bacteria affect their ability to flourish in the gut under various conditions. Our analysis reveals that strain-level selection in formula-fed infants drives enrichment of beta-lactamase genes in the gut resistome. Using genomes from metagenomes, we built a machine learning model to predict how organisms in the gut microbial community respond to perturbation by antibiotics. This may eventually have clinical applications. | 2018 | 29359195 |
| 9603 | 12 | 0.9994 | Resistance signatures manifested in early drug response across cancer types and species. Aim: Growing evidence points to non-genetic mechanisms underlying long-term resistance to cancer therapies. These mechanisms involve pre-existing or therapy-induced transcriptional cell states that confer resistance. However, the relationship between early transcriptional responses to treatment and the eventual emergence of resistant states remains poorly understood. Furthermore, it is unclear whether such early resistance-associated transcriptional responses are evolutionarily conserved. In this study, we examine the similarity between early transcriptional responses and long-term resistant states, assess their clinical relevance, and explore their evolutionary conservation across species. Methods: We integrated datasets on early drug responses and long-term resistance from multiple cancer cell lines, bacteria, and yeast to identify early transcriptional changes predictive of long-term resistance and assess their evolutionary conservation. Using genome-wide CRISPR-Cas9 knockout screens, we evaluated the impact of genes associated with resistant transcriptional states on drug sensitivity. Clinical datasets were analyzed to explore the prognostic value of the identified resistance-associated gene signatures. Results: We found that transcriptional states observed in drug-naive cells and shortly after treatment overlapped with those seen in fully resistant populations. Some of these shared features appear to be evolutionarily conserved. Knockout of genes marking resistant states sensitized ovarian cancer cells to Prexasertib. Moreover, early resistance gene signatures effectively distinguished therapy responders from non-responders in multiple clinical cancer trials and differentiated premalignant breast lesions that progressed to malignancy from those that remained benign. Conclusion: Early cellular transcriptional responses to therapy exhibit key similarities to fully resistant states across different drugs, cancer types, and species. Gene signatures defining these early resistance states have prognostic value in clinical settings. | 2025 | 41019980 |
| 5101 | 13 | 0.9994 | Identification of Key Features Pivotal to the Characteristics and Functions of Gut Bacteria Taxa through Machine Learning Methods. BACKGROUND: Gut bacteria critically influence digestion, facilitate the breakdown of complex food substances, aid in essential nutrient synthesis, and contribute to immune system balance. However, current knowledge regarding intestinal bacteria remains insufficient. OBJECTIVE: This study aims to discover essential differences for different intestinal bacteria. METHODS: This study was conducted by investigating a total of 1478 gut bacterial samples comprising 235 Actinobacteria, 447 Bacteroidetes, and 796 Firmicutes, by utilizing sophisticated machine learning algorithms. By building on the dataset provided by Chen et al., we engaged sophisticated machine learning techniques to further investigate and analyze the gut bacterial samples. Each sample in the dataset was described by 993 unique features associated with gut bacteria, including 342 features annotated by the Antibiotic Resistance Genes Database, Comprehensive Antibiotic Research Database, Kyoto Encyclopedia of Genes and Genomes, and Virulence Factors of Pathogenic Bacteria. We employed incremental feature selection methods within a computational framework to identify the optimal features for classification. RESULTS: Eleven feature ranking algorithms selected several key features as pivotal to the characteristics and functions of gut bacteria. These features appear to facilitate the identification of specific gut bacterial species. Additionally, we established quantitative rules for identifying Actinobacteria, Bacteroidetes, and Firmicutes. CONCLUSION: This research underscores the significant potential of machine learning in studying gut microbes and enhances our understanding of the multifaceted roles of gut bacteria. | 2025 | 40671232 |
| 4390 | 14 | 0.9994 | Integrated Co-functional Network Analysis on the Resistance and Virulence Features in Acinetobacter baumannii. Acinetobacter baumannii is one of the most troublesome bacterial pathogens that pose major public health threats due to its rapidly increasing drug resistance property. It is not only derived from clinic setting but also emerges from aquaculture as a fish pathogen, which could pass the resistant genes in the food chain. Understanding the mechanism of antibiotic resistance development and pathogenesis will aid our battle with the infections caused by A. baumannii. In this study, we constructed a co-functional network by integrating multiple sources of data from A. baumannii and then used the k-shell decomposition to analyze the co-functional network. We found that genes involving in basic cellular physiological function, including genes for antibiotic resistance, tended to have high k-shell values and locate in the internal layer of our network. In contrast, the non-essential genes, such as genes associated with virulence, tended to have lower k-shell values and locate in the external layer. This finding allows us to fish out the potential antibiotic resistance factors and virulence factors. In addition, we constructed an online platform ABviresDB (https://acba.shinyapps.io/ABviresDB/) for visualization of the network and features of each gene in A. baumannii. The network analysis in this study will not only aid the study on A. baumannii but also could be referenced for the research of antibiotic resistance and pathogenesis in other bacteria. | 2020 | 33224132 |
| 5102 | 15 | 0.9994 | Pipeline for Antimicrobial Resistance Gene Quantification from Host Tissue. Antibiotics are frequently used in food production animals to control disease and improve productivity, but this promotes the development of antimicrobial resistance (AMR) and subsequent broader spread of AMR bacteria throughout food chain, endangering the well-being and health of both animals and humans. In humans, the gut microbiome harbors a diverse range of AMR bacteria, known as the resistome. To effectively mitigate AMR in food animals requires first determining the expression and abundance of AMR-related genes in the gut resistome. Currently, such knowledge in regard to food animals is largely lacking. Gut tissue RNA sequencing (GTRS) can capture metabolically active transcripts from both the host and the microbes attached to the gut epithelium. Ideally, AMR genes can be quantified using GTRS data, making it possible to study the relationship between host and microbe. For the majority of these GTRS studies, only host transcriptome changes have been reported, while the microbial AMR remains largely unexamined, mainly due to the lack of easily implementable bioinformatics tools. Here we present a straightforward workflow to accomplish that using common command-line bioinformatics tools. With this pipeline, the host is considered noise, and host data are filtered out from the microbial reads. Transcript quantification of the AMR genes is then performed. The pipeline then continues through AMR transcript quantification, differential gene expression, and SNP analysis. Using open-source tools, we made this analytical pipeline easy to implement and able to generate results ready to be incorporated into publishable reports. Published 2025. This article is a U.S. Government work and is in the public domain in the USA. Basic Protocol: Running the gene quantification pipeline Support Protocol 1: Downloading FASTQ files from the NCBI database Support Protocol 2: Building a genome reference index of the host Support Protocol 3: Differential gene expression analysis Support Protocol 4: Single-nucleotide polymorphism (SNP) analysis. | 2025 | 40145236 |
| 4264 | 16 | 0.9994 | Mutational Evolution of Pseudomonas aeruginosa Resistance to Ribosome-Targeting Antibiotics. The present work examines the evolutionary trajectories of replicate Pseudomonas aeruginosa cultures in presence of the ribosome-targeting antibiotics tobramycin and tigecycline. It is known that large number of mutations across different genes - and therefore a large number of potential pathways - may be involved in resistance to any single antibiotic. Thus, evolution toward resistance might, to a large degree, rely on stochasticity, which might preclude the use of predictive strategies for fighting antibiotic resistance. However, the present results show that P. aeruginosa populations evolving in parallel in the presence of antibiotics (either tobramycin or tigecycline) follow a set of trajectories that present common elements. In addition, the pattern of resistance mutations involved include common elements for these two ribosome-targeting antimicrobials. This indicates that mutational evolution toward resistance (and perhaps other properties) is to a certain degree deterministic and, consequently, predictable. These findings are of interest, not just for P. aeruginosa, but in understanding the general rules involved in the evolution of antibiotic resistance also. In addition, the results indicate that bacteria can evolve toward higher levels of resistance to antibiotics against which they are considered to be intrinsically resistant, as tigecycline in the case of P. aeruginosa and that this may confer cross-resistance to other antibiotics of therapeutic value. Our results are particularly relevant in the case of patients under empiric treatment with tigecycline, which frequently suffer P. aeruginosa superinfections. | 2018 | 30405685 |
| 4624 | 17 | 0.9994 | Deciphering the distance to antibiotic resistance for the pneumococcus using genome sequencing data. Advances in genome sequencing technologies and genome-wide association studies (GWAS) have provided unprecedented insights into the molecular basis of microbial phenotypes and enabled the identification of the underlying genetic variants in real populations. However, utilization of genome sequencing in clinical phenotyping of bacteria is challenging due to the lack of reliable and accurate approaches. Here, we report a method for predicting microbial resistance patterns using genome sequencing data. We analyzed whole genome sequences of 1,680 Streptococcus pneumoniae isolates from four independent populations using GWAS and identified probable hotspots of genetic variation which correlate with phenotypes of resistance to essential classes of antibiotics. With the premise that accumulation of putative resistance-conferring SNPs, potentially in combination with specific resistance genes, precedes full resistance, we retrogressively surveyed the hotspot loci and quantified the number of SNPs and/or genes, which if accumulated would confer full resistance to an otherwise susceptible strain. We name this approach the 'distance to resistance'. It can be used to identify the creep towards complete antibiotics resistance in bacteria using genome sequencing. This approach serves as a basis for the development of future sequencing-based methods for predicting resistance profiles of bacterial strains in hospital microbiology and public health settings. | 2017 | 28205635 |
| 5118 | 18 | 0.9994 | Automated extraction of genes associated with antibiotic resistance from the biomedical literature. The detection of bacterial antibiotic resistance phenotypes is important when carrying out clinical decisions for patient treatment. Conventional phenotypic testing involves culturing bacteria which requires a significant amount of time and work. Whole-genome sequencing is emerging as a fast alternative to resistance prediction, by considering the presence/absence of certain genes. A lot of research has focused on determining which bacterial genes cause antibiotic resistance and efforts are being made to consolidate these facts in knowledge bases (KBs). KBs are usually manually curated by domain experts to be of the highest quality. However, this limits the pace at which new facts are added. Automated relation extraction of gene-antibiotic resistance relations from the biomedical literature is one solution that can simplify the curation process. This paper reports on the development of a text mining pipeline that takes in English biomedical abstracts and outputs genes that are predicted to cause resistance to antibiotics. To test the generalisability of this pipeline it was then applied to predict genes associated with Helicobacter pylori antibiotic resistance, that are not present in common antibiotic resistance KBs or publications studying H. pylori. These genes would be candidates for further lab-based antibiotic research and inclusion in these KBs. For relation extraction, state-of-the-art deep learning models were used. These models were trained on a newly developed silver corpus which was generated by distant supervision of abstracts using the facts obtained from KBs. The top performing model was superior to a co-occurrence model, achieving a recall of 95%, a precision of 60% and F1-score of 74% on a manually annotated holdout dataset. To our knowledge, this project was the first attempt at developing a complete text mining pipeline that incorporates deep learning models to extract gene-antibiotic resistance relations from the literature. Additional related data can be found at https://github.com/AndreBrincat/Gene-Antibiotic-Resistance-Relation-Extraction. | 2022 | 35134132 |
| 4342 | 19 | 0.9994 | Evolution and diversity of clonal bacteria: the paradigm of Mycobacterium tuberculosis. BACKGROUND: Mycobacterium tuberculosis complex species display relatively static genomes and 99.9% nucleotide sequence identity. Studying the evolutionary history of such monomorphic bacteria is a difficult and challenging task. PRINCIPAL FINDINGS: We found that single-nucleotide polymorphism (SNP) analysis of DNA repair, recombination and replication (3R) genes in a comprehensive selection of M. tuberculosis complex strains from across the world, yielded surprisingly high levels of polymorphisms as compared to house-keeping genes, making it possible to distinguish between 80% of clinical isolates analyzed in this study. Bioinformatics analysis suggests that a large number of these polymorphisms are potentially deleterious. Site frequency spectrum comparison of synonymous and non-synonymous variants and Ka/Ks ratio analysis suggest a general negative/purifying selection acting on these sets of genes that may lead to suboptimal 3R system activity. In turn, the relaxed fidelity of 3R genes may allow the occurrence of adaptive variants, some of which will survive. Furthermore, 3R-based phylogenetic trees are a new tool for distinguishing between M. tuberculosis complex strains. CONCLUSIONS/SIGNIFICANCE: This situation, and the consequent lack of fidelity in genome maintenance, may serve as a starting point for the evolution of antibiotic resistance, fitness for survival and pathogenicity, possibly conferring a selective advantage in certain stressful situations. These findings suggest that 3R genes may play an important role in the evolution of highly clonal bacteria, such as M. tuberculosis. They also facilitate further epidemiological studies of these bacteria, through the development of high-resolution tools. With many more microbial genomes being sequenced, our results open the door to 3R gene-based studies of adaptation and evolution of other, highly clonal bacteria. | 2008 | 18253486 |