ResiDB: An automated database manager for sequence data. - Related Documents




#
Rank
Similarity
Title + Abs.
Year
PMID
012345
907601.0000ResiDB: An automated database manager for sequence data. The amount of publicly available DNA sequence data is drastically increasing, making it a tedious task to create sequence databases necessary for the design of diagnostic assays. The selection of appropriate sequences is especially challenging in genes affected by frequent point mutations such as antibiotic resistance genes. To overcome this issue, we have designed the webtool resiDB, a rapid and user-friendly sequence database manager for bacteria, fungi, viruses, protozoa, invertebrates, plants, archaea, environmental and whole genome shotgun sequence data. It automatically identifies and curates sequence clusters to create custom sequence databases based on user-defined input sequences. A collection of helpful visualization tools gives the user the opportunity to easily access, evaluate, edit, and download the newly created database. Consequently, researchers do no longer have to manually manage sequence data retrieval, deal with hardware limitations, and run multiple independent software tools, each having its own requirements, input and output formats. Our tool was developed within the H2020 project FAPIC aiming to develop a single diagnostic assay targeting all sepsis-relevant pathogens and antibiotic resistance mechanisms. ResiDB is freely accessible to all users through https://residb.ait.ac.at/.202133495705
907710.9991The PLSDB 2025 update: enhanced annotations and improved functionality for comprehensive plasmid research. Plasmids are extrachromosomal DNA molecules in bacteria and archaea, playing critical roles in horizontal gene transfer, antibiotic resistance, and pathogenicity. Since its first release in 2018, our database on plasmids, PLSDB, has significantly grown and enhanced its content and scope. From 34 513 records contained in the 2021 version, PLSDB now hosts 72 360 entries. Designed to provide life scientists with convenient access to extensive plasmid data and to support computer scientists by offering curated datasets for artificial intelligence (AI) development, this latest update brings more comprehensive and accurate information for plasmid research, with interactive visualization options. We enriched PLSDB by refining the identification and classification of plasmid host ecosystems and host diseases. Additionally, we incorporated annotations for new functional structures, including protein-coding genes and biosynthetic gene clusters. Further, we enhanced existing annotations, such as antimicrobial resistance genes and mobility typing. To accommodate these improvements and to host the increase plasmid sets, the webserver architecture and underlying data structures of PLSDB have been re-reconstructed, resulting in decreased response times and enhanced visualization of features while ensuring that users have access to a more efficient and user-friendly interface. The latest release of PLSDB is freely accessible at https://www.ccb.uni-saarland.de/plsdb2025.202539565221
907820.9991MetaCherchant: analyzing genomic context of antibiotic resistance genes in gut microbiota. MOTIVATION: Antibiotic resistance is an important global public health problem. Human gut microbiota is an accumulator of resistance genes potentially providing them to pathogens. It is important to develop tools for identifying the mechanisms of how resistance is transmitted between gut microbial species and pathogens. RESULTS: We developed MetaCherchant-an algorithm for extracting the genomic environment of antibiotic resistance genes from metagenomic data in the form of a graph. The algorithm was validated on a number of simulated and published datasets, as well as applied to new 'shotgun' metagenomes of gut microbiota from patients with Helicobacter pylori who underwent antibiotic therapy. Genomic context was reconstructed for several major resistance genes. Taxonomic annotation of the context suggests that within a single metagenome, the resistance genes can be contained in genomes of multiple species. MetaCherchant allows reconstruction of mobile elements with resistance genes within the genomes of bacteria using metagenomic data. Application of MetaCherchant in differential mode produced specific graph structures suggesting the evidence of possible resistance gene transmission within a mobile element that occurred as a result of the antibiotic therapy. MetaCherchant is a promising tool giving researchers an opportunity to get an insight into dynamics of resistance transmission in vivo basing on metagenomic data. AVAILABILITY AND IMPLEMENTATION: Source code and binaries are freely available for download at https://github.com/ctlab/metacherchant. The code is written in Java and is platform-independent. COTANCT: ulyantsev@rain.ifmo.ru. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.201829092015
974430.9991PARGT: a software tool for predicting antimicrobial resistance in bacteria. With the ever-increasing availability of whole-genome sequences, machine-learning approaches can be used as an alternative to traditional alignment-based methods for identifying new antimicrobial-resistance genes. Such approaches are especially helpful when pathogens cannot be cultured in the lab. In previous work, we proposed a game-theory-based feature evaluation algorithm. When using the protein characteristics identified by this algorithm, called 'features' in machine learning, our model accurately identified antimicrobial resistance (AMR) genes in Gram-negative bacteria. Here we extend our study to Gram-positive bacteria showing that coupling game-theory-identified features with machine learning achieved classification accuracies between 87% and 90% for genes encoding resistance to the antibiotics bacitracin and vancomycin. Importantly, we present a standalone software tool that implements the game-theory algorithm and machine-learning model used in these studies.202032620856
907440.9990BacAnt: A Combination Annotation Server for Bacterial DNA Sequences to Identify Antibiotic Resistance Genes, Integrons, and Transposable Elements. Whole genome sequencing (WGS) of bacteria has become a routine method in diagnostic laboratories. One of the clinically most useful advantages of WGS is the ability to predict antimicrobial resistance genes (ARGs) and mobile genetic elements (MGEs) in bacterial sequences. This allows comprehensive investigations of such genetic features but can also be used for epidemiological studies. A plethora of software programs have been developed for the detailed annotation of bacterial DNA sequences, such as rapid annotation using subsystem technology (RAST), Resfinder, ISfinder, INTEGRALL and The Transposon Registry. Unfortunately, to this day, a reliable annotation tool of the combination of ARGs and MGEs is not available, and the generation of genbank files requires much manual input. Here, we present a new webserver which allows the annotation of ARGs, integrons and transposable elements at the same time. The pipeline generates genbank files automatically, which are compatible with Easyfig for comparative genomic analysis. Our BacAnt code and standalone software package are available at https://github.com/xthua/bacant with an accompanying web application at http://bacant.net.202134367079
840550.9990Mapping Major Disease Resistance Genes in Soybean by Genome-Wide Association Studies. Soybean is one of the most valuable agricultural crops in the world. Besides, this legume is constantly attacked by a wide range of pathogens (fungi, bacteria, viruses, and nematodes) compromising yield and increasing production costs. One of the major disease management strategies is the genetic resistance provided by single genes and quantitative trait loci (QTL). Identifying the genomic regions underlying the resistance against these pathogens on soybean is one of the first steps performed by molecular breeders. In the past, genetic mapping studies have been widely used to discover these genomic regions. However, over the last decade, advances in next-generation sequencing technologies and their subsequent cost decreasing led to the development of cost-effective approaches to high-throughput genotyping. Thus, genome-wide association studies applying thousands of SNPs in large sets composed of diverse soybean accessions have been successfully done. In this chapter, a comprehensive review of the majority of GWAS for soybean diseases published since this approach was developed is provided. Important diseases caused by Heterodera glycines, Phytophthora sojae, and Sclerotinia sclerotiorum have been the focus of the several GWAS. However, other bacterial and fungi diseases also have been targets of GWAS. As such, this GWAS summary can serve as a guide for future studies of these diseases. The protocol begins by describing several considerations about the pathogens and bringing different procedures of molecular characterization of them. Advice to choose the best isolate/race to maximize the discovery of multiple R genes or to directly map an effective R gene is provided. A summary of protocols, methods, and tools to phenotyping the soybean panel is given to several diseases. We also give details of options of DNA extraction protocols and genotyping methods, and we describe parameters of SNP quality to soybean data. Websites and their online tools to obtain genotypic and phenotypic data for thousands of soybean accessions are highlighted. Finally, we report several tricks and tips in Subheading 4, especially related to composing the soybean panel as well as generating and analyzing the phenotype data. We hope this protocol will be helpful to achieve GWAS success in identifying resistance genes on soybean.202235641772
918460.9989Unlocking the potential of phages: Innovative approaches to harnessing bacteriophages as diagnostic tools for human diseases. Phages, viruses that infect bacteria, have been explored as promising tools for the detection of human disease. By leveraging the specificity of phages for their bacterial hosts, phage-based diagnostic tools can rapidly and accurately detect bacterial infections in clinical samples. In recent years, advances in genetic engineering and biotechnology have enabled the development of more sophisticated phage-based diagnostic tools, including those that express reporter genes or enzymes, or target specific virulence factors or antibiotic resistance genes. However, despite these advancements, there are still challenges and limitations to the use of phage-based diagnostic tools, including concerns over phage safety and efficacy. This review aims to provide a comprehensive overview of the current state of phage-based diagnostic tools, including their advantages, limitations, and potential for future development. By addressing these issues, we hope to contribute to the ongoing efforts to develop safe and effective phage-based diagnostic tools for the detection of human disease.202337770168
511570.9989Search Engine for Antimicrobial Resistance: A Cloud Compatible Pipeline and Web Interface for Rapidly Detecting Antimicrobial Resistance Genes Directly from Sequence Data. BACKGROUND: Antimicrobial resistance remains a growing and significant concern in human and veterinary medicine. Current laboratory methods for the detection and surveillance of antimicrobial resistant bacteria are limited in their effectiveness and scope. With the rapidly developing field of whole genome sequencing beginning to be utilised in clinical practice, the ability to interrogate sequencing data quickly and easily for the presence of antimicrobial resistance genes will become increasingly important and useful for informing clinical decisions. Additionally, use of such tools will provide insight into the dynamics of antimicrobial resistance genes in metagenomic samples such as those used in environmental monitoring. RESULTS: Here we present the Search Engine for Antimicrobial Resistance (SEAR), a pipeline and web interface for detection of horizontally acquired antimicrobial resistance genes in raw sequencing data. The pipeline provides gene information, abundance estimation and the reconstructed sequence of antimicrobial resistance genes; it also provides web links to additional information on each gene. The pipeline utilises clustering and read mapping to annotate full-length genes relative to a user-defined database. It also uses local alignment of annotated genes to a range of online databases to provide additional information. We demonstrate SEAR's application in the detection and abundance estimation of antimicrobial resistance genes in two novel environmental metagenomes, 32 human faecal microbiome datasets and 126 clinical isolates of Shigella sonnei. CONCLUSIONS: We have developed a pipeline that contributes to the improved capacity for antimicrobial resistance detection afforded by next generation sequencing technologies, allowing for rapid detection of antimicrobial resistance genes directly from sequencing data. SEAR uses raw sequencing data via an intuitive interface so can be run rapidly without requiring advanced bioinformatic skills or resources. Finally, we show that SEAR is effective in detecting antimicrobial resistance genes in metagenomic and isolate sequencing data from both environmental metagenomes and sequencing data from clinical isolates.201526197475
955780.9989Antimicrobial Resistance Profile by Metagenomic and Metatranscriptomic Approach in Clinical Practice: Opportunity and Challenge. The burden of bacterial resistance to antibiotics affects several key sectors in the world, including healthcare, the government, and the economic sector. Resistant bacterial infection is associated with prolonged hospital stays, direct costs, and costs due to loss of productivity, which will cause policy makers to adjust their policies. Current widely performed procedures for the identification of antibiotic-resistant bacteria rely on culture-based methodology. However, some resistance determinants, such as free-floating DNA of resistance genes, are outside the bacterial genome, which could be potentially transferred under antibiotic exposure. Metagenomic and metatranscriptomic approaches to profiling antibiotic resistance offer several advantages to overcome the limitations of the culture-based approach. These methodologies enhance the probability of detecting resistance determinant genes inside and outside the bacterial genome and novel resistance genes yet pose inherent challenges in availability, validity, expert usability, and cost. Despite these challenges, such molecular-based and bioinformatics technologies offer an exquisite advantage in improving clinicians' diagnoses and the management of resistant infectious diseases in humans. This review provides a comprehensive overview of next-generation sequencing technologies, metagenomics, and metatranscriptomics in assessing antimicrobial resistance profiles.202235625299
955490.9989A multi-label learning framework for predicting antibiotic resistance genes via dual-view modeling. The increasing prevalence of antibiotic resistance has become a global health crisis. For the purpose of safety regulation, it is of high importance to identify antibiotic resistance genes (ARGs) in bacteria. Although culture-based methods can identify ARGs relatively more accurately, the identifying process is time-consuming and specialized knowledge is required. With the rapid development of whole genome sequencing technology, researchers attempt to identify ARGs by computing sequence similarity from public databases. However, these computational methods might fail to detect ARGs due to the low sequence identity to known ARGs. Moreover, existing methods cannot effectively address the issue of multidrug resistance prediction for ARGs, which is a great challenge to clinical treatments. To address the challenges, we propose an end-to-end multi-label learning framework for predicting ARGs. More specifically, the task of ARGs prediction is modeled as a problem of multi-label learning, and a deep neural network-based end-to-end framework is proposed, in which a specific loss function is introduced to employ the advantage of multi-label learning for ARGs prediction. In addition, a dual-view modeling mechanism is employed to make full use of the semantic associations among two views of ARGs, i.e. sequence-based information and structure-based information. Extensive experiments are conducted on publicly available data, and experimental results demonstrate the effectiveness of the proposed framework on the task of ARGs prediction.202235272349
9075100.9989CamPype: an open-source workflow for automated bacterial whole-genome sequencing analysis focused on Campylobacter. BACKGROUND: The rapid expansion of Whole-Genome Sequencing has revolutionized the fields of clinical and food microbiology. However, its implementation as a routine laboratory technique remains challenging due to the growth of data at a faster rate than can be effectively analyzed and critical gaps in bioinformatics knowledge. RESULTS: To address both issues, CamPype was developed as a new bioinformatics workflow for the genomics analysis of sequencing data of bacteria, especially Campylobacter, which is the main cause of gastroenteritis worldwide making a negative impact on the economy of the public health systems. CamPype allows fully customization of stages to run and tools to use, including read quality control filtering, read contamination, reads extension and assembly, bacterial typing, genome annotation, searching for antibiotic resistance genes, virulence genes and plasmids, pangenome construction and identification of nucleotide variants. All results are processed and resumed in an interactive HTML report for best data visualization and interpretation. CONCLUSIONS: The minimal user intervention of CamPype makes of this workflow an attractive resource for microbiology laboratories with no expertise in bioinformatics as a first line method for bacterial typing and epidemiological analyses, that would help to reduce the costs of disease outbreaks, or for comparative genomic analyses. CamPype is publicly available at https://github.com/JoseBarbero/CamPype .202337474912
5112110.9989Genome-Based Prediction of Bacterial Antibiotic Resistance. Clinical microbiology has long relied on growing bacteria in culture to determine antimicrobial susceptibility profiles, but the use of whole-genome sequencing for antibiotic susceptibility testing (WGS-AST) is now a powerful alternative. This review discusses the technologies that made this possible and presents results from recent studies to predict resistance based on genome sequences. We examine differences between calling antibiotic resistance profiles by the simple presence or absence of previously known genes and single-nucleotide polymorphisms (SNPs) against approaches that deploy machine learning and statistical models. Often, the limitations to genome-based prediction arise from limitations of accuracy of culture-based AST in addition to an incomplete knowledge of the genetic basis of resistance. However, we need to maintain phenotypic testing even as genome-based prediction becomes more widespread to ensure that the results do not diverge over time. We argue that standardization of WGS-AST by challenge with consistently phenotyped strain sets of defined genetic diversity is necessary to compare the efficacy of methods of prediction of antibiotic resistance based on genome sequences.201930381421
5114120.9988Datasets for benchmarking antimicrobial resistance genes in bacterial metagenomic and whole genome sequencing. Whole genome sequencing (WGS) is a key tool in identifying and characterising disease-associated bacteria across clinical, agricultural, and environmental contexts. One increasingly common use of genomic and metagenomic sequencing is in identifying the type and range of antimicrobial resistance (AMR) genes present in bacterial isolates in order to make predictions regarding their AMR phenotype. However, there are a large number of alternative bioinformatics software and pipelines available, which can lead to dissimilar results. It is, therefore, vital that researchers carefully evaluate their genomic and metagenomic AMR analysis methods using a common dataset. To this end, as part of the Microbial Bioinformatics Hackathon and Workshop 2021, a 'gold standard' reference genomic and simulated metagenomic dataset was generated containing raw sequence reads mapped against their corresponding reference genome from a range of 174 potentially pathogenic bacteria. These datasets and their accompanying metadata are freely available for use in benchmarking studies of bacteria and their antimicrobial resistance genes and will help improve tool development for the identification of AMR genes in complex samples.202235705638
8399130.9988SYN-View: A Phylogeny-Based Synteny Exploration Tool for the Identification of Gene Clusters Linked to Antibiotic Resistance. The development of new antibacterial drugs has become one of the most important tasks of the century in order to overcome the posing threat of drug resistance in pathogenic bacteria. Many antibiotics originate from natural products produced by various microorganisms. Over the last decades, bioinformatical approaches have facilitated the discovery and characterization of these small compounds using genome mining methodologies. A key part of this process is the identification of the most promising biosynthetic gene clusters (BGCs), which encode novel natural products. In 2017, the Antibiotic Resistant Target Seeker (ARTS) was developed in order to enable an automated target-directed genome mining approach. ARTS identifies possible resistant target genes within antibiotic gene clusters, in order to detect promising BGCs encoding antibiotics with novel modes of action. Although ARTS can predict promising targets based on multiple criteria, it provides little information about the cluster structures of possible resistant genes. Here, we present SYN-view. Based on a phylogenetic approach, SYN-view allows for easy comparison of gene clusters of interest and distinguishing genes with regular housekeeping functions from genes functioning as antibiotic resistant targets. Our aim is to implement our proposed method into the ARTS web-server, further improving the target-directed genome mining strategy of the ARTS pipeline.202033396183
5113140.9988Identification of bacterial antibiotic resistance genes in next-generation sequencing data (review of literature). The spread of antibiotic-resistant human bacterial pathogens is a serious threat to modern medicine. Antibiotic susceptibility testing is essential for treatment regimens optimization and preventing dissemination of antibiotic resistance. Therefore, development of antibiotic susceptibility testing methods is a priority challenge of laboratory medicine. The aim of this review is to analyze the capabilities of the bioinformatics tools for bacterial whole genome sequence data processing. The PubMed database, Russian scientific electronic library eLIBRARY, information networks of World health organization and European Society of Clinical Microbiology and Infectious Diseases (ESCMID) were used during the analysis. In this review, the platforms for whole genome sequencing, which are suitable for detection of bacterial genetic resistance determinants, are described. The classic step of genetic resistance determinants searching is an alignment between the query nucleotide/protein sequence and the subject (database) nucleotide/protein sequence, which is performed using the nucleotide and protein sequence databases. The most commonly used databases are Resfinder, CARD, Bacterial Antimicrobial Resistance Reference Gene Database. The results of the resistance determinants searching in genome assemblies is more correct in comparison to results of the searching in contigs. The new resistance genes searching bioinformatics tools, such as neural networks and machine learning, are discussed in the review. After critical appraisal of the current antibiotic resistance databases we designed a protocol for predicting antibiotic resistance using whole genome sequence data. The designed protocol can be used as a basis of the algorithm for qualitative and quantitative antimicrobial susceptibility testing based on whole genome sequence data.202134882354
9560150.9988The History of Colistin Resistance Mechanisms in Bacteria: Progress and Challenges. Since 2015, the discovery of colistin resistance genes has been limited to the characterization of new mobile colistin resistance (mcr) gene variants. However, given the complexity of the mechanisms involved, there are many colistin-resistant bacterial strains whose mechanism remains unknown and whose exploitation requires complementary technologies. In this review, through the history of colistin, we underline the methods used over the last decades, both old and recent, to facilitate the discovery of the main colistin resistance mechanisms and how new technological approaches may help to improve the rapid and efficient exploration of new target genes. To accomplish this, a systematic search was carried out via PubMed and Google Scholar on published data concerning polymyxin resistance from 1950 to 2020 using terms most related to colistin. This review first explores the history of the discovery of the mechanisms of action and resistance to colistin, based on the technologies deployed. Then we focus on the most advanced technologies used, such as MALDI-TOF-MS, high throughput sequencing or the genetic toolbox. Finally, we outline promising new approaches, such as omics tools and CRISPR-Cas9, as well as the challenges they face. Much has been achieved since the discovery of polymyxins, through several innovative technologies. Nevertheless, colistin resistance mechanisms remains very complex.202133672663
5100160.9988DeepPBI-KG: a deep learning method for the prediction of phage-bacteria interactions based on key genes. Phages, the natural predators of bacteria, were discovered more than 100 years ago. However, increasing antimicrobial resistance rates have revitalized phage research. Methods that are more time-consuming and efficient than wet-laboratory experiments are needed to help screen phages quickly for therapeutic use. Traditional computational methods usually ignore the fact that phage-bacteria interactions are achieved by key genes and proteins. Methods for intraspecific prediction are rare since almost all existing methods consider only interactions at the species and genus levels. Moreover, most strains in existing databases contain only partial genome information because whole-genome information for species is difficult to obtain. Here, we propose a new approach for interaction prediction by constructing new features from key genes and proteins via the application of K-means sampling to select high-quality negative samples for prediction. Finally, we develop DeepPBI-KG, a corresponding prediction tool based on feature selection and a deep neural network. The results show that the average area under the curve for prediction reached 0.93 for each strain, and the overall AUC and area under the precision-recall curve reached 0.89 and 0.92, respectively, on the independent test set; these values are greater than those of other existing prediction tools. The forward and reverse validation results indicate that key genes and key proteins regulate and influence the interaction, which supports the reliability of the model. In addition, intraspecific prediction experiments based on Klebsiella pneumoniae data demonstrate the potential applicability of DeepPBI-KG for intraspecific prediction. In summary, the feature engineering and interaction prediction approaches proposed in this study can effectively improve the robustness and stability of interaction prediction, can achieve high generalizability, and may provide new directions and insights for rapid phage screening for therapy.202439344712
5103170.9988Revolutionising bacteriology to improve treatment outcomes and antibiotic stewardship. LABORATORY INVESTIGATION OF BACTERIAL INFECTIONS GENERALLY TAKES TWO DAYS: one to grow the bacteria and another to identify them and to test their susceptibility. Meanwhile the patient is treated empirically, based on likely pathogens and local resistance rates. Many patients are over-treated to prevent under-treatment of a few, compromising antibiotic stewardship. Molecular diagnostics have potential to improve this situation by accelerating precise diagnoses and the early refinement of antibiotic therapy. They include: (i) the use of 'biomarkers' to swiftly distinguish patients with bacterial infection, and (ii) molecular bacteriology to identify pathogens and their resistance genes in clinical specimens, without culture. Biomarker interest centres on procalcitonin, which has given good results particularly for pneumonias, though broader biomarker arrays may prove superior in the future. PCRs already are widely used to diagnose a few infections (e.g. tuberculosis) whilst multiplexes are becoming available for bacteraemia, pneumonia and gastrointestinal infection. These detect likely pathogens, but are not comprehensive, particularly for resistance genes; there is also the challenge of linking pathogens and resistance genes when multiple organisms are present in a sample. Next-generation sequencing offers more comprehensive profiling, but obstacles include sensitivity when the bacterial load is low, as in bacteraemia, and the imperfect correlation of genotype and phenotype. In short, rapid molecular bacteriology presents great potential to improve patient treatments and antibiotic stewardship but faces many technical challenges; moreover it runs counter to the current nostrum of defining resistance in pharmacodynamic terms, rather than by the presence of a mechanism, and the policy of centralising bacteriology services.201324265945
9566180.9988Computational resources in the management of antibiotic resistance: Speeding up drug discovery. This article reviews more than 50 computational resources developed in past two decades for forecasting of antibiotic resistance (AR)-associated mutations, genes and genomes. More than 30 databases have been developed for AR-associated information, but only a fraction of them are updated regularly. A large number of methods have been developed to find AR genes, mutations and genomes, with most of them based on similarity-search tools such as BLAST and HMMER. In addition, methods have been developed to predict the inhibition potential of antibiotics against a bacterial strain from the whole-genome data of bacteria. This review also discuss computational resources that can be used to manage the treatment of AR-associated diseases.202133892146
5099190.9988A machine learning-based strategy to elucidate the identification of antibiotic resistance in bacteria. Microorganisms, crucial for environmental equilibrium, could be destructive, resulting in detrimental pathophysiology to the human host. Moreover, with the emergence of antibiotic resistance (ABR), the microbial communities pose the century's largest public health challenges in terms of effective treatment strategies. Furthermore, given the large diversity and number of known bacterial strains, describing treatment choices for infected patients using experimental methodologies is time-consuming. An alternative technique, gaining popularity as sequencing prices fall and technology advances, is to use bacterial genotype rather than phenotype to determine ABR. Complementing machine learning into clinical practice provides a data-driven platform for categorization and interpretation of bacterial datasets. In the present study, k-mers were generated from nucleotide sequences of pathogenic bacteria resistant to antibiotics. Subsequently, they were clustered into groups of bacteria sharing similar genomic features using the Affinity propagation algorithm with a Silhouette coefficient of 0.82. Thereafter, a prediction model based on Random Forest algorithm was developed to explore the prediction capability of the k-mers. It yielded an overall specificity of 0.99 and a sensitivity of 0.98. Additionally, the genes and ABR drivers related to the k-mers were identified to explore their biological relevance. Furthermore, a multilayer perceptron model with a hamming loss of 0.05 was built to classify the bacterial strains into resistant and non-resistant strains against various antibiotics. Segregating pathogenic bacteria based on genomic similarities could be a valuable approach for assessing the severity of diseases caused by new bacterial strains. Utilization of this strategy could aid in enhancing our understanding of ABR patterns, paving the way for more informed and effective treatment options.202439816256