LABOR - Word Related Documents

#	Rank	Similarity	Title + Abs.	Year	PMID
0	1	2	3	4	5
9076	0	0.9784	ResiDB: An automated database manager for sequence data. The amount of publicly available DNA sequence data is drastically increasing, making it a tedious task to create sequence databases necessary for the design of diagnostic assays. The selection of appropriate sequences is especially challenging in genes affected by frequent point mutations such as antibiotic resistance genes. To overcome this issue, we have designed the webtool resiDB, a rapid and user-friendly sequence database manager for bacteria, fungi, viruses, protozoa, invertebrates, plants, archaea, environmental and whole genome shotgun sequence data. It automatically identifies and curates sequence clusters to create custom sequence databases based on user-defined input sequences. A collection of helpful visualization tools gives the user the opportunity to easily access, evaluate, edit, and download the newly created database. Consequently, researchers do no longer have to manually manage sequence data retrieval, deal with hardware limitations, and run multiple independent software tools, each having its own requirements, input and output formats. Our tool was developed within the H2020 project FAPIC aiming to develop a single diagnostic assay targeting all sepsis-relevant pathogens and antibiotic resistance mechanisms. ResiDB is freely accessible to all users through https://residb.ait.ac.at/.	2021	33495705
5164	1	0.9778	Genome sequencing analysis of the pncA, rpsA and panD genes responsible for pyrazinamide resistance of Mycobacterium tuberculosis from Indonesian isolates. BACKGROUND: Developing the most suitable treatment against tuberculosis based on resistance profiles is imperative to effectively cure tuberculosis patients. Whole-genome sequencing is a molecular method that allows for the rapid and cost-effective detection of mutations in multiple genes associated with anti-tuberculosis drug resistance. This sequencing approach addresses the limitations of culture-based methods, which may not apply to certain anti-TB drugs, such as pyrazinamide, because of their specific culture medium requirements, potentially leading to biased resistance culture results. METHODS: Thirty-four M. tuberculosis isolates were subcultured on a Lowenstein-Jensen medium. The genome of these bacteria was subsequently isolated using cetyltrimethylammonium bromide. Genome sequencing was performed with Novaseq Illumina 6000 (Illumina), and the data were analysed using the GenTB and Mykrobe applications. We also conducted a de novo analysis to compare the two methods and performed mutation analysis of other genes encoding pyrazinamide resistance, namely rpsA and panD. RESULTS: The results revealed mutations in the pncA gene, which were identified based on the databases accessed through GenTB and Mykrobe. Two discrepancies between the drug susceptibility testing and sequencing results may suggest potential instability in the drug susceptibility testing culture, specifically concerning PZA. Meanwhile, the results of the de novo analysis showed the same result of pncA mutation to the GenTB or Mykrobe; meanwhile, there were silent mutations in rpsA in several isolates and a point mutation; no mutations were found in the panD gene. However, the mutations in the genes encoding pyrazinamide require further and in-depth study to understand their relationship to the phenotypic profile. CONCLUSIONS: Compared to the conventional culture method, the whole-genome sequencing method has advantages in determining anti-tuberculosis resistance profiles, especially in reduced time and bias.	2024	39397216
5163	2	0.9777	Multi-omics data elucidate parasite-host-microbiota interactions and resistance to Haemonchus contortus in sheep. BACKGROUND: The integration of molecular data from hosts, parasites, and microbiota can enhance our understanding of the complex biological interactions underlying the resistance of hosts to parasites. Haemonchus contortus, the predominant sheep gastrointestinal parasite species in the tropics, causes significant production and economic losses, which are further compounded by the diminishing efficiency of chemical control owing to anthelmintic resistance. Knowledge of how the host responds to infection and how the parasite, in combination with microbiota, modulates host immunity can guide selection decisions to breed animals with improved parasite resistance. This understanding will help refine management practices and advance the development of new therapeutics for long-term helminth control. METHODS: Eggs per gram (EPG) of feces were obtained from Morada Nova sheep subjected to two artificial infections with H. contortus and used as a proxy to select animals with high resistance or susceptibility for transcriptome sequencing (RNA-seq) of the abomasum and 50 K single-nucleotide genotyping. Additionally, RNA-seq data for H. contortus were generated, and amplicon sequence variants (ASV) were obtained using polymerase chain reaction amplification and sequencing of bacterial and archaeal 16S ribosomal RNA genes from sheep feces and rumen content. RESULTS: The heritability estimate for EPG was 0.12. GAST, GNLY, IL13, MGRN1, FGF14, and RORC genes and transcripts were differentially expressed between resistant and susceptible animals. A genome-wide association study identified regions on chromosomes 2 and 11 that harbor candidate genes for resistance, immune response, body weight, and adaptation. Trans-expression quantitative trait loci were found between significant variants and differentially expressed transcripts. Functional co-expression modules based on sheep genes and ASVs correlated with resistance to H. contortus, showing enrichment in pathways of response to bacteria, immune and inflammatory responses, and hub features of the Christensenellaceae, Bacteroides, and Methanobrevibacter genera; Prevotellaceae family; and Verrucomicrobiota phylum. In H. contortus, some mitochondrial, collagen-, and cuticle-related genes were expressed only in parasites isolated from susceptible sheep. CONCLUSIONS: The present study identified chromosome regions, genes, transcripts, and pathways involved in the elaborate interactions between the sheep host, its gastrointestinal microbiota, and the H. contortus parasite. These findings will assist in the development of animal selection strategies for parasite resistance and interdisciplinary approaches to control H. contortus infection in sheep.	2024	38429820
8401	3	0.9775	LSTrAP-Crowd: prediction of novel components of bacterial ribosomes with crowd-sourced analysis of RNA sequencing data. BACKGROUND: Bacterial resistance to antibiotics is a growing health problem that is projected to cause more deaths than cancer by 2050. Consequently, novel antibiotics are urgently needed. Since more than half of the available antibiotics target the structurally conserved bacterial ribosomes, factors involved in protein synthesis are thus prime targets for the development of novel antibiotics. However, experimental identification of these potential antibiotic target proteins can be labor-intensive and challenging, as these proteins are likely to be poorly characterized and specific to few bacteria. Here, we use a bioinformatics approach to identify novel components of protein synthesis. RESULTS: In order to identify these novel proteins, we established a Large-Scale Transcriptomic Analysis Pipeline in Crowd (LSTrAP-Crowd), where 285 individuals processed 26 terabytes of RNA-sequencing data of the 17 most notorious bacterial pathogens. In total, the crowd processed 26,269 RNA-seq experiments and used the data to construct gene co-expression networks, which were used to identify more than a hundred uncharacterized genes that were transcriptionally associated with protein synthesis. We provide the identity of these genes together with the processed gene expression data. CONCLUSIONS: We identified genes related to protein synthesis in common bacterial pathogens and thus provide a resource of potential antibiotic development targets for experimental validation. The data can be used to explore additional vulnerabilities of bacteria, while our approach demonstrates how the processing of gene expression data can be easily crowd-sourced.	2020	32883264
8463	4	0.9774	Safety assessment of five candidate probiotic lactobacilli using comparative genome analysis. Micro-organisms belonging to the Lactobacillus genus complex are often used for oral consumption and are generally considered safe but can exhibit pathogenicity in rare and specific cases. Therefore, screening and understanding genetic factors that may contribute to pathogenicity can yield valuable insights regarding probiotic safety. Limosilactobacillus mucosae LM1, Lactiplantibacillus plantarum SK151, Lactiplantibacillus plantarum BS25, Limosilactobacillus fermentum SK152 and Lactobacillus johnsonii PF01 are current probiotics of interest; however, their safety profiles have not been explored. The genome sequences of LM1, SK151, SK152 and PF01 were downloaded from the NCBI GenBank, while that of L. plantarum BS25 was newly sequenced. These genomes were then annotated using the Rapid Annotation using Subsystem Technology tool kit pipeline. Subsequently, a command line blast was performed against the Virulence Factor Database (VFDB) and the Comprehensive Antibiotic Resistance Database (CARD) to identify potential virulence factors and antibiotic resistance (AR) genes. Furthermore, ResFinder was used to detect acquired AR genes. The query against the VFDB identified genes that have a role in bacterial survivability, platelet aggregation, surface adhesion, biofilm formation and immunoregulation; and no acquired AR genes were detected using CARD and ResFinder. The study shows that the query strains exhibit genes identical to those present in pathogenic bacteria with the genes matched primarily having roles related to survival and surface adherence. Our results contribute to the overall strategies that can be employed in pre-clinical safety assessments of potential probiotics. Gene mining using whole-genome data, coupled with experimental validation, can be implemented in future probiotic safety assessment strategies.	2024	38361650
9072	5	0.9773	PanGeT: Pan-genomics tool. A decade after the concept of Pan-genome was first introduced; research in this field has spread its tentacles to areas such as pathogenesis of diseases, bacterial evolutionary studies and drug resistance. Gene content-based differentiation of virulent and a virulent strains of bacteria and identification of pathogen specific genes is imperative to understand their physiology and gain insights into the mechanism of genome evolution. Subsequently, this will aid in identifying diagnostic targets and in developing and selecting vaccines. The root of pan-genomic studies, however, is to identify the core genes, dispensable genes and strain specific genes across the genomes belonging to a clade. To this end, we have developed a tool, "PanGeT - Pan-genomics Tool" to compute the 'pan-genome' based on comparisons at the genome as well as the proteome levels. This automated tool is implemented using LaTeX libraries for effective visualization of overall pan-genome through graphical plots. Links to retrieve sequence information and functional annotations have also been provided. PanGeT can be downloaded from http://pranag.physics.iisc.ernet.in/PanGeT/ or https://github.com/PanGeTv1/PanGeT.	2017	27851981
9073	6	0.9770	EpitoCore: Mining Conserved Epitope Vaccine Candidates in the Core Proteome of Multiple Bacteria Strains. In reverse vaccinology approaches, complete proteomes of bacteria are submitted to multiple computational prediction steps in order to filter proteins that are possible vaccine candidates. Most available tools perform such analysis only in a single strain, or a very limited number of strains. But the vast amount of genomic data had shown that most bacteria contain pangenomes, i.e., their genomic information contains core, conserved genes, and random accessory genes specific to each strain. Therefore, in reverse vaccinology methods it is of the utmost importance to define core proteins and core epitopes. EpitoCore is a decision-tree pipeline developed to fulfill that need. It provides surfaceome prediction of proteins from related strains, defines core proteins within those, calculate their immunogenicity, predicts epitopes for a given set of MHC alleles defined by the user, and then reports if epitopes are located extracellularly and if they are conserved among the core homologs. Pipeline performance is illustrated by mining peptide vaccine candidates in Mycobacterium avium hominissuis strains. From a total proteome of ~4,800 proteins per strain, EpitoCore predicted 103 highly immunogenic core homologs located at cell surface, many of those related to virulence and drug resistance. Conserved epitopes identified among these homologs allows the users to define sets of peptides with potential to immunize the largest coverage of tested HLA alleles using peptide-based vaccines. Therefore, EpitoCore is able to provide automated identification of conserved epitopes in bacterial pangenomic datasets.	2020	32431712
8759	7	0.9768	Genetic and transcriptomic dissection of host defense to Goss's bacterial wilt and leaf blight of maize. Goss's wilt, caused by the Gram-positive actinobacterium Clavibacter nebraskensis, is an important bacterial disease of maize. The molecular and genetic mechanisms of resistance to the bacterium, or, in general, Gram-positive bacteria causing plant diseases, remain poorly understood. Here, we examined the genetic basis of Goss's wilt through differential gene expression, standard genome-wide association mapping (GWAS), extreme phenotype (XP) GWAS using highly resistant (R) and highly susceptible (S) lines, and quantitative trait locus (QTL) mapping using 3 bi-parental populations, identifying 11 disease association loci. Three loci were validated using near-isogenic lines or recombinant inbred lines. Our analysis indicates that Goss's wilt resistance is highly complex and major resistance genes are not commonly present. RNA sequencing of samples separately pooled from R and S lines with or without bacterial inoculation was performed, enabling identification of common and differential gene responses in R and S lines. Based on expression, in both R and S lines, the photosynthesis pathway was silenced upon infection, while stress-responsive pathways and phytohormone pathways, namely, abscisic acid, auxin, ethylene, jasmonate, and gibberellin, were markedly activated. In addition, 65 genes showed differential responses (up- or down-regulated) to infection in R and S lines. Combining genetic mapping and transcriptional data, individual candidate genes conferring Goss's wilt resistance were identified. Collectively, aspects of the genetic architecture of Goss's wilt resistance were revealed, providing foundational data for mechanistic studies.	2023	37652038
5162	8	0.9768	Genomic identification and characterization of Streptococcus oralis group that causes intraamniotic infection. BACKGROUND: Intraamniotic infection is a cause of spontaneous preterm labor. Streptococcus mitis is a common pathogen identified in intraamniotic infection, with the possible route of hematogenous dissemination from the oral cavity or migration from the vaginal canal. However, there are a few reports on Streptococcus oralis, a member of the S. mitis group, as a cause of pathogen in intraamniotic infection. We reported herein whole genome sequencing and comparative genomic analysis of S. oralis strain RAOG5826 that causes intraamniotic infection. RESULTS: Streptococcus mitis was initially identified from amniotic fluid, vaginal swab, and fetal blood of a patient presenting with preterm prelabor rupture of membranes with intraamniotic infection by the use of conventional microbiological methods (biochemical phenotype, MALDI-ToF, 16 S rRNA). Subsequently, this strain was later identified as S. oralis RAOG5826 by whole-genome hybrid sequencing. Genes involved in macrolide and tetracycline resistance, namely ermB and tet(M), and mutations in penicillin-binding protein were present in the genome. Moreover, potential virulence genes were predicted and compared with other Streptococcal species. CONCLUSION: We reported a comprehensive genomic analysis of S. oralis, which causes intraamniotic infection. S. mitis was initially identified by conventional microbiological identification. However, whole-genome hybrid sequencing demonstrates S. oralis with complete profiles of antimicrobial resistance genes and potential virulence factors. This study highlights the limitations of traditional techniques and underscores the importance of genomic sequencing for accurate diagnosis and tailored antimicrobial treatment. The study also suggests that S. oralis may be an underestimated pathogen in intraamniotic infection.	2025	41023353
8405	9	0.9768	Mapping Major Disease Resistance Genes in Soybean by Genome-Wide Association Studies. Soybean is one of the most valuable agricultural crops in the world. Besides, this legume is constantly attacked by a wide range of pathogens (fungi, bacteria, viruses, and nematodes) compromising yield and increasing production costs. One of the major disease management strategies is the genetic resistance provided by single genes and quantitative trait loci (QTL). Identifying the genomic regions underlying the resistance against these pathogens on soybean is one of the first steps performed by molecular breeders. In the past, genetic mapping studies have been widely used to discover these genomic regions. However, over the last decade, advances in next-generation sequencing technologies and their subsequent cost decreasing led to the development of cost-effective approaches to high-throughput genotyping. Thus, genome-wide association studies applying thousands of SNPs in large sets composed of diverse soybean accessions have been successfully done. In this chapter, a comprehensive review of the majority of GWAS for soybean diseases published since this approach was developed is provided. Important diseases caused by Heterodera glycines, Phytophthora sojae, and Sclerotinia sclerotiorum have been the focus of the several GWAS. However, other bacterial and fungi diseases also have been targets of GWAS. As such, this GWAS summary can serve as a guide for future studies of these diseases. The protocol begins by describing several considerations about the pathogens and bringing different procedures of molecular characterization of them. Advice to choose the best isolate/race to maximize the discovery of multiple R genes or to directly map an effective R gene is provided. A summary of protocols, methods, and tools to phenotyping the soybean panel is given to several diseases. We also give details of options of DNA extraction protocols and genotyping methods, and we describe parameters of SNP quality to soybean data. Websites and their online tools to obtain genotypic and phenotypic data for thousands of soybean accessions are highlighted. Finally, we report several tricks and tips in Subheading 4, especially related to composing the soybean panel as well as generating and analyzing the phenotype data. We hope this protocol will be helpful to achieve GWAS success in identifying resistance genes on soybean.	2022	35641772
9083	10	0.9768	ARGNet: using deep neural networks for robust identification and classification of antibiotic resistance genes from sequences. BACKGROUND: Emergence of antibiotic resistance in bacteria is an important threat to global health. Antibiotic resistance genes (ARGs) are some of the key components to define bacterial resistance and their spread in different environments. Identification of ARGs, particularly from high-throughput sequencing data of the specimens, is the state-of-the-art method for comprehensively monitoring their spread and evolution. Current computational methods to identify ARGs mainly rely on alignment-based sequence similarities with known ARGs. Such approaches are limited by choice of reference databases and may potentially miss novel ARGs. The similarity thresholds are usually simple and could not accommodate variations across different gene families and regions. It is also difficult to scale up when sequence data are increasing. RESULTS: In this study, we developed ARGNet, a deep neural network that incorporates an unsupervised learning autoencoder model to identify ARGs and a multiclass classification convolutional neural network to classify ARGs that do not depend on sequence alignment. This approach enables a more efficient discovery of both known and novel ARGs. ARGNet accepts both amino acid and nucleotide sequences of variable lengths, from partial (30-50 aa; 100-150 nt) sequences to full-length protein or genes, allowing its application in both target sequencing and metagenomic sequencing. Our performance evaluation showed that ARGNet outperformed other deep learning models including DeepARG and HMD-ARG in most of the application scenarios especially quasi-negative test and the analysis of prediction consistency with phylogenetic tree. ARGNet has a reduced inference runtime by up to 57% relative to DeepARG. CONCLUSIONS: ARGNet is flexible, efficient, and accurate at predicting a broad range of ARGs from the sequencing data. ARGNet is freely available at https://github.com/id-bioinfo/ARGNet , with an online service provided at https://ARGNet.hku.hk . Video Abstract.	2024	38725076
5161	11	0.9767	Genomic analysis of contaminant Stenotrophomonas maltophilia, from placental swab culture, carrying antibiotic resistance: a potential hospital laboratory contaminant. Acute chorioamnionitis has been considered as reflective of amniotic fluid infection. Standard microbiological work ups for causative microorganism of intra-amniotic infection is based on microbial identification. However, frequency of positive placental culture is varied depending on placental sampling techniques, contaminations, methods of microbiologic work ups or comprehensive microbiologic work ups. In this report, we performed a hybrid whole genome sequencing of a proven bacterial contaminant obtained from placental culture in a patient with preterm labor and acute chorioamnionitis. This is to unveil genetic characterization of contaminant Stenotrophomonas maltophilia habouring antibiotic resistance genes. Stenotrophomonas maltiphilia was proven to be bacterial contaminant since Ureaplasma urealyticum was subsequently demonstrated in amniotic fluid by 16 S rRNA gene Sanger sequencing. Cultivation results from other sources were no growth. We identified Stenotrophomonas maltiphilia strain RAOG732 which carried several antibiotic resistance genes, including aminoglycoside, fluoroquiolone and beta-lactam. Biofilm production genes were also identified in this genome. We firstly utilized a hybrid sequencing approach to investigate the genome of S. maltiphilia in the patient with preterm and acute chorioamnionitis, a proven bacterial laboratory contaminant. The analysis provided several antibiotic resistance-associated and genes biofilm-associated genes. The detection of S. maltiphilia raised the awareness of the colonization of biofilm-producing bacteria in hospitals, where surveillance for decontamination is necessary.	2025	40594762
9074	12	0.9767	BacAnt: A Combination Annotation Server for Bacterial DNA Sequences to Identify Antibiotic Resistance Genes, Integrons, and Transposable Elements. Whole genome sequencing (WGS) of bacteria has become a routine method in diagnostic laboratories. One of the clinically most useful advantages of WGS is the ability to predict antimicrobial resistance genes (ARGs) and mobile genetic elements (MGEs) in bacterial sequences. This allows comprehensive investigations of such genetic features but can also be used for epidemiological studies. A plethora of software programs have been developed for the detailed annotation of bacterial DNA sequences, such as rapid annotation using subsystem technology (RAST), Resfinder, ISfinder, INTEGRALL and The Transposon Registry. Unfortunately, to this day, a reliable annotation tool of the combination of ARGs and MGEs is not available, and the generation of genbank files requires much manual input. Here, we present a new webserver which allows the annotation of ARGs, integrons and transposable elements at the same time. The pipeline generates genbank files automatically, which are compatible with Easyfig for comparative genomic analysis. Our BacAnt code and standalone software package are available at https://github.com/xthua/bacant with an accompanying web application at http://bacant.net.	2021	34367079
9075	13	0.9766	CamPype: an open-source workflow for automated bacterial whole-genome sequencing analysis focused on Campylobacter. BACKGROUND: The rapid expansion of Whole-Genome Sequencing has revolutionized the fields of clinical and food microbiology. However, its implementation as a routine laboratory technique remains challenging due to the growth of data at a faster rate than can be effectively analyzed and critical gaps in bioinformatics knowledge. RESULTS: To address both issues, CamPype was developed as a new bioinformatics workflow for the genomics analysis of sequencing data of bacteria, especially Campylobacter, which is the main cause of gastroenteritis worldwide making a negative impact on the economy of the public health systems. CamPype allows fully customization of stages to run and tools to use, including read quality control filtering, read contamination, reads extension and assembly, bacterial typing, genome annotation, searching for antibiotic resistance genes, virulence genes and plasmids, pangenome construction and identification of nucleotide variants. All results are processed and resumed in an interactive HTML report for best data visualization and interpretation. CONCLUSIONS: The minimal user intervention of CamPype makes of this workflow an attractive resource for microbiology laboratories with no expertise in bioinformatics as a first line method for bacterial typing and epidemiological analyses, that would help to reduce the costs of disease outbreaks, or for comparative genomic analyses. CamPype is publicly available at https://github.com/JoseBarbero/CamPype .	2023	37474912
9744	14	0.9765	PARGT: a software tool for predicting antimicrobial resistance in bacteria. With the ever-increasing availability of whole-genome sequences, machine-learning approaches can be used as an alternative to traditional alignment-based methods for identifying new antimicrobial-resistance genes. Such approaches are especially helpful when pathogens cannot be cultured in the lab. In previous work, we proposed a game-theory-based feature evaluation algorithm. When using the protein characteristics identified by this algorithm, called 'features' in machine learning, our model accurately identified antimicrobial resistance (AMR) genes in Gram-negative bacteria. Here we extend our study to Gram-positive bacteria showing that coupling game-theory-identified features with machine learning achieved classification accuracies between 87% and 90% for genes encoding resistance to the antibiotics bacitracin and vancomycin. Importantly, we present a standalone software tool that implements the game-theory algorithm and machine-learning model used in these studies.	2020	32620856
8469	15	0.9765	Probiogenomic analysis of Lactiplantibacillus plantarum SPS109: A potential GABA-producing and cholesterol-lowering probiotic strain. Lactiplantibacillus plantarum SPS109, an isolated strain of lactic acid bacteria (LAB) from fermented foods, showed remarkable potential as a probiotic with dual capabilities in γ-aminobutyric acid (GABA) production and cholesterol reduction. This study employs genomic and comparative analyses to search into the strain's genetic profile, safety features, and probiotic attributes. The safety assessment reveals the absence of virulence factors and antimicrobial resistance genes, while the genome uncovers bacteriocin-related elements, including sactipeptides and a cluster for putative plantaricins, strengthening its ability to combat diverse pathogens. Pangenome analysis revealed unique bacteriocin-related genes, specifically lcnD and bcrA, distinguishing SPS109 from four other L. plantarum strains producing GABA. In addition, genomic study emphasizes SPS109 strain distinctive features, two GABA-related genes responsible for GABA production and a bile tolerance gene (cbh) crucial for cholesterol reduction. Additionally, the analysis highlights several genes of potential probiotic properties, including stress tolerance, vitamin production, and antioxidant activity. In summary, L. plantarum SPS109 emerges as a promising probiotic candidate with versatile applications in the food and beverage industries, supported by its unique genomic features and safety profile.	2024	39044985
9079	16	0.9765	Review, Evaluation, and Directions for Gene-Targeted Assembly for Ecological Analyses of Metagenomes. Shotgun metagenomics has greatly advanced our understanding of microbial communities over the last decade. Metagenomic analyses often include assembly and genome binning, computationally daunting tasks especially for big data from complex environments such as soil and sediments. In many studies, however, only a subset of genes and pathways involved in specific functions are of interest; thus, it is not necessary to attempt global assembly. In addition, methods that target genes can be computationally more efficient and produce more accurate assembly by leveraging rich databases, especially for those genes that are of broad interest such as those involved in biogeochemical cycles, biodegradation, and antibiotic resistance or used as phylogenetic markers. Here, we review six gene-targeted assemblers with unique algorithms for extracting and/or assembling targeted genes: Xander, MegaGTA, SAT-Assembler, HMM-GRASPx, GenSeed-HMM, and MEGAN. We tested these tools using two datasets with known genomes, a synthetic community of artificial reads derived from the genomes of 17 bacteria, shotgun sequence data from a mock community with 48 bacteria and 16 archaea genomes, and a large soil shotgun metagenomic dataset. We compared assemblies of a universal single copy gene (rplB) and two N cycle genes (nifH and nirK). We measured their computational efficiency, sensitivity, specificity, and chimera rate and found Xander and MegaGTA, which both use a probabilistic graph structure to model the genes, have the best overall performance with all three datasets, although MEGAN, a reference matching assembler, had better sensitivity with synthetic and mock community members chosen from its reference collection. Also, Xander and MegaGTA are the only tools that include post-assembly scripts tuned for common molecular ecology and diversity analyses. Additionally, we provide a mathematical model for estimating the probability of assembling targeted genes in a metagenome for estimating required sequencing depth.	2019	31749830
9617	17	0.9765	Multiplex CRISPRi System Enables the Study of Stage-Specific Biofilm Genetic Requirements in Enterococcus faecalis. Enterococcus faecalis is an opportunistic pathogen, which can cause multidrug-resistant life-threatening infections. Gaining a complete understanding of enterococcal pathogenesis is a crucial step in identifying a strategy to effectively treat enterococcal infections. However, bacterial pathogenesis is a complex process often involving a combination of genes and multilevel regulation. Compared to established knockout methodologies, CRISPR interference (CRISPRi) approaches enable the rapid and efficient silencing of genes to interrogate gene products and pathways involved in pathogenesis. As opposed to traditional gene inactivation approaches, CRISPRi can also be quickly repurposed for multiplexing or used to study essential genes. Here, we have developed a novel dual-vector nisin-inducible CRISPRi system in E. faecalis that can efficiently silence via both nontemplate and template strand targeting. Since the nisin-controlled gene expression system is functional in various Gram-positive bacteria, the developed CRISPRi tool can be extended to other genera. This system can be applied to study essential genes, genes involved in antimicrobial resistance, and genes involved in biofilm formation and persistence. The system is robust and can be scaled up for high-throughput screens or combinatorial targeting. This tool substantially enhances our ability to study enterococcal biology and pathogenesis, host-bacterium interactions, and interspecies communication.IMPORTANCEEnterococcus faecalis causes multidrug-resistant life-threatening infections and is often coisolated with other pathogenic bacteria from polymicrobial biofilm-associated infections. Genetic tools to dissect complex interactions in mixed microbial communities are largely limited to transposon mutagenesis and traditional time- and labor-intensive allelic-exchange methods. Built upon streptococcal dCas9, we developed an easily modifiable, inducible CRISPRi system for E. faecalis that can efficiently silence single and multiple genes. This system can silence genes involved in biofilm formation and antibiotic resistance and can be used to interrogate gene essentiality. Uniquely, this tool is optimized to study genes important for biofilm initiation, maturation, and maintenance and can be used to perturb preformed biofilms. This system will be valuable to rapidly and efficiently investigate a wide range of aspects of complex enterococcal biology.	2020	33082254
8448	18	0.9765	Genome-Wide Association Analysis for Resistance to Coniothyrium glycines Causing Red Leaf Blotch Disease in Soybean. Soybean is a high oil and protein-rich legume with several production constraints. Globally, several fungi, viruses, nematodes, and bacteria cause significant yield losses in soybean. Coniothyrium glycines (CG), the causal pathogen for red leaf blotch disease, is the least researched and causes severe damage to soybean. The identification of resistant soybean genotypes and mapping of genomic regions associated with resistance to CG is critical for developing improved cultivars for sustainable soybean production. This study used single nucleotide polymorphism (SNP) markers generated from a Diversity Arrays Technology (DArT) platform to conduct a genome-wide association (GWAS) analysis of resistance to CG using 279 soybean genotypes grown in three environments. A total of 6395 SNPs was used to perform the GWAS applying a multilocus model Fixed and random model Circulating Probability Unification (FarmCPU) with correction of the population structure and a statistical test p-value threshold of 5%. A total of 19 significant marker-trait associations for resistance to CG were identified on chromosomes 1, 5, 6, 9, 10, 12, 13, 15, 16, 17, 19, and 20. Approximately 113 putative genes associated with significant markers for resistance to red leaf blotch disease were identified across soybean genome. Positional candidate genes associated with significant SNP loci-encoding proteins involved in plant defense responses and that could be associated with soybean defenses against CG infection were identified. The results of this study provide valuable insight for further dissection of the genetic architecture of resistance to CG in soybean. They also highlight SNP variants and genes useful for genomics-informed selection decisions in the breeding process for improving resistance traits in soybean.	2023	37372451
5098	19	0.9764	Feature selection and aggregation for antibiotic resistance GWAS in Mycobacterium tuberculosis: a comparative study. INTRODUCTION: Drug resistance (DR) of pathogens remains a global healthcare concern. In contrast to other bacteria, acquiring mutations in the core genome is the main mechanism of drug resistance for Mycobacterium tuberculosis (MTB). For some antibiotics, the resistance of a particular isolate can be reliably predicted by identifying specific mutations, while for other antibiotics the knowledge of resistance mechanisms is limited. Statistical machine learning (ML) methods are used to infer new genes implicated in drug resistance leveraging large collections of isolates with known whole-genome sequences and phenotypic states for different drugs. However, high correlations between the phenotypic states for commonly used drugs complicate the inference of true associations of mutations with drug phenotypes by ML approaches. METHODS: Recently, several new methods have been developed to select a small subset of reliable predictors of the dependent variable, which may help reduce the number of spurious associations identified. In this study, we evaluated several such methods, namely, logistic regression with different regularization penalty functions, a recently introduced algorithm for solving the best-subset selection problem (ABESS) and "Hungry, Hungry SNPos" (HHS) a heuristic algorithm specifically developed to identify resistance-associated genetic variants in the presence of resistance co-occurrence. We assessed their ability to select known causal mutations for resistance to a specific drug while avoiding the selection of mutations in genes associated with resistance to other drugs, thus we compared selected ML models for their applicability for MTB genome wide association studies. RESULTS AND DISCUSSION: In our analysis, ABESS significantly outperformed the other methods, selecting more relevant sets of mutations. Additionally, we demonstrated that aggregating rare mutations within protein-coding genes into markers indicative of changes in PFAM domains improved prediction quality, and these markers were predominantly selected by ABESS, suggesting their high informativeness. However, ABESS yielded lower prediction accuracy compared to logistic regression methods with regularization.	2025	40606161