QuickSearch:   Number of matching entries: 0.

Search Settings

    AuthorTitleYearJournal/ProceedingsReftypeDOI/URL
    Abraham and Chain An enzyme from bacteria able to destroy penicillin. 1940. 1988 Rev Infect Dis
    Vol. 10(4), pp. 677-678 
    article  
    Akerley et al. A genome-scale analysis for identification of genes required for growth or survival of Haemophilus influenzae. 2002 Proc Natl Acad Sci U S A
    Vol. 99(2), pp. 966-971 
    article DOI URL 
    Abstract: A high-density transposon mutagenesis strategy was applied to the Haemophilus influenzae genome to identify genes required for growth or viability. This analysis detected putative essential roles for the products of 259 ORFs of unknown function. Comparisons between complete genomes defined a subset of these proteins in H. influenzae having homologs in Mycobacterium tuberculosis that are absent in Saccharomyces cerevisiae, a distribution pattern that favors their use in development of antimicrobial therapeutics. Three genes within this set are essential for viability in other bacteria. Interfacing the set of essential gene products in H. influenzae with the distribution of homologs in other microorganisms can detect components of unrecognized cellular pathways essential in diverse bacteria. This genome-scale phenotypic analysis identifies potential roles for a large set of genes of unknown function.
    Altschul Evaluating the statistical significance of multiple distinct local alignments 1997 Theoretical and Computational Methods in Genome Research, pp. 1-14  article  
    Altschul et al. Basic local alignment search tool. 1990 J Mol Biol
    Vol. 215(3), pp. 403-410 
    article DOI URL 
    Abstract: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score. Recent mathematical results on the stochastic properties of MSP scores allow an analysis of the performance of this method as well as the statistical significance of alignments it generates. The basic algorithm is simple and robust; it can be implemented in a number of ways and applied in a variety of contexts including straightforward DNA and protein sequence database searches, motif searches, gene identification searches, and in the analysis of multiple regions of similarity in long DNA sequences. In addition to its flexibility and tractability to mathematical analysis, BLAST is an order of magnitude faster than existing sequence comparison tools of comparable sensitivity.
    Altschul et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. 1997 Nucleic Acids Res
    Vol. 25(17), pp. 3389-3402 
    article  
    Abstract: The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSI-BLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.
    Alvarez Virtual Screening in Drug Discovery 2005   book  
    Alvarez High-throughput docking as a source of novel drug leads. 2004 Curr Opin Chem Biol
    Vol. 8(4), pp. 365-370 
    article DOI URL 
    Abstract: Receptor-based virtual screening has become a viable source of novel leads in the pharmaceutical industry. The rapidly growing availability of structural information across protein families, the accessibility to increased computational power at affordable cost, as well as an improved understanding on how to effectively apply virtual screening technologies has contributed to their emergence. Nonetheless, continued improvement in the accuracy of scoring functions and a greater understanding of protein mobility is critical to advance the technology further.
    Archer Staphylococcus aureus: a well-armed pathogen. 1998 Clin Infect Dis
    Vol. 26(5), pp. 1179-1181 
    article  
    Abstract: Staphylococcus aureus is a virulent pathogen that is currently the most common cause of infections in hospitalized patients. S. aureus infection can involve any organ system. The success of S. aureus as a pathogen and its ability to cause such a wide range of infections are the result of its extensive virulence factors. The increase in the resistance of this virulent pathogen to antibacterial agents, coupled with its increasing prevalence as a nosocomial pathogen, is of major concern. The core resistance phenotype that seems to be most associated with the persistence of S. aureus in the hospital is methicillin resistance. Methicillin resistance in nosocomial S. aureus isolates has been increasing dramatically in United States hospitals and is also associated with resistance to other useful antistaphylococcal compounds. Possible ways to decrease the incidence of nosocomial S. aureus infections include instituting more effective infection control, decreasing nasal colonization, developing vaccines, and developing new or improved antimicrobials.
    Arigoni et al. A genome-based approach for the identification of essential bacterial genes. 1998 Nat Biotechnol
    Vol. 16(9), pp. 851-856 
    article DOI URL 
    Abstract: We have used comparative genomics to identify 26 Escherichia coli open reading frames that are both of unknown function (hypothetical open reading frames or y-genes) and conserved in the compact genome of Mycoplasma genitalium. Not surprisingly, these genes are broadly conserved in the bacterial world. We used a markerless knockout strategy to screen for essential E. coli genes. To verify this phenotype, we constructed conditional mutants in genes for which no null mutants could be obtained. In total we identified six genes that are essential for E. coli (yhbZ, ygjD, ycfB, yfil, yihA, and yjeQ). The respective orthologs of the genes yhbZ, ygjD, ycfB, yjeQ, and yihA are also essential in Bacillus subtilis. This low number of essential genes was unexpected and might be due to a characteristic of the versatile genomes of E. coli and B. subtilis that is comparable to the phenomenon of nonorthologous gene displacement. The gene ygjD, encoding a sialoglycoprotease, was eliminated from a minimal genome computationally derived from a comparison of the Haemophilus influenzae and M. genitalium genomes. We show that ygjD and its ortholog ydiE are essential in E. coli and B. subtilis, respectively. Thus, we include this gene in a minimal genome. This study systematically integrates comparative genomics and targeted gene disruptions to identify broadly conserved bacterial genes of unknown function required for survival on complex media.
    Baba et al. Genome sequence of Staphylococcus aureus strain Newman and comparative analysis of staphylococcal genomes: polymorphism and evolution of two major pathogenicity islands. 2008 J Bacteriol
    Vol. 190(1), pp. 300-310 
    article DOI URL 
    Abstract: Strains of Staphylococcus aureus, an important human pathogen, display up to 20% variability in their genome sequence, and most sequence information is available for human clinical isolates that have not been subjected to genetic analysis of virulence attributes. S. aureus strain Newman, which was also isolated from a human infection, displays robust virulence properties in animal models of disease and has already been extensively analyzed for its molecular traits of staphylococcal pathogenesis. We report here the complete genome sequence of S. aureus Newman, which carries four integrated prophages, as well as two large pathogenicity islands. In agreement with the view that S. aureus Newman prophages contribute important properties to pathogenesis, fewer virulence factors are found outside of the prophages than for the highly virulent strain MW2. The absence of drug resistance genes reflects the general antibiotic-susceptible phenotype of S. aureus Newman. Phylogenetic analyses reveal clonal relationships between the staphylococcal strains Newman, COL, NCTC8325, and USA300 and a greater evolutionary distance to strains MRSA252, MW2, MSSA476, N315, Mu50, JH1, JH9, and RF122. However, polymorphism analysis of two large pathogenicity islands distributed among these strains shows that the two islands were acquired independently from the evolutionary pathway of the chromosomal backbones of staphylococcal genomes. Prophages and pathogenicity islands play central roles in S. aureus virulence and evolution.
    Badger et al. Structural analysis of a set of proteins resulting from a bacterial genomics project. 2005 Proteins
    Vol. 60(4), pp. 787-796 
    article DOI URL 
    Abstract: The targets of the Structural GenomiX (SGX) bacterial genomics project were proteins conserved in multiple prokaryotic organisms with no obvious sequence homolog in the Protein Data Bank of known structures. The outcome of this work was 80 structures, covering 60 unique sequences and 49 different genes. Experimental phase determination from proteins incorporating Se-Met was carried out for 45 structures with most of the remainder solved by molecular replacement using members of the experimentally phased set as search models. An automated tool was developed to deposit these structures in the Protein Data Bank, along with the associated X-ray diffraction data (including refined experimental phases) and experimentally confirmed sequences. BLAST comparisons of the SGX structures with structures that had appeared in the Protein Data Bank over the intervening 3.5 years since the SGX target list had been compiled identified homologs for 49 of the 60 unique sequences represented by the SGX structures. This result indicates that, for bacterial structures that are relatively easy to express, purify, and crystallize, the structural coverage of gene space is proceeding rapidly. More distant sequence-structure relationships between the SGX and PDB structures were investigated using PDB-BLAST and Combinatorial Extension (CE). Only one structure, SufD, has a truly unique topology compared to all folds in the PDB.
    Bi and Lutkenhaus Cell division inhibitors SulA and MinCD prevent formation of the FtsZ ring. 1993 J Bacteriol
    Vol. 175(4), pp. 1118-1125 
    article  
    Abstract: Immunoelectron microscopy was used to assess the effects of inhibitors of cell division on formation of the FtsZ ring in Escherichia coli. Induction of the cell division inhibitor SulA, a component of the SOS response, or the inhibitor MinCD, a component of the min system, blocked formation of the FtsZ ring and led to filamentation. Reversal of SulA inhibition by blocking protein synthesis in SulA-induced filaments led to a resumption of FtsZ ring formation and division. These results suggested that these inhibitors block cell division by preventing FtsZ localization into the ring structure. In addition, analysis of min mutants demonstrated that FtsZ ring formation was also associated with minicell formation, indicating that all septation events in E. coli involve the FtsZ ring.
    Black Microbiology 2008   book  
    Bleicher et al. Hit and lead generation: beyond high-throughput screening. 2003 Nat Rev Drug Discov
    Vol. 2(5), pp. 369-378 
    article DOI URL 
    Abstract: The identification of small-molecule modulators of protein function, and the process of transforming these into high-content lead series, are key activities in modern drug discovery. The decisions taken during this process have far-reaching consequences for success later in lead optimization and even more crucially in clinical development. Recently, there has been an increased focus on these activities due to escalating downstream costs resulting from high clinical failure rates. In addition, the vast emerging opportunities from efforts in functional genomics and proteomics demands a departure from the linear process of identification, evaluation and refinement activities towards a more integrated parallel process. This calls for flexible, fast and cost-effective strategies to meet the demands of producing high-content lead series with improved prospects for clinical success.
    Brown and Warren Antibiotic discovery: is it all in the genes? 1998 Drug Discovery Today
    Vol. 3, pp. 564-566 
    article  
    Bruccoleri et al. Concordance analysis of microbial genomes. 1998 Nucleic Acids Res
    Vol. 26(19), pp. 4482-4486 
    article  
    Abstract: The set of proteins which are conserved across families of microbes contain important targets of new anti-microbial agents. We have developed a simple and efficient computational tool which determines concordances of putative gene products that show sets of proteins conserved across one set of user specified genomes and not present in another set of user specified genomes. The thresholds and the homology scoring criterion are selectable to allow the user to decide the stringency of the homologies. The system uses a relational database to store protein coding regions from different genomes, and to store the results of a complete comparison of all sequences against all sequences using the FASTA program. Using Web technology, the display of all the related proteins for a given sequence and calculation of multiple sequence alignments (using CLUSTALW) can be performed with the click of a button. The current database holds 97 365 sequences from 19 complete or partial genomes and 8798905 FASTA comparison results. A example concordance is presented which demonstrates that the target of the quinolone antibiotics could have been identified using this tool.
    Buysse The role of genomics in antibacterial target discovery. 2001 Curr Med Chem
    Vol. 8(14), pp. 1713-1726 
    article  
    Abstract: Complete DNA sequence information has now been obtained for several prokaryotic genomes, defining the entire genetic complement of these organisms. The collection of genomic data has provided new insights into the molecular architecture of bacterial cells, revealing the basic genetic and metabolic structures that support viability of the organisms. Genomic information has also revealed new avenues for inhibition of bacterial growth and viability, expanding the number of possible drug targets for antibiotic discovery. This review examines how genomic sciences and experimental tools are applied to antibacterial target discovery, the necessary first step in the development of new antibiotic classes. Significant advances have been realized in the development of functional genomic, comparative genomic, and proteomic methods for the analysis of completed genomes. The combination of these methods can be used to systematically parse the genome and identify targets worthy of inhibitor screens. Two basic categories of targets emerge from this exercise, comprising in vitro essential targets required for bacterial viability on synthetic media and in vivo essential targets required to establish and maintain infection within a host organism. Current use of genomic information is focused primarily on a definition of all in vitro essential targets that satisfy criteria of selectivity, spectrum, and novelty. As the genomes of additional bacterial pathogens are solved, it will be possible to select in vivo essential targets common to groups of select pathogens (e.g., bacterial agents of community acquired pneumonia) or even pathogen-specific targets. Consideration of host-pathogen interactions, defined at the level of gene expression for each organism, might provide novel therapeutic options in the future.
    Canchaya et al. The impact of prophages on bacterial chromosomes. 2004 Mol Microbiol
    Vol. 53(1), pp. 9-18 
    article DOI URL 
    Abstract: Prophages were automatically localized in sequenced bacterial genomes by a simple semantic script leading to the identification of 190 prophages in 115 investigated genomes. The distribution of prophages with respect to presence or absence in a given bacterial species, the location and orientation of the prophages on the replichore was not homogeneous. In bacterial pathogens, prophages are particularly prominent. They frequently encoded virulence genes and were major contributors to the genetic individuality of the strains. However, some commensal and free-living bacteria also showed prominent prophage contributions to the bacterial genomes. Lysogens containing multiple sequence-related prophages can experience rearrangements of the bacterial genome across prophages, leading to prophages with new gene constellations. Transfer RNA genes are the preferred chromosomal integration sites, and a number of prophages also carry tRNA genes. Prophage integration into protein coding sequences can lead to either gene disruption or new proteins. The phage repressor, immunity and lysogenic conversion genes are frequently transcribed from the prophage. The expression of the latter is sometimes integrated into control circuits linking prophages, the lysogenic bacterium and its animal host. Prophages are apparently as easily acquired as they are lost from the bacterial chromosome. Fixation of prophage genes seems to be restricted to those with functions that have been co-opted by the bacterial host.
    Canchaya et al. Phage as agents of lateral gene transfer. 2003 Curr Opin Microbiol
    Vol. 6(4), pp. 417-424 
    article  
    Abstract: When establishing lysogeny, temperate phages integrate their genome as a prophage into the bacterial chromosome. Prophages thus constitute in many bacteria a substantial part of laterally acquired DNA. Some prophages contribute lysogenic conversion genes that are of selective advantage to the bacterial host. Occasionally, phages are also involved in the lateral transfer of other mobile DNA elements or bacterial DNA. Recent advances in the field of genomics have revealed a major impact by phages on bacterial chromosome evolution.
    Canchaya et al. Prophage genomics. 2003 Microbiol Mol Biol Rev
    Vol. 67(2), pp. 238-76, table of contents 
    article  
    Abstract: The majority of the bacterial genome sequences deposited in the National Center for Biotechnology Information database contain prophage sequences. Analysis of the prophages suggested that after being integrated into bacterial genomes, they undergo a complex decay process consisting of inactivating point mutations, genome rearrangements, modular exchanges, invasion by further mobile DNA elements, and massive DNA deletion. We review the technical difficulties in defining such altered prophage sequences in bacterial genomes and discuss theoretical frameworks for the phage-bacterium interaction at the genomic level. The published genome sequences from three groups of eubacteria (low- and high-G+C gram-positive bacteria and gamma-proteobacteria) were screened for prophage sequences. The prophages from Streptococcus pyogenes served as test case for theoretical predictions of the role of prophages in the evolution of pathogenic bacteria. The genomes from further human, animal, and plant pathogens, as well as commensal and free-living bacteria, were included in the analysis to see whether the same principles of prophage genomics apply for bacteria living in different ecological niches and coming from distinct phylogenetical affinities. The effect of selection pressure on the host bacterium is apparently an important force shaping the prophage genomes in low-G+C gram-positive bacteria and gamma-proteobacteria.
    Carver et al. Artemis and ACT: viewing, annotating and comparing sequences stored in a relational database. 2008 Bioinformatics
    Vol. 24(23), pp. 2672-2676 
    article DOI URL 
    Abstract: MOTIVATION: Artemis and Artemis Comparison Tool (ACT) have become mainstream tools for viewing and annotating sequence data, particularly for microbial genomes. Since its first release, Artemis has been continuously developed and supported with additional functionality for editing and analysing sequences based on feedback from an active user community of laboratory biologists and professional annotators. Nevertheless, its utility has been somewhat restricted by its limitation to reading and writing from flat files. Therefore, a new version of Artemis has been developed, which reads from and writes to a relational database schema, and allows users to annotate more complex, often large and fragmented, genome sequences. RESULTS: Artemis and ACT have now been extended to read and write directly to the Generic Model Organism Database (GMOD, http://www.gmod.org) Chado relational database schema. In addition, a Gene Builder tool has been developed to provide structured forms and tables to edit coordinates of gene models and edit functional annotation, based on standard ontologies, controlled vocabularies and free text. AVAILABILITY: Artemis and ACT are freely available (under a GPL licence) for download (for MacOSX, UNIX and Windows) at the Wellcome Trust Sanger Institute web sites: http://www.sanger.ac.uk/Software/Artemis/ http://www.sanger.ac.uk/Software/ACT/
    Carver et al. ACT: the Artemis Comparison Tool. 2005 Bioinformatics
    Vol. 21(16), pp. 3422-3423 
    article DOI URL 
    Abstract: The Artemis Comparison Tool (ACT) allows an interactive visualisation of comparisons between complete genome sequences and associated annotations. The comparison data can be generated with several different programs; BLASTN, TBLASTX or Mummer comparisons between genomic DNA sequences, or orthologue tables generated by reciprocal FASTA comparison between protein sets. It is possible to identify regions of similarity, insertions and rearrangements at any level from the whole genome to base-pair differences. ACT uses Artemis components to display the sequences and so inherits powerful searching and analysis tools. ACT is part of the Artemis distribution and is similarly open source, written in Java and can run on any Java enabled platform, including UNIX, Macintosh and Windows.
    Chan et al. Finding the gems using genomic discovery strategies -- the successes and the challenges 2004 Drug Discovery Today: Therapeutic Strategies
    Vol. 1, pp. 519527 
    article  
    Chan et al. Novel antibacterials: a genomics approach to drug discovery. 2002 Curr Drug Targets Infect Disord
    Vol. 2(4), pp. 291-308 
    article  
    Abstract: The appearance of antibiotic resistant pathogens, including vancomycin resistant Staphylococcus aureus, in the clinic has necessitated the development of new antibiotics. The golden age of antibiotic discovery, in which potent selective compounds were readily extracted from natural product extracts is over and novel approaches need to be implemented to cover the therapeutic shortfall. The generation of huge quantities of bacterial sequence data has allowed the identification of all the possible targets for therapeutic intervention and allowed the development of screens to identify inhibitors. Here, we described a number of target classes in which genomics has contributed to its identification. As a result of analyzing sequence data, all of the tRNA synthetases and all of the two-component signal transduction systems were readily isolated; which would not have been easily identified if whole genome sequences were not available. Fatty acid biosynthesis is a known antibacterial target, but genomics showed which genes in that pathway had the appropriate spectrum to be considered as therapeutic targets. Genes of unknown function may seem untractable targets, but if those that are broad spectrum and essential are identified, it becomes valuable to invest time and effort to determine their cellular role. In addition, we discuss the role of genomics in developing technologies that assist in the discovery of new antibiotics including microarray gridding technology. Genomics can also increase the chemical diversity against which the novel targets can be screened.
    Chang et al. Virtual Screening for HIV Protease Inhibitors: A Comparison of AutoDock 4 and Vina 2010 PLoS ONE
    Vol. 5(8), pp. e11955 
    article  
    Abstract: Background

    The AutoDock family of software has been widely used in protein-ligand docking research. This study compares AutoDock 4 and AutoDock Vina in the context of virtual screening by using these programs to select compounds active against HIV protease.

    Methodology/Principal Findings

    Both programs were used to rank the members of two chemical libraries, each containing experimentally verified binders to HIV protease. In the case of the NCI Diversity Set II, both AutoDock 4 and Vina were able to select active compounds significantly better than random (AUC = 0.69 and 0.68, respectively; p<0.001). The binding energy predictions were highly correlated in this case, with r = 0.63 and i = 0.82. For a set of larger, more flexible compounds from the Directory of Universal Decoys, the binding energy predictions were not correlated, and only Vina was able to rank compounds significantly better than random.

    Conclusions/Significance

    In ranking smaller molecules with few rotatable bonds, AutoDock 4 and Vina were equally capable, though both exhibited a size-related bias in scoring. However, as Vina executes more quickly and is able to more accurately rank larger molecules, researchers should look to it first when undertaking a virtual screen.

    Chetouani et al. FindTarget: software for subtractive genome analysis. 2001 Microbiology
    Vol. 147(Pt 10), pp. 2643-2649 
    article  
    Abstract: In silico subtractive/differential genome analysis is a powerful approach for identifying genus- or species-specific genes, or groups of genes that are responsible for a unique phenotype. By this method, one searches for genes present in one group of bacteria and absent in another group. A software package has been developed, named FindTarget, that has a user-friendly web interface to facilitate differential genome analysis. The user chooses the genomes to compare, the similarity criteria and the thresholds to decide if a gene has a counterpart in another genome. The searches are based on BLASTP comparisons of proteomes. FindTarget also includes access to sequences, coloured multiple alignments, phylogenetic trees of conserved proteins and links to public annotated databases which provide a means for validation of the results. To illustrate this approach, a FindTarget search for genes putatively involved in the specificity of cell envelope synthesis of Gram-negative bacteria is presented. The results show that most of the identified genes are clearly involved in cell wall processes, underlining the power of such an approach in general and that of FindTarget in particular.
    Clamp et al. The Jalview Java alignment editor. 2004 Bioinformatics
    Vol. 20(3), pp. 426-427 
    article DOI URL 
    Abstract: Multiple sequence alignment remains a crucial method for understanding the function of groups of related nucleic acid and protein sequences. However, it is known that automatic multiple sequence alignments can often be improved by manual editing. Therefore, tools are needed to view and edit multiple sequence alignments. Due to growth in the sequence databases, multiple sequence alignments can often be large and difficult to view efficiently. The Jalview Java alignment editor is presented here, which enables fast viewing and editing of large multiple sequence alignments.
    Cole et al. The Jpred 3 secondary structure prediction server. 2008 Nucleic Acids Res
    Vol. 36(Web Server issue), pp. W197-W201 
    article DOI URL 
    Abstract: Jpred (http://www.compbio.dundee.ac.uk/jpred) is a secondary structure prediction server powered by the Jnet algorithm. Jpred performs over 1000 predictions per week for users in more than 50 countries. The recently updated Jnet algorithm provides a three-state (alpha-helix, beta-strand and coil) prediction of secondary structure at an accuracy of 81.5 Given either a single protein sequence or a multiple sequence alignment, Jpred derives alignment profiles from which predictions of secondary structure and solvent accessibility are made. The predictions are presented as coloured HTML, plain text, PostScript, PDF and via the Jalview alignment editor to allow flexibility in viewing and applying the data. The new Jpred 3 server includes significant usability improvements that include clearer feedback of the progress or failure of submitted requests. Functional improvements include batch submission of sequences, summary results via email and updates to the search databases. A new software pipeline will enable Jnet/Jpred to continue to be updated in sync with major updates to SCOP and UniProt and so ensures that Jpred 3 will maintain high-accuracy predictions.
    Dale and Schantz From Genes to Genomes: concepts and applications of DNA technology 2007   book  
    Das et al. Global perspectives on proteins: comparing genomes in terms of folds, pathways and beyond. 2001 Pharmacogenomics J
    Vol. 1(2), pp. 115-125 
    article  
    Abstract: The sequencing of complete genomes provides us with a global view of all the proteins in an organism. Proteomic analysis can be done on a purely sequence-based level, with a focus on finding homologues and grouping them into families and clusters of orthologs. However, incorporating protein structure into this analysis provides valuable simplification; it allows one to collect together very distantly related sequences, thus condensing the proteome into a minimal number of 'parts.' We describe issues related to surveying proteomes in terms of structural parts, including methods for fold assignment and formats for comparisons (eg top-10 lists and whole-genome trees), and show how biases in the databases and in sampling can affect these surveys. We illustrate our main points through a case study on the unique protein properties evident in many thermophile genomes (eg more salt bridges). Finally, we discuss metabolic pathways as an even greater simplification of genomes. In comparison to folds these allow the organization of many more genes into coherent systems, yet can nevertheless be understood in many of the same terms.
    Davidov et al. Advancing drug discovery through systems biology. 2003 Drug Discov Today
    Vol. 8(4), pp. 175-183 
    article  
    Abstract: Pharmaceutical companies are facing an urgent need to both increase their lead compound and clinical candidate portfolios and satisfy market demands for continued innovation and revenue growth. Here, we outline an emerging approach that attempts to facilitate and alleviate many of the current drug discovery issues and problems. This is, in part, achieved through the systematic integration of technologies, which results in a superior output of data and information, thereby enhancing our understanding of biological function, chemico-biological interactions and, ultimately, drug discovery. Systems biology is one new discipline that is positioned to significantly impact this process.
    Davies Origins and evolution of antibiotic resistance. 1996 Microbiologia
    Vol. 12(1), pp. 9-16 
    article  
    Abstract: The massive prescription of antibiotics and their non-regulated and extensive usage has resulted in the development of extensive antibiotic resistance in microorganisms; this has been of great clinical significance. Antibiotic resistance occurs not only by mutation of microbial genes which code for antibiotic uptake into cells or the binding sites for antibiotics, but mostly by the acquisition of heterologous resistance genes from external sources. The physical characteristics of the microbial community play a major role in gene exchange, but antimicrobial agents provide the selective pressure for the development of resistance and promote the transfer of resistance genes among bacteria. The control of antibiotic usage is essential to prevent the development of resistance to new antibiotics.
    Delano The PyMOL Molecular Graphics System 2002   electronic URL 
    DeLeo and Chambers Reemergence of antibiotic-resistant Staphylococcus aureus in the genomics era. 2009 J Clin Invest
    Vol. 119(9), pp. 2464-2474 
    article DOI URL 
    Abstract: Staphylococcus aureus is the leading cause of bacterial infections in developed countries and produces a wide spectrum of diseases, ranging from minor skin infections to fatal necrotizing pneumonia. Although S. aureus infections were historically treatable with common antibiotics, emergence of drug-resistant organisms is now a major concern. Methicillin-resistant S. aureus (MRSA) was endemic in hospitals by the late 1960s, but it appeared rapidly and unexpectedly in communities in the 1990s and is now prevalent worldwide. This Review focuses on progress made toward understanding the success of community-associated MRSA as a human pathogen, with an emphasis on genome-wide approaches and virulence determinants.
    Diep et al. Complete genome sequence of USA300, an epidemic clone of community-acquired meticillin-resistant Staphylococcus aureus. 2006 Lancet
    Vol. 367(9512), pp. 731-739 
    article DOI URL 
    Abstract: BACKGROUND: USA300, a clone of meticillin-resistant Staphylococcus aureus, is a major source of community-acquired infections in the USA, Canada, and Europe. Our aim was to sequence its genome and compare it with those of other strains of S aureus to try to identify genes responsible for its distinctive epidemiological and virulence properties. METHODS: We ascertained the genome sequence of FPR3757, a multidrug resistant USA300 strain, by random shotgun sequencing, then compared it with the sequences of ten other staphylococcal strains. FINDINGS: Compared with closely related S aureus, we noted that almost all of the unique genes in USA300 clustered in novel allotypes of mobile genetic elements. Some of the unique genes are involved in pathogenesis, including Panton-Valentine leucocidin and molecular variants of enterotoxin Q and K. The most striking feature of the USA300 genome is the horizontal acquisition of a novel mobile genetic element that encodes an arginine deiminase pathway and an oligopeptide permease system that could contribute to growth and survival of USA300. We did not detect this element, termed arginine catabolic mobile element (ACME), in other S aureus strains. We noted a high prevalence of ACME in S epidermidis, suggesting not only that ACME transfers into USA300 from S epidermidis, but also that this element confers a selective advantage to this ubiquitous commensal of the human skin. INTERPRETATION: USA300 has acquired mobile genetic elements that encode resistance and virulence determinants that could enhance fitness and pathogenicity.
    Drews Drug discovery: a historical perspective. 2000 Science
    Vol. 287(5460), pp. 1960-1964 
    article  
    Abstract: Driven by chemistry but increasingly guided by pharmacology and the clinical sciences, drug research has contributed more to the progress of medicine during the past century than any other scientific factor. The advent of molecular biology and, in particular, of genomic sciences is having a deep impact on drug discovery. Recombinant proteins and monoclonal antibodies have greatly enriched our therapeutic armamentarium. Genome sciences, combined with bioinformatic tools, allow us to dissect the genetic basis of multifactorial diseases and to determine the most suitable points of attack for future medicines, thereby increasing the number of treatment options. The dramatic increase in the complexity of drug research is enforcing changes in the institutional basis of this interdisciplinary endeavor. The biotech industry is establishing itself as the discovery arm of the pharmaceutical industry. In bridging the gap between academia and large pharmaceutical companies, the biotech firms have been effective instruments of technology transfer.
    Edgar MUSCLE: multiple sequence alignment with high accuracy and high throughput. 2004 Nucleic Acids Res
    Vol. 32(5), pp. 1792-1797 
    article DOI URL 
    Abstract: We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the log-expectation score, and refinement using tree-dependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.
    Edgar MUSCLE: a multiple sequence alignment method with reduced time and space complexity. 2004 BMC Bioinformatics
    Vol. 5, pp. 113 
    article DOI URL 
    Abstract: BACKGROUND: In a previous paper, we introduced MUSCLE, a new program for creating multiple alignments of protein sequences, giving a brief summary of the algorithm and showing MUSCLE to achieve the highest scores reported to date on four alignment accuracy benchmarks. Here we present a more complete discussion of the algorithm, describing several previously unpublished techniques that improve biological accuracy and / or computational complexity. We introduce a new option, MUSCLE-fast, designed for high-throughput applications. We also describe a new protocol for evaluating objective functions that align two profiles. RESULTS: We compare the speed and accuracy of MUSCLE with CLUSTALW, Progressive POA and the MAFFT script FFTNS1, the fastest previously published program known to the author. Accuracy is measured using four benchmarks: BAliBASE, PREFAB, SABmark and SMART. We test three variants that offer highest accuracy (MUSCLE with default settings), highest speed (MUSCLE-fast), and a carefully chosen compromise between the two (MUSCLE-prog). We find MUSCLE-fast to be the fastest algorithm on all test sets, achieving average alignment accuracy similar to CLUSTALW in times that are typically two to three orders of magnitude less. MUSCLE-fast is able to align 1,000 sequences of average length 282 in 21 seconds on a current desktop computer. CONCLUSIONS: MUSCLE offers a range of options that provide improved speed and / or alignment accuracy compared with currently available programs. MUSCLE is freely available at http://www.drive5.com/muscle.
    Edgar and Batzoglou Multiple sequence alignment. 2006 Curr Opin Struct Biol
    Vol. 16(3), pp. 368-373 
    article DOI URL 
    Abstract: Multiple sequence alignments are an essential tool for protein structure and function prediction, phylogeny inference and other common tasks in sequence analysis. Recently developed systems have advanced the state of the art with respect to accuracy, ability to scale to thousands of proteins and flexibility in comparing proteins that do not share the same domain architecture. New multiple alignment benchmark databases include PREFAB, SABMARK, OXBENCH and IRMBASE. Although CLUSTALW is still the most popular alignment tool to date, recent methods offer significantly better alignment quality and, in some cases, reduced computational cost.
    Eisen et al. Macronuclear genome sequence of the ciliate Tetrahymena thermophila, a model eukaryote. 2006 PLoS Biol
    Vol. 4(9), pp. e286 
    article DOI URL 
    Abstract: The ciliate Tetrahymena thermophila is a model organism for molecular and cellular biology. Like other ciliates, this species has separate germline and soma functions that are embodied by distinct nuclei within a single cell. The germline-like micronucleus (MIC) has its genome held in reserve for sexual reproduction. The soma-like macronucleus (MAC), which possesses a genome processed from that of the MIC, is the center of gene expression and does not directly contribute DNA to sexual progeny. We report here the shotgun sequencing, assembly, and analysis of the MAC genome of T. thermophila, which is approximately 104 Mb in length and composed of approximately 225 chromosomes. Overall, the gene set is robust, with more than 27,000 predicted protein-coding genes, 15,000 of which have strong matches to genes in other organisms. The functional diversity encoded by these genes is substantial and reflects the complexity of processes required for a free-living, predatory, single-celled organism. This is highlighted by the abundance of lineage-specific duplications of genes with predicted roles in sensing and responding to environmental conditions (e.g., kinases), using diverse resources (e.g., proteases and transporters), and generating structural complexity (e.g., kinesins and dyneins). In contrast to the other lineages of alveolates (apicomplexans and dinoflagellates), no compelling evidence could be found for plastid-derived genes in the genome. UGA, the only T. thermophila stop codon, is used in some genes to encode selenocysteine, thus making this organism the first known with the potential to translate all 64 codons in nuclear genes into amino acids. We present genomic evidence supporting the hypothesis that the excision of DNA from the MIC to generate the MAC specifically targets foreign DNA as a form of genome self-defense. The combination of the genome sequence, the functional diversity encoded therein, and the presence of some pathways missing from other model organisms makes T. thermophila an ideal model for functional genomic studies to address biological, biomedical, and biotechnological questions of fundamental importance.
    Errington et al. Cytokinesis in bacteria. 2003 Microbiol Mol Biol Rev
    Vol. 67(1), pp. 52-65, table of contents 
    article  
    Abstract: Work on two diverse rod-shaped bacteria, Escherichia coli and Bacillus subtilis, has defined a set of about 10 conserved proteins that are important for cell division in a wide range of eubacteria. These proteins are directed to the division site by the combination of two negative regulatory systems. Nucleoid occlusion is a poorly understood mechanism whereby the nucleoid prevents division in the cylindrical part of the cell, until chromosome segregation has occurred near midcell. The Min proteins prevent division in the nucleoid-free spaces near the cell poles in a manner that is beginning to be understood in cytological and biochemical terms. The hierarchy whereby the essential division proteins assemble at the midcell division site has been worked out for both E. coli and B. subtilis. They can be divided into essentially three classes depending on their position in the hierarchy and, to a certain extent, their subcellular localization. FtsZ is a cytosolic tubulin-like protein that polymerizes into an oligomeric structure that forms the initial ring at midcell. FtsA is another cytosolic protein that is related to actin, but its precise function is unclear. The cytoplasmic proteins are linked to the membrane by putative membrane anchor proteins, such as ZipA of E. coli and possibly EzrA of B. subtilis, which have a single membrane span but a cytoplasmic C-terminal domain. The remaining proteins are either integral membrane proteins or transmembrane proteins with their major domains outside the cell. The functions of most of these proteins are unclear with the exception of at least one penicillin-binding protein, which catalyzes a key step in cell wall synthesis in the division septum.
    Essoussi et al. A comparison of MSA tools. 2008 Bioinformation
    Vol. 2(10), pp. 452-455 
    article  
    Abstract: Multiple sequence alignment (MSA) is essential in phylogenetic, evolutionary and functional analysis. Several MSA tools are available in the literature. Here, we use several MSA tools such as ClustalX, Align-m, T-Coffee, SAGA, ProbCons, MAFFT, MUSCLE and DIALIGN to illustrate comparative phylogenetic trees analysis for two datasets. Results show that there is no single MSA tool that consistently outperforms the rest in producing reliable phylogenetic trees.
    Feng and Doolittle Progressive sequence alignment as a prerequisite to correct phylogenetic trees. 1987 J Mol Evol
    Vol. 25(4), pp. 351-360 
    article  
    Abstract: A progressive alignment method is described that utilizes the Needleman and Wunsch pairwise alignment algorithm iteratively to achieve the multiple alignment of a set of protein sequences and to construct an evolutionary tree depicting their relationship. The sequences are assumed a priori to share a common ancestor, and the trees are constructed from difference matrices derived directly from the multiple alignment. The thrust of the method involves putting more trust in the comparison of recently diverged sequences than in those evolved in the distant past. In particular, this rule is followed: "once a gap, always a gap." The method has been applied to three sets of protein sequences: 7 superoxide dismutases, 11 globins, and 9 tyrosine kinase-like sequences. Multiple alignments and phylogenetic trees for these sets of sequences were determined and compared with trees derived by conventional pairwise treatments. In several instances, the progressive method led to trees that appeared to be more in line with biological expectations than were trees obtained by more commonly used methods.
    Fischer and Hubbard Fragment-based ligand discovery. 2009 Mol Interv
    Vol. 9(1), pp. 22-30 
    article DOI URL 
    Abstract: From home building and decor to mass production, modular design is a standard feature of the modern age. The concept also promises to define drug discovery efforts in the near future, as a wide range of methodologies, from NMR to X-ray crystallography, are being adapted to high-throughput platforms. In particular, "fragment-based ligand discovery" describes the laboratory-driven evolution of drugs from libraries of chemical building blocks. "Evolution" is an apt word for the process, as a wide array of methods are used to define how compound fragments can be best fit into the binding sites of medically relevant target biomolecules. A number of compounds that evolved from fragments have entered the clinic, and the approach is increasingly accepted as an additional route to identifying new hit compounds in pharmaceutical discovery and inhibitor design.
    Fleming On antibacterial action of culture of penicillium, with special reference to their use in isolation of B. influenzae 1929 Br. J. Exp. Pathol.
    Vol. 10, pp. 226-236 
    article  
    Galperin and Koonin Searching for drug targets in microbial genomes. 1999 Curr Opin Biotechnol
    Vol. 10(6), pp. 571-578 
    article  
    Abstract: Comparative analysis of the complete genome sequences of 10 bacterial pathogens available in the public databases offers the first insights into the drug discovery approaches of the near future. Genes that are conserved in different genomes often turn out to be essential, which makes them attractive targets for new broad-spectrum antibiotics. Subtractive genome analysis reveals the genes that are conserved in all or most of the pathogenic bacteria but not in eukaryotes; these are the most obvious candidates for drug targets. Species-specific genes, on the other hand, may offer the possibility to design drugs against a particular, narrow group of pathogens.
    Geysen et al. Combinatorial compound libraries for drug discovery: an ongoing challenge. 2003 Nat Rev Drug Discov
    Vol. 2(3), pp. 222-230 
    article DOI URL 
    Abstract: Almost 20 years of combinatorial chemistry have emphasized the power of numbers, a key issue for drug discovery in the current genomic era, in which it has been estimated that there might be more than 10,000 potential targets for which it would be desirable to have small-molecule modulators. Combinatorial chemistry is best described as the industrialization of chemistry; the chemistry has not changed, just the way in which it is now carried out, which is principally by exploiting instrumentation and robotics coupled to the extensive use of computers to efficiently control the process and analyse the vast amounts of resulting data. Many researchers have contributed to the general concepts as well as to the technologies in present use. However, some interesting challenges still remain to be solved, and these are discussed here in the context of the application of combinatorial chemistry to drug discovery.
    Glass et al. Essential genes of a minimal bacterium. 2006 Proc Natl Acad Sci U S A
    Vol. 103(2), pp. 425-430 
    article DOI URL 
    Abstract: Mycoplasma genitalium has the smallest genome of any organism that can be grown in pure culture. It has a minimal metabolism and little genomic redundancy. Consequently, its genome is expected to be a close approximation to the minimal set of genes needed to sustain bacterial life. Using global transposon mutagenesis, we isolated and characterized gene disruption mutants for 100 different nonessential protein-coding genes. None of the 43 RNA-coding genes were disrupted. Herein, we identify 382 of the 482 M. genitalium protein-coding genes as essential, plus five sets of disrupted genes that encode proteins with potentially redundant essential functions, such as phosphate transport. Genes encoding proteins of unknown function constitute 28% of the essential protein-coding genes set. Disruption of some genes accelerated M. genitalium growth.
    Glavas and Tanner Active site residues of glutamate racemase. 2001 Biochemistry
    Vol. 40(21), pp. 6199-6204 
    article  
    Abstract: Glutamate racemase, MurI, catalyzes the interconversion of glutamate enantiomers in a cofactor-independent fashion and provides bacteria with a source of D-Glu for use in peptidoglycan biosynthesis. The enzyme uses a "two-base" mechanism involving a deprotonation of the substrate at the alpha-position to form an anionic intermediate, followed by a reprotonation in the opposite stereochemical sense. In the Lactobacillus fermenti enzyme, Cys73 is responsible for the deprotonation of D-glutamate, and Cys184 is responsible for the deprotonation of L-glutamate; however, very little is known about the roles of other active site residues. This work describes the preparation of four mutants in which strictly conserved residues containing ionizable side chains were modified (D10N, D36N, E152Q, and H186N). During the course of this research, the structural analysis of a crystallized glutamate racemase indicated that three of these residues (D10, E152, and H186) are in the active site of the enzyme [Hwang, K. Y., Cho, C.-S., Kim, S. S., Sung, H.-C., Yu, Y. G., and Cho, Y. (1999) Nat. Struct. Biol. 6, 422-426]. Two of the mutants, D10N and H186N, displayed a marked decrease in the values of k(cat), but not K(M), and are therefore implicated as important catalytic residues. Further analysis of the primary kinetic isotope effects observed with alpha-deuterated substrates showed that a significant asymmetry was introduced into the free energy profile by these two mutations. This is interpreted as evidence that the mutated residues normally assist the catalytic thiols in acting as bases (D10 with C73 and H186 with C184). An alternate possibility is that the residues may serve to stabilize the carbanionic intermediate in the racemization reaction.
    Goehring and Beckwith Diverse paths to midcell: assembly of the bacterial cell division machinery. 2005 Curr Biol
    Vol. 15(13), pp. R514-R526 
    article DOI URL 
    Abstract: At the heart of bacterial cell division is a dynamic ring-like structure of polymers of the tubulin homologue FtsZ. This ring forms a scaffold for assembly of at least ten additional proteins at midcell, the majority of which are likely to be involved in remodeling the peptidoglycan cell wall at the division site. Together with FtsZ, these proteins are thought to form a cell division complex, or divisome. In Escherichia coli, the components of the divisome are recruited to midcell according to a strikingly linear hierarchy that predicts a step-wise assembly pathway. However, recent studies have revealed unexpected complexity in the assembly steps, indicating that the apparent linearity does not necessarily reflect a temporal order. The signals used to recruit cell division proteins to midcell are diverse and include regulated self-assembly, protein-protein interactions, and the recognition of specific septal peptidoglycan substrates. There is also evidence for a complex web of interactions among these proteins and at least one distinct subcomplex of cell division proteins has been defined, which is conserved among E. coli, Bacillus subtilis and Streptococcus pneumoniae.
    Golubchik et al. Mind the gaps: evidence of bias in estimates of multiple sequence alignments. 2007 Mol Biol Evol
    Vol. 24(11), pp. 2433-2442 
    article DOI URL 
    Abstract: Multiple sequence alignment (MSA) is a crucial first step in the analysis of genomic and proteomic data. Commonly occurring sequence features, such as deletions and insertions, are known to affect the accuracy of MSA programs, but the extent to which alignment accuracy is affected by the positions of insertions and deletions has not been examined independently of other sources of sequence variation. We assessed the performance of 6 popular MSA programs (ClustalW, DIALIGN-T, MAFFT, MUSCLE, PROBCONS, and T-COFFEE) and one experimental program, PRANK, on amino acid sequences that differed only by short regions of deleted residues. The analysis showed that the absence of residues often led to an incorrect placement of gaps in the alignments, even though the sequences were otherwise identical. In data sets containing sequences with partially overlapping deletions, most MSA programs preferentially aligned the gaps vertically at the expense of incorrectly aligning residues in the flanking regions. Of the programs assessed, only DIALIGN-T was able to place overlapping gaps correctly relative to one another, but this was usually context dependent and was observed only in some of the data sets. In data sets containing sequences with non-overlapping deletions, both DIALIGN-T and MAFFT (G-INS-I) were able to align gaps with near-perfect accuracy, but only MAFFT produced the correct alignment consistently. The same was true for data sets that comprised isoforms of alternatively spliced gene products: both DIALIGN-T and MAFFT produced highly accurate alignments, with MAFFT being the more consistent of the 2 programs. Other programs, notably T-COFFEE and ClustalW, were less accurate. For all data sets, alignments produced by different MSA programs differed markedly, indicating that reliance on a single MSA program may give misleading results. It is therefore advisable to use more than one MSA program when dealing with sequences that may contain deletions or insertions, particularly for high-throughput and pipeline applications where manual refinement of each alignment is not practicable.
    Goodsell and Olson Automated docking of substrates to proteins by simulated annealing. 1990 Proteins
    Vol. 8(3), pp. 195-202 
    article DOI URL 
    Abstract: The Metropolis technique of conformation searching is combined with rapid energy evaluation using molecular affinity potentials to give an efficient procedure for docking substrates to macromolecules of known structure. The procedure works well on a number of crystallographic test systems, functionally reproducing the observed binding modes of several substrates.
    Gotoh An improved algorithm for matching biological sequences. 1982 J Mol Biol
    Vol. 162(3), pp. 705-708 
    article  
    Grasso and Lee Combining partial order alignment and progressive multiple sequence alignment increases alignment speed and scalability to very large alignment problems. 2004 Bioinformatics
    Vol. 20(10), pp. 1546-1556 
    article DOI URL 
    Abstract: MOTIVATION: Partial order alignment (POA) has been proposed as a new approach to multiple sequence alignment (MSA), which can be combined with existing methods such as progressive alignment. This is important for addressing problems both in the original version of POA (such as order sensitivity) and in standard progressive alignment programs (such as information loss in complex alignments, especially surrounding gap regions). RESULTS: We have developed a new Partial Order-Partial Order alignment algorithm that optimally aligns a pair of MSAs and which therefore can be applied directly to progressive alignment methods such as CLUSTAL. Using this algorithm, we show the combined Progressive POA alignment method yields results comparable with the best available MSA programs (CLUSTALW, DIALIGN2, T-COFFEE) but is far faster. For example, depending on the level of sequence similarity, aligning 1000 sequences, each 500 amino acids long, took 15 min (at 90% average identity) to 44 min (at 30% identity) on a standard PC. For large alignments, Progressive POA was 10-30 times faster than the fastest of the three previous methods (CLUSTALW). These data suggest that POA-based methods can scale to much larger alignment problems than possible for previous methods. AVAILABILITY: The POA source code is available at http://www.bioinformatics.ucla.edu/poa
    Guzmán et al. Completely sequenced genomes of pathogenic bacteria: a review. 2008 Enferm Infecc Microbiol Clin
    Vol. 26(2), pp. 88-98 
    article  
    Abstract: Six out of ten completely sequenced bacterial genomes are pathogenic or opportunistic bacteria. The genome sequence of at least one strain of all the principal pathogenic bacteria will soon be available. This information should enable us to identify genes that encode virulence factors. As these genes are potential targets for drugs and vaccines, their identification should have considerable repercussions on prevention, diagnosis, and treatment of the main bacterial infectious diseases. Comparison of genome sequences of several strains of the same species should allow identification of the genetic clues responsible for the differing behavior of related bacterial pathogens. This article reviews the genomes from pathogenic bacteria that have been or are currently being sequenced, describes the main tasks to be accomplished after a genome sequence becomes available, and discusses the benefits of having the genome sequence of bacterial pathogens.
    Hacker and Kaper Pathogenicity islands and the evolution of microbes. 2000 Annu Rev Microbiol
    Vol. 54, pp. 641-679 
    article DOI URL 
    Abstract: Virulence factors of pathogenic bacteria (adhesins, toxins, invasins, protein secretion systems, iron uptake systems, and others) may be encoded by particular regions of the prokaryotic genome termed pathogenicity islands. Pathogenicity islands were first described in human pathogens of the species Escherichia coli, but have recently been found in the genomes of various pathogens of humans, animals, and plants. Pathogenicity islands comprise large genomic regions [10-200 kilobases (kb) in size] that are present on the genomes of pathogenic strains but absent from the genomes of nonpathogenic members of the same or related species. The finding that the G+C content of pathogenicity islands often differs from that of the rest of the genome, the presence of direct repeats at their ends, the association of pathogenicity islands with transfer RNA genes, the presence of integrase determinants and other mobility loci, and their genetic instability argue for the generation of pathogenicity islands by horizontal gene transfer, a process that is well known to contribute to microbial evolution. In this article we review these and other aspects of pathogenicity islands and discuss the concept that they represent a subclass of genomic islands. Genomic islands are present in the majority of genomes of pathogenic as well as nonpathogenic bacteria and may encode accessory functions which have been previously spread among bacterial populations.
    Handford et al. Conserved network of proteins essential for bacterial viability. 2009 J Bacteriol
    Vol. 191(15), pp. 4732-4749 
    article DOI URL 
    Abstract: The yjeE, yeaZ, and ygjD genes are highly conserved in the genomes of eubacteria, and ygjD orthologs are also found throughout the Archaea and eukaryotes. In this study, we have constructed conditional expression strains for each of these genes in the model organism Escherichia coli K12. We show that each gene is essential for the viability of E. coli under laboratory growth conditions. Growth of the conditional strains under nonpermissive conditions results in dramatic changes in cell ultrastructure. Deliberate repression of the expression of yeaZ results in cells with highly condensed nucleoids, while repression of yjeE and ygjD expression results in at least a proportion of very enlarged cells with an unusual peripheral distribution of DNA. Each of the three conditional expression strains can be complemented by multicopy clones harboring the rstA gene, which encodes a two-component-system response regulator, strongly suggesting that these proteins are involved in the same essential cellular pathway. The results of bacterial two-hybrid experiments show that YeaZ can interact with both YjeE and YgjD but that YgjD is the preferred interaction partner. The results of in vitro experiments indicate that YeaZ mediates the proteolysis of YgjD, suggesting that YeaZ and YjeE act as regulators to control the activity of this protein. Our results are consistent with these proteins forming a link between DNA metabolism and cell division.
    Hawkey The origins and molecular basis of antibiotic resistance. 1998 BMJ
    Vol. 317(7159), pp. 657-660 
    article  
    Haydon et al. An inhibitor of FtsZ with potent and selective anti-staphylococcal activity. 2008 Science
    Vol. 321(5896), pp. 1673-1675 
    article DOI URL 
    Abstract: FtsZ is an essential bacterial guanosine triphosphatase and homolog of mammalian beta-tubulin that polymerizes and assembles into a ring to initiate cell division. We have created a class of small synthetic antibacterials, exemplified by PC190723, which inhibits FtsZ and prevents cell division. PC190723 has potent and selective in vitro bactericidal activity against staphylococci, including methicillin- and multi-drug-resistant Staphylococcus aureus. The putative inhibitor-binding site of PC190723 was mapped to a region of FtsZ that is analogous to the Taxol-binding site of tubulin. PC190723 was efficacious in an in vivo model of infection, curing mice infected with a lethal dose of S. aureus. The data validate FtsZ as a target for antibacterial intervention and identify PC190723 as suitable for optimization into a new anti-staphylococcal therapy.
    He and Zhang Why do hubs tend to be essential in protein networks? 2006 PLoS Genet
    Vol. 2(6), pp. e88 
    article DOI URL 
    Abstract: The protein-protein interaction (PPI) network has a small number of highly connected protein nodes (known as hubs) and many poorly connected nodes. Genome-wide studies show that deletion of a hub protein is more likely to be lethal than deletion of a non-hub protein, a phenomenon known as the centrality-lethality rule. This rule is widely believed to reflect the special importance of hubs in organizing the network, which in turn suggests the biological significance of network architectures, a key notion of systems biology. Despite the popularity of this explanation, the underlying cause of the centrality-lethality rule has never been critically examined. We here propose the concept of essential PPIs, which are PPIs that are indispensable for the survival or reproduction of an organism. Our network analysis suggests that the centrality-lethality rule is unrelated to the network architecture, but is explained by the simple fact that hubs have large numbers of PPIs, therefore high probabilities of engaging in essential PPIs. We estimate that approximately 3% of PPIs are essential in the yeast, accounting for approximately 43% of essential genes. As expected, essential PPIs are evolutionarily more conserved than nonessential PPIs. Considering the role of essential PPIs in determining gene essentiality, we find the yeast PPI network functionally more robust than random networks, yet far less robust than the potential optimum. These and other findings provide new perspectives on the biological relevance of network structure and robustness.
    Heringa Local weighting schemes for protein multiple sequence alignment. 2002 Comput Chem
    Vol. 26(5), pp. 459-477 
    article  
    Abstract: This paper describes three weighting schemes for improving the accuracy of progressive multiple sequence alignment methods: (1) global profile pre-processing, to capture for each sequence information about other sequences in a profile before the actual multiple alignment takes place; (2) local pre-processing; which incorporates a new protocol to only use non-overlapping local sequence regions to construct the pre-processed profiles; and (3) local-global alignment, a weighting scheme based on the double dynamic programming (DDP) technique to softly bias global alignment to local sequence motifs. The first two schemes allow the compilation of residue-specific multiple alignment reliability indices, which can be used in an iterative fashion. The schemes have been implemented with associated iterative modes in the PRALINE multiple sequence alignment method, and have been evaluated using the BAliBASE benchmark alignment database. These tests indicate that PRALINE is a toolbox able to build alignments with very high quality. We found that local profile pre-processing raises the alignment quality by 5.5% compared to PRALINE alignments generated under default conditions. Iteration enhances the quality by a further percentage point. The implications of multiple alignment scoring functions and iteration in relation to alignment quality and benchmarking are discussed.
    Hermans et al. Distribution of prophages and SGI-1 antibiotic-resistance genes among different Salmonella enterica serovar Typhimurium isolates. 2006 Microbiology
    Vol. 152(Pt 7), pp. 2137-2147 
    article DOI URL 
    Abstract: Recently, the authors identified Salmonella enterica serovar Typhimurium (S. Typhimurium) definitive type (DT)104-specific sequences of mainly prophage origin by genomic subtractive hybridization. In the present study, the distribution of the prophages identified, ST104 and ST64B, and the novel prophage remnant designated prophage ST104B, was tested among 23 non-DT104 S. Typhimurium isolates of different phage types and 19 isolates of the DT104 subtypes DT104A, DT104B low and DT104L, and the DT104-related type U302. The four S. Typhimurium prophages Gifsy-1, Gifsy-2, Fels-1 and Fels-2 were also included. Analysis of prophage distribution in different S. Typhimurium isolates may supply additional information to enable development of a molecular method as an alternative to phage typing. Furthermore, the presence of the common DT104 antibiotic resistance genes for the penta-resistance type ACSSuT, aadA2, floR, pse-1, sul1 and tet(G), was also studied because of the authors' focus on this emerging type. Based on differences in prophage presence within their genome, it was possible to divide S. Typhimurium isolates into 12 groups. Although no clear relationship was found between different phage type and prophage presence, discrimination could be made between the different DT104 subtypes based on diversity in the presence of prophages ST104, ST104B and ST64B. The novel prophage remnant ST104B, which harbours a homologue of the Escherichia coli O157 : H7 HldD LPS assembly-related protein, was identified only in the 14 DT104L isolates and in the DT104-related U302 isolate. In conclusion, the presence of the genes for penta-resistance type ACSSuT, the HldD homologue containing ST104 prophage remnant and phage type DT104L are most likely common features of the emerging subtype of S. Typhimurium DT104.
    Higgins and Sharp CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. 1988 Gene
    Vol. 73(1), pp. 237-244 
    article  
    Abstract: An approach for performing multiple alignments of large numbers of amino acid or nucleotide sequences is described. The method is based on first deriving a phylogenetic tree from a matrix of all pairwise sequence similarity scores, obtained using a fast pairwise alignment algorithm. Then the multiple alignment is achieved from a series of pairwise alignments of clusters of sequences, following the order of branching in the tree. The method is sufficiently fast and economical with memory to be easily implemented on a microcomputer, and yet the results obtained are comparable to those from packages requiring mainframe computer facilities.
    Higgins et al. Using CLUSTAL for multiple sequence alignments. 1996 Methods Enzymol
    Vol. 266, pp. 383-402 
    article  
    Abstract: We have tested CLUSTAL W in a wide variety of situations, and it is capable of handling some very difficult protein alignment problems. If the data set consists of enough closely related sequences so that the first alignments are accurate, then CLUSTAL W will usually find an alignment that is very close to ideal. Problems can still occur if the data set includes sequences of greatly different lengths or if some sequences include long regions that are impossible to align with the rest of the data set. Trying to balance the need for long insertions and deletions in some alignments with the need to avoid them in others is still a problem. The default values for our parameters were tested empirically using test cases of sets of globular proteins where some information as to the correct alignment was available. The parameter values may not be very appropriate with nonglobular proteins. We have argued that using one weight matrix and two gap penalties is too simplistic to be of general use in the most difficult cases. We have replaced these parameters with a large number of new parameters designed primarily to help encourage gaps in loop regions. Although these new parameters are largely heuristic in nature, they perform surprisingly well and are simple to implement. The underlying speed of the progressive alignment approach is not adversely affected. The disadvantage is that the parameter space is now huge; the number of possible combinations of parameters is more than can easily be examined by hand. We justify this by asking the user to treat CLUSTAL W as a data exploration tool rather than as a definitive analysis method. It is not sensible to automatically derive multiple alignments and to trust particular algorithms as being capable of always getting the correct answer. One must examine the alignments closely, especially in conjunction with the underlying phylogenetic tree (or estimate of it) and try varying some of the parameters. Outliers (sequences that have no close relatives) should be aligned carefully, as should fragments of sequences. The program will automatically delay the alignment of any sequences that are less than 40% identical to any others until all other sequences are aligned, but this can be set from a menu by the user. It may be useful to build up an alignment of closely related sequences first and to then add in the more distant relatives one at a time or in batches, using the profile alignments and weighting scheme described earlier and perhaps using a variety of parameter settings. We give one example using SH2 domains. SH2 domains are widespread in eukaryotic signalling proteins where they function in the recognition of phosphotyrosine-containing peptides. In the chapter by Bork and Gibson ([11], this volume), Blast and pattern/profile searches were used to extract the set of known SH2 domains and to search for new members. (Profiles used in database searches are conceptually very similar to the profiles used in CLUSTAL W: see the chapters [11] and [13] for profile search methods.) The profile searches detected SH2 domains in the JAK family of protein tyrosine kinases, which were thought not to contain SH2 domains. Although the JAK family SH2 domains are rather divergent, they have the necessary core structural residues as well as the critical positively charged residue that binds phosphotyrosine, leaving no doubt that they are bona fide SH2 domains. The five new JAK family SH2 domains were added sequentially to the existing alignment of 65 SH2 domains using the CLUSTAL W profile alignment option. Figure 6 shows part of the resulting alignment. Despite their divergent sequences, the new SH2 domains have been aligned nearly perfectly with the old set. No insertions were placed in the original SH2 domains. In this example, the profile alignment procedure has produced better results than a one-step full alignment of all 70 SH2 domains, and in considerably less time. (ABSTRACT TRUNCATED)
    Hogan Directed combinatorial chemistry. 1996 Nature
    Vol. 384(6604 Suppl), pp. 17-19 
    article  
    Abstract: Combinatorial chemistry has given chemists access to vast numbers of molecules, but selecting the right one has proved more difficult. As chemists have gained experience in the technique, however, it has become possible to use solid- or solution-phase syntheses with different chemistries and scaffolds to produce libraries tailor-made for finding or optimizing a lead directed at almost any class of target.
    Hopkins Network pharmacology: the next paradigm in drug discovery. 2008 Nat Chem Biol
    Vol. 4(11), pp. 682-690 
    article DOI URL 
    Abstract: The dominant paradigm in drug discovery is the concept of designing maximally selective ligands to act on individual drug targets. However, many effective drugs act via modulation of multiple proteins rather than single targets. Advances in systems biology are revealing a phenotypic robustness and a network structure that strongly suggests that exquisitely selective compounds, compared with multitarget drugs, may exhibit lower than desired clinical efficacy. This new appreciation of the role of polypharmacology has significant implications for tackling the two major sources of attrition in drug development--efficacy and toxicity. Integrating network biology and polypharmacology holds the promise of expanding the current opportunity space for druggable targets. However, the rational design of polypharmacology faces considerable challenges in the need for new methods to validate target combinations and optimize multiple structure-activity relationships while maintaining drug-like properties. Advances in these areas are creating the foundation of the next paradigm in drug discovery: network pharmacology.
    Howe et al. Vancomycin susceptibility within methicillin-resistant Staphylococcus aureus lineages. 2004 Emerg Infect Dis
    Vol. 10(5), pp. 855-857 
    article  
    Abstract: Methicillin-resistant Staphylococcus aureus (MRSA) with reduced vancomycin susceptibility vancomycin-intermediate S. aureus (VISA) has been reported from many countries. Whether resistance is evolving regularly in different genetic backgrounds or in a single clone with a genetic predisposition, as early results suggest, is unclear. We have studied 101 MRSA with reduced vancomycin susceptibility from nine countries by multilocus sequence typing (MLST), characterization of SCCmec (staphylococcal chromosomal cassette mec), and agr (accessory gene regulator). We found nine genotypes by MLST, with isolates within all five major hospital MRSA lineages. Most isolates (88/101) belonged to two of the earliest MRSA clones that have global prevalence. Our results show that reduced susceptibility to vancomycin has emerged in many successful epidemic lineages with no clear clonal disposition. Increasing antimicrobial resistance in genetically distinct pandemic clones may lead to MRSA infections that will become increasingly difficult to treat.
    Huey et al. A semiempirical free energy force field with charge-based desolvation. 2007 J Comput Chem
    Vol. 28(6), pp. 1145-1152 
    article DOI URL 
    Abstract: The authors describe the development and testing of a semiempirical free energy force field for use in AutoDock4 and similar grid-based docking methods. The force field is based on a comprehensive thermodynamic model that allows incorporation of intramolecular energies into the predicted free energy of binding. It also incorporates a charge-based method for evaluation of desolvation designed to use a typical set of atom types. The method has been calibrated on a set of 188 diverse protein-ligand complexes of known structure and binding energy, and tested on a set of 100 complexes of ligands with retroviral proteases. The force field shows improvement in redocking simulations over the previous AutoDock3 force field.
    Irwin and Shoichet ZINC--a free database of commercially available compounds for virtual screening. 2005 J Chem Inf Model
    Vol. 45(1), pp. 177-182 
    article DOI URL 
    Abstract: A critical barrier to entry into structure-based virtual screening is the lack of a suitable, easy to access database of purchasable compounds. We have therefore prepared a library of 727,842 molecules, each with 3D structure, using catalogs of compounds from vendors (the size of this library continues to grow). The molecules have been assigned biologically relevant protonation states and are annotated with properties such as molecular weight, calculated LogP, and number of rotatable bonds. Each molecule in the library contains vendor and purchasing information and is ready for docking using a number of popular docking programs. Within certain limits, the molecules are prepared in multiple protonation states and multiple tautomeric forms. In one format, multiple conformations are available for the molecules. This database is available for free download (http://zinc.docking.org) in several common file formats including SMILES, mol2, 3D SDF, and DOCK flexibase format. A Web-based query tool incorporating a molecular drawing interface enables the database to be searched and browsed and subsets to be created. Users can process their own molecules by uploading them to a server. Our hope is that this database will bring virtual screening libraries to a wide community of structural biologists and medicinal chemists.
    Jagusztyn-Krynicka and Wyszynska The decline of antibiotic era--new approaches for antibacterial drug discovery. 2008 Pol J Microbiol
    Vol. 57(2), pp. 91-98 
    article  
    Abstract: Infectious diseases still remain the main cause of human premature deaths; especially in developing countries. The emergence and spread of pathogenic bacteria resistant to many antibiotics (multidrug-resistant strains) have created the need for the development of novel therapeutic agents. Only two new classes of antibiotics of novel mechanisms of action (linezolid and daptomycin) have been introduced into the market during the last three decades. The recent progress in molecular biology and bacterial genome analysis has had an enormous impact on antibacterial drug research. This review presents new achievements in searching a new bacterial essential genes, a potential targets for antibacterial drugs. Application of metagenomics strategy is also shown. Some recent technologies aimed at development of anti-pathogenic drugs such as inhibitors of quorum sensing process or histidine kinases are also discussed. Extensive research efforts have provided many details concerning structure of bacterial proteins playing an important role in pathogenesis such as adherence proteins or toxins, what allowed searching for antitoxin drugs or drugs interfering with bacterial adhesion. As an example, the review focuses on anthrax therapies under development. Additionally, the article presents the progress in phage therapy; using bacteriophages or their products such as lysins in antibacterial therapy.
    Ji et al. Identification of critical staphylococcal genes using conditional phenotypes generated by antisense RNA. 2001 Science
    Vol. 293(5538), pp. 2266-2269 
    article DOI URL 
    Abstract: Comprehensive genomic analysis of the important human pathogen Staphylococcus aureus was achieved by a strategy involving antisense technology in a regulatable gene expression system. In addition to known essential genes, many genes of unknown or poorly defined biological function were identified. This methodology allowed gene function to be characterized in a comprehensive, defined set of conditionally growth-defective/lethal isogenic strains. Quantitative titration of the conditional growth effect was performed either in bacterial culture or in an animal model of infection. This genomic strategy offers an approach to the identification of staphylococcal gene products that could serve as targets for antibiotic discovery.
    Jones et al. Epidemiologic trends in nosocomial and community-acquired infections due to antibiotic-resistant gram-positive bacteria: the role of streptogramins and other newer compounds. 1999 Diagn Microbiol Infect Dis
    Vol. 33(2), pp. 101-112 
    article  
    Abstract: The Gram-positive cocci have clearly re-emerged as important pathogens world-wide in the past two decades. Staphylococci, including the coagulase-negative staphylococci and Staphylococcus aureus, and the enterococci account for approximately one-third of all blood stream infections and as much as 50% of nosocomial blood stream infections. Although Streptococcus pneumoniae is often considered a community-acquired pathogen, it is also an important cause of nosocomial infection. The hallmark of these Gram-positive pathogens is increasing resistance to available antimicrobial agents. Of particular note is resistance to glycopeptides (vancomycin and teicoplanin), aminoglycosides (high-level), and penicillins among the enterococci (especially E. faecium), resistance to penicillinase-resistant penicillins (oxacillin and methicillin) and fluoroquinolones (ciprofloxacin and ofloxacin) among staphylococci, and resistance to penicillin, other beta-lactams and macrolides among the pneumococci. The recent detection of decreased susceptibility to vancomycin among S. aureus is also quite ominous. In many instances the ability of the clinical laboratory to accurately characterize these resistant isolates is suboptimal, further compounding the problem. Increased understanding of resistance mechanisms and correlations of resistance genes with the phenotypic expression of resistance has allowed for modifications and improvements of reference susceptibility tests and interpretive breakpoints. New compounds for effective therapy of infection with multi-resistant Gram-positive species are clearly needed. To this end, the streptogramin combination, quinupristin/dalfopristin, has demonstrated significant activity against oxacillin-resistant staphylococci, penicillin-resistant streptococci, and vancomycin-resistant E. faecium. Other candidate drugs including Gram-positive active fluoroquinolones (clinafloxacin, grepafloxacin, moxifloxacin, gatifloxacin, and trovafloxacin) and novel compounds such as the everninomicin derivatives (SCH27899), ketolides, and oxazolidinones (linezolid) have been shown to be active against these organisms and are under rapid clinical development.
    Jordan et al. Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. 2002 Genome Res
    Vol. 12(6), pp. 962-968 
    article DOI URL 
    Abstract: The "knockout-rate" prediction holds that essential genes should be more evolutionarily conserved than are nonessential genes. This is because negative (purifying) selection acting on essential genes is expected to be more stringent than that for nonessential genes, which are more functionally dispensable and/or redundant. However, a recent survey of evolutionary distances between Saccharomyces cerevisiae and Caenorhabditis elegans proteins did not reveal any difference between the rates of evolution for essential and nonessential genes. An analysis of mouse and rat orthologous genes also found that essential and nonessential genes evolved at similar rates when genes thought to evolve under directional selection were excluded from the analysis. In the present study, we combine genomic sequence data with experimental knockout data to compare the rates of evolution and the levels of selection for essential versus nonessential bacterial genes. In contrast to the results obtained for eukaryotic genes, essential bacterial genes appear to be more conserved than are nonessential genes over both relatively short (microevolutionary) and longer (macroevolutionary) time scales.
    Jorgensen Efficient drug lead discovery and optimization. 2009 Acc Chem Res
    Vol. 42(6), pp. 724-733 
    article DOI URL 
    Abstract: During the 1980s, advances in the abilities to perform computer simulations of chemical and biomolecular systems and to calculate free energy changes led to the expectation that such methodology would soon show great utility for guiding molecular design. Important potential applications included design of selective receptors, catalysts, and regulators of biological function including enzyme inhibitors. This time also saw the rise of high-throughput screening and combinatorial chemistry along with complementary computational methods for de novo design and virtual screening including docking. These technologies appeared poised to deliver diverse lead compounds for any biological target. As with many technological advances, realization of the expectations required significant additional effort and time. However, as summarized here, striking success has now been achieved for computer-aided drug lead generation and optimization. De novo design using both molecular growing and docking are illustrated for lead generation, and lead optimization features free energy perturbation calculations in conjunction with Monte Carlo statistical mechanics simulations for protein-inhibitor complexes in aqueous solution. The specific applications are to the discovery of non-nucleoside inhibitors of HIV reverse transcriptase (HIV-RT) and inhibitors of the binding of the proinflammatory cytokine MIF to its receptor CD74. A standard protocol is presented that includes scans for possible additions of small substituents to a molecular core, interchange of heterocycles, and focused optimization of substituents at one site. Initial leads with activities at low-micromolar concentrations have been advanced rapidly to low-nanomolar inhibitors.
    Justice et al. Cell division inhibitors SulA and MinC/MinD block septum formation at different steps in the assembly of the Escherichia coli division machinery. 2000 Mol Microbiol
    Vol. 37(2), pp. 410-423 
    article  
    Abstract: SulA and MinCD are specific inhibitors of cell division in Escherichia coli. In this paper, size exclusion chromatography was used to study the effect of the SulA and MinCD division inhibitors on the oligomerization state of endogenous FtsZ in cytoplasmic extracts, and immunofluorescence microscopy was used to determine the effect of SulA and MinCD on the formation of FtsZ, FtsA and ZipA rings at potential division sites. SulA prevented the formation of high-molecular-weight FtsZ polymers by interfering with FtsZ dimerization and subsequent oligomerization. In contrast, the MinCD division inhibitor did not prevent the oligomerization of FtsZ in the cell extracts or the formation of FtsZ and ZipA ring structures in vivo. However, MinCD did prevent the formation of FtsA rings. Increased expression of ftsA suppressed MinCD-induced division inhibition, but had no effect on SulA-induced division inhibition. These results indicate that MinCD blocks the assembly of the septation machinery at a later step than SulA, at the stage at which FtsA is added to the FtsZ ring.
    Kalman et al. Comparative genomes of Chlamydia pneumoniae and C. trachomatis. 1999 Nat Genet
    Vol. 21(4), pp. 385-389 
    article DOI URL 
    Abstract: Chlamydia are obligate intracellular eubacteria that are phylogenetically separated from other bacterial divisions. C. trachomatis and C. pneumoniae are both pathogens of humans but differ in their tissue tropism and spectrum of diseases. C. pneumoniae is a newly recognized species of Chlamydia that is a natural pathogen of humans, and causes pneumonia and bronchitis. In the United States, approximately 10% of pneumonia cases and 5% of bronchitis cases are attributed to C. pneumoniae infection. Chronic disease may result following respiratory-acquired infection, such as reactive airway disease, adult-onset asthma and potentially lung cancer. In addition, C. pneumoniae infection has been associated with atherosclerosis. C. trachomatis infection causes trachoma, an ocular infection that leads to blindness, and sexually transmitted diseases such as pelvic inflammatory disease, chronic pelvic pain, ectopic pregnancy and epididymitis. Although relatively little is known about C. trachomatis biology, even less is known concerning C. pneumoniae. Comparison of the C. pneumoniae genome with the C. trachomatis genome will provide an understanding of the common biological processes required for infection and survival in mammalian cells. Genomic differences are implicated in the unique properties that differentiate the two species in disease spectrum. Analysis of the 1,230,230-nt C. pneumoniae genome revealed 214 protein-coding sequences not found in C. trachomatis, most without homologues to other known sequences. Prominent comparative findings include expansion of a novel family of 21 sequence-variant outer-membrane proteins, conservation of a type-III secretion virulence system, three serine/threonine protein kinases and a pair of parologous phospholipase-D-like proteins, additional purine and biotin biosynthetic capability, a homologue for aromatic amino acid (tryptophan) hydroxylase and the loss of tryptophan biosynthesis genes.
    Kemena and Notredame Upcoming challenges for multiple sequence alignment methods in the high-throughput era. 2009 Bioinformatics
    Vol. 25(19), pp. 2455-2465 
    article DOI URL 
    Abstract: This review focuses on recent trends in multiple sequence alignment tools. It describes the latest algorithmic improvements including the extension of consistency-based methods to the problem of template-based multiple sequence alignments. Some results are presented suggesting that template-based methods are significantly more accurate than simpler alternative methods. The validation of existing methods is also discussed at length with the detailed description of recent results and some suggestions for future validation strategies. The last part of the review addresses future challenges for multiple sequence alignment methods in the genomic era, most notably the need to cope with very large sequences, the need to integrate large amounts of experimental data, the need to accurately align non-coding and non-transcribed sequences and finally, the need to integrate many alternative methods and approaches.
    Kersey et al. Integr8 and Genome Reviews: integrated views of complete genomes and proteomes. 2005 Nucleic Acids Res
    Vol. 33(Database issue), pp. D297-D302 
    article DOI URL 
    Abstract: Integr8 is a new web portal for exploring the biology of organisms with completely deciphered genomes. For over 190 species, Integr8 provides access to general information, recent publications, and a detailed statistical overview of the genome and proteome of the organism. The preparation of this analysis is supported through Genome Reviews, a new database of bacterial and archaeal DNA sequences in which annotation has been upgraded (compared to the original submission) through the integration of data from many sources, including the EMBL Nucleotide Sequence Database, the UniProt Knowledgebase, InterPro, CluSTr, GOA and HOGENOM. Integr8 also allows the users to customize their own interactive analysis, and to download both customized and prepared datasets for their own use. Integr8 is available at http://www.ebi.ac.uk/integr8.
    Kitchen et al. Docking and scoring in virtual screening for drug discovery: methods and applications. 2004 Nat Rev Drug Discov
    Vol. 3(11), pp. 935-949 
    article DOI URL 
    Abstract: Computational approaches that 'dock' small molecules into the structures of macromolecular targets and 'score' their potential complementarity to binding sites are widely used in hit identification and lead optimization. Indeed, there are now a number of drugs whose development was heavily influenced by or based on structure-based design and screening strategies, such as HIV protease inhibitors. Nevertheless, there remain significant challenges in the application of these approaches, in particular in relation to current scoring schemes. Here, we review key concepts and specific features of small-molecule-protein docking methods, highlight selected applications and discuss recent advances that aim to address the acknowledged limitations of established approaches.
    Klebe Virtual ligand screening: strategies, perspectives and limitations. 2006 Drug Discov Today
    Vol. 11(13-14), pp. 580-594 
    article DOI URL 
    Abstract: In contrast to high-throughput screening, in virtual ligand screening (VS), compounds are selected using computer programs to predict their binding to a target receptor. A key prerequisite is knowledge about the spatial and energetic criteria responsible for protein-ligand binding. The concepts and prerequisites to perform VS are summarized here, and explanations are sought for the enduring limitations of the technology. Target selection, analysis and preparation are discussed, as well as considerations about the compilation of candidate ligand libraries. The tools and strategies of a VS campaign, and the accuracy of scoring and ranking of the results, are also considered.
    Knowles New strategies for antibacterial drug design. 1997 Trends Microbiol
    Vol. 5(10), pp. 379-383 
    article DOI URL 
    Kobayashi et al. Essential Bacillus subtilis genes. 2003 Proc Natl Acad Sci U S A
    Vol. 100(8), pp. 4678-4683 
    article DOI URL 
    Abstract: To estimate the minimal gene set required to sustain bacterial life in nutritious conditions, we carried out a systematic inactivation of Bacillus subtilis genes. Among approximately 4,100 genes of the organism, only 192 were shown to be indispensable by this or previous work. Another 79 genes were predicted to be essential. The vast majority of essential genes were categorized in relatively few domains of cell metabolism, with about half involved in information processing, one-fifth involved in the synthesis of cell envelope and the determination of cell shape and division, and one-tenth related to cell energetics. Only 4% of essential genes encode unknown functions. Most essential genes are present throughout a wide range of Bacteria, and almost 70% can also be found in Archaea and Eucarya. However, essential genes related to cell envelope, shape, division, and respiration tend to be lost from bacteria with small genomes. Unexpectedly, most genes involved in the Embden-Meyerhof-Parnas pathway are essential. Identification of unknown and unexpected essential genes opens research avenues to better understanding of processes that sustain bacterial life.
    Koch Bacterial wall as target for attack: past, present, and future research. 2003 Clin Microbiol Rev
    Vol. 16(4), pp. 673-687 
    article  
    Abstract: When Bacteria, Archaea, and Eucarya separated from each other, a great deal of evolution had taken place. Only then did extensive diversity arise. The bacteria split off with the new property that they had a sacculus that protected them from their own turgor pressure. The saccular wall of murein (or peptidoglycan) was an effective solution to the osmotic pressure problem, but it then was a target for other life-forms, which created lysoymes and beta-lactams. The beta-lactams, with their four-member strained rings, are effective agents in nature and became the first antibiotic in human medicine. But that is by no means the end of the story. Over evolutionary time, bacteria challenged by beta-lactams evolved countermeasures such as beta-lactamases, and the producing organisms evolved variant beta-lactams. The biology of both classes became evident as the pharmaceutical industry isolated, modified, and produced new chemotherapeutic agents and as the properties of beta-lactams and beta-lactamases were examined by molecular techniques. This review attempts to fit the wall biology of current microbes and their clinical context into the way organisms developed on this planet as well as the changes arising since the work done by Fleming. It also outlines the scientific advances in our understanding of this broad area of biology.
    Kohanski et al. How antibiotics kill bacteria: from targets to networks. 2010 Nat Rev Microbiol
    Vol. 8(6), pp. 423-435 
    article DOI URL 
    Abstract: Antibiotic drug-target interactions, and their respective direct effects, are generally well characterized. By contrast, the bacterial responses to antibiotic drug treatments that contribute to cell death are not as well understood and have proven to be complex as they involve many genetic and biochemical pathways. In this Review, we discuss the multilayered effects of drug-target interactions, including the essential cellular processes that are inhibited by bactericidal antibiotics and the associated cellular response mechanisms that contribute to killing. We also discuss new insights into these mechanisms that have been revealed through the study of biological networks, and describe how these insights, together with related developments in synthetic biology, could be exploited to create new antibacterial therapies.
    Koonin and Galperin Sequence - Evolution - Function: Computational Approaches in Comparative Genomics 2003   book  
    Korf et al. BLAST 2003   book  
    Kuroda et al. Whole genome sequencing of meticillin-resistant Staphylococcus aureus. 2001 Lancet
    Vol. 357(9264), pp. 1225-1240 
    article  
    Abstract: BACKGROUND: Staphylococcus aureus is one of the major causes of community-acquired and hospital-acquired infections. It produces numerous toxins including superantigens that cause unique disease entities such as toxic-shock syndrome and staphylococcal scarlet fever, and has acquired resistance to practically all antibiotics. Whole genome analysis is a necessary step towards future development of countermeasures against this organism. METHODS: Whole genome sequences of two related S aureus strains (N315 and Mu50) were determined by shot-gun random sequencing. N315 is a meticillin-resistant S aureus (MRSA) strain isolated in 1982, and Mu50 is an MRSA strain with vancomycin resistance isolated in 1997. The open reading frames were identified by use of GAMBLER and GLIMMER programs, and annotation of each was done with a BLAST homology search, motif analysis, and protein localisation prediction. FINDINGS: The Staphylococcus genome was composed of a complex mixture of genes, many of which seem to have been acquired by lateral gene transfer. Most of the antibiotic resistance genes were carried either by plasmids or by mobile genetic elements including a unique resistance island. Three classes of new pathogenicity islands were identified in the genome: a toxic-shock-syndrome toxin island family, exotoxin islands, and enterotoxin islands. In the latter two pathogenicity islands, clusters of exotoxin and enterotoxin genes were found closely linked with other gene clusters encoding putative pathogenic factors. The analysis also identified 70 candidates for new virulence factors. INTERPRETATION: The remarkable ability of S aureus to acquire useful genes from various organisms was revealed through the observation of genome complexity and evidence of lateral gene transfer. Repeated duplication of genes encoding superantigens explains why S aureus is capable of infecting humans of diverse genetic backgrounds, eliciting severe immune reactions. Investigation of many newly identified gene products, including the 70 putative virulence factors, will greatly improve our understanding of the biology of staphylococci and the processes of infectious diseases caused by S aureus.
    K{\"o}ppen Virtual screening - what does it give us? 2009 Curr Opin Drug Discov Devel
    Vol. 12(3), pp. 397-407 
    article  
    Abstract: In current pharmaceutical research, lead compounds of high quality and structural diversity are key to the successful optimization of development candidates. In-house compound libraries at pharmaceutical companies, tested using HTS assays, are the major source of leads for new projects. However, these physically existing compounds, stored in microtiter plates in dispensaries, represent only a tiny fraction of the drug-like chemical space. Virtual screening offers many possibilities for new structures beyond those found in in-house libraries. During the last decade, a huge number of different virtual screening methods have been reported and used to search for novel bioactive compounds for many targets. This review addresses the current status of virtual screening, highlighting achievements as well as challenges, along with the value of virtual screening, and recent examples of successful applications.
    Lange et al. The targets of currently used antibacterial agents: lessons for drug discovery. 2007 Curr Pharm Des
    Vol. 13(30), pp. 3140-3154 
    article  
    Abstract: Based on the mode of action of antibacterial drugs currently used, targets can be defined as distinct cellular constituents such as enzymes, enzyme substrates, RNA, DNA, and membranes which exhibit very specific binding sites at the surface of these components or at the interface of macromolecular complexes assembled in the cell. Intriguingly, growth inhibition or even complete loss of bacterial viability is often the result of a cascade of events elicited upon treatment with an antibacterial agent. In addition, their mode of action frequently involves more than one single target. A comprehensive description of the targets exploited so far by commercialized antibacterial agents, including anti-mycobacterial agents, is given. The number of targets exploited so far by commercial antibacterial agents is estimated to be about 40. The most important biosynthetic pathways and cellular structures affected by antibacterial drugs are the cell wall biosynthesis, protein biosynthesis, DNA per se, replication, RNA per se, transcription and the folate biosynthetic pathway. The disillusionment with the genomics driven antibacterial drug discovery is a result of the restrictive definition of targets as products of essential and conserved genes. Emphasis is made to not only focus on proteins as potential drug targets, but increase efforts and devise screening technologies to discover new agents interacting with different RNA species, DNA, new protein families or macromolecular complexes of these constituents.
    Larkin et al. Clustal W and Clustal X version 2.0. 2007 Bioinformatics
    Vol. 23(21), pp. 2947-2948 
    article DOI URL 
    Abstract: SUMMARY: The Clustal W and Clustal X multiple sequence alignment programs have been completely rewritten in C++. This will facilitate the further development of the alignment algorithms in the future and has allowed proper porting of the programs to the latest versions of Linux, Macintosh and Windows operating systems. AVAILABILITY: The programs can be run on-line from the EBI web server: http://www.ebi.ac.uk/tools/clustalw2. The source code and executables for Windows, Linux and Macintosh computers are available from the EBI ftp site ftp://ftp.ebi.ac.uk/pub/software/clustalw2/
    Lassman and Sonnhammer Quality assessment of multiple alignment programs 2002 FEBS Letters
    Vol. 529, pp. 126-130 
    article  
    Leaver et al. Life without a wall or division machine in Bacillus subtilis. 2009 Nature
    Vol. 457(7231), pp. 849-853 
    article DOI URL 
    Abstract: The cell wall is an essential structure for virtually all bacteria, forming a tough outer shell that protects the cell from damage and osmotic lysis. It is the target of our best antibiotics. L-form strains are wall-deficient derivatives of common bacteria that have been studied for decades. However, they are difficult to generate and typically require growth for many generations on osmotically protective media with antibiotics or enzymes that kill walled forms. Despite their potential importance for understanding antibiotic resistance and pathogenesis, little is known about their basic cell biology or their means of propagation. We have developed a controllable system for generating L-forms in the highly tractable model bacterium Bacillus subtilis. Here, using genome sequencing, we identify a single point mutation that predisposes cells to grow without a wall. We show that propagation of L-forms does not require the normal FtsZ-dependent division machine but occurs by a remarkable extrusion-resolution mechanism. This novel form of propagation provides insights into how early forms of cellular life may have proliferated.
    Lee Generating consensus sequences from partial order multiple sequence alignment graphs. 2003 Bioinformatics
    Vol. 19(8), pp. 999-1008 
    article  
    Abstract: MOTIVATION: Consensus sequence generation is important in many kinds of sequence analysis ranging from sequence assembly to profile-based iterative search methods. However, how can a consensus be constructed when its inherent assumption-that the aligned sequences form a single linear consensus-is not true? RESULTS: Partial Order Alignment (POA) enables construction and analysis of multiple sequence alignments as directed acyclic graphs containing complex branching structure. Here we present a dynamic programming algorithm (heaviest_bundle) for generating multiple consensus sequences from such complex alignments. The number and relationships of these consensus sequences reveals the degree of structural complexity of the source alignment. This is a powerful and general approach for analyzing and visualizing complex alignment structures, and can be applied to any alignment. We illustrate its value for analyzing expressed sequence alignments to detect alternative splicing, reconstruct full length mRNA isoform sequences from EST fragments, and separate paralog mixtures that can cause incorrect SNP predictions. AVAILABILITY: The heaviest_bundle source code is available at http://www.bioinformatics.ucla.edu/poa
    Lee et al. Multiple sequence alignment using partial order graphs. 2002 Bioinformatics
    Vol. 18(3), pp. 452-464 
    article  
    Abstract: MOTIVATION: Progressive Multiple Sequence Alignment (MSA) methods depend on reducing an MSA to a linear profile for each alignment step. However, this leads to loss of information needed for accurate alignment, and gap scoring artifacts. RESULTS: We present a graph representation of an MSA that can itself be aligned directly by pairwise dynamic programming, eliminating the need to reduce the MSA to a profile. This enables our algorithm (Partial Order Alignment (POA)) to guarantee that the optimal alignment of each new sequence versus each sequence in the MSA will be considered. Moreover, this algorithm introduces a new edit operator, homologous recombination, important for multidomain sequences. The algorithm has improved speed (linear time complexity) over existing MSA algorithms, enabling construction of massive and complex alignments (e.g. an alignment of 5000 sequences in 4 h on a Pentium II). We demonstrate the utility of this algorithm on a family of multidomain SH2 proteins, and on EST assemblies containing alternative splicing and polymorphism. AVAILABILITY: The partial order alignment program POA is available at http://www.bioinformatics.ucla.edu/poa.
    Lesk Introduction to Bioinformatics 2008   book  
    Lesk Introduction to Protein Architecture 2001   book  
    Liang et al. Anatomy of protein pockets and cavities: measurement of binding site geometry and implications for ligand design. 1998 Protein Sci
    Vol. 7(9), pp. 1884-1897 
    article DOI URL 
    Abstract: Identification and size characterization of surface pockets and occluded cavities are initial steps in protein structure-based ligand design. A new program, CAST, for automatically locating and measuring protein pockets and cavities, is based on precise computational geometry methods, including alpha shape and discrete flow theory. CAST identifies and measures pockets and pocket mouth openings, as well as cavities. The program specifies the atoms lining pockets, pocket openings, and buried cavities; the volume and area of pockets and cavities; and the area and circumference of mouth openings. CAST analysis of over 100 proteins has been carried out; proteins examined include a set of 51 monomeric enzyme-ligand structures, several elastase-inhibitor complexes, the FK506 binding protein, 30 HIV-1 protease-inhibitor complexes, and a number of small and large protein inhibitors. Medium-sized globular proteins typically have 10-20 pockets/cavities. Most often, binding sites are pockets with 1-2 mouth openings; much less frequently they are cavities. Ligand binding pockets vary widely in size, most within the range 10(2)-10(3)A3. Statistical analysis reveals that the number of pockets and cavities is correlated with protein size, but there is no correlation between the size of the protein and the size of binding sites. Most frequently, the largest pocket/cavity is the active site, but there are a number of instructive exceptions. Ligand volume and binding site volume are somewhat correlated when binding site volume is < or =700 A3, but the ligand seldom occupies the entire site. Auxiliary pockets near the active site have been suggested as additional binding surface for designed ligands (Mattos C et al., 1994, Nat Struct Biol 1:55-58). Analysis of elastase-inhibitor complexes suggests that CAST can identify ancillary pockets suitable for recruitment in ligand design strategies. Analysis of the FK506 binding protein, and of compounds developed in SAR by NMR (Shuker SB et al., 1996, Science 274:1531-1534), indicates that CAST pocket computation may provide a priori identification of target proteins for linked-fragment design. CAST analysis of 30 HIV-1 protease-inhibitor complexes shows that the flexible active site pocket can vary over a range of 853-1,566 A3, and that there are two pockets near or adjoining the active site that may be recruited for ligand design.
    Lock and Harry Cell-division inhibitors: new insights for future antibiotics. 2008 Nat Rev Drug Discov
    Vol. 7(4), pp. 324-338 
    article DOI URL 
    Abstract: The growing problem of antibiotic resistance has been exacerbated by the use of new drugs that are merely variants of older overused antibiotics. While it is naive to expect to restrain the spread of resistance without controlling antibacterial usage, the desperate need for drugs with novel targets has been recognized by health organizations, industry and academia alike. The wealth of knowledge available about the bacterial cell-division pathway has aided target-driven approaches to identify novel inhibitors. Here, we discuss the therapeutic potential of inhibiting bacterial cell division, and review the progress made in this exciting new area of antibacterial discovery.
    Makhlin et al. Staphylococcus aureus ArcR controls expression of the arginine deiminase operon. 2007 J Bacteriol
    Vol. 189(16), pp. 5976-5986 
    article DOI URL 
    Abstract: We identified a single open reading frame that is strongly similar to ArcR, a member of the Crp/Fnr family of bacterial transcriptional regulators, in all sequenced Staphylococcus aureus genomes. The arcR gene encoding ArcR forms an operon with the arginine deiminase (ADI) pathway genes arcABDC that enable the utilization of arginine as a source of energy for growth under anaerobic conditions. In this report, we show that under anaerobic conditions, S. aureus growth is subject to glucose catabolic repression and is enhanced by arginine. Likewise, glucose and arginine have reciprocal effects on the transcription of the arcABDCR genes. Furthermore, we show using a mutant deleted for arcR that the transcription of the arc operon under anaerobic conditions depends strictly on a functional ArcR. These findings are supported by proteome analyses, which showed that under anaerobic conditions the expression of the ADI catabolic proteins depends on ArcR. Bioinformatic analysis of S. aureus ArcR predicts an N-terminal nucleotide binding domain and a C-terminal helix-turn-helix DNA binding motif. ArcR binds to a conserved Crp-like sequence motif, TGTGA-N(6)-TCACA, present in the arc promoter region and thereby activates the expression of the ADI pathway genes. Crp-like sequence motifs were also found in the regulatory regions of some 30 other S. aureus genes mostly encoding anaerobic enzymatic systems, virulence factors, and regulatory systems. ArcR was tested and found to bind to the regulatory regions of four such genes, adh1, lctE, srrAB, and lukM. In one case, for lctE, encoding l-lactate dehydrogenase, ArcR was able to bind only in the presence of cyclic AMP. These observations suggest that ArcR is likely to play an important role in the expression of numerous genes required for anaerobic growth.
    Margulies Confidence in comparative genomics. 2008 Genome Res
    Vol. 18(2), pp. 199-200 
    article DOI URL 
    Margulies et al. Identification and characterization of multi-species conserved sequences. 2003 Genome Res
    Vol. 13(12), pp. 2507-2518 
    article DOI URL 
    Abstract: Comparative sequence analysis has become an essential component of studies aiming to elucidate genome function. The increasing availability of genomic sequences from multiple vertebrates is creating the need for computational methods that can detect highly conserved regions in a robust fashion. Towards that end, we are developing approaches for identifying sequences that are conserved across multiple species; we call these "Multi-species Conserved Sequences" (or MCSs). Here we report two strategies for MCS identification, demonstrating their ability to detect virtually all known actively conserved sequences (specifically, coding sequences) but very little neutrally evolving sequence (specifically, ancestral repeats). Importantly, we find that a substantial fraction of the bases within MCSs (approximately 70 resides within non-coding regions; thus, the majority of sequences conserved across multiple vertebrate species has no known function. Initial characterization of these MCSs has revealed sequences that correspond to clusters of transcription factor-binding sites, non-coding RNA transcripts, and other candidate functional elements. Finally, the ability to detect MCSs represents a valuable metric for assessing the relative contribution of a species' sequence to identifying genomic regions of interest, and our results indicate that the currently available genome sequences are insufficient for the comprehensive identification of MCSs in the human genome.
    Mauser and Guba Recent developments in de novo design and scaffold hopping. 2008 Curr Opin Drug Discov Devel
    Vol. 11(3), pp. 365-374 
    article  
    Abstract: This review covers the developments in the fields of de novo ligand design and scaffold hopping since 2006. De novo ligand design was introduced in 1991 as a purely structure-based method to suggest ligands for synthesis and was later augmented by ligand-based approaches. Both structure-based and ligand-based methods identify pharmacophores, as well as shape constraints, and subsequently match these with complementary features embedded into small-molecule topologies. Recently, significant attention has been paid to de novo ligand design in combination with biophysical fragment screening and X-ray structure elucidation. Scaffold hopping has evolved from a niche application of de novo design into a rapidly expanding suite of different software tools, which are used extensively in the pharmaceutical industry.
    May et al. Structural and functional analysis of two glutamate racemase isozymes from Bacillus anthracis and implications for inhibitor design. 2007 J Mol Biol
    Vol. 371(5), pp. 1219-1237 
    article DOI URL 
    Abstract: Glutamate racemase (RacE) is responsible for converting l-glutamate to d-glutamate, which is an essential component of peptidoglycan biosynthesis, and the primary constituent of the poly-gamma-d-glutamate capsule of the pathogen Bacillus anthracis. RacE enzymes are essential for bacterial growth and lack a human homolog, making them attractive targets for the design and development of antibacterial therapeutics. We have cloned, expressed and purified the two glutamate racemase isozymes, RacE1 and RacE2, from the B. anthracis genome. Through a series of steady-state kinetic studies, and based upon the ability of both RacE1 and RacE2 to catalyze the rapid formation of d-glutamate, we have determined that RacE1 and RacE2 are bona fide isozymes. The X-ray structures of B. anthracis RacE1 and RacE2, in complex with d-glutamate, were determined to resolutions of 1.75 A and 2.0 A. Both enzymes are dimers with monomers arranged in a "tail-to-tail" orientation, similar to the B. subtilis RacE structure, but differing substantially from the Aquifex pyrophilus RacE structure. The differences in quaternary structures produce differences in the active sites of racemases among the various species, which has important implications for structure-based, inhibitor design efforts within this class of enzymes. We found a Val to Ala variance at the entrance of the active site between RacE1 and RacE2, which results in the active site entrance being less sterically hindered for RacE1. Using a series of inhibitors, we show that this variance results in differences in the inhibitory activity against the two isozymes and suggest a strategy for structure-based inhibitor design to obtain broad-spectrum inhibitors for glutamate racemases.
    McDevitt and Rosenberg Exploiting genomics to discover new antibiotics. 2001 Trends Microbiol
    Vol. 9(12), pp. 611-617 
    article  
    Abstract: There is an urgent need to develop new classes of antibiotics to tackle the increase in resistance in many common bacterial pathogens. One strategy to develop new antibiotics is to identify and exploit new molecular targets and this strategy is being driven by the wealth of new genome sequence information now available. Additionally, new technologies have been developed to validate new antibacterial targets, for example, new technologies have been developed to enable rapid determination of whether a gene is essential and to assess the transcription status of a putative target during infection. As a result, many novel validated targets have now been identified and for some, appropriate high-throughput screens against diverse compound collections have been carried out. Novel antibiotic leads are emerging from these genomics-derived targeted screens and the challenge now is to optimize and develop these leads to become part of the next generation of antibiotics.
    Mills When will the genomics investment pay off for antibacterial discovery? 2006 Biochem Pharmacol
    Vol. 71(7), pp. 1096-1102 
    article DOI URL 
    Abstract: Effective solutions to antibacterial resistance are among the key unmet medical needs driving the antibacterial industry. A major thrust in a number of companies is the development of agents with new modes of action in order to bypass the increasing emergence of antibacterial resistance. However, few antibacterials marketed in the last 30 years have novel modes of action. Most recently, genomics and target-based screening technologies have been emphasized as a means to facilitate this and expedite the antibacterial discovery process. And although no new antibacterials have yet been marketed as result of these technologies, genomics has delivered well-validated novel bacterial targets as well as a host of genetic approaches to support the antibacterial discovery process. Likewise, high throughput screening technologies have delivered the capacity to perform robust screenings of large compound collections to identify target inhibitors for lead generation. One of the principal challenges still facing antibacterial discovery is to become proficient at optimizing target inhibitors into broad-spectrum antibacterials with appropriate in vivo properties. Genomics-based technologies clearly have the potential for additional application throughout the discovery process especially in the areas of structural biology and safety assessment.
    Mills The role of genomics in antimicrobial discovery. 2003 J Antimicrob Chemother
    Vol. 51(4), pp. 749-752 
    article DOI URL 
    Moretti et al. The M-Coffee web server: a meta-method for computing multiple sequence alignments by combining alternative alignment methods. 2007 Nucleic Acids Res
    Vol. 35(Web Server issue), pp. W645-W648 
    article DOI URL 
    Abstract: The M-Coffee server is a web server that makes it possible to compute multiple sequence alignments (MSAs) by running several MSA methods and combining their output into one single model. This allows the user to simultaneously run all his methods of choice without having to arbitrarily choose one of them. The MSA is delivered along with a local estimation of its consistency with the individual MSAs it was derived from. The computation of the consensus multiple alignment is carried out using a special mode of the T-Coffee package [Notredame, Higgins and Heringa (T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 2000; 302: 205-217); Wallace, O'Sullivan, Higgins and Notredame (M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Res. 2006; 34: 1692-1699)] Given a set of sequences (DNA or proteins) in FASTA format, M-Coffee delivers a multiple alignment in the most common formats. M-Coffee is a freeware open source package distributed under a GPL license and it is available either as a standalone package or as a web service from www.tcoffee.org.
    Moretti et al. R-Coffee: a web server for accurately aligning noncoding RNA sequences. 2008 Nucleic Acids Res
    Vol. 36(Web Server issue), pp. W10-W13 
    article DOI URL 
    Abstract: The R-Coffee web server produces highly accurate multiple alignments of noncoding RNA (ncRNA) sequences, taking into account predicted secondary structures. R-Coffee uses a novel algorithm recently incorporated in the T-Coffee package. R-Coffee works along the same lines as T-Coffee: it uses pairwise or multiple sequence alignment (MSA) methods to compute a primary library of input alignments. The program then computes an MSA highly consistent with both the alignments contained in the library and the secondary structures associated with the sequences. The secondary structures are predicted using RNAplfold. The server provides two modes. The slow/accurate mode is restricted to small datasets (less than 5 sequences less than 150 nucleotides) and combines R-Coffee with Consan, a very accurate pairwise RNA alignment method. For larger datasets a fast method can be used (RM-Coffee mode), that uses R-Coffee to combine the output of the three packages which combines the outputs from programs found to perform best on RNA (MUSCLE, MAFFT and ProbConsRNA). Our BRAliBase benchmarks indicate that the R-Coffee/Consan combination is one of the best ncRNA alignment methods for short sequences, while the RM-Coffee gives comparable results on longer sequences. The R-Coffee web server is available at http://www.tcoffee.org.
    Morgenstern DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment. 1999 Bioinformatics
    Vol. 15(3), pp. 211-218 
    article  
    Abstract: MOTIVATION: The performance and time complexity of an improved version of the segment-to-segment approach to multiple sequence alignment is discussed. In this approach, alignments are composed from gap-free segment pairs, and the score of an alignment is defined as the sum of so-called weights of these segment pairs. RESULTS: A modification of the weight function used in the original version of the alignment program DIALIGN has two important advantages: it can be applied to both globally and locally related sequence sets, and the running time of the program is considerably improved. The time complexity of the algorithm is discussed theoretically, and the program running time is reported for various test examples. AVAILABILITY: The program is available on-line at the Bielefeld University Bioinformatics Server (BiBiServ) http://bibiserv.TechFak.Uni-Bielefeld.DE/dial ign/
    Morgenstern et al. Multiple DNA and protein sequence alignment based on segment-to-segment comparison. 1996 Proc Natl Acad Sci U S A
    Vol. 93(22), pp. 12098-12103 
    article  
    Abstract: In this paper, a new way to think about, and to construct, pairwise as well as multiple alignments of DNA and protein sequences is proposed. Rather than forcing alignments to either align single residues or to introduce gaps by defining an alignment as a path running right from the source up to the sink in the associated dot-matrix diagram, we propose to consider alignments as consistent equivalence relations defined on the set of all positions occurring in all sequences under consideration. We also propose constructing alignments from whole segments exhibiting highly significant overall similarity rather than by aligning individual residues. Consequently, we present an alignment algorithm that (i) is based on segment-to-segment comparison instead of the commonly used residue-to-residue comparison and which (ii) avoids the well-known difficulties concerning the choice of appropriate gap penalties: gaps are not treated explicity, but remain as those parts of the sequences that do not belong to any of the aligned segments. Finally, we discuss the application of our algorithm to two test examples and compare it with commonly used alignment methods. As a first example, we aligned a set of 11 DNA sequences coding for functional helix-loop-helix proteins. Though the sequences show only low overall similarity, our program correctly aligned all of the 11 functional sites, which was a unique result among the methods tested. As a by-product, the reading frames of the sequences were identified. Next, we aligned a set of ribonuclease H proteins and compared our results with alignments produced by other programs as reported by McClure et al. [McClure, M. A., Vasi, T. K. & Fitch, W. M. (1994) Mol. Biol. Evol. 11, 571-592]. Our program was one of the best scoring programs. However, in contrast to other methods, our protein alignments are independent of user-defined parameters.
    Morgenstern et al. DIALIGN: finding local similarities by multiple sequence alignment. 1998 Bioinformatics
    Vol. 14(3), pp. 290-294 
    article  
    Abstract: MOTIVATION: DIALIGN is a new method for pairwise as well as multiple alignment of nucleic acid and protein sequences. While standard alignment programs rely on comparing single residues and imposing gap penalties, DIALIGN constructs alignments by comparing whole segments of the sequences. No gap penalty is employed. This point of view is especially adequate if sequences are not globally related, but share only local similarities, as is the case in genomic DNA sequences and in many protein families. RESULTS: Using four different data sets, we show that DIALIGN is able correctly to align conserved motifs in protein sequences. Alignments produced by DIALIGN are compared systematically to the results of five other alignment programs. AVAILABILITY: DIALIGN is available to the scientific community free of charge for non-commercial use. Executables for various UNIX platforms including LINUX can be downloaded at http://www.gsf.de/biodv/dialign.html Contact: werner, morgenstern@gsf.de
    Morris et al. Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function 1998 Journal of Computational Chemistry
    Vol. 19, pp. 1639-1662 
    article  
    Abstract: A novel and robust automated docking method that predicts the bound conformations of flexible ligands to macromolecular targets has been developed and tested, in combination with a new scoring function that estimates the free energy change upon binding. Interestingly, this method applies a Lamarckian model of genetics, in which environmental adaptations of an individual's phenotype are reverse transcribed into its genotype and become heritable traits (sic). We consider three search methods, Monte Carlo simulated annealing, a traditional genetic algorithm, and the Lamarckian genetic algorithm, and compare their performance in dockings of seven protein-ligand test systems having known three-dimensional structure. We show that both the traditional and Lamarckian genetic algorithms can handle ligands with more degrees of freedom than the simulated annealing method used in earlier versions of AUTODOCK, and that the Lamarckian genetic algorithm is the most efficient, reliable, and successful of the three. The empirical free energy function was calibrated using a set of 30 structurally known protein-ligand complexes with experimentally determined binding constants. Linear regression analysis of the observed binding constants in terms of a wide variety of structure-derived molecular properties was performed. The final model had a residual standard error of 9.11 kJ mol-1 (2.177 kcal mol-1) and was chosen as the new energy function. The new search methods and empirical free energy function are available in AUTODOCK, version 3.0. © 1998 John Wiley & Sons, Inc. J Comput Chem 19: 1639-1662, 1998
    Morris et al. Distributed automated docking of flexible ligands to proteins: parallel applications of AutoDock 2.4. 1996 J Comput Aided Mol Des
    Vol. 10(4), pp. 293-304 
    article  
    Abstract: AutoDock 2.4 predicts the bound conformations of a small, flexible ligand to a nonflexible macromolecular target of known structure. The technique combines simulated annealing for conformation searching with a rapid grid-based method of energy evaluation based on the AMBER force field. AutoDock has been optimized in performance without sacrificing accuracy; it incorporates many enhancements and additions, including an intuitive interface. We have developed a set of tools for launching and analyzing many independent docking jobs in parallel on a heterogeneous network of UNIX-based workstations. This paper describes the current release, and the results of a suite of diverse test systems. We also present the results of a systematic investigation into the effects of varying simulated-annealing parameters on molecular docking. We show that even for ligands with a large number of degrees of freedom, root-mean-square deviations of less than 1 A from the crystallographic conformation are obtained for the lowest-energy dockings, although fewer dockings find the crystallographic conformation when there are more degrees of freedom.
    Morris et al. AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. 2009 J Comput Chem
    Vol. 30(16), pp. 2785-2791 
    article DOI URL 
    Abstract: We describe the testing and release of AutoDock4 and the accompanying graphical user interface AutoDockTools. AutoDock4 incorporates limited flexibility in the receptor. Several tests are reported here, including a redocking experiment with 188 diverse ligand-protein complexes and a cross-docking experiment using flexible sidechains in 87 HIV protease complexes. We also report its utility in analysis of covalently bound ligands, using both a grid-based docking method and a modification of the flexible sidechain technique.
    Mount Bioinformatics: Sequence and Genome Analysis 2004   book  
    Msadek Grasping at shadows: revealing the elusive nature of essential genes. 2009 J Bacteriol
    Vol. 191(15), pp. 4701-4704 
    article DOI URL 
    Mushegian and Koonin A minimal gene set for cellular life derived by comparison of complete bacterial genomes. 1996 Proc Natl Acad Sci U S A
    Vol. 93(19), pp. 10268-10273 
    article  
    Abstract: The recently sequenced genome of the parasitic bacterium Mycoplasma genitalium contains only 468 identified protein-coding genes that have been dubbed a minimal gene complement [Fraser, C.M., Gocayne, J.D., White, O., Adams, M.D., Clayton, R.A., et al. (1995) Science 270, 397-403]. Although the M. genitalium gene complement is indeed the smallest among known cellular life forms, there is no evidence that it is the minimal self-sufficient gene set. To derive such a set, we compared the 468 predicted M. genitalium protein sequences with the 1703 protein sequences encoded by the other completely sequenced small bacterial genome, that of Haemophilus influenzae. M. genitalium and H. influenzae belong to two ancient bacterial lineages, i.e., Gram-positive and Gram-negative bacteria, respectively. Therefore, the genes that are conserved in these two bacteria are almost certainly essential for cellular function. It is this category of genes that is most likely to approximate the minimal gene set. We found that 240 M. genitalium genes have orthologs among the genes of H. influenzae. This collection of genes falls short of comprising the minimal set as some enzymes responsible for intermediate steps in essential pathways are missing. The apparent reason for this is the phenomenon that we call nonorthologous gene displacement when the same function is fulfilled by nonorthologous proteins in two organisms. We identified 22 nonorthologous displacements and supplemented the set of orthologs with the respective M. genitalium genes. After examining the resulting list of 262 genes for possible functional redundancy and for the presence of apparently parasite-specific genes, 6 genes were removed. We suggest that the remaining 256 genes are close to the minimal gene set that is necessary and sufficient to sustain the existence of a modern-type cell. Most of the proteins encoded by the genes from the minimal set have eukaryotic or archaeal homologs but seven key proteins of DNA replication do not. We speculate that the last common ancestor of the three primary kingdoms had an RNA genome. Possibilities are explored to further reduce the minimal set to model a primitive cell that might have existed at a very early stage of life evolution.
    Needleman and Wunsch A general method applicable to the search for similarities in the amino acid sequence of two proteins. 1970 J Mol Biol
    Vol. 48(3), pp. 443-453 
    article  
    Notredame et al. T-Coffee: A novel method for fast and accurate multiple sequence alignment. 2000 J Mol Biol
    Vol. 302(1), pp. 205-217 
    article DOI URL 
    Abstract: We describe a new method (T-Coffee) for multiple sequence alignment that provides a dramatic improvement in accuracy with a modest sacrifice in speed as compared to the most commonly used alternatives. The method is broadly based on the popular progressive approach to multiple alignment but avoids the most serious pitfalls caused by the greedy nature of this algorithm. With T-Coffee we pre-process a data set of all pair-wise alignments between the sequences. This provides us with a library of alignment information that can be used to guide the progressive alignment. Intermediate alignments are then based not only on the sequences to be aligned next but also on how all of the sequences align with each other. This alignment information can be derived from heterogeneous sources such as a mixture of alignment programs and/or structure superposition. Here, we illustrate the power of the approach by using a combination of local and global pair-wise alignments to generate the library. The resulting alignments are significantly more reliable, as determined by comparison with a set of 141 test cases, than any of the popular alternatives that we tried. The improvement, especially clear with the more difficult test cases, is always visible, regardless of the phylogenetic spread of the sequences in the tests.
    Nygaard et al. Community-associated methicillin-resistant Staphylococcus aureus skin infections: advances toward identifying the key virulence factors. 2008 Curr Opin Infect Dis
    Vol. 21(2), pp. 147-152 
    article DOI URL 
    Abstract: PURPOSE OF REVIEW: In recent years there has been an increase in the incidence of community-associated methicillin-resistant Staphylococcus aureus (CA-MRSA) infections in healthy individuals, the cause of which is largely unknown. CA-MRSA primarily causes skin and soft-tissue infections but certain strains are also associated with unusually severe pathology. The purpose of this review is to provide a critical analysis of our current knowledge of virulence factors contributing to skin and soft-tissue infections caused by CA-MRSA. RECENT FINDINGS: Isolates classified as pulsed-field gel electrophoresis type USA300 have emerged as the predominant CA-MRSA genotype and in most geographic areas account for 97% or more of CA-MRSA infections. Recent key studies, such as those reporting the complete genome sequence of USA300, and the discovery of cytolytic peptides that contribute significantly to CA-MRSA virulence, lead the way for future investigations. SUMMARY: Although we have only a cursory understanding of the molecular mechanisms of CA-MRSA virulence, studies using clinically relevant CA-MRSA isolates are beginning to identify virulence determinants specific to this pathogen. Identifying CA-MRSA virulence determinants and the concerted regulation of these factors will foster development of vaccines and therapeutics designed to control CA-MRSA skin infections.
    Ochman et al. Lateral gene transfer and the nature of bacterial innovation. 2000 Nature
    Vol. 405(6784), pp. 299-304 
    article DOI URL 
    Abstract: Unlike eukaryotes, which evolve principally through the modification of existing genetic information, bacteria have obtained a significant proportion of their genetic diversity through the acquisition of sequences from distantly related organisms. Horizontal gene transfer produces extremely dynamic genomes in which substantial amounts of DNA are introduced into and deleted from the chromosome. These lateral transfers have effectively changed the ecological and pathogenic character of bacterial species.
    Palmer et al. A primer on screening data management. 2009 J Biomol Screen
    Vol. 14(8), pp. 999-1007 
    article DOI URL 
    Abstract: A drug discovery startup company or academic lab entering the screening arena faces numerous challenges as it tries to manage the large quantity of data generated by a typical drug discovery screening campaign. Although there are sophisticated off-the-shelf software solutions available, their use requires substantial forethought and attention to detail if the data they capture are to be of sufficient quality to serve the various purposes to which it will be put. For newcomers to the field of screening data management in particular, the problem is compounded by a lack of literature covering the practical aspects of managing screening data. The authors provide some practical advice based on their experience of using a commercially available software suite. They discuss issues ranging from the organizational aspects to examples of how the form and content of metadata can have a big impact on whether results can be easily queried, pivoted, and reported. It is also hoped that their experiences might provide an opportunity for reflection to data management practitioners operating in established environments.
    Payne et al. Drugs for bad bugs: confronting the challenges of antibacterial discovery. 2007 Nat Rev Drug Discov
    Vol. 6(1), pp. 29-40 
    article DOI URL 
    Payne et al. Genomic approaches to antibacterial discovery. 2004 Methods Mol Biol
    Vol. 266, pp. 231-259 
    article DOI URL 
    Abstract: This chapter describes two key strategies for the discovery of new antibacterial agents and illustrates the critical role played by genomics in each. The first approach is genomic target-based screening. Comparative genomics and bioinformatics are used to identify novel, selective antibacterial targets of the appropriate antibacterial spectrum. Genetic technologies integral for the success of this approach, such as essentiality testing, are also described. An unprecedented number of novel targets have been discovered via this approach, and a plethora of examples are discussed. This section concludes with the case history of a target successfully progressed from identification by genomics, to high-throughput screening, and onto proof of concept in curing experimental infections. The second approach is based on screening for compounds with antibacterial activity and then employing a broad variety of newer technologies to identify the molecular target of the antibacterial agent. The advantage of this approach is that compounds already possess antibacterial activity, which is a property often challenging to engineer into molecules obtained from enzyme-based screening approaches. The recent development of novel biochemical and genomic technologies that facilitate identification and characterization of the mode of action of these agents has made this approach as attractive as the genomic target-based screening strategy.
    Pearson and Lipman Improved tools for biological sequence comparison. 1988 Proc Natl Acad Sci U S A
    Vol. 85(8), pp. 2444-2448 
    article  
    Abstract: We have developed three computer programs for comparisons of protein and DNA sequences. They can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity. The FASTA program is a more sensitive derivative of the FASTP program, which can be used to search protein or DNA sequence data bases and can compare a protein sequence to a DNA sequence data base by translating the DNA data base as it is searched. FASTA includes an additional step in the calculation of the initial pairwise similarity score that allows multiple regions of similarity to be joined to increase the score of related sequences. The RDF2 program can be used to evaluate the significance of similarity scores using a shuffling method that preserves local sequence composition. The LFASTA program can display all the regions of local similarity between two sequences with scores greater than a threshold, using the same scoring parameters and a similar alignment algorithm; these local similarities can be displayed as a "graphic matrix" plot or as individual alignments. In addition, these programs have been generalized to allow comparison of DNA or protein sequences based on a variety of alternative scoring matrices.
    Pei et al. PCMA: fast and accurate multiple sequence alignment based on profile consistency. 2003 Bioinformatics
    Vol. 19(3), pp. 427-428 
    article  
    Abstract: PCMA (profile consistency multiple sequence alignment) is a progressive multiple sequence alignment program that combines two different alignment strategies. Highly similar sequences are aligned in a fast way as in ClustalW, forming pre-aligned groups. The T-Coffee strategy is applied to align the relatively divergent groups based on profile-profile comparison and consistency. The scoring function for local alignments of pre-aligned groups is based on a novel profile-profile comparison method that is a generalization of the PSI-BLAST approach to profile-sequence comparison. PCMA balances speed and accuracy in a flexible way and is suitable for aligning large numbers of sequences. AVAILABILITY: PCMA is freely available for non-commercial use. Pre-compiled versions for several platforms can be downloaded from ftp://iole.swmed.edu/pub/PCMA/.
    Pfaller et al. Bacterial pathogens isolated from patients with bloodstream infection: frequencies of occurrence and antimicrobial susceptibility patterns from the SENTRY antimicrobial surveillance program (United States and Canada, 1997). 1998 Antimicrob Agents Chemother
    Vol. 42(7), pp. 1762-1770 
    article  
    Abstract: The SENTRY Program was established in January 1997 to measure the predominant pathogens and antimicrobial resistance patterns of nosocomial and community-acquired infections over a broad network of sentinel hospitals in the United States (30 sites), Canada (8 sites), South America (10 sites), and Europe (24 sites). During the first 6-month study period (January to June 1997), a total of 5,058 bloodstream infections (BSI) were reported by North American SENTRY participants (4,119 from the United States and 939 from Canada). In both the United States and Canada, Staphylococcus aureus and Escherichia coli were the most common BSI isolates, followed by coagulase-negative staphylococci and enterococci. Klebsiella spp., Enterobacter spp., Pseudomonas aeruginosa, Streptococcus pneumoniae, and beta-hemolytic streptococci were also among the 10 most frequently reported species in both the United States and Canada. Although the rank orders of pathogens in the United States and Canada were similar, distinct differences were noted in the antimicrobial susceptibilities of several pathogens. Overall, U.S. isolates were considerably more resistant than those from Canada. The differences in the proportions of oxacillin-resistant S. aureus isolates (26.2 versus 2.7% for U.S. and Canadian isolates, respectively), vancomycin-resistant enterococcal isolates (17.7 versus 0% for U.S. and Canadian isolates, respectively), and ceftazidime-resistant Enterobacter sp. isolates (30.6 versus 6.2% for U.S. and Canadian isolates, respectively) dramatically emphasize the relative lack of specific antimicrobial resistance genes (mecA, vanA, and vanB) in the Canadian microbial population. Among U.S. isolates, resistance to oxacillin among staphylococci, to vancomycin among enterococci, to penicillin among pneumococci, and to ceftazidime among Enterobacter spp. was observed in both nosocomial and community-acquired pathogens, although in almost every instance the proportion of resistant strains was higher among nosocomial isolates. Antimicrobial resistance continues to increase, and ongoing surveillance of microbial pathogens and resistance profiles is essential on national and international scales.
    Prakhov et al. VSDocker: a tool for parallel high-throughput virtual screening using AutoDock on Windows-based computer clusters. 2010 Bioinformatics
    Vol. 26(10), pp. 1374-1375 
    article DOI URL 
    Abstract: SUMMARY: VSDocker is an original program that allows using AutoDock4 for optimized virtual ligand screening on computer clusters or multiprocessor workstations. This tool is the first implementation of parallel high-performance virtual screening of ligands for MS Windows-based computer systems. AVAILABILITY: VSDocker 2.0 is freely available for non-commercial use at http://www.bio.nnov.ru/projects/vsdocker2/ CONTACT: nikita.prakhov@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
    Projan Whither antibacterial drug discovery? 2008 Drug Discov Today
    Vol. 13(7-8), pp. 279-280 
    article DOI URL 
    Projan Why is big Pharma getting out of antibacterial drug discovery? 2003 Curr Opin Microbiol
    Vol. 6(5), pp. 427-430 
    article  
    Abstract: Since the advent of the antibiotic era in the late 1940s drug discovery and development has evolved into an expensive, time consuming, cumbersome and bureaucratic process involving multiple interest groups such as pharmaceutical manufacturers, governmental regulatory authorities, patent officers, academic and clinical researchers and trial lawyers. It would seem that the least involved among the interest groups are the consumers of health care themselves. Politicians and the public alike complain loudly about drug prices although fewer and fewer new therapies are being developed. The cost and complexities of drug discovery and development have shifted the investment equation away from the development of drugs targeting short course therapies for acute diseases and towards long-term treatment of chronic conditions. Coupled with the failure of large investments into target-based approaches to produce novel antibacterial agents, companies large and small have exited from this field despite a growing clinical need.
    Pucci Use of genomics to select antibacterial targets. 2006 Biochem Pharmacol
    Vol. 71(7), pp. 1066-1072 
    article DOI URL 
    Abstract: The problem of antibiotic resistance has eroded the usefulness of our arsenal of effective antibiotics. There is a need for new strategies to discover and develop new, effective drugs. The advent of the microbial genomics era has provided a wealth of information on a variety of microorganisms. This has allowed the identification and/or validation of a number of gene products that could serve as targets for the discovery of novel antibacterial agents. New genetic techniques and approaches have arisen in an attempt to exploit this newly available genomic data. Both random and targeted gene disruption efforts have proven effective in this process. Many of these methods would have been difficult to accomplish without DNA sequence and bioinformatics analyses. Several targets have been selected to further characterize and screen for inhibitors and one has yielded two clinical candidates.
    Rasko and Sperandio Anti-virulence strategies to combat bacteria-mediated disease. 2010 Nat Rev Drug Discov
    Vol. 9(2), pp. 117-128 
    article DOI URL 
    Abstract: Antibiotic resistance is one of the greatest challenges of the twenty-first century. However, the increasing understanding of bacterial pathogenesis and intercellular communication has revealed many potential strategies to develop novel drugs to treat bacteria-mediated disease. Interference with bacterial virulence and/or cell-to-cell signalling pathways is an especially compelling approach, as it is thought to apply less selective pressure for the development of bacterial resistance than traditional strategies, which are aimed at killing bacteria or preventing their growth. Here, we discuss the mechanisms of bacterial virulence and present promising anti-virulence strategies and compounds for the future treatment of bacterial infections.
    Rausch et al. Segment-based multiple sequence alignment. 2008 Bioinformatics
    Vol. 24(16), pp. i187-i192 
    article DOI URL 
    Abstract: MOTIVATION: Many multiple sequence alignment tools have been developed in the past, progressing either in speed or alignment accuracy. Given the importance and wide-spread use of alignment tools, progress in both categories is a contribution to the community and has driven research in the field so far. RESULTS: We introduce a graph-based extension to the consistency-based, progressive alignment strategy. We apply the consistency notion to segments instead of single characters. The main problem we solve in this context is to define segments of the sequences in such a way that a graph-based alignment is possible. We implemented the algorithm using the SeqAn library and report results on amino acid and DNA sequences. The benefit of our approach is threefold: (1) sequences with conserved blocks can be rapidly aligned, (2) the implementation is conceptually easy, generic and fast and (3) the consistency idea can be extended to align multiple genomic sequences. AVAILABILITY: The segment-based multiple sequence alignment tool can be downloaded from http://www.seqan.de/projects/msa.html. A novel version of T-Coffee interfaced with the tool is available from http://www.tcoffee.org. The usage of the tool is described in both documentations.
    Rost Twilight zone of protein sequence alignments. 1999 Protein Eng
    Vol. 12(2), pp. 85-94 
    article  
    Abstract: Sequence alignments unambiguously distinguish between protein pairs of similar and non-similar structure when the pairwise sequence identity is high (>40% for long alignments). The signal gets blurred in the twilight zone of 20-35% sequence identity. Here, more than a million sequence alignments were analysed between protein pairs of known structures to re-define a line distinguishing between true and false positives for low levels of similarity. Four results stood out. (i) The transition from the safe zone of sequence alignment into the twilight zone is described by an explosion of false negatives. More than 95% of all pairs detected in the twilight zone had different structures. More precisely, above a cut-off roughly corresponding to 30% sequence identity, 90% of the pairs were homologous; below 25% less than 10% were. (ii) Whether or not sequence homology implied structural identity depended crucially on the alignment length. For example, if 10 residues were similar in an alignment of length 16 (>60, structural similarity could not be inferred. (iii) The 'more similar than identical' rule (discarding all pairs for which percentage similarity was lower than percentage identity) reduced false positives significantly. (iv) Using intermediate sequences for finding links between more distant families was almost as successful: pairs were predicted to be homologous when the respective sequence families had proteins in common. All findings are applicable to automatic database searches.
    Ruzheinikov et al. Substrate-induced conformational changes in Bacillus subtilis glutamate racemase and their implications for drug discovery. 2005 Structure
    Vol. 13(11), pp. 1707-1713 
    article DOI URL 
    Abstract: D-glutamate is an essential building block of the peptidoglycan layer in bacterial cell walls and can be synthesized from L-glutamate by glutamate racemase (RacE). The structure of a complex of B. subtilis RacE with D-glutamate reveals that the glutamate is buried in a deep pocket, whose formation at the interface of the enzyme's two domains involves a large-scale conformational rearrangement. These domains are related by pseudo-2-fold symmetry, which superimposes the two catalytic cysteine residues, which are located at equivalent positions on either side of the alpha carbon of the substrate. The structural similarity of these two domains suggests that the racemase activity of RacE arose as a result of gene duplication. The structure of the complex is dramatically different from that proposed previously and provides new insights into the RacE mechanism and an explanation for the potency of a family of RacE inhibitors, which have been developed as novel antibiotics.
    Sanner A component-based software environment for visualizing large macromolecular assemblies. 2005 Structure
    Vol. 13(3), pp. 447-462 
    article DOI URL 
    Abstract: The interactive visualization of large biological assemblies poses a number of challenging problems, including the development of multiresolution representations and new interaction methods for navigating and analyzing these complex systems. An additional challenge is the development of flexible software environments that will facilitate the integration and interoperation of computational models and techniques from a wide variety of scientific disciplines. In this paper, we present a component-based software development strategy centered on the high-level, object-oriented, interpretive programming language: Python. We present several software components, discuss their integration, and describe some of their features that are relevant to the visualization of large molecular assemblies. Several examples are given to illustrate the interoperation of these software components and the integration of structural data from a variety of experimental sources. These examples illustrate how combining visual programming with component-based software development facilitates the rapid prototyping of novel visualization tools.
    Sanner Python: a programming language for software integration and development. 1999 J Mol Graph Model
    Vol. 17(1), pp. 57-61 
    article  
    Schneider Virtual screening: an endless staircase? 2010 Nat Rev Drug Discov
    Vol. 9(4), pp. 273-276 
    article DOI URL 
    Abstract: Computational chemistry--in particular, virtual screening--can provide valuable contributions in hit- and lead-compound discovery. Numerous software tools have been developed for this purpose. However, despite the applicability of virtual screening technology being well established, it seems that there are relatively few examples of drug discovery projects in which virtual screening has been the key contributor. Has virtual screening reached its peak? If not, what aspects are limiting its potential at present, and how can significant progress be made in the future?
    Schneider and Sahl An oldie but a goodie - cell wall biosynthesis as antibiotic target pathway. 2010 Int J Med Microbiol
    Vol. 300(2-3), pp. 161-169 
    article DOI URL 
    Abstract: Bacterial cell wall biosynthesis represents the target pathway for penicillin, the first antibiotic that was clinically applied on a large scale. Penicillin, by means of its beta-lactam ring, inhibits a number of enzymes which participate in inserting monomeric cell wall building blocks into the cell wall polymer and which have been termed penicillin-binding proteins (PBPs). Ever since the introduction of penicillin, hundreds of beta-lactam antibiotics have been developed and details of their molecular activities elaborated. Meanwhile, various additional classes of antibiotics have been described, which inhibit the same pathway, yet use target molecules others than the PBPs. Such classes include the glycopeptide antibiotics, lipopeptide and lipodepsipeptide antibiotics, the lantibiotics and various other natural product antibiotics with comparatively complex structures. They usually target the membrane-bound steps of the biosynthesis pathway and the highly conserved lipid-bound intermediates of the building block such as lipid II, which represents a particular "Achilles' heel" for antibiotic attack. With in-depth analysis of the activity of more recently identified inhibitors and with the availability of novel techniques for studying prokaryotic cell biology, new insights were obtained into the molecular organisation of the cell wall biosynthesis machinery and its interconnections with other vital cellular processes such as cell division. This, in turn, provides hints for new targets to be exploited and for the development of novel cell wall biosynthesis inhibitors.
    Shoichet Virtual screening of chemical libraries. 2004 Nature
    Vol. 432(7019), pp. 862-865 
    article DOI URL 
    Abstract: Virtual screening uses computer-based methods to discover new ligands on the basis of biological structures. Although widely heralded in the 1970s and 1980s, the technique has since struggled to meet its initial promise, and drug discovery remains dominated by empirical screening. Recent successes in predicting new ligands and their receptor-bound structures, and better rates of ligand discovery compared to empirical screening, have re-ignited interest in virtual screening, which is now widely used in drug discovery, albeit on a more limited scale than empirical screening.
    Silver Does the cell wall of bacteria remain a viable source of targets for novel antibiotics? 2006 Biochem Pharmacol
    Vol. 71(7), pp. 996-1005 
    article DOI URL 
    Abstract: Whether the bacterial cell wall remains a viable source of novel antibacterials is addressed here by reviewing screen and design strategies for discovery of antibacterials with a focus on their output. Inhibitors for which antibacterial activity has been shown to be due to specific inhibition of a reaction (antibacterially validated inhibitors) are known for 8 of the 14 conserved essential steps of the pathway. Antibacterially validated enzyme inhibitors exist for six of these steps. The possible obstacles to finding validated inhibitors of the remaining enzymes are discussed and some strategies are suggested.
    Silver Novel inhibitors of bacterial cell wall synthesis. 2003 Curr Opin Microbiol
    Vol. 6(5), pp. 431-438 
    article  
    Abstract: Over the past forty years, efforts to discover antibacterials have yielded a wide variety of chemical structures, almost exclusively natural products, which inhibit many steps in cell wall synthesis. Although screening for new cell wall inhibitors has been continuous during that period, there have been few reports of new drugs. With the advent of genomics, high resolution X-ray crystallography and the recognition of the need for new antibiotics to combat resistant organisms, there has been a resurgence in interest in this validated target area.
    Smith and Waterman Identification of common molecular subsequences. 1981 J Mol Biol
    Vol. 147(1), pp. 195-197 
    article  
    Song et al. Recent advances in computer-aided drug design. 2009 Brief Bioinform
    Vol. 10(5), pp. 579-591 
    article DOI URL 
    Abstract: Modern drug discovery is characterized by the production of vast quantities of compounds and the need to examine these huge libraries in short periods of time. The need to store, manage and analyze these rapidly increasing resources has given rise to the field known as computer-aided drug design (CADD). CADD represents computational methods and resources that are used to facilitate the design and discovery of new therapeutic solutions. Digital repositories, containing detailed information on drugs and other useful compounds, are goldmines for the study of chemical reactions capabilities. Design libraries, with the potential to generate molecular variants in their entirety, allow the selection and sampling of chemical compounds with diverse characteristics. Fold recognition, for studying sequence-structure homology between protein sequences and structures, are helpful for inferring binding sites and molecular functions. Virtual screening, the in silico analog of high-throughput screening, offers great promise for systematic evaluation of huge chemical libraries to identify potential lead candidates that can be synthesized and tested. In this article, we present an overview of the most important data sources and computational methods for the discovery of new molecular entities. The workflow of the entire virtual screening campaign is discussed, from data collection through to post-screening analysis.
    Song and Ko Detection of essential genes in Streptococcus pneumoniae using bioinformatics and allelic replacement mutagenesis. 2008 Methods Mol Biol
    Vol. 416, pp. 401-408 
    article DOI URL 
    Abstract: Although the emergence and spread of antimicrobial resistance in major bacterial pathogens for the past decades poses a growing challenge to public health, discovery of novel antimicrobial agents from natural products or modification of existing antibiotics cannot circumvent the problem of antimicrobial resistance. The recent development of bacterial genomics and the availability of genome sequences allow the identification of potentially novel antimicrobial agents. The cellular targets of new antimicrobial agents must be essential for the growth, replication, or survival of the bacterium. Conserved genes among different bacterial genomes often turn out to be essential (1, 2). Thus, the combination of comparative genomics and the gene knock-out procedure can provide effective ways to identify the essential genes of bacterial pathogens (3). Identification of essential genes in bacteria may be utilized for the development of new antimicrobial agents because common essential genes in diverse pathogens could constitute novel targets for broad-spectrum antimicrobial agents.
    Song et al. Identification of essential genes in Streptococcus pneumoniae by allelic replacement mutagenesis. 2005 Mol Cells
    Vol. 19(3), pp. 365-374 
    article  
    Abstract: To find potential targets of novel antimicrobial agents, we identified essential genes of Streptococcus pneumoniae using comparative genomics and allelic replacement mutagenesis. We compared the genome of S. pneumoniae R6 with those of Bacillus subtilis, Enterococcus faecalis, Escherichia coli, and Staphylococcus aureus, and selected 693 candidate target genes with > 40% amino acid sequence identity to the corresponding genes in at least two of the other species. The 693 genes were disrupted and 133 were found to be essential for growth. Of these, 32 encoded proteins of unknown function, and we were able to identify orthologues of 22 of these genes by genomic comparisons. The experimental method used in this study is easy to perform, rapid and efficient for identifying essential genes of bacterial pathogens.
    Sousa et al. Protein-ligand docking: current status and future challenges. 2006 Proteins
    Vol. 65(1), pp. 15-26 
    article DOI URL 
    Abstract: Understanding the ruling principles whereby protein receptors recognize, interact, and associate with molecular substrates and inhibitors is of paramount importance in drug discovery efforts. Protein-ligand docking aims to predict and rank the structure(s) arising from the association between a given ligand and a target protein of known 3D structure. Despite the breathtaking advances in the field over the last decades and the widespread application of docking methods, several downsides still exist. In particular, protein flexibility-a critical aspect for a thorough understanding of the principles that guide ligand binding in proteins-is a major hurdle in current protein-ligand docking efforts that needs to be more efficiently accounted for. In this review the key concepts of protein-ligand docking methods are outlined, with major emphasis being given to the general strengths and weaknesses that presently characterize this methodology. Despite the size of the field, the principal types of search algorithms and scoring functions are reviewed and the most popular docking tools are briefly depicted. Recent advances that aim to address some of the traditional limitations associated with molecular docking are also described. A selection of hand-picked examples is used to illustrate these features.
    Subramanian et al. DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment. 2008 Algorithms Mol Biol
    Vol. 3, pp. 6 
    article DOI URL 
    Abstract: BACKGROUND: DIALIGN-T is a reimplementation of the multiple-alignment program DIALIGN. Due to several algorithmic improvements, it produces significantly better alignments on locally and globally related sequence sets than previous versions of DIALIGN. However, like the original implementation of the program, DIALIGN-T uses a a straight-forward greedy approach to assemble multiple alignments from local pairwise sequence similarities. Such greedy approaches may be vulnerable to spurious random similarities and can therefore lead to suboptimal results. In this paper, we present DIALIGN-TX, a substantial improvement of DIALIGN-T that combines our previous greedy algorithm with a progressive alignment approach. RESULTS: Our new heuristic produces significantly better alignments, especially on globally related sequences, without increasing the CPU time and memory consumption exceedingly. The new method is based on a guide tree; to detect possible spurious sequence similarities, it employs a vertex-cover approximation on a conflict graph. We performed benchmarking tests on a large set of nucleic acid and protein sequences For protein benchmarks we used the benchmark database BALIBASE 3 and an updated release of the database IRMBASE 2 for assessing the quality on globally and locally related sequences, respectively. For alignment of nucleic acid sequences, we used BRAliBase II for global alignment and a newly developed database of locally related sequences called DIRM-BASE 1. IRMBASE 2 and DIRMBASE 1 are constructed by implanting highly conserved motives at random positions in long unalignable sequences. CONCLUSION: On BALIBASE3, our new program performs significantly better than the previous program DIALIGN-T and outperforms the popular global aligner CLUSTAL W, though it is still outperformed by programs that focus on global alignment like MAFFT, MUSCLE and T-COFFEE. On the locally related test sets in IRMBASE 2 and DIRM-BASE 1, our method outperforms all other programs while MAFFT E-INSi is the only method that comes close to the performance of DIALIGN-TX.
    Subramanian et al. DIALIGN-T: an improved algorithm for segment-based multiple sequence alignment. 2005 BMC Bioinformatics
    Vol. 6, pp. 66 
    article DOI URL 
    Abstract: BACKGROUND: We present a complete re-implementation of the segment-based approach to multiple protein alignment that contains a number of improvements compared to the previous version 2.2 of DIALIGN. This previous version is superior to Needleman-Wunsch-based multi-alignment programs on locally related sequence sets. However, it is often outperformed by these methods on data sets with global but weak similarity at the primary-sequence level. RESULTS: In the present paper, we discuss strengths and weaknesses of DIALIGN in view of the underlying objective function. Based on these results, we propose several heuristics to improve the segment-based alignment approach. For pairwise alignment, we implemented a fragment-chaining algorithm that favours chains of low-scoring local alignments over isolated high-scoring fragments. For multiple alignment, we use an improved greedy procedure that is less sensitive to spurious local sequence similarities. To evaluate our method on globally related protein families, we used the well-known database BAliBASE. For benchmarking tests on locally related sequences, we created a new reference database called IRMBASE which consists of simulated conserved motifs implanted into non-related random sequences. CONCLUSION: On BAliBASE, our new program performs significantly better than the previous version of DIALIGN and is comparable to the standard global aligner CLUSTAL W, though it is outperformed by some newly developed programs that focus on global alignment. On the locally related test sets in IRMBASE, our method outperforms all other programs that we evaluated.
    Tatusov et al. Detection of conserved segments in proteins: iterative scanning of sequence databases with alignment blocks. 1994 Proc Natl Acad Sci U S A
    Vol. 91(25), pp. 12091-12095 
    article  
    Abstract: We describe an approach to analyzing protein sequence databases that, starting from a single uncharacterized sequence or group of related sequences, generates blocks of conserved segments. The procedure involves iterative database scans with an evolving position-dependent weight matrix constructed from a coevolving set of aligned conserved segments. For each iteration, the expected distribution of matrix scores under a random model is used to set a cutoff score for the inclusion of a segment in the next iteration. This cutoff may be calculated to allow the chance inclusion of either a fixed number or a fixed proportion of false positive segments. With sufficiently high cutoff scores, the procedure converged for all alignment blocks studied, with varying numbers of iterations required. Different methods for calculating weight matrices from alignment blocks were compared. The most effective of those tested was a logarithm-of-odds, Bayesian-based approach that used prior residue probabilities calculated from a mixture of Dirichlet distributions. The procedure described was used to detect novel conserved motifs of potential biological importance.
    Tenover Mechanisms of antimicrobial resistance in bacteria. 2006 Am J Med
    Vol. 119(6 Suppl 1), pp. S3-10; discussion S62-70 
    article DOI URL 
    Abstract: The treatment of bacterial infections is increasingly complicated by the ability of bacteria to develop resistance to antimicrobial agents. Antimicrobial agents are often categorized according to their principal mechanism of action. Mechanisms include interference with cell wall synthesis (e.g., beta-lactams and glycopeptide agents), inhibition of protein synthesis (macrolides and tetracyclines), interference with nucleic acid synthesis (fluoroquinolones and rifampin), inhibition of a metabolic pathway (trimethoprim-sulfamethoxazole), and disruption of bacterial membrane structure (polymyxins and daptomycin). Bacteria may be intrinsically resistant to > or =1 class of antimicrobial agents, or may acquire resistance by de novo mutation or via the acquisition of resistance genes from other organisms. Acquired resistance genes may enable a bacterium to produce enzymes that destroy the antibacterial drug, to express efflux systems that prevent the drug from reaching its intracellular target, to modify the drug's target site, or to produce an alternative metabolic pathway that bypasses the action of the drug. Acquisition of new genetic material by antimicrobial-susceptible bacteria from resistant strains of bacteria may occur through conjugation, transformation, or transduction, with transposons often facilitating the incorporation of the multiple resistance genes into the host's genome or plasmids. Use of antibacterial agents creates selective pressure for the emergence of resistant strains. Herein 3 case histories-one involving Escherichia coli resistance to third-generation cephalosporins, another focusing on the emergence of vancomycin-resistant Staphylococcus aureus, and a third detailing multidrug resistance in Pseudomonas aeruginosa--are reviewed to illustrate the varied ways in which resistant bacteria develop.
    Tenover et al. Characterization of a strain of community-associated methicillin-resistant Staphylococcus aureus widely disseminated in the United States. 2006 J Clin Microbiol
    Vol. 44(1), pp. 108-118 
    article DOI URL 
    Tenover et al. Vancomycin-resistant Staphylococcus aureus isolate from a patient in Pennsylvania. 2004 Antimicrob Agents Chemother
    Vol. 48(1), pp. 275-280 
    article  
    Abstract: A vancomycin-resistant Staphylococcus aureus (VRSA) isolate was obtained from a patient in Pennsylvania in September 2002. Species identification was confirmed by standard biochemical tests and analysis of 16S ribosomal DNA, gyrA, and gyrB sequences; all of the results were consistent with the S. aureus identification. The MICs of a variety of antimicrobial agents were determined by broth microdilution and macrodilution methods following National Committee for Clinical Laboratory Standards (NCCLS) guidelines. The isolate was resistant to vancomycin (MIC = 32 micro g/ml), aminoglycosides, beta-lactams, fluoroquinolones, macrolides, and tetracycline, but it was susceptible to linezolid, minocycline, quinupristin-dalfopristin, rifampin, teicoplanin, and trimethoprim-sulfamethoxazole. The isolate, which was originally detected by using disk diffusion and a vancomycin agar screen plate, was vancomycin susceptible by automated susceptibility testing methods. Pulsed-field gel electrophoresis (PFGE) of SmaI-digested genomic DNA indicated that the isolate belonged to the USA100 lineage (also known as the New York/Japan clone), the most common staphylococcal PFGE type found in hospitals in the United States. The VRSA isolate contained two plasmids of 120 and 4 kb and was positive for mecA and vanA by PCR amplification. The vanA sequence was identical to the vanA sequence present in Tn1546. A DNA probe for vanA hybridized to the 120-kb plasmid. This is the second VRSA isolate reported in the United States.
    Thain et al. Distributed computing in practice: the Condor experience. 2005 Concurrency - Practice and Experience
    Vol. 17, pp. 323-356 
    article  
    Abstract: Since 1984, the Condor project has enabled ordinary users to do extraordinary

    computing. Today, the project continues to explore the social and technical problems

    of cooperative computing on scales ranging from the desktop to the world-wide

    computational grid. In this chapter, we provide the history and philosophy of the Condor

    project and describe how it has interacted with other projects and evolved along with the

    field of distributed computing. We outline the core components of the Condor system

    and describe how the technology of computing must correspond to social structures.

    Throughout, we re ect on the lessons of experience and chart the course traveled by

    research ideas as they grow into production systems.

    Thanassi et al. Identification of 113 conserved essential genes using a high-throughput gene disruption system in Streptococcus pneumoniae. 2002 Nucleic Acids Res
    Vol. 30(14), pp. 3152-3162 
    article  
    Abstract: The recent availability of bacterial genome sequence information permits the identification of conserved genes that are potential targets for novel antibiotic drug discovery. Using a coupled bioinformatic/experimental approach, a list of candidate conserved genes was generated using a Microbial Concordance bioinformatics tool followed by a targeted disruption campaign. Pneumococcal sequence data allowed for the design of precise PCR primers to clone the desired gene target fragments into the pEVP3 'suicide vector'. An insertion-duplication approach was employed that used the pEVP3 constructs and resulted in the introduction of a selectable chloramphenicol resistance marker into the chromosome. In the case of non-essential genes, cells can survive the disruption and form chloramphenicol-resistant colonies. A total of 347 candidate reading frames were subjected to disruption analysis, with 113 presumed to be essential due to lack of recovery of antibiotic-resistant colonies. In addition to essentiality determination, the same high-throughput methodology was used to overexpress gene products and to examine possible polarity effects for all essential genes.
    Thompson et al. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. 1994 Nucleic Acids Res
    Vol. 22(22), pp. 4673-4680 
    article  
    Abstract: The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved for the alignment of divergent protein sequences. Firstly, individual weights are assigned to each sequence in a partial alignment in order to down-weight near-duplicate sequences and up-weight the most divergent ones. Secondly, amino acid substitution matrices are varied at different alignment stages according to the divergence of the sequences to be aligned. Thirdly, residue-specific gap penalties and locally reduced gap penalties in hydrophilic regions encourage new gaps in potential loop regions rather than regular secondary structure. Fourthly, positions in early alignments where gaps have been opened receive locally reduced gap penalties to encourage the opening up of new gaps at these positions. These modifications are incorporated into a new program, CLUSTAL W which is freely available.
    Tinsley et al. Bacteriophages and pathogenicity: more than just providing a toxin? 2006 Microbes Infect
    Vol. 8(5), pp. 1365-1371 
    article DOI URL 
    Abstract: An increasing number of pathogenicity factors carried by bacteriophages have been discovered. This review considers bacteriophage-bacterium interaction and its relation to disease processes. We discuss the search for new bacteriophage-associated pathogenicity factors, with emphasis on recent advances brought by the use of genomic sequence data and the techniques of genomic epidemiology.
    Trott and Olson AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. 2010 J Comput Chem
    Vol. 31(2), pp. 455-461 
    article DOI URL 
    Abstract: AutoDock Vina, a new program for molecular docking and virtual screening, is presented. AutoDock Vina achieves an approximately two orders of magnitude speed-up compared with the molecular docking software previously developed in our lab (AutoDock 4), while also significantly improving the accuracy of the binding mode predictions, judging by our tests on the training set used in AutoDock 4 development. Further speed-up is achieved from parallelism, by using multithreading on multicore machines. AutoDock Vina automatically calculates the grid maps and clusters the results in a way transparent to the user.
    Uchiyama MBGD: a platform for microbial comparative genomics based on the automated construction of orthologous groups. 2007 Nucleic Acids Res
    Vol. 35(Database issue), pp. D343-D346 
    article DOI URL 
    Abstract: The microbial genome database for comparative analysis (MBGD) is a comprehensive platform for microbial comparative genomics. The central function of MBGD is to create orthologous groups among multiple genomes from precomputed all-against-all similarity relationships using the DomClust algorithm. The database now contains >300 published genomes and the number continues to grow. For researchers who are interested in ongoing genome projects, we have now started a new service called 'My MBGD,' which allows users to add their own genome sequences to MBGD for the purpose of identifying orthologs among both the new and the existing genomes. Furthermore, in order to make available the rapidly accumulating information on closely related genome sequences, we enhanced the interface for pairwise genome comparisons using the CGAT interface, which allows users to see nucleotide sequence alignments of non-coding as well as coding regions. MBGD is available at http://mbgd.genome.ad.jp/.
    Uchiyama MBGD: microbial genome database for comparative analysis. 2003 Nucleic Acids Res
    Vol. 31(1), pp. 58-62 
    article  
    Abstract: MBGD is a workbench system for comparative analysis of completely sequenced microbial genomes. The central function of MBGD is to create an orthologous gene classification table using precomputed all-against-all similarity relationships among genes in multiple genomes. In MBGD, an automated classification algorithm has been implemented so that users can create their own classification table by specifying a set of organisms and parameters. This feature is especially useful when the user's interest is focused on some taxonomically related organisms. The created classification table is stored into the database and can be explored combining with the data of individual genomes as well as similarity relationships among genomes. Using these data, users can carry out comparative analyses from various points of view, such as phylogenetic pattern analysis, gene order comparison and detailed gene structure comparison. MBGD is accessible at http://mbgd.genome.ad.jp/.
    Uchiyama et al. MBGD update 2010: toward a comprehensive resource for exploring microbial genome diversity. 2010 Nucleic Acids Res
    Vol. 38(Database issue), pp. D361-D365 
    article DOI URL 
    Abstract: The microbial genome database (MBGD) for comparative analysis is a platform for microbial comparative genomics based on automated ortholog group identification. A prominent feature of MBGD is that it allows users to create ortholog groups using a specified subgroup of organisms. The database is constantly updated and now contains almost 1000 genomes. To utilize the MBGD database as a comprehensive resource for investigating microbial genome diversity, we have developed the following advanced functionalities: (i) enhanced assignment of functional annotation, including external database links to each orthologous group, (ii) interface for choosing a set of genomes to compare based on phenotypic properties, (iii) the addition of more eukaryotic microbial genomes (fungi and protists) and some higher eukaryotes as references and (iv) enhancement of the MyMBGD mode, which allows users to add their own genomes to MBGD and now accepts raw genomic sequences without any annotation (in such a case, it runs a gene-finding procedure before identifying the orthologs). Some analysis functions, such as the function to find orthologs with similar phylogenetic patterns, have also been improved. MBGD is accessible at http://mbgd.genome.ad.jp/.
    Walsh Where will new antibiotics come from? 2003 Nat Rev Microbiol
    Vol. 1(1), pp. 65-70 
    article DOI URL 
    Abstract: There is a constant need for new antibacterial drugs owing to the inevitable development of resistance that follows the introduction of antibiotics to the clinic. When a new class of antibiotic is introduced, it is effective at first, but will eventually select for survival of the small fraction of bacterial populations that have an intrinsic or acquired resistance mechanism. Pathogens that are resistant to multiple drugs emerge around the globe, so how robust are antibiotic discovery processes?
    Walsh Molecular mechanisms that confer antibacterial drug resistance. 2000 Nature
    Vol. 406(6797), pp. 775-781 
    article DOI URL 
    Abstract: Antibiotics--compounds that are literally 'against life'--are typically antibacterial drugs, interfering with some structure or process that is essential to bacterial growth or survival without harm to the eukaryotic host harbouring the infecting bacteria. We live in an era when antibiotic resistance has spread at an alarming rate and when dire predictions concerning the lack of effective antibacterial drugs occur with increasing frequency. In this context it is apposite to ask a few simple questions about these life-saving molecules. What are antibiotics? Where do they come from? How do they work? Why do they stop being effective? How do we find new antibiotics? And can we slow down the development of antibiotic-resistant superbugs?
    Warren et al. A critical assessment of docking programs and scoring functions. 2006 J Med Chem
    Vol. 49(20), pp. 5912-5931 
    article DOI URL 
    Abstract: Docking is a computational technique that samples conformations of small molecules in protein binding sites; scoring functions are used to assess which of these conformations best complements the protein binding site. An evaluation of 10 docking programs and 37 scoring functions was conducted against eight proteins of seven protein types for three tasks: binding mode prediction, virtual screening for lead identification, and rank-ordering by affinity for lead optimization. All of the docking programs were able to generate ligand conformations similar to crystallographically determined protein/ligand complex structures for at least one of the targets. However, scoring functions were less successful at distinguishing the crystallographic conformation from the set of docked poses. Docking programs identified active compounds from a pharmaceutically relevant pool of decoy compounds; however, no single program performed well for all of the targets. For prediction of compound affinity, none of the docking programs or scoring functions made a useful prediction of ligand binding affinity.
    Waterhouse et al. Jalview Version 2--a multiple sequence alignment editor and analysis workbench. 2009 Bioinformatics
    Vol. 25(9), pp. 1189-1191 
    article DOI URL 
    Abstract: Jalview Version 2 is a system for interactive WYSIWYG editing, analysis and annotation of multiple sequence alignments. Core features include keyboard and mouse-based editing, multiple views and alignment overviews, and linked structure display with Jmol. Jalview 2 is available in two forms: a lightweight Java applet for use in web applications, and a powerful desktop application that employs web services for sequence alignment, secondary structure prediction and the retrieval of alignments, sequences, annotation and structures from public databases and any DAS 1.53 compliant sequence or annotation server. Availability: The Jalview 2 Desktop application and JalviewLite applet are made freely available under the GPL, and can be downloaded from www.jalview.org.
    Weigel et al. Genetic analysis of a high-level vancomycin-resistant isolate of Staphylococcus aureus. 2003 Science
    Vol. 302(5650), pp. 1569-1571 
    article DOI URL 
    Abstract: Vancomycin is usually reserved for treatment of serious infections, including those caused by multidrug-resistant Staphylococcus aureus. A clinical isolate of S. aureus with high-level resistance to vancomycin (minimal inhibitory concentration = 1024 microg/ml) was isolated in June 2002. This isolate harbored a 57.9-kilobase multiresistance conjugative plasmid within which Tn1546 (vanA) was integrated. Additional elements on the plasmid encoded resistance to trimethoprim (dfrA), beta-lactams (blaZ), aminoglycosides (aacA-aphD), and disinfectants (qacC). Genetic analyses suggest that the long-anticipated transfer of vancomycin resistance to a methicillin-resistant S. aureus occurred in vivo by interspecies transfer of Tn1546 from a co-isolate of Enterococcus faecalis.
    Wiley Genomics in the real world. 1998 Curr Pharm Des
    Vol. 4(5), pp. 417-422 
    article  
    Abstract: The term genomics has evolved into a catch-all term for a variety of information intensive biological methodologies. While the promise of genomics in the bio/pharmaceutical industry is great, its impact on the drug discovery pipeline has not yet been realized, excluding a few notable exceptions. As companies acquire several years of experience in working with genomic data, it is likely that the impact on the discovery process will slowly emerge as we learn to integrate these new technologies into individual discovery programs. It is clear that extracting novel biologically valid targets targets from exponentially growing amounts of sequence data requires time and considerable investment in biological research infrastructure. In order to accelerate the process of target validation, a variety of functional genomics technologies are also being developed to try to predict the effect of inhibitory compounds in advance of development. Resources spent on early stage exploratory efforts such as these can pay off by improving the success rate for screening and medicinal chemistry.
    Yoon et al. A computational approach for identifying pathogenicity islands in prokaryotic genomes. 2005 BMC Bioinformatics
    Vol. 6, pp. 184 
    article DOI URL 
    Abstract: BACKGROUND: Pathogenicity islands (PAIs), distinct genomic segments of pathogens encoding virulence factors, represent a subgroup of genomic islands (GIs) that have been acquired by horizontal gene transfer event. Up to now, computational approaches for identifying PAIs have been focused on the detection of genomic regions which only differ from the rest of the genome in their base composition and codon usage. These approaches often lead to the identification of genomic islands, rather than PAIs. RESULTS: We present a computational method for detecting potential PAIs in complete prokaryotic genomes by combining sequence similarities and abnormalities in genomic composition. We first collected 207 GenBank accessions containing either part or all of the reported PAI loci. In sequenced genomes, strips of PAI-homologs were defined based on the proximity of the homologs of genes in the same PAI accession. An algorithm reminiscent of sequence-assembly procedure was then devised to merge overlapping or adjacent genomic strips into a large genomic region. Among the defined genomic regions, PAI-like regions were identified by the presence of homolog(s) of virulence genes. Also, GIs were postulated by calculating G+C content anomalies and codon usage bias. Of 148 prokaryotic genomes examined, 23 pathogenic and 6 non-pathogenic bacteria contained 77 candidate PAIs that partly or entirely overlap GIs. CONCLUSION: Supporting the validity of our method, included in the list of candidate PAIs were thirty four PAIs previously identified from genome sequencing papers. Furthermore, in some instances, our method was able to detect entire PAIs for those only partial sequences are available. Our method was proven to be an efficient method for demarcating the potential PAIs in our study. Also, the function(s) and origin(s) of a candidate PAI can be inferred by investigating the PAI queries comprising it. Identification and analysis of potential PAIs in prokaryotic genomes will broaden our knowledge on the structure and properties of PAIs and the evolution of bacterial pathogenesis.
    Zhang and Zhang Gene essentiality analysis based on DEG, a database of essential genes. 2008 Methods Mol Biol
    Vol. 416, pp. 391-400 
    article DOI URL 
    Abstract: Essential genes are the genes that are indispensable for the survival of an organism. The genome-scale identification of essential genes has been performed in various organisms, and we consequently constructed DEG, a Database that contains currently available essential genes. Here we analyzed functional distributions of essential genes in DEG, and found that some essential-gene functions are even conserved between the prokaryote (bacteria) and the eukaryote (yeast), e.g., genes involved in information storage and processing are overrepresented, whereas those involved in metabolism are underrepresented in essential genes compared with non-essential ones. In bacteria, species specificity in functional distribution of essential genes is mainly due to those involved in cellular processes. Furthermore, within the category of information storage and processing, function of translation, ribosomal structure, and biogenesis are predominant in essential genes. Finally, some potential pitfalls for analyzing gene essentiality based on DEG are discussed.
    Zhang and Lin DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes. 2009 Nucleic Acids Res
    Vol. 37(Database issue), pp. D455-D458 
    article DOI URL 
    Abstract: Essential genes are those indispensable for the survival of an organism, and their functions are therefore considered a foundation of life. Determination of a minimal gene set needed to sustain a life form, a fundamental question in biology, plays a key role in the emerging field, synthetic biology. Five years after we constructed DEG, a database of essential genes, DEG 5.0 has significant advances over the 2004 version in both the number of essential genes and the number of organisms in which these genes are determined. The number of prokaryotic essential genes in DEG has increased about 10-fold, mainly owing to genome-wide gene essentiality screens performed in a wide range of bacteria. The number of eukaryotic essential genes has increased more than 5-fold, because DEG 1.0 only had yeast ones, but DEG 5.0 also has those in humans, mice, worms, fruit flies, zebrafish and the plant Arabidopsis thaliana. These updates not only represent significant advances of DEG, but also represent the rapid progress of the essential-gene field. DEG is freely available at the website http://tubic.tju.edu.cn/deg or http://www.essentialgene.org.
    Zhang et al. DEG: a database of essential genes. 2004 Nucleic Acids Res
    Vol. 32(Database issue), pp. D271-D272 
    article DOI URL 
    Abstract: Essential genes are genes that are indispensable to support cellular life. These genes constitute a minimal gene set required for a living cell. We have constructed a Database of Essential Genes (DEG), which contains all the essential genes that are currently available. The functions encoded by essential genes are considered a foundation of life and therefore are likely to be common to all cells. Users can BLAST the query sequences against DEG. If homologous genes are found, it is possible that the queried genes are also essential. Users can search for essential genes by their function or name. Users can also browse and extract all the records in DEG. Essential gene products comprise excellent targets for antibacterial drugs. Analysis of essential genes could help to answer the question of what are the basic functions necessary to support cellular life. DEG is freely accessible from the website http://tubic.tju.edu.cn/deg/.
    Zvelebil and Baum Understanding Bioinformatics 2008   book  
    Virtual Screening for Bioactive Molecules 2000   book  
    Sherris Medical Microbiology 2004 , pp. 409-412  book  

    Created by JabRef on 31/08/2010.