My Bloglines

September 30, 2006

Virtual Screening Methods

Specific binding interactions are central to many biological processes and pathways. Similarly, most drugs act by binding specifically to a site on a target protein, thereby modulating protein activity. The quest for new drugs relies on many approaches, including computer-based virtual screening and docking. Over the past fifteen years, and in parallel with the exponential increase in the number of available high-resolution protein structures, many screening and docking methods and programs of use in the drug discovery process have emerged. Understanding the similarities and differences of different methods as well as their capabilities and limitations is both important and increasingly challenging.

The main objective of our Virtual Screening eCheminfo Community of Practice activity is to foster discussion amongst researchers working on both development of screening and docking methods and the application of such methods to drug discovery. This interaction is intended to lead to a better understanding of the current state-of-the-art, improved screening and docking tools in the future, and enhanced awareness of how to apply the current set of tools.

On Tuesday 17th October 2006 a number of leading screening experts and practitioners will meet at the joint eCheminfo and InnovationWell Community of Practice meeting at Bryn Mawr College, Philadelphia to discuss virtual screening and docking methods.

On the afternoons of both Monday 16th and Tuesday 17th October we will also hold a number of workshops on latest virtual screening and docking methods and software.

On the afternoon of Tuesday 17th a forum will discuss current virtual screening and docking methods and software, results of existing validation and comparison studies, and procedures for useful independent comparative studies that could be undertaken by the community of practice.

High throughput screening (HTS) data is complex; the data sets are large and there are usually active compounds that follow different mechanisms (one QSAR model will not fit all the active compounds). Also, some statistical analysis methods can be complex so that biologists and chemists can be reluctant to jump in even if chemometric help is not available. There is a need for a simple method of analysis that can deal with large data sets and multiple mechanisms. This lecture will present the underlying concepts and analysis strategies of recursive partitioning, RP. We will use the ChemTree® software from Golden Helix, which is particularly easy to use as it was designed from the ground up for the analysis of chemistry data (the lessons learned with ChemTree are largely transferable to other RP codes). Some biological targets have relatively expensive assays and many times resource limitations prevent organizations from scaling up assays to massive HTS for all of their targets. For these targets, sequential screening is a very viable option to progress the target. In sequential screening a modest data set is screened, say 5k to 15k compounds, and then statistical analysis is used to select additional compounds for screening in an iterative optimization process. We will cover the use of RP in sequential screening.

Structure-based virtual screening is now the most widely-used approach to leverage structure for ligand discovery. Whereas the ultimate test of docking is prospective prediction of novel ligands, a pragmatic approach for routine testing of algorithmic developments is to use experimentally-observed poses of selected ligands and enrichment of known actives as performance evaluation criteria. Here we report datasets that may be used to benchmark docking programs. We are making available a set of actives drawn from the literature and a corresponding Database of Universal Decoys (DUD) for forty drug targets. Decoys were selected from commercially-available, ‘drug like’ compounds to have similar physicochemical properties to known actives, while having dissimilar chemical structure. To facilitate routine testing of our docking program against these forty systems we developed a high throughput virtual screening pipeline. We have docked DUD and its actives forty drug targets to investigate the performance of our docking program, our automated docking procedure, and the database itself. We have also compared docking against DUD with other databases such as the MDDR. Our results show that enrichment depends on the database used for docking, and suggest strongly that a carefully calibrated decoy database is important for effectively evaluating docking enrichment. To control for interference between annotated lists for different targets we report cross-docking experiments for each of the 40 systems. We are making DUD and the database of actives available for free download in ready-to-dock formats in the hope that DUD will be a useful community resource for improving docking methods.

We present a method for simultaneous 3D structure generation and pharmacophore-based alignment using a self-organizing algorithm called Stochastic Proximity Embedding (SPE). Current flexible molecular alignment methods either start from a single low-energy structure for each molecule and then tweak bonds or torsion angles, or choose from multiple conformations of each molecule. Methods that generate structures and align them iteratively (eg. Genetic Algorithms), are often slow.

In earlier work (2003), we used SPE to generate 3D structures by iteratively adjusting pairwise distances between atoms based on a set of rules, and showed that it samples conformational space better and runs faster than earlier programs. In this work, we run SPE on the entire ensemble of molecules to be aligned. Additional information on which atoms or groups of atoms in each molecule correspond to points of the pharmacophore can come from an automatically generated hypothesis or be specified manually. We add distance terms to SPE to bring pharmacophore points from different molecules closer, and also to line up normal/direction vectors associated with these points. We also permit individual atoms to be constrained to lie near external coordinates from a protein binding site. The 3D structures of each molecule in the resulting alignment are nearly correct if the pharmacophore hypothesis was chemically feasible; post-processing by BFGS minimization of the distance and energy functions further improves the structures and weeds out infeasible hypotheses.

The new tools can be used to develop and test 3D pharmacophores for a diverse set of known active compounds from a screening run, starting from only 1D correspondences between atoms derived from a pharmacophore hypothesis. The 3D pharmacophore extracted from a successful alignment can be used for 3D database searching.

Virtual Ligand Screening (VLS) has become an integral part of the drug design process for many pharmaceutical companies. In protein structure based VLS the aim is to find a ligand that has a high binding affinity to the target receptor whose 3D structure is known. Ligand similarity searches also provide a very powerful method of quickly screening large databases of ligands to identify possible hits. This presentation will describe the docking tool eHiTS and its seamless integration with a new ligand-based pre-screening filter tool, eHiTS_Filter. eHiTS_Filter uses 23 surface point types (chemical property identifiers) to create a feature vector of active and presumed inactive ligands. The filter is then trained to recognize active ligands and can then be used to screen large databases of ligands extremely rapidly (5-7 ligands per second per cpu). eHiTS_Filter has been integrated into eHiTS to allow for docking poses to be generated for the top N% of the database as ranked by eHiTS_Filter. Enrichment results obtained over a wide range of receptor families consistently show that eHiTS_Filter is able to recover ~80% of the actives in the top 10% of a screened database.

In an effort to understand the strengths and weaknesses of docking programs and scoring functions, an evaluation of ten docking programs and 37 scoring functions was conducted against eight proteins of seven protein types for three tasks: binding mode prediction, virtual screening for lead identification, and rank-ordering by affinity for lead optimization. All of the docking programs were able to generate ligand conformations similar to crystallographically determined protein/ligand complex structures for at least one of the targets. However, scoring functions were less successful at distinguishing the crystallographic conformation from the set of docked poses. For virtual screening, docking programs identified active compounds from a pharmaceutically relevant pool of decoy compounds, but no single program or scoring function performed well for all of the targets. For prediction of compound affinity, none of the docking programs or scoring functions made a useful prediction of ligand binding affinity.

Development of Thalidomide as an Angiogenesis Inhibitor - from Screening to the ClinicWilliam D. Figg, National Cancer Institute

Angiogenesis, or the development of new blood vessels, is essential for the growth, invasion, and metastasis of solid tumors. Inhibition of this process represents a promising new therapeutic treatment strategy for metastatic diseases such as advanced stage hormone-refractory prostate cancer (HRPC). In 2004, the concept of angiogenesis inhibition was validated with the FDA approval of the first antiangiogenic agent (bevacizumab-Avastin). This bolstered the field to the quest for novel inhibitors with several antiangiogenic compounds currently in the preclinical or clinical phase of drug development. The challenge for the discovery and characterization of antiangiogenic targets remains in developing efficient in vitro and/or in vivo preclinical angiogenesis screening systems and in the effective design of clinical trials with measurable endpoints. Despite its teratogenicity, thalidomide has emerged as a treatment for cancer evaluating its clinical efficacy through its antiangiogenic property. The development of thalidomide as an angiogenesis inhibitor will be presented which includes the synthesis and screening of novel thalidomide analogs, the molecular pharmacology, toxicity and metabolism of the drug, followed by a summary of the results of trials involving the use of thalidomide in HRPC.

This workshop will review a number of recent advancements that have been made by researchers at Schrodinger. These new methods include the accurate treatment of both ligand and receptor flexibility in docking (induced-fit), the use of polarizable ligand charges derived from quantum mechanics for docking and scoring, and docking to conformational ensembles to reduce the rate of false negatives and to improve enrichment factors in database screens. The instructors will provide training sessions to familiarize users with these new tools and demonstrate how they can be used in real-world applications. Topics will include: *Induced Fit Docking (J. Med. Chem., 2006, 49, 534-553) *Quantum Polarizable Docking (J. Comput. Chem. 2005, 26, 915-931) *Virtual Screening Workflow: Automating the process of screening large databases of millions of compounds with a hierarchical approach that leverages three levels of Glide docking accuracy *Use of conformational receptor ensembles in virtual screening *Incorporation of ADME properties into virtual screening and lead optimization research

Given the availability of crystallography data for a drug target, it is possible to generate a large number of reasonable docked poses using modern software. This workshop will address the use of protein:ligand interaction fingerprints, combined with activity data, to reduce the noise which is inherent in docking results. A combination of clustering methods and 2D visualization can be used to produce model hypotheses, which can be applied to subsequent screening of compound databases.

Tuesday Afternoon Workshops

Applications of Filtering and Similarity in Virtual ScreeningPaul Hawkins, OpenEye

This workshop will address the application of shape-based, ligand-centric virtual screening using the OpenEye tool ROCS (Rapid Overlay of Chemical Structures).

The tool will be introduced, its underlying theory presented and its coupling to other tools in the OpenEye suite will be illustrated. The bulk of the presentation will focus on new approaches to using ROCS, with particular attention paid to the use of multiple molecules in a single ROCS query.

Docking and ScreeningDarryl Reid, SimBioSys

In this workshop you will learn about the usage of eHiTS for accurate pose prediction as well as virtual screening. The computations will be performed on a Linux PC, using the eHiTS integrated Chemical Visualizer: CheVi. Some examples will be presented first that aim to illustrate the most common steps during the use of eHiTS flexible ligand docking tool: such as preparing the input files; setting up the run and analyzing / visualizing of the results. Subsequently you will investigate virtual screening using eHiTS docking alone and also combining it with eHiTS Filter. eHiTS Filter is a chemical feature descriptor method of the ligands. It works purely with ligand information and is looking for chemical features on the interaction surface of the ligands. Therefore eHiTS Filter can also work for virtual screening if there is no receptor structure available for the target.

Summary: You will learn about docking, accurate pose prediction, VHTS filtering and chemical features on the interaction surface of the ligands. Participants will also take home: A CD with a full suite of SimBioSys software and a license good for a period of 30 days from the end of the meeting.