Archive for July, 2013

Modern drug discovery is characterized by the production of vast quantities of compounds and the need to examine these huge libraries in short periods of time. The need to store, manage and analyze these rapidly increasing resources has given rise to the field known as computer-aided drug design (CADD). CADD represents computational methods and resources that are used to facilitate the design and discovery of new therapeutic solutions. Digital repositories, containing detailed information on drugs and other useful compounds, are goldmines for the study of chemical reactions capabilities. Design libraries, with the potential to generate molecular variants in their entirety, allow the selection and sampling of chemical compounds with diverse characteristics. Fold recognition, for studying sequence-structure homology between protein sequences and structures, are helpful for inferring binding sites and molecular functions. Virtual screening, the in silico analog of high-throughput screening, offers great promise for systematic evaluation of huge chemical libraries to identify potential lead candidates that can be synthesized and tested.

A Brief History of CADD
1900: The receptor and lock-and-key concepts
P. Ehrich (1909) and E. Fisher(1894)

1970s: Quantitative structure-activity relationships (QSAR)
Limitations: 2-Dimensional, retrospective analysis

1980s: Beginning of CADD
Molecular Biology
X-ray crystallography, multi-dimensional NMR
Molecular modeling , computer graphics

1990s: Human genome
Combinatorial chemistry
High-throughput screening

The cost of drug discovery and development process has increased significantly during the past thirty-four years. The cost of drug development has increased from $4 million in 1962 to over $350 million in 1996 (Fig. 1).
Fig. 1: The cost of drug development from $4million in 1962 to over $350 million in 1996

Fig. 1: The cost of drug development from $4million in 1962 to over $350 million in 1996

Moreover, during this process, only a small amount of candidates will be examined in the clinic and few will be marketed. In 1950, it was estimated that 7,000 compounds had to be isolated or synthesized and then tested for therapeutic activity for each one that became a pharmaceutical product. The challenge is becoming more difficult: 10,000 compounds had to be evaluated in 1979, and this number could be as high as 20,000 today (Fig. 2).

Fig. 2: Timeline in a drug discovery project

Fig. 2: Timeline in a drug discovery project

The reasons for this are several-fold. The market for so called high value-added compounds is very competitive. The new compound must offer improved characteristics in order to be worthwhile for commercialization. Also there are serious hurdles regarding ease and cost of synthesis, patentability, safety, and social need for the new compound.

CADD Strategy Towards Drug Discovery
Computer-aided drug design (CADD) is one of these tools which can be used to increase the efficiency of the drug discovery process. CADD cannot maximize its utility in isolation and will not do so. Rather, it can form a valuable partnership with experiment by providing estimates when experiments are difficult, expensive, or impossible, and by coordinating the experimental data available. A close coupling between computational chemists and experimentalists allows information to flow immediately and directly between the two. This helps CADD chemists to better understand the details of the problem and to refine their approach.

CADD in Lead generation
In the early stage of a drug discovery process, researchers may be faced with little or no structure activity relationship (SAR) information. At this point, assay development and screening should be undertaken immediately by the high-throughput screening (HTS) group. The aim of these analyses is to select and test fewer compounds, whilst gaining as much information as possible about the dataset. However if a lead is known, then more focused approach can be adopted by searching for compounds with similar (two or three-dimensional) structures to the lead candidate or by substructure searching4. In substructure searching the query will retrieve those structures from the database that contain groups present in the primary lead. These molecules can then be screened in a biological assay.

CADD in Lead Optimization
In medicinal chemistry the lead optimization process concerns many aspects such as the optimization of the affinity for the biological target, the toxicity, the oral bioavailability, the cell permeability, the plasma binding, the ease of metabolism6. The principle employed is that any incremental change in the chemical structure produces incremental (positive or negative) changes in bio-activity and a systematic study of such cause and effect relationship is called structure activity relationship (SAR) study. The process is highly iterative and traditionally based on trial-anerror. When no structural data about the target is available, the lead optimization process can be made more methodological by using quantitative structure activity relationship (QSAR) studies. QSAR methods are used to attempt to correlate Two different approaches can be used in QSAR depending on the available compounds:

(1) Two-dimensional QSAR (2D-QSAR) and
(2) Three-dimensional QSAR (3D-QSAR).

This problem limits the applicability of CoMFA. In order to overcome this problem, some new approaches, which do not depend on a common alignment of the molecules, have been recently developed. Comparative molecular moment analysis CoMMA9, EVA or the WHIM are used because they provide three dimensional descriptors that are independent of the orientation of the molecules in space; they do not have to be aligned.

Software for Molecular Modeling

􀀼 General purpose molecular modeling (large & small molecules)
molecular mechanics, dynamics and multifunctional programs

􀀼 Quantum Chemistry calculations (small molecules)
molecular orbital or quantum mechanical calculations

􀀼 Database of molecular structures (large & small molecules)
software for storage and retrieval of molecular structure data

􀀼 Molecular graphics (large & small molecules)
programs to visualize molecules

􀀼 QSAR (small molecules)

Software for General Purpose Molecular Modeling

For workstations, minicomputers, and supercomputers (SGI, Sun, Cray, etc.)

AMBER — Peter Kollman and coworkers, UCSF Computer assisted model building, energy minimization, molecular dynamics, and free energy perturbation calculations.

Midas Plus — UCSF Computer Graphics Laboratory

CHARMM — Martin Karplus and cowrokers, Harvard

QUANTA/CHARMm — Molecular Simulations Inc. (MSI) molecular/drug design, QSAR, quantum chemistry, X-ray & NMR data analysis

Insight/DISCOVER — Biosym, Inc. Now MSI and Biosym became Accelrys Inc.

SYBYL — Tripos, Inc.

ECEPP — (Harold Scheraga and coworkers, Cornell)
MM3 — (Norman Allinger and coworkers, Georgia)

For personal computers (Apple, Compaq, IBM, etc.)

Alchemy III — Tripos, Inc. Structure building and manipulation,

Chem3D Pro — CambridgeSoft Corp.

Desktop Molecular Modeller — Oxford Elec. Publishing

Molecular Modeling Pro — WindowChem Software
Energy minimization, QSAR (surface area, volume, logP), etc.

PC MODEL — Serena Software

Molecular Modeling:

1. Data Analysis
Structural data (X-ray, NMR structure determination)
Biological data (bioinformatics)
Chemical data (QSAR of conventional compound synthesis and combinatorial chemistry)

2. Theory and Prediction
Molecular energy (structure and folding)
Molecular dynamics (conformational changes)
Molecular recognition (ligand and drug design)


Case Study #1
Virus type 1 Integrase (HIV-1 IN) inhibitors. HIV-1 IN is among one of the most important enzyme responsible for the HIV-1 replication cycle. Because of its essential nature in the replicative cycle of HIV-1, HIV-1 IN is an attractive target for the development of anti-AIDS drugs.
Starting from a pharmacophore hypothesis derived from a known inhibitor of HIV-1 IN, caffeic acid phenethyl ester (CAPE), a three-dimensional search of the NCI database was performed. From this search, 267 structures were found to match the pharmacophore, 60 of those were tested in an in vitro assay against HIV-1 IN and 19 were found to inhibit
both the 3’ processing and strand transfer.
The relevance of the proposed pharmacophore was then tested using a small three dimensional validation database of known HIV-1 IN inhibitors, which had no overlap with the group of compounds found in the initial search. This search strongly supports for the existence of the postulated pharmacophore and in addition, it hinted at the existence of a possible second pharmacophore relevant in the binding to IN.Using the second pharmacophore in a threedimensional search of the NCI database, 10 novel structurally diverse HIV-1 IN inhibitors were found.

Case Study #2
Recently a pioneering study was published by the group of Fesik. In this work, they elegantly combined the advantage of rational design and combinatorial chemistry by a new procedure called “SAR by NMR”. In the first step of this process, a library of low molecular weight compounds is screened to identify molecules that bind to the protein. Addition of a substrate with sufficient affinity to the 15N-enriched protein in solution yields a shift of the HSQC NMR signals for all groups near to the binding site.
In the next step, once a lead is identified, analogs are screened to optimize binding to this site. Searching for a second binding site is then undertaken in either the original screen or a screen conducted in the presence of the first fragment. The second ligand is then optimized. When the two optimized fragments have been selected, their location and orientation are determined experimentally by NMR or X-ray crystallography. Finally, on the basis of this structural information, suitable linkers for the two ligands are modeled on the computer.

The advantage of the Fesik approach is that one needs only weak binding for single ligands. Linking such weak binders provides not only the product of binding constants of the single substances but an additional entropic contribution which yields superactive compounds.

CADD approaches aim to increase the speed and efficiency in the drug discovery process which provides a somewhat more detailed map to the goal. The hope is that providing bit and pieces of information and by helping to coordinate the information, CADD will help to make the drug design process more rational. The many success stories of the use of CADD in the discovery of new drugs shows the utility of such analyses used in close coupling with traditional medicinal chemistry techniques.
CADD is now widely recognized as a viable alternative and complement to high-throughput screening. The search for new molecular entities has led to the construction of high quality datasets and design libraries that may be optimized for molecular diversity or similarity. On the other hand, advances in molecular docking algorithms, combined with improvements in computational infrastructure, are enabling rapid improvement in screening throughput. Propelled by increasingly powerful technology, distributed computing is gaining popularity for large-scale screening initiatives. Recent examples include the European Union funded WISDOM (World-wide In Silico Docking on Malaria) project which analyzed over 41 million malaria-relevant compounds in _1 month using 1700 computers from 15 countries, and the Chinese funded Drug Discovery Grid (DDGrid) for anti-SARS and anti-diabetes research with a calculation capacity of >1 Tflops per second. Combined with concerted efforts towards the design of more detailed physical models such as solubility and protein solvation, these advancements will, for the first time, allow the realization of the full potential of lead discovery by design.