LINKING CHEMINFORMATICS TOOLS FOR AUTOMATED LEAD OPTIMISATION
Simon Folkertsma, Martin Ott, Gijs Schaftenaar, Gert Vriend; Centre for Molecular and Biomolecular Informatics, University of Nijmegen,The Netherlands
When the 3-D structure of a receptor or ligand binding site is known, in-silico docking studies can be performed with actual or potential ligand structures. In the process of lead refinement, structural variants of a lead compound are created in one way or another and then docked to check their suitability. The method by which variants are created is crucial to the success of this procedure. Random attachment of substituents and/or functional groups to the lead structure often leads to disappointing results when the structural variants cannot be readily synthesised, either from the lead compound or otherwise. We propose to overcome this drawback by generating the variants in such a way that they are likely to be synthetically accessible. In particular, we generate virtual reaction products by applying known chemical reaction types to the lead compound. These reaction products are then docked automatically to the ligand binding site. The procedure can be repeated on promising structures, allowing a stepwise lead refinement.
We have developed a Web-accessible tool to screen the synthetically accessible structural variations of a given lead compound. Several programs are used in our scheme:
- The WHAT IF program is used to`clean up' the receptor structure. It optimises the static coordinates of the structure, models missing atoms, and optimises the hydrogen bond network. Finally, the structure is prepared for energy minimisation.
- GROMOS performs an energy minimisation of the resulting receptor structures.
- Originally developed to perform retrosynthetic analyses, the LHASA program can now also be used in the "synthetic" or "forward" mode, applying the reactions from its knowledge base in the forward sense. LHASA searches for reactions that add small groups or make other minor modifications to the lead compound structure, thus generating a number of variants (usually 10-100).
- The 3-D model building program Corina then constructs a 3-D model from each of the reaction products, which are needed as input for the next step.
- CONCOORD emulates an molecular dynamics program. A small number of highly divergent steps in a long molecular dynamics run can be found in a relatively small time.
- Finally, FlexX is used to dock the newly generated structures into the ligand binding site. It treats the ligand as a flexible structure but the binding site is kept rigid, so FlexX cannot deal with induced fits or other forms of receptor flexibility. To partly overcome this problem, FlexX may call CONCOORD to build alternative pocket geometries each of which is tried for docking.
The receptor or ligand binding site must first be pre-processed by WHAT and GROMOS in order to be used by CONCOORD and FlexX later on. The initial lead compound can be input either as an MDL MOLfile or as a Tripos MOL2 file. The bond connectivity (`topology' or `2-D structure') must be fully described. The 3-D description is converted to a 2-D description used by LHASA. After generation of the reaction products, their 3-D models are built which are subsequently optimised using a simple force field. At this stage, the potential ligands are presented to the user who can select a subset or all ligands for docking into the receptor. After docking, again the user can make a selection of the ligands based on the docking scores and visual inspection of the docked ligands. The selected ligands may then be submitted as improved lead compounds to a next iteration of the process described above.
A DATA BASE FOR TRANSITION STATES. RANKING OF SYNTHESIS ROUTES BY USING A SYSTEM COMBINED COMPUTATIONAL WITH INFORMATION CHEMISTRY
Kenzi Hori; Yamaguchi University, Ube, Japan
It has been possible to practically use computer-assisted synthesis design systems such as AIPHOS and EROS in these days to create synthesis routes of compounds. However, these systems produce many synthesis routes without giving information on which route should be applied first in experiments. Experimental chemists usually choose one of the routes depending on their knowledge and experience. If they do not have skills for organic synthesis enough to adopt the best synthesis route, they cannot start to synthesize a target compound.
It is common knowledge that transition states (TSs) are keys for how to estimate easiness of chemical reactions and computational chemistry delivers that answer. However, it usually takes long time to obtain this information if organic chemists try to perform theoretical calculations by themselves. Few organic chemists can use theoretical calculations to rank the synthetic routes for their experiments.
We have been developed a system that makes it possible to rank the routes easily by using a data base. The data base system, called “Data Base for Transition States, TSDB”, combines computational and information chemistry. The TSDB consists of three programs. One is a molecular modeling program and for this purpose we used Jmol developed by The Open Science Project. The second is a program, TS_Search, that makes it possible to treat transition states even by organic chemists. This program offers organic chemists to search transition states easily and to analyze chemical reactions in detail. The program includes three standard methods, i.e., the minimum energy path, the saddle and the contour methods. The TS_Search also has another method, the substituent method, to find TSs as it is very easy to introduce a substituent to a TS geometry that we already have. This method is also possible to make a ring in a part of a TS geometry. For this purpose, we have to make a library that includes many TS geometries, called TSLB, as well as data related to reactions.
The third is the program that searches and handles data in the TSLB. Although there are a lot of data for TSs in many journals such as JACS, JOC and so on, we cannot use them for ranking synthesis routes from, for example, the AIPHOS system. It is because the TS data in the journals are usually not the same as that seen in a synthesis route. The substituent method makes it possible to find a required TS by using data in academic journals. Therefore, the TSLB that gathers TS geometries is very useful to rank the synthesis routes.
In the present paper, we described the concept of TSDB and how to use it for ranking synthesis routes which a computer-assisted synthesis route design system such as AIPHOS and TOSP system creates.
ENHANCED RETRIEVAL OF SYNTHETIC INFORMATION VIA SCIFINDER
Linda Toler, Roger Schenck; CAS, Columbus OH 43210, USA
Scientists expect more than answers from today's information tools. Good decisions still rely on good data, but scientists expect tools that go beyond the scope of the asked question and provide a means to effectively analyze the retrieved data to extract the information relevant to their work. Providing current, comprehensive, reliable chemical information coupled with tools to effectively convert that information into knowledge is key to the popularity of the SciFinder(R) desk top research tool for chemists. This talk will focus on how SciFinder is addressing some of the challenges inherent to reaction retrieval and post processing. It will show examples of how SciFinder tools can be used to gain insights from the synthetic information covered by Chemical Abstracts.