SIV (SELECTION THROUGH INTERACTIVE VISUALIZATION) - AN APPROACH TO COMPOUND PROGRESSION BASED ON HTS DATA
Gavin Harper, Gianpaolo Bravi, Darren VS Green, Andrew R Leach, Stephen Pickett, Andrew R Whittington; Discovery Research, GlaxoSmithKline, Stevenage, UK
On the introduction of high-throughput screening to many pharmaceutical companies, an automatic progression strategy was put in place where compounds with an observed potency above a threshold value were progressed. Downstream capacity was often one of the main criteria in assessing the level at which such a threshold was set. At GSK it became apparent that this may not be the most effective method of progressing compounds. Correlation between one-shot HTS data and IC50 data was often poor, many progressed compounds were not considered particularly desirable or chemically tractable by the chemists doing lead optimisation, and clusters of apparently low potency compounds fell beneath the potency threshold despite strong evidence that their activity was real. In addition, compounds were false positives for structural reasons that could be spotted by the chemist after the screen.
In this presentation we discuss, through reference to in-house HTS screening data, an alternative strategy currently being employed at GSK that is interactive and does not rely upon activity thresholds. This process has been of great value in identifying compounds showing low and sub-micromolar potency in secondary screening that would have been missed otherwise.
Computational and statistical methods such as recursive partitioning and kernel discrimination have been evaluated as fully automatic methods for identifying true hits from the data. Through consideration of real screening data we show that, although such methods identify compounds that would not normally be progressed, they are well short of being exhaustive. In particular, these methods are poorly suited to identifying interesting singletons in the data.
Our solution has been to make those compounds, which might plausibly be considered hits available in an intuitive and browsable form, enabling project chemists to interpret the data using their expert knowledge and select the compounds to progress - human rather than algorithmic selection of compounds. "Plausible" hits are defined on the basis of what constitutes statistically significant activity rather than on screening capacity constraints. Clearly, a human selection process introduces an element of "subjectivity" - no two teams of chemists will progress exactly the same compounds, while running the same non-randomised mathematical algorithm twice will generate the same list for progression. However, human selection enables the use of expert opinion on issues such as chemical tractability to be used in the decision. It also ensures buy-in from the chemistry team. Computational and statistical methods are a complement to the human selection process and assist expert user knowledge rather than replacing it.
Based on the definition of plausible hit, we calculate a number of standard clusterings, physical descriptors and markers for reactive groups (written in the SMARTS language). These properties can be accessed through a web-page, where the user simply selects the screen of interest from a drop-down list. We use the Spotfire package to deliver the data to the user, and provide an interface for compound selection. More generally, an expert user may wish to add other descriptions of the data thought to be relevant to a particular screen, such as whether compounds match a 3-D pharmacaphore model. Our process allows the user to add these descriptions as additional columns to the Spotfire data set, assisting the decision-making process.
A tool has been written for Spotfire to enable straightforward and intuitive selection of compounds for progression. As points are highlighted in a plot, the corresponding structures are displayed and equally, as structures are highlighted, so are the corresponding points in a plot. Chemists can interactively select structures for progression. Problem compounds can also be selected and this information used later to improve the automatic process by, for example, suggesting suitable additional SMARTS filters. By clustering compounds effectively, and making the selection process immediate and intuitive, chemists are able to visualise and process compounds in large groups at a time, allowing straightforward processing of a few tens of thousands. By allowing users to actually see in real time what would be excluded by using filters based on physical properties, preconceptions can be checked against the reality of the effect that such filters have on a particular data set, and the filters adjusted as desired.
MAKING WEB TOOLS WORK FOR CHEMISTS
David Wild1, Rob Goulet1, Sherry Marcy1, Mike Coble2; 1Pfizer, Ann Arbor, MI, USA, 2Tripos Inc., St.Louis, MO, USA
Company intranets have made deployment of web-based software very easy, and many Pharmaceutical R&D organizations including Pfizer Ann Arbor have taken advantage of this to deploy to chemists and biologists tools and techniques that had previously been only available to computational chemists and modelers. However, differences in "world view" and experience between these groups of people have limited the success of such deployments. In this presentation we shall give examples of how, by partnership between Pfizer and Tripos, we have overhauled our approach to the design of tools, employing three significant techniques - Contextual Design, Interaction Design and a Science-based Deployment Model. As a result, we are boosting the impact and usefulness of these tools to scientists.
Two books have had major influence on us. Contextual Design (Beyer, H. and Holtzblatt, K., Morgan Kaufmann Publishers, San Fransisco, 1998) introduces two major concepts: contextual inquiry, which puts designers and programmers directly in the scientists' work context so that the work flow in which a tool is to be used can be properly understood and modeled, and paper prototyping, which enables design ideas to be tried out with the scientists without commiting to code. These have been combined with use testing, the analysis of users performing set tasks, to form a three-stage design framework. We have applied this to a number of our Web applications which shall be illustrated in the talk.
The second book, The Inmates are Running the Asylum (Cooper, A., Sams Publishing, Indianapolis, 1999) introduces Interaction Design, a collection of techniques focused on designing for real people rather than abstract "users". A major component in this is the development of personas, which are personified stereotypes based on interviews with actual users of the software. We have found this to be useful in designing for the diverse community of scientists at Ann Arbor.
The final change was to introduce a five-step deployment model, which is focused on the development of complex, scientific programs. The model consists of Ideas Development, Exploratory Deployment, Full Development, Launch, and Support. The "Ideas development" and "Exploratory deployment" stages are focused on the prototyping and generation of expert-level tools, whilst the final three stages focus on development (including the methods described above), formal launch and marketing, and support of tools for scientists. We shall give examples of how this model is helping us research and develop new algorithms and methods, and then deploy and market tools based on these methods appropriately to scientists.
AN EARLY EVALUATION OF THE EXPERIMENTAL CHEMISTRY PREPRINT SERVER
Wendy A Warr; Wendy Warr & Associates, Holmes Chapel, UK
A preprint is a research article made publicly available prior to formal publication. A preprint server is a freely available and permanent archive and distribution medium for preprints, allowing rapid dissemination, and use of multimedia and supporting files. Electronic e-prints have been widely adopted in certain fields (notably high energy physics) but, until recently, the preprint concept has not been received with enthusiasm by chemists. Despite the fact that preprints have the advantage of rapid publication, chemists have been reluctant to produce them because they could be viewed as “unallowable” for research assessment or tenure exercises and because editors of certain prestigious journals will not publish papers that have already appeared on the Web. In theory, preprints, together with version control and\online discussion, could be a useful compromise: rapid pre-publication followed by open peer review, before publication in a traditional journal.
Organisations such as ACS and IUPAC have rejected the idea of mounting a preprint server but Elsevier Science decided to start an experiment in August 2000, when the Chemistry Preprint Server (CPS) hosted by ChemWeb.com (http://preprint.chemweb.com) was launched. CPS provides a freely available and permanent archive for pre-publication research articles in the field of chemistry. Users can upload their completed or work-in-progress articles, search across the Chemistry Preprint Server, browse papers by chemistry classifications, and discuss articles of interest.
This paper will evaluate the CPS in its second year of publication (allowing for the fact that the first year was highly experimental and editorial guidelines were tightened in July 2001), with particular reference to chemistry disciplines of interest to attendees of the Noordwijkerhout meeting. How does CPS differ from an electronic journal? Who are the chemists using CPS and why do they use it? Is there genuinely an open peer review process? Is it possible to assess the “quality” of the papers? Is it even necessary? How many papers go on to be published in “standard” journals? In summary, what lessons can be learned from this experiment to date?