CLASSPHARMER SUITE, A DESKTOP SOLUTION FOR SCREENING DATA ANALYSIS
Vincent Vivien, Christos A. Nicolaou, Patricia A. Bacha; Bioreason, 67300 Schiltigheim, France
Bioreason's ClassPharmer software suite is designed to provide an automated mechanism for the generation and archiving of class-based knowledge derived from primary and secondary screening data. The Suite is made up of a core unit that serves as the primary interface and one or more functional modules.
ClassViewer allows the user to view, manipulate, and export information about identified classes and compounds in formats that can be understood by other analytical tools and serves as the core unit for the Suite.
CompoundClassifier takes small molecule training data and identifies 2D structural classes of compounds that are homogenous with respect to a learned maximum common substructure.
ClassImporter converts user-defined classes of compounds into a format that is readable by the Suite and calculates class maximum common substructures for use with all of the other modules.
ClassProfiler characterizes identified chemical classes by adding other tested compounds to the learned classes based upon substructure similarity and by importing or calculating additional class and compound attributes. In addition, it allows users to select priority classes based upon attribute values.
CompoundSelector generates lists of compounds for primary or secondary screening based upon compound attributes and/or substructure similarity with families of previously tested compounds.
The complete ClassPharmer(tm) Suite, a Microsoft Windows native application, has been designed to incorporate a high level of interactivity, user-friendliness and robustness. The use of both structural classes and class-based reasoning enables scientists to discover significant pieces of knowledge very quickly and make important decisions on which compounds show the most promise with more confidence.
This presentation aims at highlighting some of the key features of the ClassPharmer Suite and at familiarizing the audience with the workflow of the system by use of real data examples.
VISUAL ANALYSIS OF HIGH THROUGHPUT SCREENING DATA
Andreas Witte; Tripos GmbH, D-81829 München
High Throughput Screening (HTS) of ever growing chemical libraries poses a new challenge to data analysis. Problems arise both from the large amount of data obtained and the complexity of structural diversity.
High dimensional descriptors like substructural fingerprints have proven very useful for chemical library and diversity analysis. However, their high dimensionality makes them poorly suited to standard projection techniques like principal components analysis (PCA) or non-linear mapping (NLM).
By using a combination of optimizable K-dissimilarity selection (OptiSim) and a modified stress function which suppresses effects of distances which fall beyond a characteristic horizon, it is possible to relax PCA coordinates into more consistently meaningful projections, typically from fingerprint space into two dimensions.
The non-linear maps so obtained are useful for characterizing combinatorial libraries, for comparing sub-libraries, and for exploring the distribution of biological properties across structural space. By means of this approach it is possible to identify clusters of activity which will allow the chemist or biologist to pick a lead compound (or series of compounds) for further testing.
This talk will demonstrate how exploratory data analysis of HTS data can be achieved and introduce the underlying concepts. The implementation in SARNavigator will be shown using public data sets.
ADVANCED ALGORITHM BUILDER
P. Jurgutis; Advanced Pharma Algorithms, Toronto, Canada
Advanced Pharma Algorithms, innovators in Silico, is a technology company developing methodologies and software applications for the optimisation and pharmacokinetic screening of early lead molecules. Advanced Pharma Algorithms introduces ‘Advanced Algorithm Builder’ for Win NT/2000. This novel application is a multitasking software system enabling discovery teams to create high throughput computational algorithms from a variety of experimental data resulting in qualitative filters/screens or quantitative property/activity predictors. A unique and comprehensive set of fragmentation tools, derive structural descriptors that retain their chemical significance throughout data optimisation in QSAR, QSPR and SAR analysis. Statistical tools include regression techniques, cross validation, hierarchical clustering and recursive partitioning. The object-based computational environment requires no programming, is flexible, highly integrated and distributable throughout the research effort, affording drug discovery specialists the opportunity to move from structure and data to theory validation in a fraction of the time previously required.