![]()
CINF
1:
Dixel modeling of gene expression
N Sukumar1, Curt M. Breneman1, Kristin
P. Bennett2, Charles Lawrence3, and Inna Vitol3.
(1) Department of Chemistry, Rensselaer Polytechnic Institute, Cogswell
Laboratory, 110 8th Street, Troy, NY 12180-3590, Fax: 518-276-4045, nagams@rpi.edu,
brenec@rpi.edu, (2) Department of Mathematics, Rensselaer Polytechnic
Institute, (3) Wadsworth Center
Abstract
Sequence-specific binding of proteins to DNA is arguably the most important
foundation of cellular function, since it exerts fundamental control over
the abundance of virtually all cellular functional macromolecules.
Identification of promoter sequences and transcription factor binding sites
in the genome thus represents one of the grand challenges of the
post-genomic era. The most successful bioinformatics methods today are based
on models that represent DNA by sequences of letters (motif methods).
Unfortunately, the sequence data used for training and validation is quite
limited. Motif models are thus hampered both by small sample sizes and by an
abstract representation that has little to do with the energetics of
binding. It is here that cheminformatics can supply additional information
and introduce a more accurate and sensitive chemical representation of
DNA-protein interactions. Drawing upon our experience with E.coli
transcription factors and sigma factors, we show how characterization of DNA
through features of electron densities sampled on the vdW surfaces of the
major and minor grooves (“Dixels”) captures the effects of environmental
perturbations of neighboring base pairs, without requiring additional
sequence data for training.
![]()
CINF 2:
Integration of biological and chemical information: Faster decisions from
linked data and visualizations
Gavin M Fischer, Application Scientist, OmniViz Inc, 2 Clocktower
Place, Suite 600, Maynard, MA 01754, gfischer@omniviz.com
Abstract
Visualizations are the best way for people to understand data. Presenting
anyone with long lists of numbers rarely helps the understanding of the
data, never mind the interconnectedness within that data. This is even more
true when crossing between domains, such as between chemistry and biology.
Both sides understand, in theory if not practice, what the other is doing.
However, the lack of a common language between them necessitates new
approaches for integrating analysis; visualizations are a key to this. The
understanding of HTS data, with linked biologic pathways illustrating the
context in which the target is being tested, and microarrays showing how
responses map against the genome, allow for more rapid decisions. Both
chemists and biologists have analysis techniques that can, and should, aid
the others. I will show some examples of this integration working, and talk
about linking this with literature analysis to understand the BIG picture,
whilst not losing sight of the details on either side.
![]()
CINF 3:
The BioPrint®
pharmaco-informatics platform: A large profile database for the development
of relevant predictive models
Frédérique Barbosa, Molecular Modelling, Cerep, 128, rue Danton,
92500 Rueil Malmaison, France, Fax: 33 1 55 94 84 10, F.Barbosa@cerep.fr
Abstract
Linking biological and chemical information for use in computational
approaches in order to predict biological activity, ADME profiles and
adverse drug reactions (ADR) is critical for enhancing the drug discovery
process. However, modeling approaches have been hampered by the lack of
large, robust and standardized training datasets. In an extensive effort to
build such a dataset, the BioPrint® database is continuously constructed by
systematic profiling of drugs available on the market, as well as numerous
reference compounds (at present, BioPrint includes more than 2,200 compounds
and 172 different assays). The database is composed of several large
datasets: compound pharmacology profiles, and complementary clinical data
including therapeutic use information, pharmacokinetics profiles and ADR
profiles. These data have allowed the development of predictive QSPR and
QSAR models. Models based on chemical structure are strengthened by in vitro
results that can be used as additional compound descriptors to predict
complex in vivo endpoints.
![]()
CINF 4:
Keeping up with the
changing face of Medline and MeSH - 3 keys to improving searches
Soaring Bear, MeSH, NLM/NIH, 8600 Rockville Pike B2E17, Bethesda, MD
20894, Fax: 301-402-2002, bears@mail.nlm.nih.gov
Abstract
National Library of Medicine provides dozens of medical, chemical, sequence,
and structural databases which can all be searched at one time with the new
Entrez interface (http://www.ncbi.nlm.nih.gov/gquery/gquery.fcgi) The
information explosion requires prudent search strategies for quicker finding
of the data gems you are seeking in the growing haystack of science results.
Ambiguities of word meanings confound and frustrate. To help, the MeSH group
of the National Library of Medline is continually updating the terms and
concept structure of the MeSH indexing vocabulary (http://www.nlm.nih.gov/mesh/2003/MBrowser.html)
used for Medline (http://Pubmed.gov). Some recent examples of these changes
in biology and chemistry are described and how you can keep up with and use
these changes for better search results. Three easy steps to better Medline
searches will be presented by an NLM expert. A balance of widening (with OR
terms) and narrowing (with NOT terms) can be facilitated with three tools
provided by Pubmed: Details, Display Citation and Mesh Browser.
![]()
CINF 5:
Steric and
electronic requirements of enzyme reactions
Johann Gasteiger1, Martin Reitz1, and Oliver
Sacher2. (1) Computer-Chemie-Centrum and Institute of Organic
Chemistry, University of Erlangen-Nuremberg, Naegelsbachstr. 25, Erlangen
91052, Germany, Fax: +49-9131-85 26566, Gasteiger@chemie.uni-erlangen.de,
(2) Molecular Networks GmbH
Abstract
Genes express proteins, enzymes, that govern biochemical reactions. A more
detailed understanding of these reactions requires an analysis of how the
substrates fit into the enzymes and of the physicochemical effects
influencing the bond breaking and making in enzyme reactions. In order to
advance such studies we have built a database of biochemical pathways that
represents chemical structures and reactions on the atomic level giving
access to each atom and bond of the substrates of enzyme reactions. This
database allows the study of transition state hypotheses of enzyme
reactions. Furthermore, the analysis of the physicochemical effects
operating at the reaction site allows a classification of enzyme reactions
that goes beyond the traditional EC code for enzymes.
![]()
CINF 6:
Linking chemical
scaffolds to gene families to help elucidate molecular mechanisms
Chihae Yang1, Paul E. Blower1, Kevin Cross1,
Glenn Myatt1, Wolfgang Sadée2, and Ying Huang2.
(1) Leadscope, Inc, Columbus, OH 43212, Fax: 614-675-3732, cyang@leadscope.com,
(2) College of Medicine and Public Health, The Ohio State University
Abstract
The significant investment in “omics” technologies and large amount of
information generated by these new paradigms have not yet led to dramatic
productivity increases in the drug discovery process. Linking biology to
chemistry still remains the bottleneck. To link the vast amount of genomics
information to small molecule discovery, we previously correlated the gene
expression profiles of 60 NCI cancer cell lines to compound activity
patterns of the same cell lines, resulting in many possible gene-compound
pairs. In this paper, genes in specific biological process pathways were
correlated with active chemical scaffolds, whose associations were used to
build molecular hypotheses. Gene hierarchical classifications, based on
biological process, were used to differentiate gene expression patterns of
various cell types. The results from the gene hierarchy analysis are
compared to other computational methods for extracting subsets of
differentiating genes. This methodology allows us to extend our hypotheses
from individual gene-compound pair mappings to a systems approach of linking
gene families to compound scaffolds.
![]()
CINF 7:
Streamlining drug
discovery informatics: Accelerating the flow from gene to structure to
pre-clinical candidate
Dean R. Artis, Informatics, Plexxikon Inc, 91 Bolivar Drive,
Berkeley, CA 94710, Fax: (510) 548-4785, drartis@plexxikon.com
Abstract
Plexxikon’s Scaffold-Based Drug Discovery™ platform relies on a unique
combination of low-affinity biochemical screening of a proprietary
target-neutral compound library and structural characterization via
high-throughput x-ray crystallography, coupled to a powerful infrastructure
for computational analysis and design that bridges traditional
bioinformatics and cheminformatics. Use of these integrated systems has
resulted in the identification of many novel chemical starting points with
facile synthetic approaches and a target structure-directed optimization
path. This has enabled the efficient synthesis of lead compounds with
compelling bioactivity against proteins of interest in the kinase,
phosphodiesterase and nuclear receptor families. Examples highlighting the
role of Informatics approaches in Plexxikon’s efforts will be discussed,
including efforts leading to the rapid development of a new class of
anti-diabetic compounds with excellent potency, selectivity, pharmaceutical
properties and in vivo efficacy.
![]()
CINF 8:
Linking
bioinformatics to cheminformatics in biological networks
Barbara A. Eckman, Life Sciences, IBM, 1475 Phoenixville Pike, West
Chester, PA 19380, baeckman@us.ibm.com, and Julia E. Rice, IBM Almaden
Research Center
Abstract
As high-throughput biology generates large volumes of data about the
"parts list" of living organisms, the need grows for robust,
efficient systems to manage metabolic and signaling pathways, chemical
reaction networks, protein interaction networks, etc. Network data is
arguably best represented as graphs, which are not well supported by
standard relational database management systems. IBM Research is extending
DB2 with advanced graph operations, to support such queries as: "Find
all proteins related to protein A (i.e. within a given path length of A) in
a protein interaction graph, and retrieve related assay results and compound
structures.” “Find all pathways where compound x inhibits or slows a
reaction, and retrieve Gene Ontology classifications for all proteins
involved in the reaction.” “Find a subgraph of a large pathway that has
the same structure and involves the same enzyme as the subgraph that I have
circled, and retrieve associated protein and compound annotations.”
![]()
CINF 9:
Technical and
people disconnects hindering knowledge exchange between chemistry and
biology
Christopher A. Lipinski, Exploratory Medicinal Sciences, Pfizer
Global Research and Development, Groton Laboratories (retired), Eastern
Point Road, mail stop 8200-36, Groton, CT 06340, Fax: 860-715-3149,
christopher_a_lipinski@groton.pfizer.com
Abstract
Both technical and people factors hinder knowledge exchange between
chemistry and biology. For both disciplines software effort is expended on
data with little value. For example, capture and subsequent analysis of
large volumes of primary HTS data is difficult because of the very high
noise factor and hence is not very useful. Public access to primary
literature data is very different between the disciplines. Much of
searchable biology data is in the public domain while most of chemistry
structural data is not. Batch mode data searching is feasible in biology but
in chemistry batch mode searching capability is primitive. A problem exists
with chemistry needs for batch mode chemical structure searching capability,
for example with CAS SciFinder a leading software search tool. The time
course of data capture and the very different complexity levels of gene and
protein structure representation compared to chemical structure
representation contribute to this issue. On the people side, software lags
in capture of high level meta data, i.e. why decisions are made. Meta data
capture is complicated by people issues particularly those between chemists
and biologists. Discipline based disconnects occur distressingly often and
are frequently overlooked as a cause of lost productivity. Many of the
problems between chemists and biologists are directly traceable to
differences in training and hence in attitudes and outlook. Most synthetic
chemists are math averse and any type of communication to chemists relying
on mathematical equations will be under appreciated or even ignored.
Chemists are superb at pattern recognition but biologists are not. This
causes confusion and conflict with biology when a medicinal chemist makes a
judgment in just a few seconds as to the quality of a compound structure.
Expert systems that could capture the pattern recognition skills of
medicinal chemists are badly needed.
![]()
CINF 10:
Relating chemical
and biological space: An in-silico platform technology approach to
accelerate the discovery of novel medicinally relevant small molecules
Stephan C. Schürer, Director, Content Development, Sertanty, Inc,
1735 N. First Street, Suite 102, San Jose, CA 95112, Fax: 408 487 4011,
sschurer@sertanty.com
Abstract
In the post-genomic era of drug discovery, a promising approach appears to
be the systematic exploration of target families. It is critical in this
process to utilize all available and relevant SAR data and consider various
synthetic methodologies to most efficiently arrive at novel molecules that
have desired properties and are also amenable to further optimization.
Sertanty, Inc. has developed a discovery informatics platform – LUCIATM
– that facilitates archival, sharing, integration, and exploration of
synthetic methods and biological activity data. Using LUCIA, novel small
molecules can be generated in-silico and prioritized against computationally
efficient eScreensTM and ADMET models. eScreens are derived from
an integrated gene family-wide SAR knowledge base and can improve as new
experimental data is generated. Successful application of the technology has
resulted in the identification of novel ABL Kinase inhibitors in a four
month project and offers promise in both accelerating and enriching the
success-rate of collaborative hit identification and lead optimization. Our
next generation ChIP (Chemical Intelligence Platform) system explores
chemical space in-silico based on forward analysis of synthetic pathways.
Utilizing dynamic transforms that are generated from common representations
of chemical reactions, ChIP prospectively “mix-n-matches” compatible
synthetic strategies to generate novel compositions of matter with probable
improvements in potency, selectivity and ADMET profiles.
![]()
CINF 11:
Critical
assessment of chemo- and bio-informatics applications development, or,
"It's the infrastructure, stupid"
Doron Chema, Department of Medicinal Chemistry, Hebrew University of
Jerusalem, School of Pharmacy, Jerusalem 91120, Israel, doron_chema@md.huji.ac.il
Abstract
The increasing need for bridging chemo- and bio-informatics is an excellent
opportunity to reassess the development of applications in these fields and
the expected consequences of bridging together these disciplines.
Examination of the current situation may lead to the conclusion that both
fields currently suffer from a software crisis. This crisis involved several
aspects of the application developing process. The data format
standardization problem is a well-known aspect of this crisis, as many
similar files and databases formats co-exit, sharing similar goals. Another
aspect of this crisis may be called “too many tools for too small
missions.” It is a fact that even a modest project usually demands
developers to manage several code environments, which in turn were designed
and implemented with a specific scientific goal(s) in mind. Ironically, the
existence of many niche tools effectively causes the lack of appropriate
developing tools. This may end in many times in a situation that much of the
developing work is done from scratch, causing a huge waste of resources. It
is our belief that these major difficulties, which can be found in high
frequency in both fields are already causing major bottlenecks that have
even higher potential to block or delay any significant progress of the
integrated field. In this talk an approach for overcoming these barriers in
the infrastructure level will be described, followed by introduction of a
new infrastructure technology.
![]()
CINF 12:
Cross-discipline
analysis made possible with data pipelining
J.R. Tozer, SciTegic, Inc, 9665 Chesapeake Dr. #401, San Diego, CA
92123, Fax: 858 279 8804, jtozer@scitegic.com
Abstract
While cheminformatics and bioinformatics use completely different data
formats and analysis tools, the data pipelining approach makes is possible
to apply them together. Chemical compound structures and activities can be
processed in the same computing environment that analyzes gene expression
profiles or protein sequences. We will discuss some interesting research
questions that can only be addressed by the coordinated analysis in
bioinformatics and cheminformatics (e.g., clustering gene targets using the
correlation of their expression levels in a series of cells with the
biological activity on those cells of a set of test compounds).
![]()
CINF 13:
Informatics
integration at Arena Pharmaceuticals
Gareth Jones, Arena Pharmaceuticals, Inc, 6166 Nancy Ridge Drive, San
Diego, CA 92121, Fax: 8584537210, gjones@arenapharm.com
Abstract
The development of platform-independent web-based computing allows ordinary
users unprecedented access to corporate information. At Arena we have
developed a web-based informatics system that allows all employees access to
chemical, screening, genomic and gene-expression data. This system was
designed specifically to allow users with little or no computing experience
the ability to browse, analyze, update and edit chemical and biological
data. This results in real-time distribution of experimental data and allows
on the fly analysis and search of information. Additionally, communication
between disparate groups working on the same project has been greatly
facilitated.
The data system is based on a three-tier system with an Oracle database in the back-end. The middle tier comprises a web-server with perl CGI and Java programs. Extensive use has been made of Java applets on the client web-browser. A separate Linux cluster provides cheminformatics services to the middle tier, which are accessed using XML/RPC protocols.
![]()
CINF 14:
Systematic
bioactivity classification of ligands onto a protein target ontology:
Application for library design and virtual profiling of a compound
collection
Mark A. Hermsmeier1, Dora Schnur2, and Bradley
C. Pearce1. (1) New Leads Chemistry, Bristol-Myers Squibb, P.O.
Box 4000, Princeton, NJ 08543, Fax: 609-252-7446, (2) Compter Assisted Drug
Design, Bristol-Myers Squibb
Abstract
Profiling the in-silico biological content of our screening deck and the
ability to create target class libraries are greatly facilitated using a
data platform that integrates ligand databases and a protein target
ontology. The data platform that has been developed integrates the
non-proprietary Gene Ontology from the GO Consortium with three commercially
available Ligand databases. The structures in these ligand databases have in
turn been linked to the screening compounds by atom pairs similarity. The
activity associations and similarity results are stored in a relational
database for rapid retrieval of results. A web interface has been deployed
that allows browsing the Protein Target Ontology and drilling down to view
associated ligands in the commercial databases and similar structures in the
screening deck. The data platform also allows rapid in-silico profiling of
the screening compounds.
![]()
CINF 15:
Proteomica™ –
An integrated system for analysis of biological and chemical data
Michael Farnum1, Sergei Izrailev1, and Dimitris
Agrafiotis2. (1) 3-Dimensional Pharmaceuticals, Inc, 665 Stockton
Dr, Exton, PA, PA 19341, Fax: 610-458-8249, michael.farnum@3dp.com, (2)
Research Informatics, 3-Dimensional Pharmaceuticals, Inc
Abstract
In recent years, there has been an explosion of the amount of chemical and
genomic data. Chemical information has been driven by high-throughput
screening and analysis of large libraries of chemical compounds, both
physical and virtual, while genomic information has been generated through
full genome sequencing and annotation as well as by DNA microarray and other
high-throughput experiments. The number of protein crystal structures
deposited in the Protein Data Bank has also grown at an unprecedented rate.
Much effort has been made to relate the structure and properties of chemical
compounds to the structure and function of genes and proteins. However,
chemical and protein sequence information has been largely analyzed
separately, in part because very few databases and software packages provide
the connectivity required for analyzing and browsing the data
simultaneously. Proteomica™ is an architecture designed to integrate both
types of information. It is leveraged by advanced dimensionality reduction
techniques and provides the capability to visualize similarity in both the
property space of small molecules and the sequence space of target proteins.
Proteomica™ enables scientists to ask iterative questions about
biochemical experiments by combining information from external and in-house
sources. This presentation will demonstrate both the principles and
implementation of the system.
![]()
CINF 16:
Fedora: Federated
access to chemical and biological data
Scott Dixon, Vera Povolna, and David Weininger, Metaphorics, 441 Greg
Ave, Santa Fe, NM 87501, scott@metaphorics.com, vera@metaphorics.com
Abstract
Fedora is a technology which enables the rapid development of special
purpose HTTP servers designed for the analysis and integration of biological
and chemical information. These servers containing seemingly disparate data
can communicate with one another via a web browser and provide the
capability to mine data for complex relationships. The Fedora servers
include a metabolic pathway network (Empath), Protein-Ligand Association
Network (Planet), Traditional Chinese Medicines (TCM), the World Drug Index
(WDI), and others.
![]()
CINF 17:
Case study of IP
information management at a small pharmaceutical company
Susan Wollowitz, Wollowitz Associates, 455 Moraga Rd, Suite C,
Moraga, CA 94556, Fax: 925-247-1289, sue@wollowitz.com
Abstract
A case study will be presented of how a small pharmaceutical company
addressed their intellectual property information acquisition and document
management needs. The situation was initially evaluated including the demand
for IP creation and prosecution, the current capabilities and the
operational contraints. Issues identified were a need for an improved
document tracking system, better access to patent information and an ability
to proactively monitor the competitive landscape. The presentation will
discuss the options considered and selected as well as a retrospective
evaluation of the decision success.
![]()
CINF 18:
Low-income patent
management
John Santacruz, Division of Small Chemical Businesses, 1263 Fulton
Street, Rahway, NJ 07065, santacr2@aol.com
Abstract
Patent management on a low-income budget is a growing concern for Small
Chemical Businesses due to limited resources and multitasking of personnel.
Two methods of legal representation that significantly reduce the annual
costs of patent management will be discussed. The two methods will be
compared to the traditional method of private law firm representation. The
literature and laws in this area will be briefly reviewed.
![]()
CINF 19:
Minimizing
intellectual property cost - maximizing intellectual property return
Gianna Arnold, and Corinne Marie Pouliquen, Epstein Becker and Green,
1227 25th Street, NW, Suite 700, Washington, DC 20037-1175, Fax:
202-296-2882, garnold@ebglaw.com
Abstract
Today’s small business owner faces a vast array of decisions related to
the appropriate protection, utilization, and management of intellectual
assets. This discussion will focus on tools and strategies to maximize the
use of intellectual property dollars, by minimizing actual cost, and by
maximizing return. Topics addressed include establishing a scientific
advisory board; establishing process and screening criteria to
obtain/maintain patents; promoting and easing the burden of invention
disclosure; reducing costs associated with use of outside counsel;
capitalizing on intellectual property as a business asset; and aligning
intellectual property resources with corporate strategy.
![]()
CINF 20:
Patent searching
for small chemical businesses
Barbara Hurwitz, Barbara Hurwitz, consulting, 36 Waverly Street,
Portland, ME 04103, Fax: 207-228-6418
Abstract
Patent searches are run for small chemical companies either directly for the company or through the company’s outside counsel. Using three small businesses as case studies, we can see how interacting with these small companies differs from working with the staff of a large chemical and pharmaceutical company.
![]()
CINF 21:
Information
sources for small companies
Sandy Burcham, Service Is Our Business, Inc, 111 Lincoln Terrace,
Norristown, PA 19403-3317, Fax: 610-630-0863, cass123@earthlink.net
Abstract
This paper will discuss the various sources available to small companies -
in order to aid in the determination of the ways to best spend their
resources.
![]()
CINF 22:
Comparison of free
Internet-based intellectual property (IP) tools with contracting IP research
to third party information professionals
Michael I. Montembeau, and Gerri B. Potash, Nerac, Inc, 1 Technology
Drive, Tolland, CT 06084, Fax: 860-872-7856, mmontembeau@nerac.com
Abstract
Chemical businesses, whether large or small, have an enormous need for
intellectual property information. This need is particularly burdensome for
small chemical businesses which often cannot afford to hire full-time
information staff, let alone full-time patent information staff. As a
result, the small chemical businesses are left to appointing a lead IP
person, who must juggle their new IP duties with their research tasks and
other duties.
This presentation will: 1) outline the tools and capabilities of the free internet-based intellectual property resources, 2) compare the internet-based resources with those of a third-party information, such as Nerac.com; and 3) discuss the advantages and disadvantages of each resource and how one would make effective use of these resources.
This presentation will also describe how chemical businesses can benefit, not only from the Intellectual Property resources at Nerac, but also from the use of the extensive chemical and engineering related databases Nerac has compiled as a research and analysis tool.
![]()
CINF 23:
Professional tools
and services supporting the small to medium enterprise
Anthony J. Trippe, Science IP/Chemical Abstracts Service, 2540 Olentangy
River Rd., Columbus, OH 43210, atrippe@cas.org, and Rebecca A. Wolff,
Product Marketing Management, Chemical Abstracts Service, 2540 Olentangy
River Road, Columbus, OH 43202-1505, Fax: 614-461-7149, rwolff@cas.org
Abstract
Employees at small to medium enterprises must wear many different hats. With
each “hat” that they wear, they also strive to optimize their time,
present a professional image, and add value to their work. CAS provides a
number of tools and services that can assist the multi-hat wearer to not
only meet these needs, but to also meet the needs of both their internal and
external customers.
This presentation will explore how to use the latest STN software to:
1) take advantage of the patent content available on STN, 2) analyze the results to meet business critical needs, and 3) create professional-looking reports and tables.
For smaller organizations in particular, without the benefit of a sizable staff of information professonals, certain projects may require additional expertise or outside assistance to meet a critical deadline. For these situations, CAS has created Science IP, the CAS Search Service. This function is staffed with searching and analysis experts who can assist on a project by project basis. During this presentation, examples of searches with legal ramifications will be discussed and details will be provided on the advantages of working with Science IP on these types of requests.
![]()
CINF 24:
The Questel-Orbit
alternative for chemical information
Elliott Linder, Questel*Orbit, Inc, 7925 Jones Branch Drive, Fax:
703.873.4701, ELinder@questel.orbit.com, and Joseph M Terlizzi,
Questel-Orbit, 8000 Westpark Drive, jterlizzi@questel.orbit.com
Abstract
For over 25 years, Questel·Orbit has offered information specialists an
extensive collection of online patent databases containing chemical
information. For broad subject searching, the European, International, and
US classifications in our exclusive PlusPat database can be used, with easy
lookup using the ECLA and USPCL dictionary files. Narrower searching can be
conducted using the US, EP, and PCT full-text databases. For specific
chemical searching, our exclusive Merged Markush Service (MMS) for chemical
structure searching is available, as are codes and indexing in databases
produced by Derwent, IFI, CAS, INPI, and others. Special features allow the
creation of “super” display records composed of fields from any database
on the system. The standardization of patent numbers system-wide makes
cross-file searching for complementary information simple. Built-in
statistical analysis tools are easy-to-use and valuable for competitive
intelligence. This presentation will review how the techniques and features
outlined above are applicable for small chemical businesses.
![]()
CINF 25:
Instruments on the
Grid: UK national crystallography grid service
Jeremy G. Frey, Chemistry, University of Southampton, Department of
Chemistry, Highfield, Southampton SO17 1BJ, United Kingdom, Fax: +44 23 8059
3781, j.g.frey@soton.ac.uk
Abstract
We will describe the processes and infrastructure needed to develop and
deploy a grid service for access to and interaction with the UK EPRSC
National Crystallography Service (NCS) developed as part of the CombeChem
e-Science Pilot Project and with the assistance of the Centre of Excellence
in Combinatorial Chemistry, all largely based at the University of
Southampton. UK. Special consideration will be given to a discussion of the
sample tracking database and the implementation needed to run this national
service, the implications for the security of the service, and the system
employed to meet these requirements. The user interface, archiving methods
and notification systems will also be described along with the results of
the initial users experience.
![]()
CINF 26:
Computational
science and engineering online: A web-based grid-computing environment for
research and education in computational science and engineering
Thanh N. Truong, Department of Chemistry, University of Utah, 315 S,
1400 E, Room 2020, Salt Lake City, UT 84112, Fax: 801-581-4354, truong@chemistry.chem.utah.edu
Abstract
We present the development of an integrated extendable web-based simulation
environment called Computational Science and Engineering On-line (CSEO) that
allows computational scientists to perform research using state-of-the-art
tools, querying data from personal or public databases, discuss results with
colleagues, and access resources beyond those available locally from a web
browser. Currently, CSEO provides an integrated environment for multi-scale
modeling of complex reacting systems. A unique feature of CSEO is in its
framework that allows data to flow from one application to another in a
transparent manner. A particular example is demonstrated to show how results
from fundamental quantum chemistry simulations are used to calculate
thermodynamic and kinetic properties of a chemical reaction, which
subsequently are used in the simulation of a combustion reactor. Advantages,
disadvantages, and future prospects of a web-based simulation approach are
then discussed. CSEO can be accessed at http://cseo.net.
![]()
CINF 27:
Grid computing:
How applications are finally catching up to the technology
Chris Crafford, Engineering, United Devices, 12675 Research Blvd.,
Bldg. A, Austin, TX 78759, Fax: 512-331-6235, chris@ud.com, and Seetharamulu
Peddaiahgari, Director, Life Sciences Applications, United Devices
Abstract
The completion of the human genome has transformed drug discovery and
molecular targeting, vastly increasing the potential number of druggable
targets as well as information about their possible binding sites. Computer
power is essential to identifying and learning more about these targets.
With the appropriate grid solution, researchers can explore drug actions,
speed the development cycle and reduce costs, without sacrificing precision.
Several research organizations and top pharmaceutical companies are already
using the technology to gain a competitive edge. Multiple case studies will
be presented illustrating how researchers, with the help of top application
providers are using grid computing now to achieve success.
![]()
CINF 28:
Virtual screening
using grid computing
W Graham Richards, Central Chemistry Lab, University of Oxford, South
Parks Road, Oxford, OX1 3QH, United Kingdom, graham.richards@chem.ox.ac.uk
Abstract
The screen saver project currently involving the Chemistry Department at the
University of Oxford, United Devices Inc and Accelrys Inc now involves some
2.5 million PCs in over 220 countries and has provided more than 250,000
years of CPU time: an effective 100 teraflop facility. Such power permits
the virtual screening of billions of drug-like molecules against defined
protein targets within days or weeks. A review of the project and the
results obtained so far and future opportunities will be presented
![]()
CINF 29:
OpenMolGRID, a
Grid-based large-scale drug design system
Laszlo Urge1, Ákos Papp1, István Bágyi1,
Géza Ambrus2, and Ferenc Darvas1. (1) ComGenex Inc,
33-34 Bem rpk, Budapest, H-1027, Hungary, Fax: +361-214-2310, laszlo.urge@comgenex.hu,
(2) RecomGenex, Ltd
Abstract
Pharmaceutical companies are facing the challenges that modern drug
discovery requires precise "high-throughput" in silico systems
that are not only able to handle millions of structures, but can also give
accurate predictions for the requested properties. On the other hand,
mergers in the pharmaceutical industry demand the integration of
geographically distributed information and computation resources. These
challenges make indispensable the usage of GRID systems. As a consequence,
chemical applications developed for traditional environments have to be
redesigned to meet the requirements of this new technology. OpenMolGRID is
going to be one of the first realizations of the GRID technology in drug
design. The system is designed to build forward- and reverse-QSAR models,
and generate novel structures with favorable properties. The lecture details
the realization of implementing traditional chemical IT tools to solve
large-scale library design scenarios. The development of OpenMolGRID is
partly funded by the European Commission (IST-2001-37238).
![]()
CINF 30:
BioSimGRID: A
distributed database for biomolecular simulations
Jonathan W Essex1, Kaihsu Tai2, Stuart Murdock1,
Muan Hong Ng3, Bing Wu4, Steve Johnston3,
Hans Fangohr3, Paul Jeffreys4, Simon Cox3,
and Mark Sansom2. (1) School of Chemistry, University of
Southampton, Highfield, Southampton SO17 1BJ, United Kingdom, Fax: +44 (0)23
8059 3781, jwe1@soton.ac.uk, (2) Department of Biochemistry, University of
Oxford, (3) e-Science Centre, University of Southampton, (4) e-Science
Centre, University of Oxford
Abstract
Biomolecular simulations provide data on the conformational dynamics and
energetics of complex biomolecular systems. We aim to exploit the Grid
infrastructure developing in the UK to enable large scale analysis of the
results of such simulations. The BioSimGRID project (www.biosimgrid.org)
will provide a generic database for comparative analysis of simulations of
biomolecules of biological and pharmaceutical interest. The system will have
a service-oriented computing model using Grid-based Web service technology
to deliver analysis. Data mining services will be provided for the
biomolecular simulation and structural biology communities, using a Python
scripting environment. To address the security problem of the heterogeneous
BioSimGRID environment, a Grid certificate-based and a user/password-based
authentication mechanism will be integrated across the system. The back-end
of BioSimGRID is based on a relational database, with appropriate indexing
to optimize performance of the analysis package.
![]()
CINF 31:
Comb-e-Chem:
GRID-enabled chemical crystallography and a new opportunity for structural
chemistry
Michael B. Hursthouse, Department of Chemistry, University of
Southampton, Southampton SO17 1BJ, United Kingdom, Fax: 44-2380-596723,
M.B.Hursthouse@soton.ac.uk
Abstract
We are exploring the feasibility of an e-Science approach to provide an
integrated, GRID-enabled, Chemical Structure and Property Environment,
incorporating a co-ordinated high-throughput crystal structure determination
and property measurement capability, with distributed structure and property
calculations and data-base mining. We developing new software for automated
pattern searching in crystal structures, with a view to learning more about
crystal structure assembly, polymorphism and materials properties. In a
related E-Bank project, we are developing procedures for automated archiving
and dissemination of fundamental data, subsequent processing and
calculations, and the derived knowledge, so that publications in which the
new information can be assessed and presented, are not compromised by the
need to carry with it the data. This presentation will report and review the
status of these activities
![]()
CINF 32:
Semantic Grid
computing - the WorldWideMolecularMatrix
Yong Zhang1, Robert C. Glen2, Peter Murray-Rust3,
Henry S. Rzepa4, and Joe A Townsend2. (1) Unilever
Centre for Molecular Sciences Informatics, University of Cambridge,
Lensfield Road, Cambridge, United Kingdom, yz237@cam.ac.uk, (2) Department
of Chemistry, Unilever Centre for Molecular Science Informatics, (3)
Unilever Centre for Molecular Informatics, University of Cambridge, (4)
Chemistry, Imperial College
Abstract
The Semantic Web is Tim Berners-Lee's vision of knowledge-based computing for the Web. We have shown how this can be adapted to chemistry. Our implementation uses XML-CML for molecules and properties and the new IChI as a unique key calculated directly from the connection table. A molecule can be precisely differentiated from any other and retrieved by conventional database methods.
The NCI database has ca 250,000 molecules which we converted into CML using openbabel. These are stored in a native XML database, Xindice, and searched by the XPath language. We can retrieve molecules within 50 milliseconds.
Molecular properties were calculated using MOPAC2003, using Condor and the spare cpu time on 24 PCs. Times per molecule varied from 0.5 sec to 500,000 seconds; the calculations took 4 months.
The XML results are Openly available on our WorldWideMolecularMatrix, WWMM. A chemist submits a molecule. If its properties already exist they are returned; otherwise the computation is run. For new molecules the results are provided through a RSS system (CMLRSS).
The system is a peer2peer Grid for chemical information and computation. The software can be downloaded and we invite other groups to run servers with varied functions so a Semantic Grid for chemistry becomes possible.
We thank the DTI and Unilever PLC.
![]()
CINF 33:
Adaptive
informatics infrastructure for multi-scale chemical science
James D. Myers1, Larry Rahn2, David Leahy2,
Carmen M. Pancerella2, Gregor von Laszewski3, Branko
Ruscic4, and William H. Green Jr.5. (1) Collaboratory
Group Leader, Battelle / Pacific Northwest National Laboratory, Battelle
Blvd. MS K1-87, Richland, WA 99352, Fax: 509-375-6631, jim.myers@pnl.gov,
(2) Sandia National Laboratories, (3) Mathematics and Computer Science
Division, Argonne National Laboratory, (4) Chemistry Division, Argonne
National Laboratory, (5) Department of Chemical Engineering, Massachusetts
Institute of Technology
Abstract
The Collaboratory for Multi-scale Chemical Sciences (CMCS, cmcs.org) is
enabling the flow of information across physical scales and scientific
disciplines ranging from subatomic quantum chemistry to predictive
simulations of chemical processes such as combustion. CMCS is using advanced
collaboration and metadata-based data management technologies to develop a
portal providing distributed research support, community interactions, and
data discovery, management, and annotation capabilities. The portal assists
in documenting and browsing data pedigree and in communicating dependencies
between data produced at one scale and computations using it at the next. A
variety of standards-based mechanisms for extracting metadata from files,
translating between schema, converting data formats, and integrating
external applications (such as Active Thermochemical Tables) are being
developed to minimize the work required to adopt CMCS capabilities. These
capabilities are being piloted by involving key national chemistry resources
(data and software) and by supporting distributed groups performing
informatics-based chemical research in combustion science.
![]()
CINF 34:
The application of
distributed computing to computer simulations
Jonathan W Essex1, Christopher J. Woods1,
Adrian P. Willey1, Luca A. Fenu1, Andrew C. Good2,
Andrew R. Leach3, Richard A. Lewis4, and Jeremy G.
Frey1. (1) School of Chemistry, University of Southampton,
Highfield, Southampton SO17 1BJ, United Kingdom, Fax: +44 (0)23 8059 3781,
jwe1@soton.ac.uk, (2) Structural Biology and Modeling, Bristol-Myers Squibb,
(3) Computational Chemistry and Informatics, GlaxoSmithKline Research and
Development, (4) Lilly Research Centre
Abstract
Distributed computing is a very popular, and potentially very powerful,
approach for accessing large amounts of computational power. Under the
umbrella of the comb-e-chem project, we have examined both freely available,
and commercial distributed computing software. In this paper, our
experiences will be described. The performance of coarsely parallel
computations, such as protein-ligand docking, and more tightly coupled
replica-exchange molecular dynamics computer simulations will be assessed.
Issues of security will also be discussed, and in particular how security
determines the availability and utility of computers within a large
organisation.
![]()
CINF 35:
Virtual Research
Parks enable multi-organizational collaboration
Gary G Benesko, Life Sciences, IBM, 755 Cypress Rd., St. Augustine,
FL 32086, Fax: 419-735-6288, benesko@us.ibm.com
Abstract
A Virtual Research Park (VRP) is a secure, state-of-the-art, Web-based
research environment that supports and facilitates joint R&D,
collaboration, and commercial activities among Life Science Communities¨
whose boundaries extend beyond any one enterprise or geography. Each
Community can consist of multiple related organizations and individuals
united by common interests, such as
| Accelerating innovation using an advanced set of collaboration tools across an extended team | |
| Leveraging external expertise through Virtual Consulting services | |
| Streamlining the R&D process through access to Best Practice applications a wide range of data sources, and state-of-the-art R&D tools | |
| Organizing and managing common projects and common resources | |
| Sharing of common data and applications | |
| Leveraging external resources "On Demand" (e.g. compute grids, storage grids, external applications) | |
| Decreasing mutual costs via a common commercial platform with access to external suppliers and vendors of goods and services |
![]()
CINF 36:
Structure-activity
relationships for the design of molecules (STARDoM): The development and
implementation of grid-enabled, automated predictive QSAR modeling
Alexander Tropsha1, Scott Oloff2, Alexander
Golbraikh1, Chi-Duen Poon3, Terry O'Brien4,
Michael Blocksome4, Rich Dulaney4, Madhu Gombar4,
and Virinder Batra4. (1) Laboratory of Molecular Modeling, School
of Pharmacy, The University of North Carolina at Chapel Hill, 301 Beard
Hall, CB# 7360, UNC-CH, Chapel Hill, NC 27599, tropsha@email.unc.edu, (2)
Department of Pharmacology, University of North Carolina at Chapel Hill, (3)
Department of Chemistry, University of North Carolina, (4) IBM Life Sciences
Abstract
QSAR models are typically generated with a single modeling technique. Our
research has demonstrated that multiple models should be generated for any
dataset to ensure their statistical significance, and predictive power. We
have developed a combinatorial QSAR approach which explores all possible
combinations of various descriptor sets and optimization methods coupled
with external model validation. This approach required integration of
multiple individual protocols dealing with descriptor generation, model
development and validation, and model application to external database
mining to identify potentially active hits. The integration of the protocols
developed at UNC was achieved in collaboration with the IBM’s Life
Sciences team using the WebSphere framework and implemented on the North
Carolina BioGrid through a Globus Toolkit. This solution is automated,
efficient, and accessible to users via a web interface. It was successfully
applied to the discovery of novel anticonvulsant agents as well as novel
ligands of the P2Y12 receptor.
![]()
CINF 37:
Development of a
personal computing environment for molecular design on Grid
Umpei Nagashima1, Takeshi Nishikawa1, Satoshi
Sekiguchi1, Sumie Tajima2, Toru Yagi2,
Takeshi Kitayama2, and Makoto Haraguchi2. (1) Grid
Technology Research Center, National Institute of Advanced Industrial
Science and Technology, Umezono 1-1-1, Tsukuba, Japan, Fax: +81-29-861-5301,
u.nagashima@aist.go.jp, (2) Bestsystems Inc
Abstract
We are developing a personal computing environment for molecular design on
Grid as an attempt of computational chemistry on Grid environment. In this
talk, we introduce tow products: Molworks(http://www.molworks.com) and
Gaussian Portal. MolWorks supports molecular modeling, input data
generation, output analysis and Job controls of Molecular orbital
calculation on Grid. Property estimation function of molecules is also
supported. Gaussian Portal is an attempt to construct a framework for
Grid-enabled application service provider. These tow products are expected
to realize a desktop virtual laboratory for Chemists and achieve high
throughput by PC clusters, supercomputers and databases integration with
intelligent scheduler.
![]()
CINF 38:
Heterojunctions of
nanomaterials and organic-inorganic nanoassemblies
Cengiz S. Ozkan, Electrical and Chemical Engineering, Biomaterials
and Nanotechnology Laboratory, Center for Nanoscience Innovation for
Defense, University of California, Riverside, CA CA 92521, cengiz.ozkan@ucr.edu
Abstract
Nanomaterials including carbon nanotubes and nanocrystals have considerable
potential as building blocks in future nanoelectronics and
bio-nanotechnology applications. The unique electrical, mechanical, and
chemical properties of CNT’s have made them intensively studied materials
in the field of nanotechnology within the last decade. Nanocrystals or
quantum dots provide a remarkable opportunity for designing artificial
solids, since they possess unique and controllable physical and chemical
properties based on composition, structure and their size. Another heavily
investigated area includes the conjugation of inorganic nanomaterials with
biomolecules including DNA and protein for various applications in
bio-nanotechnology. In this talk, I will first describe approaches for the
synthesis of nano-assemblies of carbon nanotubes and quantum dots. Such
functional nanostructures could become better alternatives for the
fabrication of nanoscale electronic and photonic devices. They could also be
useful for the bottom-up assembly of nanosystems as part of larger or
microsystem technologies. Detailed chemical and physical characterization of
the nanostructures will be presented via transmission electron microscopy
and Fourier transform infrared spectroscopy. Next, approaches for
encaspulating biological molecules including DNA inside carbon nanotubes
which could be useful for a number of applications including novel
electronics, DNA sequencing and drug delivery systems will be presented.
DNA-oligo labeled with nano-colloid particles are encaspulated into
multiwalled carbon nanotubes and the nanoassemblies are characterized via
transmission electron microscopy and energy dispersive spectroscopy.
![]()
CINF 39:
Effects of the
presence of nanotubes on heat transfer in microfluidics
Nishitha Thummala, and Dimitrios V Papavassiliou, School of Chemical
Engineering and Materials Science, The University of Oklahoma, 100 E Boyd,
SEC T-335, Norman, OK 73019-1004, Fax: 405-325-5813, nishitha@ou.edu
Abstract
The drive for technical advancements in the micro/nano world, emerging from
the desire to manipulate flow fields at smaller and smaller scales, is
indeed challenging. An effective and reliable numerical tool for the
analysis of transport properties in microfluidics is the Lattice Boltzmann
Method (LBM). It can efficiently link the microscopic and macroscopic
phenomena. Our group is using LBM to simulate single-phase flow in
configurations like parallel plates, porous media. The paper will focus on
simulation of heat transport from surfaces that have nanotubes aligned
vertically as line sources or horizontally as point sources. Lagrangian
Scalar Tracking (LST) methods are used to track the trajectories of heat
particles released in the flow field, and to synthesize the behavior of the
mean temperature profile from the behavior of the instantaneous sources of
heat. The effect of the presence of nanotubes on the heat transfer
characteristics will be discussed.
![]()
CINF 40:
Computational
nanotechnology: Bridging lengthscales with Materials Studio
Amitesh Maiti, Gerhard Goldbeck-Wood, and Scott Kahn, Accelrys
Inc, 9685 Scranton Road, San Diego, CA 92121, Fax: 858-799-5100, amaiti@accelrys.com,
scott@accelrys.com
Abstract
Nanotechnology holds tremendous economic and scientific potential, yet it
will cost industry a considerable amount of time, money, and resources to
research and develop new processes, devices, and synthesis techniques. The
use of rational materials discovery software tools in conjunction with
experimentation can lower this barrier significantly, and lead to new
insights that may not be possible otherwise. Technologically important
nanomaterials come in all shapes and sizes. They can range from small
molecules to complex composites and mixtures. Depending upon the spatial
dimensions of the system and properties under investigation, computer
modeling of such materials can range from first-principles Quantum
Mechanics, to Forcefield-based Molecular Mechanics, to mesoscale simulation
methods, to the prediction of structure-property relationships. All of the
above computational techniques are available in Accelrys’ integrated PC
platform Materials StudioTM, as illustrated through a number of recent
applications: (1) carbon nanotubes (CNTs) as nano electromechanical sensors
(NEMS); (2) Metal-oxide nanoribbons as chemical sensors; (3) mesoscale
modeling of polymer-CNT nanocomposites; and (4) mesoscale diffusion of drug
molecules across cell membranes.
Another big challenge for the nanotechnologist is the very large space of possible material parameters and processing routes. Recent developments in Materials Informatics provide crucial knowledge management and data mining tools for better, cheaper and faster materials development. Design of Experiment, Combinatorial and High Throughput materials design software help to focus research and development on the most promising areas.
![]()
CINF 41:
Chemical
information resources for nanotechnology
Robert A Stembridge, Global Marketing Services, Thomson Scientific,
14 Great Queen Street, London, United Kingdom, bob.stembridge@thomson.com
Abstract
Nanotechnology is a young area dating back to Richard Feynman's intellectual
demonstration in 1959 of the possibility of placing a facsimile of the
entire Encyclopaedia Britannica on a pin-head. Much information is still in
the realm of research papers published in learned journals and on the web,
but increasingly practical applications of the technology are appearing in
the patent literature, particularly in the area of chemical nanotechnology.
This paper will illustrate these trends, examine the challenges for the user
of tracking multiple sources of this information and discuss possible
solutions to these problems.
![]()
CINF 42:
A method for
estimating the composite solubility vs. pH profile
Michael B. Bolger, Pharmaceutical Sciences, USC School of Pharmacy
and Simulations Plus, Inc, 1985 Zonal Ave. PSC 700, Los Angeles, CA 90089,
Fax: 323-442-1390, bolger@usc.edu, Christel Bergstrom, Department of
Pharmacy, Uppsala University, Robert Fraczkiewicz, Life Sciences Department,
Simulations Plus, Inc, and Per Artursson, Division of Pharmaceutics, Uppsala
University
Abstract
Purpose: To predict the shape of the composite solubility vs. pH
profile by using purely in silico estimation. Method: The
complete solubility vs. pH profile for 25 monobasic drug molecules was
collected and molecular descriptors were generated using QMPRPlus. We then
examined relationships between intrinsic solubility and several other
molecular descriptors to predict the solubility factor (ratio of solubility
for ionized over unionized). Results: A simple linear relationship
between intrinsic solubility and solubility factor showed that the
solubility factor is inversely proportional to the experimental value of
intrinsic solubility. We then developed a multiple linear regression
equation to predict log of solubility factor using intrinsic solubility and
number of hydrogen bond donors and acceptors as independent variables. Conclusions:
A relationship between log of intrinsic solubility and solubility factor,
when corrected for the number of hydrogen bond donors and acceptors can
provide a good estimate of salt solubility for a small set of monoprotic
basic drugs.
![]()
CINF 43:
A systematic name
generator module for Marvin
Szilveszter Juhos, Gyorgy Pirok, and Ferenc Csizmadia, ChemAxon Ltd,
Maramaros koz 3/a, 1037 Budapest, Hungary, Fax: +36 1 4532659, sjuhos@chemaxon.com
Abstract
Constructing systematic names for single molecules based on IUPAC rules can be rather time-consuming and requires chemists experienced in complex nomenclature. Naming a large number of structures manually is practically impossible so several automatic name generating software tools have been developed.
Our module is a platform-independent Java plugin linked to Marvin to facilitate generating IUPAC names for individual molecule sketches or for whole databases via batch processing. It can be easily integrated into other Java applications or applied over intranet/web pages. The throughput and accuracy of name generation will be demonstrated in the poster.
![]()
CINF 44:
Chemical
information in Medline/PubMed
Beryl M. Benjers, Index section, National Library of Medicine,
Bethesda, MD 20894, Fax: 301-402-2433, benjersb@mail.nlm.nih.gov
Abstract
MEDLINE contains more than 12 million citations from 1966 to present.
Pre-1966 citations are now being added in the OldMEDLINE. More than 4,500
journals in languages from around the world are indexed. Last year over
537,000 indexed citations were added to MEDLINE. Indexers analyze the
article and index at an average rate of four articles/hour, applying 8-10
subject terms from MeSH, NLM’s controlled vocabulary. New indexers attend
a rigorous two-week training course at NLM and then work closely with a
reviser, who reviews their work. An asterisk with a MeSH subject term
indicates the main point of an article, and that the article will be cited
under that term in Index Medicus, the print counterpart of MEDLINE. MEDLINE
citations and abstracts are available as the primary component of NLM’s
PubMed database and retrieval system, which is searchable free-of-charge via
the Internet.
MeSH contains 22,568 descriptors, of which 7,355 are chemical descriptors, supplemented by 138,526 chemical concepts (Supplementary Concept Records). New MeSH descriptors are added annually while Supplementary Concept Records are added daily as they are encountered in the indexed literature. New chemicals are electronically flagged for the chemical specialists, who study, research, update, and/or create new records as needed, and add them to the indexed citation and MeSH Browser. This allows MEDLINE citations to be indexed with the existing terms as well as the new ones.
MEDLINE indexing of chemical concepts includes coordination with a Pharmacological Action (PA) when appropriate. Indexing Information (II) terms may also be added with chemicals (e.g. disease/organism associated with a chemical).
The MeSH Browser is available at http://www.nlm.nih.gov/mesh/2004/MBrowser.html and can be searched by MeSH terms, Supplementary Concepts, ID, II, PA, RN, RR and EC numbers. MEDLINE/PubMed can be searched by MeSH terms, Supplementary concepts, authors, text words, journal, etc.
The National Library of Medicine (NLM) Home pages (http://www.nlm.nih.gov) offer information and links to other databases, such as MEDLINEplus and CHEMIDPlus.
![]()
CINF 45:
Conformational
folding process of a small-peptide predicted by using CONFLEX conformation
search and GRID technology
Hitoshi Goto1, Kazuo Ohta2, Umpei Nagashima3,
Yoshihiro Nakajima4, Mitsuhisa Sato4, and Hiroshi
Chuman5. (1) Department of Knowledge-based Information and
Engineering, Toyohashi University of Technology, Toyohashi 441-8055, Japan,
Fax: 81-532-48-5588, gotoh@cochem2.tutkie.tut.ac.jp, (2) Conflex
Corporation, (3) Grid Technology Research Center, National Institute of
Advanced Industrial Science and Technology, (4) Graduate School of Systems
& Information Engineering, University of Tsukuba, (5) Faculty of
Pharmaceutical Sciences, University of Tokushima
Abstract
Among the fundamental problems in elucidation of biomolecular functions with
the aid of theoretical and computational chemistry, the first difficulty to
overcome is the conformational flexibility problem, especially, related to
the folding problem of proteins. To resolve these challenging problems, we
have started on improvements of our original conformational space search
method gCONFLEXh using parallel computing and Grid techniques. In the
previous ACS meeting, we reported a master-and-worker parallelization and
GRID world-wide distributed computing techniques used in CONFLEX
conformation search algorithm, and those performances data of some small
peptides. In this Anaheim meeting, a folding process of a small polypeptide,
which is predicted by conformational analyses using a clustering technique
based on the conformational distance matrix among backbone conformations,
will be presented. Some interesting animations and movies are also
demonstrated.
![]()
CINF 46:
Combining
fingerprints and other descriptors in virtual HTS
Zsuzsanna Szabo, Miklos Vargyas, Ferenc Csizmadia, and Gyorgy Pirok,
ChemAxon Ltd, Maramaros koz 3/a, 1037 Budapest, Hungary, Fax:
+36-1-453-2659, , fcsiz@chemaxon.com
Abstract
Various aspects of virtual screening using molecular descriptors of 2-dimensional chemical structures have been investigated over the last two years at ChemAxon. The work involved the implementation of various descriptors and metrics as wellas the optimization of some of the parameters. The poster to be presented summarizes our results to date.
When setting up a virtual screening experiment, researchers are faced with the problem of choosing the right combination of the available descriptors. Additionally, some descriptors may allow several parameters which overall increases the degree of freedom dramatically. Finally, when comparing descriptor values one can choose from numerous dissimilarity metrics. To cope with this freedom of choice an automated optimization tool has been implemented.
This tool has proved to be successful in helping chemists to choose suitable descriptors, metrics and parameter values for virtual screening. It will be demonstrated that optimization can increase the enrichment ratio of the screening procedure.
![]()
CINF 47:
Drug discovery
using grid technologies and DrugML
Michiaki Hamada, Science and Technology Group, Fuji Research
Intstitute Corporation, Tokyo 101-8443, Japan, mhamada@star.fuji-ric.co.jp, Yuichiro
Inagaki, Science and Technology Group, Fuji Research Institute
Corporation, Tokyo 101-8443, Japan, yinagaki@star.fuji-ric.co.jp, Hitoshi
Goto, Toyohashi University of Technology, Umpei Nagashima, National
Institute of Advanced Industrial Science and Technology, Shigenori Tanaka,
Toshiba Research and Development Center, and Hiroshi Chuman, Tokushima
University
Abstract
A number of computer resources, such as CPUs and storages, can be connected
over networks to construct a huge virtual computing environment using grid
technologies. Our project "g-Drug Discovery" aims at developing a
platform for drug design using grid technologies, on which various analysis
and calculations are conducted, such as molecular mechanics method, replica
exchange method, docking with proteins, molecular orbital method, and
3-dimensional quantitative structure activity relationship. For storing data
of structures of compounds, descriptors, and calculation results, we are
making DrugML by extending CML. One can use these grid technologies with
DrugML in from rough screening with drug likeness or ADMET properties to
screening by very precise calculation.
![]()
CINF 48:
Investigation of
molecular chirality in 3D chemical structure databases
Zengjian Hu1, William M. Southerland1,
and Shaomeng Wang2. (1) Department of Biochemistry and Molecular
Biology, Howard University College of Medicine and the Howard University
Drug Discovery Unit, 520 West Street, Northwest, Room 324, Washington, DC
20059, huzengjian@hotmail.com, wsoutherland@howard.edu, (2) Departments of
Internal Medicine and Medicinal Chemistry, University of Michigan
Abstract
In recent years, virtual screening of chemical databases using molecular
docking has emerged as the most important tool and a well-established method
in drug discovery for finding new leads. The first step in virtual screening
is to create a searchable database of three-dimensional structures of small.
In the past few years, we have created 9 small molecule 3D searchable
databases which contain more than 1,000,000 molecular entries, and could be
used to discover interesting ligands for various pharmaceutical targets.
When production of 3D chemical databases for screening purposes, we found
that there is no information about absolute stereochemistry (R-S) and double
bond geometry (E-Z) of most compounds contained in the 2D chemical database
connection tables. Today more than 50% of marketed drugs are chiral. Chiral
drugs have become a major focus of most pharmaceutical companies, which are
safer, exhibit fewer side effects, and are more potent than the drugs
previously used. As chiral molecules will certainly play a role in the
exploitation of 3D space for the development of new drugs, the creation of a
3D database with the consideration of chirality of molecules will be
beneficial for the discovery of lead compound binding to molecular targets.
As the first step, we analyzed the chirality of molecules in our 10
three-dimensional databases. It was found that about 29% of the compounds in
these databases were chiral compounds with about 62% compounds in CGE
database being chiral compounds while only about 14% compounds in MCC
database have chirality. It could be seen that most chiral molecules in
these 3D databases have only one chiral center, but it is not rare for
compounds with more than 10 chiral centers. The maximum of chiral centers in
a molecule could be more than 60. It is well known that in general, if a
molecule has n chiral centers, there are 2n different possible stereoisomers.
Therefore, the entries in a 3D databases considering chirality will be
doubled for molecules with one chiral center if there is no any symmetry
elements in the molecule. The creation of th
![]()
CINF 49:
Molecular
modelling for organic chemists: A chemical informatics problem
Jonathan M Goodman, Unilever Centre for Molecular Science
Informatics, Cambridge University, Department of Chemistry, Lensfield Road,
Cambridge CB2 1EW, United Kingdom, Fax: +44 1223 336362, J.M.Goodman@ch.cam.ac.uk,
and María A. Silva, Unilever Centre for Molecular Science Informatics,
University of Cambridge
Abstract
Both molecular modelling and organic chemistry generate and use large
amounts of information, which should be mutually beneficial. However, it can
be difficult to persuade experimental organic chemists to use molecular
modelling, as force field methods cannot be applied to many transition
states and molecular orbital methods are too slow to calculate the behaviour
of many reactions before the experimental result makes the calculation of
less immediate interest. We use a combination of molecular mechanics and
molecular orbital methods in a ‘Chemical Information Laboratory’
(http://www.ch.cam.ac.uk/SGTL/gle/) in order to gain information of
experimental relevance quickly enough to be useful. For example, chemical
information has been generated about the molecules illustrated using this
process, so improving our knowledge of structure and reactivity.
![]()
CINF 50:
Chemical education
markup language: An XML namespace for educational chemistry software
Daniel C. Tofan, Department of Chemistry, State University of New
York, Stony Brook, NY 11794-3400, Fax: 631-632-7960, dtofan@mail.chem.sunysb.edu
Abstract
The Chemical Education Markup Language (ChEdML) is being developed as an XML
namespace to allow learning management systems to include chemical content.
ChEdML was initially intended to provide extensions to the current IMS
specifications for question and test item interoperability (QTI) XML
binding. Such extensions allow authors to create items containing responses
that use chemical symbolism. Examples include chemical reactions, electron
configurations, Lewis structures, measures with units etc. Tags were also
developed to format chemical information for display on web pages. A
complete XML tag set is now under development to encompass a full curriculum
of introductory chemistry. ChEdML also provides a mechanism to parameterize
items and to include equations to calculate numeric responses. This allows
the generation of item templates that can be instantiated at runtime with
appropriate parameters. A Java API is being developed to support the
generation and use of ChEdML.
![]()
CINF 51:
Oligopeptide
transporter (PepT1) homology model based on lactose permease (LacY)
Michael B. Bolger, Pharmaceutical Sciences, USC School of Pharmacy,
1985 Zonal Ave. PSC 700, Los Angeles, CA 90089, Fax: 323-442-1390, bolger@usc.edu
Abstract
Purpose. To build a homology model of the oligopeptide / proton
co-transporter PepT1 based on the crystal structure of bacterial lactose /
proton co-transporter. Methods. The centers of transmembrane spanning
domains (TMDs) in LacY plus the 22 amino acids that comprise each of the
twelve TMDs were selected. The software package “Proteotoolbox™” was
used to guide the threading of the sequence of PepT1 onto the 3D-structure
of LacY to allow for maximal overlap of the 2D and 3D hydrophobic moments.
Finally, the experimental results for site-directed mutagenesis were
examined in light of this new homology model to identify structural basis
for those results. Results. Site directed mutation results and
cysteine-scanning for TMD 5 and 7 were explained on the basis of the PepT1
model. The new model helps to explain the involvement of key histidine
residues in the proton translocation process. Conclusions. The new 3D
model extends and enhances our previous results (J. Pharm. Sci. 87(11):1286
1998) and provides additional insight into the structure and function of the
oligopeptide transporter.
![]()
CINF 52:
Multi-conformational
3D databases: Quality assessment and pharmacophore search capabilities in
MOE
Morten Langgaard, Berith Bjornholm, Anne Marie Munk Jorgensen, and
Klaus Gundertofte, Department of Computational Chemistry, H. Lundbeck A/S,
Ottiliavej 9, Dk 2500 Valby, Denmark, Fax: +45 3643 8237, mol@lundbeck.com
Abstract
In this study we report our experiences with the software solution MOE with
respect to building multi-conformational databases and performing
pharmacophore searches. Template pharmacophores derived from crystal
structures of known protein-ligand complexes as well as classically derived
pharmacophore models are used for the evaluation. Conformational coverage
and the quality of each conformation of the developed multi-conformational
3D databases are evaluated thoroughly. The analysis of the search results
focusses on hit rate, quality of hits, and the impact of pharmacophoric
element selections for the query. Practical issues like speed, storage and
management of databases are also addressed. The performance of MOE with
respect to the above-mentioned issues will be discussed and compared to the
more established method Catalyst.
![]()
CINF 53:
A combinatorial
DFT study of how cisplatin binds to purine bases
Leah Sandvoss, and Mu-Hyun Baik, Department of Chemistry, Indiana
University, 1200 Rolling Ridge Way #1311, Bloomington, IN 47403, lsandvos@indiana.edu
Abstract
Cisplatin (cis-diamminedichloroplatinum(II)) continues to attract much
attention because of its therapeutic importance as an anticancer drug. It
binds primarily to the N7 positions of adjacent guanine (G) sites in genomic
DNA, causing intrastrand cross-links, which suppress replication and lead
ultimately to cell death. Previous work showed both kinetic and
thermodynamic preference of G over adenine for the platination reaction. The
goal of this study is to obtain a chemically intuitive explanation for this
selective behavior of cisplatin by systematically comparing the electronic
structures of a diverse set of functionalized purine bases. A computational
combinatorial library of over 1500 purine derivatives was designed based on
density functional theory calculations and the changes of the most important
molecular orbitals as a function of structural variance were examined in
detail. This electronic profile for purine bases reveals how electronic hot
spots control the reactivity at the N7 position (see figure).
![]()
CINF 54:
Study of
selectivity from a pharmacophore perspective
Klaus Gundertofte, Berith Bjørnholm, and Morten Langgård,
Department of Computational Chemistry, H. Lundbeck A/S, Ottiliavej 9, Dk
2500 Valby, Denmark, kgu@lundbeck.com
Abstract
A number of pharmacophore models covering G protein-coupled receptors and
transporters primarily from the monoaminergic families of targets have been
developed. The general methodology will be described as well as performance
of different methods, e.g. MOE and Catalyst, applied in the development. In
order to elucidate selectivity issues across the targets studied, a
comparison of the models characterised by their pharmacophoric elements was
done. The analysis of the pharmacophore patterns revealed remarkable
resemblances or superpharmacophores. Distinct differences between the models
were also found. The impact of these findings in medicinal chemistry
projects will be discussed.
![]()
CINF 55:
Successful
shape-based virtual screening: The discovery of a potent inhibitor of the
type I TGFb receptor kinase (TbRI)
Juswinder Singh, and Claudio Chuaqui, Structural Informatics, Biogen,
12 Cambridge St., Cambridge, MA 02142, Fax: 6176792616, Juswinder_Singh@Biogen.com
Abstract
We describe the discovery, using shape-based virtual screening, of a potent,
ATP site-directed inhibitor of the TbRI kinase, an important and novel drug
target for fibrosis and cancer. The first detailed report of a TbRI kinase
small molecule co-complex confirms the predicted binding interactions of our
small molecule inhibitor, which stabilizes the inactive kinase conformation.
Our results validate shape-based screening as a powerful tool to discover
useful leads against a new drug target
![]()
CINF 56:
HypoRefine:
Automated identification of exclusion volumes in pharmacophore models
Allister J. Maynard, Marvin Waldman, and Jon Sutter, Accelrys, 9685
Scranton Rd., San Diego, CA 92121, Fax: 858 799 5100
Abstract
This presentation provides an overview of the HypoGen pharmacophore
generation algorithm. HypoGen is a ligand-based QSAR tool using
pharmacophoric overlap to predict activity.
A limitation of HypoGen is that activity prediction is based purely on the presence and arrangement of pharmacophoric features – steric effects are unaccounted for. A novel modification to HypoGen is described (HypoRefine). HypoRefine accounts for steric effects on activity, based on the targeted addition of excluded volume features to the pharmacophores. These excluded volumes attempt to penalize molecules occupying steric regions not occupied by active molecules.
Details of the steric detection and excluded volume addition algorithm are presented, along with some examples illustrating how excluded volumes improve the QSAR pharmacophore models.
![]()
CINF 57:
Automatic
generation of multiple pharmacophore hypotheses
Simon Cottrell1, Valerie J. Gillet1, and Robin
Taylor2. (1) University of Sheffield, Western Bank, Sheffield S10
2TN, United Kingdom, s.cottrell@sheffield.ac.uk, v.gillet@sheffield.ac.uk,
(2) Cambridge Crystallographic Data Centre
Abstract
Pharmacophore methods provide a way of establishing a structure-activity
relationship for a series of known active ligands. Often, there are several
plausible hypotheses that could explain the same set of ligands and in such
cases, it is important that the chemist is presented with alternatives that
can be tested with different synthetic compounds. Existing pharmacophore
methods involve either generating an ensemble of conformers and considering
each conformer of each ligand in turn or exploring conformational space
on-the-fly. The ensemble methods tend to produce a large number of
hypotheses and require considerable effort to analyse the results, whereas
methods that vary conformation on-the-fly typically generate a single
solution that represents one possible hypothesis even though several might
exist. We will describe a new method for generating multiple pharmacophore
hypotheses with full conformational flexibility being explored on-the-fly.
The method is based on multiobjective evolutionary algorithm techniques and
generates a manageable number of different yet plausible hypotheses.
![]()
CINF 58:
PepT1 substrate
transport pharmacophore determinants: Refinement with data from a single
consistent functional assay
Terry R Stouch1, Teresa Faria2, and Julita
Timoszyk2. (1) Computer-Assisted Drug Design, Bristol-Myers
Squibb Pharmaceutical Research Institute, MS H23-07, PO Box 4000, Princeton,
NJ 08543-4000, Fax: 609-252-6030, terry.stouch@bms.com, (2) Exploratory
Biopharmaceutics and Stability, Bristol-Myers Squibb, Pharmaceutical
Research Institute
Abstract
PepT1 is a primary intestinal transporter of di and tripeptides. It also
transports large quantities of important pharmaceuticals, such as beta-lactams
and ACE inhibitors. The ability to function as a substrate for this channel
can appreciably increase the absorption of drugs whose passive permeation
rates might be low or nill. Data was collected on a series of ligands using
recently developed single fluorescent function assay. The ligands were
specifically chosen to elucidate the important determinants of transport. A
wide range of different rates of transport was evidenced, even for
dipeptides. Coupled with conformational analysis and molecular overlays, a
fairly simple pharmacophore of five elements was developed that can be used
to retrieve known substrates.
![]()
CINF 59:
Structure and
information theory derived pharmacophores as pre- and post-filters for
docking
Kenneth E. Lind, Erik Evensen, Hans Purkey, Robert McDowell, and Erin
K. Bradley, Computational Sciences, Sunesis Pharmaceuticals Inc, 341 Oyster
Point Blvd., South San Francisco, CA 94080, klind@sunesis.com
Abstract
Screening virtual compound collections has been a valuable method for
finding starting points in the drug discovery process. This is often done
through structure-based docking or ligand-based pharmacophore searching.
These methods are more effective than random searching, but both have
inherent limitations. It would be useful to have methods that make optimal
use of both techniques to improve the selection of active molecules. In this
study we compare standard docking and pharmacophore search techniques to
methods that use different permutations to combine both methods, such as
docking as a pre-filter for a pharacophore search, or vice versa. The
methods are evaluated against CDK-2 for their ability to select known
inhibitors and their overall enrichment rates.
![]()
CINF 60:
A new method for
pharmacophore identification
S. Stanley Young, Jun Feng, and Ashish Sanil, National Institute of
Statistical Sciences, 19 T.W. Alexander Dr, Research Triangle Park, NC
27709, young@niss.org, feng@niss.org
Abstract
Abstract
The binding of a small molecule to a protein is inherently a 3D matching problem. As crystal structures are not available for most drug targets, there is a need to be able to infer key binding features and their disposition in space, the pharmacophore, from bioassay data. We use fingerprints of 3D features and a new approach to uncover the common pharmacophore for a set of compounds. We describe the algorithm and basic benchmarking. Knowing the 3D pharmacophore for a target should allow better data base searching and more efficient compound design.
![]()
CINF 61:
A 3DPL case study:
Finding new active molecules for the inhibition of calcineurin
Tad Hurst, Scientific Software, ChemNavigator, 6126 Nancy Ridge
Drive, Suite 117, San Diego, CA 92121, Fax: 858-625-2377, thurst@chemnavigator.com
Abstract
The 3DPL Database Docking system has been demonstrated to be effective at
extracting known active molecules from sets of inactive compounds in many
test cases. The 3DPL technology can dock structures into a receptor
structure at rate of up to 30/second, thus allowing in silico investigation
of millions of database structures. In this paper, we detail the application
of 3DPL to select from over 11 million chemical structures in the
ChemNavigator iResearch Library to find 25 screening candidates. Samples of
these 25 compounds were acquired and tested for calcineurin inhibition. Four
of the compounds were found to be micro-molar inhibitors. Three of these
compounds share a common core structure, and represent a new area for
possible lead development.
![]()
CINF 62:
Facilitating
virtual screening workflows: The PyFlexX/E/S/-Pharm and PyFTrees modules
Sally Ann Hindle1, Frank Sonnenburg1, Marcus
Gastreich2, and Christian Lemmen1. (1)
Chemoinformatics, BioSolveIT GmbH, An der Ziegelei 75, 53757 St. Augustin,
Germany, Sally.Hindle@biosolveit.de, (2) BioSolveIt GmbH
Abstract
Virtual screening usually requires several programs. This entails file
format conversions, conceptually superfluous I/O, manual selection of data,
consideration of interims-results and so on.
Python - a wide-spread, cross-platform, open-source and easy-to-read scripting language - allows for a wrapping of native C-applications in a Python layer, thus generating a modular world of applications which may easily be "plugged" together within a single Python script.
We have recently taken this step with our cheminformatics tools: FlexX/-E/C/-Pharm (docking), FlexS (small molecule alignment), and Feature Trees (similarity comparisons) may now be used within this scripting environment, sharing information instead of transferring it. An instant benefit is the availability of open-source Python packages for analysis and visualisation.
This concept drastically facilitates virtual screening experiments; moreover it allows for rapid prototyping of virtual screening protocols and parameter studies which shall be demonstrated in an application example.
![]()
CINF 63:
Fast Lead
Identification Protocol (FLIP) for structure based data mining using 3D
fingerprints
Amit, S Kulkarni, Scientific Services, Accelrys Inc, 9685 Scranton
Road, San Diego, CA 92121
Abstract
Structure based drug design is the method used to identify and optimize
pharmaceutical leads when the crystal, NMR structure or homology model of a
specific target protein is known. Virtual screening of corporate libraries,
external compound collections and virtual compounds using various docking
methods is routine in the drug discovery process. We are proposing a new
virtual high throughput screening approach that we term “FLIP” (Fast
Lead Identification Protocol) that uses the potential protein-ligand
interaction sites in the active site of the target protein to data-mine
compound collections. This proposed approach has the advantage of being
extremely fast and can potentially be used for any target protein structure
![]()
CINF 64:
Conformation
mining: Shrinking chemical space to find biologically-active molecules
Santosh Putta, Gregory A. Landrum, and Julie E. Penzotti, Rational
Discovery LLC, 555 Bryant St. #467, Palo Alto, CA 94301, sputta@rationaldiscovery.com
Abstract
Discovering the essential three-dimensional steric and chemical features
shared by active compounds is an important step in designing drug
candidates. However, the flexibility of actives often allows them to adopt
several low-energy conformations, some of which are not important for
biological activity. Conformational flexibility complicates the task of
finding important features by forcing a search through a conformational
space with dimensions that increase exponentially with the number of
actives. Model building approaches typically address this problem either by
using a small subset of conformations (e.g. most extended or lowest energy)
or by encoding all of a compound’s conformations in a single fingerprint.
The first approach may miss biologically-important conformations while the
second risks masking critical information available only from individual
conformations.
Here we explore techniques for efficiently mining the conformational space of multiple compounds. Our goal is to find a subset of biologically-important conformations and understand and exploit their commonalities.
![]()
CINF 65:
Hit-directed
nearest neighbor searching
Veerabahu Shanmugasundaram, Computer-Assisted Drug Discovery, Pfizer
Global Research & Development, 2800 Plymouth Road, Ann Arbor, MI 48105,
Fax: 734-622-2782, Veerabahu.Shanmugasundaram@pfizer.com, and Gerald M
Maggiora, Department of Pharmacology and Toxicology, University of Arizona
Abstract
Follow-up of initial hits resulting from HTS is crucial if the hits are
ultimately to give rise to useful lead compounds. Several approaches may be
employed to select compounds from the Research Compound Collection or from
commercially available collections for follow-up screening. Similarity
searching based upon the similarity of the molecular fragments possessed by
the molecules, yields compounds that are similar in structure to the hits.
Nearest-neighbor searching of BCUT Chemistry Space identifies compounds that
have similar BCUT values and hence similar electrostatic, hydrophobic and
hydrogen bonding properties. In contrast to molecular fingerprint based
similarity searching that looks for similar scaffolds in molecules, nearest
neighbor searching identifies isobiological molecular structures with
significantly different molecular scaffolds. Several examples illustrating
the application and the success of this methodology will be presented.
![]()
CINF 66:
AGENT: A program
generating tautomers for computer-aided drug design
Patrick Ballmer, Pavel Pospisil, Gerd Folkers, and Leonardo Scapozza,
Department of Chemistry and Applied Biosciences, Swiss Federal Institute of
Technology (ETH), Winterthurerstr. 190, 8057 Zurich, Switzerland, Fax:
01141-1-6356884, patrick.ballmer@ethz.ch
Abstract
Several cases documenting the impact of ligand tautomerism on protein-ligand
binding are described in the literature. AGENT has been developed to provide
a tool to study this phenomenon. AGENT can be used to create chemically
(energetically) reasonable tautomers of molecules stored in a 3D-input file.
The created tautomeric forms can be directly used for molecular docking
studies. The purpose of AGENT is thus to enrich a given small
molecule-database with tautomeric forms, which are not unlikely to be able
to exist in a protein active site. The number of tautomers created by AGENT
is restricted either by chemical rules or by a user-defined energy threshold
limiting the tolerated, semiempirically calculated Gibbs free energy of
tautomer formation.