LIBRARY
Michigan State
University

 

 

 

This is to certify that the
dissertation entitled

STRUCTURE-BASED LIGAND SCREENING AND DESIGN
FOR AMINOACYL-tRNA SYNTHETASE INHIBITORS

presented by

SAI CHETAN K. SUKURU

has been accepted towards fulﬁllment
of the requirements for the

PhD. degree in Biochemistry and Molecular
Biology

 

5/0., £154: %_ 14% ___..

 

Major Professor’s Signature
./17//:?/c 7
Date

MSU is an affirmative-action, equal-opportunity employer

 

—._._.—u-o--¢n-._.-

PLACE IN RETURN BOX to remove this checkout from your record.
TO AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE

DATE DUE

DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

6/07 p:/C|RCIDaleDue.indd-p,1

 

 

STRUCTURE-BASED LIGAND SCREENING AND DESIGN FOR
AMINOACYL—tRNA SYNTHETASE INHIBITORS

By

Sai Chetan K. Sukuru

A DISSERTATION

Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY
Department of Biochemistry and Molecular Biology

2007

ABSTRACT

STRUCTURE-BASED LIGAND SCREENING AND DESIGN FOR
AMINOACYL-tRNA SYNTHETASE INHIBTORS

By

Sai Chetan K. Sukuru

Asparaginyl-tRNA synthetase (AsnRS) is a rational target for drug development against
lymphatic ﬁlariasis caused by the human parasite, Brugia malayi. This thesis describes
the application of structure-based computational techniques to identify novel Brugia
AsnRS inhibitors. A new method, developed to incorporate speciﬁcity determinants in
virtual ligand screening and design, is also presented. Large databases of organic
molecules were screened using SLIDE to identify potential inhibitors of the ATP binding
site in a 1.9 A resolution Brugia AsnRS crystal structure. SLIDE is a structure-based
virtual screening tool that models the ﬂexibility of protein and ligand side chains while
docking. Seven new classes of compounds, identiﬁed by SLIDE, were conﬁrmed as
Brugia AsnRS inhibitors in experimental assays. Analogs of variolin B, one of the
inhibitors, showed 3- to 8-fold selectivity for Brugia over human AsnRS. These analogs,
unlike variolin B, cannot bind to the closed conformation of the Brugia AsnRS crystal
structure due to steric clashes. Modeling of different main-chain loop ﬂexibility in Brugia
and human AsnRS, due to a single amino acid sequence difference at the base of the loop,

can explain the selectivity of these analogs.

Variolin B and triazinylamine, the top most potent inhibitors in experimental
assays, are predicted to bind in the adenosyl pocket of Brugia AsnRS. To optimize these
compounds to enhance their binding afﬁnity and speciﬁcity for the enzyme, the analogs
with different substituents were designed. The energetic favorability of the designed
analogs was assessed using predicted protein-ligand complementarity scores and the
difference in the ligand internal (conformational) energies between their bound and
lowest-energy free conformations. Analogs with maximal shape and chemical
complementarity with Brugia AsnRS binding site and minimal strain were identiﬁed for

chemical synthesis by our collaborators.

Determining the signiﬁcant differences between protein binding sites is key to
designing drugs that are selective to one protein relative to another. Here, an approach is
presented, based on SLIDE’s calculation of points at which ligand atoms can make
chemically favorable interactions with the protein, to automatically compare binding sites
and identify their signiﬁcant similarities and differences with respect to favored ligand
interactions. Application of this method to Brugia AsnRS and a set of ATP-binding
proteins reveals novel chemical and steric differences in their binding sites which can

guide ﬁiture structure-based drug design efforts against the parasite.

ACKNOWLEDGEMENTS

I would like to ﬁrst thank my advisor Dr. Leslie Kuhn for her patient guidance
throughout my graduate school career. Her infectious energy and enthusiasm in teaching
science inspired me to work harder. I consider myself fortunate to be mentored by her
and aspire to become a competent, compassionate and responsible scientist like her.
Working in her lab offered me great opportunities to not only train myself in different
aspects of drug design but also to meet with excellent scientiﬁc collaborators. I couldn’t
have asked for a better collaborator cum teacher than Dr. Michael Kron. His passion for
infectious diseases research and generous appreciation of diverse approaches to the

problem encouraged me to contribute actively to the project.

I would also like to sincerely thank other members of my guidance committee,
Dr. Robert Cukier, Dr. Michael Feig, Dr. Robert Hausinger and Dr. Rawle
Hollingsworth. Their timely advice and critical assessment has helped me keep on track
and improved the quality of the research too. I would also like to extend an enthusiastic
thank you to other collaborators on the AsnRS project, Dr. Jonathan Morris, Dr. Morten

Grotli, Dr. Stephen Cusack, Dr. Thibaut Crepin, Dr. Frank Danel and Dr. Malcolm Page.

The research work that I have completed over the years would not have been
possible without the help of many current and former members of the Kuhn lab, who I
have had the pleasure to work with. I would like to wholeheartedly thank Matthew

Tonero, Sandeep Namilikonda, Jeff Van Voorst, Dr. Maria Zavodszky, Anjali Rohatgi,

iv

Sameer Arora, Litian He, Erica Scheller and Andrew Stumpff-Kane. Although I have
never worked with them, I will like to thank Dr. Paul Sanschagrin, Dr. A. J. Rader, Dr.
Brandon Hespenheide and Dr. Ming Lei for their helpful advice. I will also cherish my
friendship with many current and former graduate students from other labs in the
department. Of particular note are Dr. Lishan Yao, Dr. Dean Shooltz, Dr. Harini
Krishnamurthy, Josh Kwekel, James Johnson, Colleen Doherty, Joonyul Kim, Sean Law
and Soledad Quiroz. Helen Geiger, Julie Oesterle, Lesley Reed, Jessica Hudson, Melinda
Kochenderfer and Dr. K Padmanabhan deserve a special thanks for their timely help at

various points during the course of my graduate school career.

I would like to extend a special thanks to my friends Tejas Kadakia, Soumya
Korrapati, Rahul Datta, Sumohan Misra, Dr. RG Iyer, Dr. Nan Ding, Dr. Priya Mani and
Dr. Gauri Jawdekar for all the support and good times that made my stay in East Lansing

really memorable.

Finally, I would like to thank my family for their overwhelming affection and
encouragement. I want to thank my mom, dad, uncles, aunts, cousins, nephews and

nieces for their belief in me and prayers for my success.

TABLE OF CONTENTS

 

 

 

 

 

LIST OF TABLES x
LIST OF FIGURES xi
LIST OF ABBREVIATIONS xii
Chapter 1: Introduction 1
1.1 Screening the chemical space ........................................................................................ 1
1.2 Structure-based screening and docking ........................................................................ 3
1.3 Modeling ﬂexibility in ligand binding .......................................................................... 7
1.4 Deﬁning binding site speciﬁcity determinants ............................................................ 9
1.5 Motivation for this thesis work ................................................................................... 10
References .......................................................................................................................... 13
Chapter 2: Discovering new classes of Brugia malayi asparaginyl-tRNA synthetase
inhibitors and relating speciﬁcity to conformational change 20
2.1 Abstract ......................................................................................................................... 20
2.2 Introduction .................................................................................................................. 21
2.3 Materials and methods ................................................................................................. 27
2.3.1 Asparaginyl-tRNA synthetase structures ............................................................ 27
2.3.2 Experimental assay ............................................................................................... 29
2.3.3 Screening and docking with SLIDE .................................................................... 29
2.3.4 Scoring protein-ligand interactions ..................................................................... 30
2.3.5 Modeling main-chain ﬂexibility .......................................................................... 31
2.4 Results .......................................................................................................................... 35
2.4.1 Scoring Brugia AsnRS—ligand interactions ........................................................ 35
2.4.2 Screening the databases ........................................................................................ 39
2.4.3 Modeling the conformational ﬂexibility of Brugia AsnRS ............................... 52
2.4.4 Impact of main-chain conformational ﬂexibility on ligand binding: interpreting
the observed afﬁnities and spevciﬁcities ...................................................................... 58
2.5 Discussion .................................................................................................................... 59
2.6 Conclusions .................................................................................................................. 62
References .......................................................................................................................... 64

Chapter 3: Optimizing variolin B and triazinylamine to improve their binding

 

affinity and speciﬁcity for Brugia AsnRS 73
3.1 Introduction .................................................................................................................. 73
3.2 Methods ..... - ................................................................................................................... 7 7

3.2.1 Designing new analogs and generating their structures ..................................... 77
3.2.2 Scoring the interactions between designed analogs and Brugia AsnRS ........... 78
3.2.3 Assessing ligand internal energies ....................................................................... 78
3.3 Results and discussion ................................................................................................. 79
3.3.1 Variolin B analogs ................................................................................................ 79
3.3.2 Triazinylamine analogs ........................................................................................ 87
3.4 Conclusions .................................................................................................................. 93
References .......................................................................................................................... 94

Chapter 4: Automated shape and chemistry comparison for defining binding site

 

invariants and specificity determinants 96
4.1 Abstract ......................................................................................................................... 96
4.2 Introduction .................................................................................................................. 97
4.3 Materials and methods ............................................................................................... 103

4.3.1 Representing the protein binding sites .............................................................. 103
4.3.2 Superposition to bring the templates in the same reference ............................ 105
4.3.3 Complete-linkage clustering to identify the shared interaction sites .............. 106
4.3.4 Post-clustering processing to identify similar and chemical difference sites .1 11
4.3.5 Relative signiﬁcance of the chemical difference sites ..................................... 113
4.3.6 Clustering sensitivity to superpositional accuracy ........................................... 114
4.3.7 Steric difference sites ......................................................................................... 115
4.3.8 Datasets ............................................................................................................... 117
4.4 Results ............................ _ ............................................................................................ 1 25
4.4.1 Chemical difference sites: Explaining the observed experimental selectivity of
ligands bound to proteins of the AfﬁnDB set ............................................................ 125
4.4.2 Similar sites identiﬁed in the ATP-set proteins ................................................ 134
4.4.3 Chemical difference sites identiﬁed in protein pairs of the ATP set .............. 141
4.4.4 Steric difference sites identiﬁed between AsnRS and phosphorylase kinase . 149
4.5 Discussion .................................................................................................................. 151
4.5.1 Relative signiﬁcance of chemical difference sites ........................................... 151
4.5.2 Integrating our method into virtual screening protocol .................................... 153
4.5.3 Conformational ﬂexibility and chemical difference sites ................................ 154
4.5.4 Superpositional accuracy ................................................................................... 154
4.6 Conclusions ................................................................................................................ 155
References ........................................................................................................................ 1 56

vii

Chapter 5: Summary and future directions 163

5.1 Virtual screening for aminoacyl-tRN A synthetase inhibitors ................................. 163
5.1.1 Summary and perspective .................................................................................. 163
5.1.2 Future directions ................................................................................................. 164

5.2 Using speciﬁcity determinants in virtual screening ................................................. 166
5.2.1 Summary and perspective .................................................................................. 166
5.2.1 Future directions ................................................................................................. 167

References ........................................................................................................................ 168

viii

LIST OF TABLES

2.1 Data collection and reﬁnement statistics of Brugia AsnRS crystal structures ....... 28

2.2 Predicted AsnRS-inhibitor complementarity scores and experimentally determined
afﬁnity values of known ligands of Brugia AsnRS ............................................................. 37

2.3 Predicted AsnRS-inhibitor complementarity scores and experimentally determined
afﬁnity values of SLIDE-predicted inhibitors of Brugia AsnRS ........................................ 43

2.4 Predicted AsnRS-inhibitor complementarity scores and experimentally determined
afﬁnity values of synthesized analogs of SLIDE-discovered Brugia AsnRS inhibitors
variolin B and cycloadenosine ............................................................................................... 45

2.5 Known ﬂexible regions in Brugia AsnRS ................................................................ 54

3.1 Designed analogs of variolin B with sulfamoyl-asparagine (S-ASN) attached to
two different positions on the variolin scaffold .................................................................... 80

3.2 Predicted protein-ligand complementarity scores and difference in ligand internal
energies of docked orientations of designed variolin analogs compared with known
ligands ..................................................................................................................................... 86

3.3 Predicted protein-ligand complementarity scores and difference in ligand internal
energies of docked orientations of designed triazinylamine analogs .................................. 88

4.1 Proteins of the AfﬁnDB set used in our analysis with their bound ligands and
afﬁnity data ........................................................................................................................... 119

4.2 Proteins of the ATP set used in our analysis ........................................................... 123

4.3 Signiﬁcant chemical difference sites identiﬁed between protein pairs of the
AfﬁnDB set that can explain the known relative binding afﬁnities of their bound
ligands. ...................................................................................................... 126

4.4 Signiﬁcant chemical difference sites identiﬁed between Brugia AsnRS and
representative structures of other ATP-binding proteins ................................................... 142

ix

LIST OF FIGURES

2.1 Brugia AsnRS dimer and the three class II aminoacyl-tRNA synthetase (AARS)
sequence motifs ...................................................................................................................... 23

2.2 The three residues that differ near the active sites of Brugia and human AsnRS and
diverse ROCK-generated conformations of known ﬂexible regions of Brugia AsnRS ....32

2.3 Enrichment plot for the three different scoring functions - SLIDE score,
DrugScore and X-Score ......................................................................................................... 35

2.4 The predicted binding modes of seven SLIDE-discovered Brugia AsnRS inhibitors
compared with the crystallographic binding mode of known ligand ASNAMS ............... 41

2.5 The surface of closed and open Brugia AsnRS conformations compared with its
ligand-free (apo) crystal structure. The adenine binding loop residue His 219 undergoes
signiﬁcant motion between the closed and open conformations. The steric clashes
between the long side-chain variolin B derivative and the closed conformation of the
adenine binding loop compared with its overlap-free docked orientation in the open
conformation ........................................................................................................................... 56

3.1 The 2D structure of variolin B scaffold and its SLIDE-predicted binding mode...75

3.2 The 2D structure of triazinylamine, its SLIDE-predicted binding mode, and the
truncated scaffold used to design substituents ...................................................................... 76

3.3 Manually assessed binding modes of designed variolin B analogs I, II and 111
compared with the SLIDE-predicted binding mode of variolin B and ASNAMS bound in
the crystal structure ................................................................................................................ 82

3.4 Two manually assessed binding modes of designed variolin B analog IV compared
with the SLIDE-predicted binding mode of variolin B and ASNAMS bound in the crystal
structure. For the attached asparagine side chain to be docked in the amino acid pocket,
the pendant ring of analog IV has to be buried ﬁirther in the ribose pocket of AsnRS
binding site .............................................................................................................................. 84

3.5 The SLIDE-predicted binding mode of designed variolin B analog V is compared
with the SLIDE-predicted binding mode of variolin B and ASNAMS bound in the crystal
structure .................................................................................................................................. 85

3.6 The SLIDE-predicted binding modes of nine designed analogs of triazinylamine
are compared with the SLIDE-predicted binding mode of variolin B and ASNAMS
bound in the crystal structure ................................................................................................ 90

4.1 The steps in the algorithm to perform automated comparison of binding sites are
brieﬂy explained using a ﬂowchart ..................................................................................... 104

4.2 The complete-linkage clustering algorithm is brieﬂy explained using a
hypothetical example ........................................................................................................... 108

4.3 A plot to show the sensitivity of the clustering algorithm to superpositional
accuracy ................................................................................................................................ 116

4.4 Signiﬁcant chemical difference sites identiﬁed between protein pairs of the
AfﬁnDB set that can explain the known relative binding afﬁnities of their bound
ligands ................................................................................................................................... 128

4.5 Chemically similar sites identiﬁed for each of the four subsets of ATP-binding
proteins .................................................................................................................................. l 3 5

4.6 Signiﬁcant chemical difference sites identiﬁed between Brugia AsnRS and
representative structures of each of the other three classes of ATP -binding proteins ..... 144

4.7 Steric difference sites, showing accessible space in Brugia AsnRS relative to a
representative protein kinase structure, and vice-versa ..................................................... 15 0

Images in this thesis/dissertation are presented in color.

2D

3D

AARS
ASNAMS
AsnRS
ATP

CSD

DS

HTS

LBHAMP

MMFF
NCI
PDB
QSAR
RMSD
ROCK

SLIDE

LIST OF ABBREVIATIONS

2-dimensional

' 3-dimensional

aminoacyl-tRN A synthetases

asparaginyl sulfamoyl adenylate
asparaginyl-tRN A synthetase

adenosine triphosphate

Cambridge Structural Database
DrugScore

hi gh-throughput screening
L-aspartate-B-hydroxamate adenylate
molecular dynamics

Merck Molecular Force Field

National Cancer Institute

Protein Data Bank

quantitative structure-activity relationship
root mean square deviation

Rigidity Optimized Conformational Kinetics

Screening for Ligands by Induced-ﬁt Docking Efﬁciently

xii

Chapter 1

Introduction

1.1 Screening the chemical space

Screening of chemical compounds to discover inhibitors or agonists of proteins as drug
targets is a dominant tool in the modern drug discovery process. In 1909, Paul Ehrlich
and colleagues were the ﬁrst to screen a few hundred compounds to discover a drug that
that binds to a certain ‘chemoreceptor’ of therapeutic interest (1-3). Advances in assay
and instrument technologies, coupled with the use of combinatorial chemistry to produce
diverse libraries of compounds, have signiﬁcantly increased the compound coverage and
throughput in screening methodologies (4). Experimental high-throughput screening
(HTS) can now screen between thousands and millions of compounds in an attempt to
discover novel drug leads. However, despite its many successes, the method has its
drawbacks. It is a complex and expensive method with declining productivity in terms of
the money spent and the number of new drugs developed (5). It also provides no insights
into the mode of interaction between the compounds and the protein target. The advent of
computers and the technology to store chemical information enabled creation of chemical
databases and associated information retrieval systems. Virtual screening, analogous with

HTS but less expensive, refers to screening of a large number of compounds stored in

virtual libraries by computer instead of by experiment. In its earliest forms during the
19805, virtual screening was widely used to ﬁnd molecules in databases that were
‘similar’ to the query molecule. The central premise of these similarity search methods
was that structurally similar molecules are more likely to exhibit similar biological
activity (6, 7). These similarity methods could be differentiated from each other based on
the way they represent the molecules and the way they calculate the similarity between
the molecules (8-10). The most common representations are binary ﬁngerprints which
encode molecular structures in a string of 0s and Is (bits) that describe the presence or
absence of a certain feature, e.g. a functional group. The Tanimoto coefﬁcient (8) is a
common similarity metric used to efﬁciently compare these binary strings. Molecular
shape and electrostatics, in conjunction with structural ﬁngerprints, have also been found
to be important variables in similarity searching (11). However, all similarity methods
come with an inherent risk of missing the structurally diverse potential candidates in the
database. Virtual screening works best in an information-rich environment, i.e., its ability
to evaluate a large number of compounds automatically with a high degree of accuracy
will be greatest in those cases where the most information is available (12). For example,
in cases where high resolution structural information about the target protein is available,
success is more likely when this information is also taken into account. The advantages
of and challenges to protein structure-based virtual screening are discussed in the next

section.

1.2 Structure-based screening and docking

In 1902 Sir Archibald Garrod was the ﬁrst to attribute a disease (alkaptonuria) to an
enzyme defect, to what he identiﬁed as an “inborn error of metabolism” (13). The
biochemical evidence of his theory was provided ﬁfty years later when the enzyme
responsible for the metabolic defect was identiﬁed (14). The elucidation of the structural
and mechanistic details of the enzyme involved in the disease had to wait even longer
(15), since the technology to determine protein structures wasn’t developed until 19608.
Myoglobin and hemoglobin were the ﬁrst protein structures solved at atomic resolution
by Kendrew and Perutz (16, 17). Their work showed it was possible to see how
macromolecules bind to their ligands providing insights into how they carry out their
functions. The crystal structure of hemoglobin, a pharmacologically relevant target,
enabled Goodford and colleagues to carry out the ﬁrst reported example of ‘structure-
based design’ (18). They used primitive physical models to ﬁnd compounds similar to
diphosphoglycerate, the allosteric effector of hemoglobin. Their approach had a
signiﬁcant impact on how chemists and molecular modelers viewed protein active sites
and the possibility for rational design. This was followed by the development of an
innovative molecular docking approach by Kuntz and colleagues (19) to dock ligands to
binding sites of medically relevant proteins. Molecular docking can be deﬁned as the
prediction of the binding orientation of small molecule ligand candidates to protein
binding sites. Kuntz’s classical algorithm for computational docking generates a set of
spheres to describe the volume, or negative image, of the binding site and uses the centers
of these spheres as sites for matching to ligand atoms. Sets of spheres representing the

binding site are matched to sets of ligand atoms to generate a ligand orientation. Since

then, the exponential growth in the number of protein structures deposited to the Protein
Data Bank (PDB; (20)) has been accompanied by a rapid growth and evolution of

structure-based drug design methods in the last two decades (21 -23).

The structure—based screening and docking tool SLIDE (24-26), developed in our
laboratory, was used for the work presented in this thesis. SLIDE represents the binding
site of the target protein by a set of points collectively called a template. These template
points identify the optimal positions for potential ligand atoms to form favorable
hydrogen bond and hydrophobic interactions with the neighboring protein atoms. The
ligand candidates in the database are similarly represented by a set of hydrogen bonding
and hydrophobic interaction points, assigned to polar atoms and centers of hydrophobic
atom clusters respectively. A ligand candidate is docked after ﬁnding a feasible match
between all possible triplets of its ligand interaction points and all geometrically and

chemically compatible protein template triangles.

Molecular docking is an integral part of protein structure-based virtual screening.
For docking to be useful in a high-throughput mode, accuracy and speed in assessing
orientations and chemical complementarity of the two molecules are key factors. Since
these can be contradictory requirements, some simpliﬁcations (which will be discussed in
later sections) need to be made to make docking tractable in virtual screening efforts.
Nevertheless, structure-based screening and docking methods have led to successful
discoveries, such as the HIV protease inhibitor Viracept (27) and the anti-inﬂuenza drug
Relenza (28), and signiﬁcantly higher hit rates (ligands discovered per molecules tested)
than experimental HTS (29). While experimental validation of docking hits must always

be done, docking provides a clear, testable hypothesis of how the molecules interact and

intelligently ﬁlters a chemical database to focus on those molecules that are most

complementary with the target protein structure.

The fundamental challenges in structure-based screening and docking are
sampling and scoring (30-32). Sampling relates to assessing various conformations and
relative orientations of the ﬂexible molecules — ligands as well as the target protein —
while scoring relates to calculating the binding afﬁnities between each docked ligand
candidate and the target protein. Since the computational complexity of the problem
increases exponentially with the number of degrees of freedom (bond rotations,
molecular translations/rotations), modeling the ﬂexibility of ligands is less difﬁcult than
that of the target protein. Conformational sampling of the ligands is necessary during
docking because usually it is not known which low-energy conformation interacts most
favorably with the protein. This can be achieved by pro-computing a database of
conformers for each compound to be screened. Examples of virtual screening tools that
work with ligand conformer libraries are FRED (33) and SLIDE (34). Alternatively,
docking programs can explore the conformational ﬂexibility of the ligands as the docking
proceeds, as done by a variety of docking tools, e.g. the genetic algorithm in GOLD (35),
Monte Carlo methods in QXP (36) and incremental construction in FlexX (37) and
GLIDE (38). Approaches to model protein ﬂexibility during docking will be discussed in

the next section of this chapter.

The ultimate purpose of a screening and docking tool is to provide a ranked list of
compounds to be tested for biological activity. Therefore, once a binding orientation for a

ligand candidate is generated by the docking program, it needs to be scored to rank the

quality of the orientation, not only with respect to other possible orientations of the same

compound but also with respect to other compounds in the database. The protein-ligand
scoring functions can be broadly divided into three categories: a) force ﬁeld-based, b)
empirical, and c) knowledge-based. The force ﬁeld-based methods, although generally
more accurate, are computationally intensive, and hence unsuitable for high-throughput
molecular docking purposes. Structure-based screening and docking tools like QXP (36)
employ force ﬁeld-based scoring functions in a minimalist manner with no explicit
solvent term. Empirical scoring functions, derived from ﬁtting to known experimental
binding afﬁnities of different protein-ligand complexes, are widely employed by docking
algorithms, e.g. PLP (27), FlexX (3 7), and SLIDE (26, 34). These scoring functions use
an additive approximation to estimate the binding afﬁnity and are usually composed of
several terms corresponding to hydrogen bonding, hydrophobic interactions and, in some
cases, interactions with metal ions. On the other hand, knowledge-based scoring
functions use information from available structures of protein-ligand complexes to
estimate the binding afﬁnity based on the potentials of mean force (e.g. PMF (39)) or
preferred interatomic distances (e. g. DrugScore (40)). The different scoring functions
may give the individual docking programs a particular advantage in one aspect with
respect to another, but they are still far from perfect (32). No current scoring function is
able to accurately estimate the binding afﬁnities across different protein classes.
Consensus scoring by combining several scoring functions has been suggested to

overcome their individual deﬁciencies and enhance the hit rates (41).

The empirical scoring function in SLIDE is a weighted sum of hydrophobic and
hydrogen-bonding interaction terms trained to match binding afﬁnity values in known

protein-ligand complexes (26). For the work presented here, I have used SLIDE’s internal

scoring ﬁmction, DrugScore (40), and X-Score, another empirical scoring function (42),
focusing on assessing which scoring function or a combination thereof performed best for

the particular protein target.

1.3 Modeling protein ﬂexibility in ligand binding

Virtual screening and docking experiments have led to the identiﬁcation of many
bioactive compounds, although very different from the natural ligands for a given protein,
that bind to the active site as predicted (43). The fact that structurally diverse ligands can
be recognized by the same binding site is facilitated by protein ﬂexibility in ligand
binding. The evidence of conformational rearrangements in the protein leading to the
binding of structurally diverse ligands is substantial (44, 45). From a structure-based drug
design perspective, incorporating protein ﬂexibility into the docking algorithm should
enhance the diversity of lead compounds with desired bioactivity (46, 47). Protein
conformational changes induced upon ligand binding can range from the local rotation of
a few side chains to whole domain rearrangements (48, 49). In general, two broad
schemes have been employed to model protein ﬂexibility in structure-based screening
and docking methods. First, an ensemble of protein conformations obtained from
multiple sources, 6. g. multiple crystal or NMR structures of the same protein, molecular
dynamic simulations or homology models. A ligand candidate is then docked to an
average (50, 51), most conserved (52), or all of these protein conformations (53).
Second, the protein conformation is allowed to change during the docking process, either

by rotating the optimal side-chain torsional angles (34, 54) or by using a rotamer library

to represent the preferred side-chain orientations (55, 56). Deriving a high-afﬁnity lead
compound that speciﬁcally binds to an alternate conformation of the target protein is a
highly attractive strategy in structure-based drug design. However, the existing
approaches are limited either by availability of experimentally solved structures or by the

ability of algorithms to sample large scale motions involving the protein backbone.

The screening and docking tool SLIDE (26, 34), used for the work presented here,
models ﬂexibility of protein side chains during docking. It resolves steric overlaps
between the docked ligand and the protein through minimal directed rotation, determined
by mean-ﬁeld optimization, of the rotatable bonds in protein and ligand side chains (34,
49). To model the main-chain ﬂexibility of our target protein and assess its impact on
ligand binding, we used a graph-theoretic algorithm ProFlex (57, 58) to ﬁrst predict the
ﬂexible and rigid regions in the protein structure, and then search the conformational
space available to those ﬂexible regions using a restricted random-walk sampling
algorithm ROCK (Rigidity Optimized Conformational Kinetics) (59, 60). ProFlex
predicts the protein ﬂexibility based on the analysis of the constraints posed by the
protein’s network of covalent bonds and non-covalent interactions including hydrogen
bonds, salt bridges and hydrophobic interactions. Protein conformers, generated by
sampling dihedral angles in ROCK, are either accepted or rejected, depending upon
whether they maintain the non-covalent bond network and have no van der Waals

overlaps between atoms.

1.4 Deﬁning binding site specificity determinants

Even some of the most potent bioactive compounds discovered by structure-based
screening are beset by problems with respect to their efﬁcacy to discriminate between the
target protein and other homologs or related proteins. The problem of speciﬁcity has
plagued many common drug targets like serine proteases (61), nuclear receptors (62) and
matrix-metalloproteinases (63). In the case of protein kinases, the problem is more acute
as some of the promising ATP-site inhibitors, including those that were approved or were
in clinical development, have been reported to have poor speciﬁcity proﬁles (64). Most
of the methods used to model speciﬁcity of drug candidates apply techniques to compare
protein binding site models. Kastenholz and co-workers analyzed several binding site
models, generated by the program GRID (65), using the consensus principal component
analysis to obtain contour plots identifying regions that are important for speciﬁcity in
the chosen target protein (66). GRID generates molecular interaction ﬁelds to identify the
energetically favorable sites for ligands to bind to a protein. The GRID/PCA technique
was adapted in a different way by Braiuca and co-workers to partially account for protein
ﬂexibility that could predict selectivity differences caused by amino-acid residue
differences in not only the active site but also in regions that are not directly interacting
with the ligand (67). Sheridan and co-workers developed a mathematically simpler
method FLOGTV (68) that uses the trend vector paradigm to compare the binding site
ﬁeld maps, generated by the program FLOG (69), to visualize the differences in closely
related proteins superimposed in a reasonable way. Deng and co-workers used
hierarchical clustering to analyze the interaction ﬁngerprints, which reduce the three-

dimensional (3D) structural binding information of protein-ligand complexes into

corresponding one-dimensional binary strings, to identify similarities and diversities
between their small-molecule binding interaction patterns (70). By modifying the virtual
screening protocol to introduce essential protein-ligand interactions, identiﬁed by prior
ﬁndings, as ﬁlters during the docking stage, Perola reported signiﬁcant reduction in the
false positive rates in kinase virtual screens (71). However, these methods are
handicapped by their reliance on known ligands of the target protein, and their results

cannot be easily integrated in any other structure-based screening and docking protocols.

A new method to identify speciﬁcity determinants in one protein relative to
another has been developed for the work presented here. This method uses complete-
linkage clustering to identify signiﬁcant similarities and differences between SLIDE-
generated templates representing protein binding sites. The results generated by this
method can be incorporated into our structure-based screening protocol to identify

prospective ligands speciﬁc to a target protein.

1.5 Motivation for this thesis work

Lymphatic ﬁlariasis is caused by the parasitic nematode worms Wuchereria
bancrofti and Brugia malayz’. It is a debilitating human disease that afﬂicts more than
200 million people worldwide and more than a billion people reside in areas where the
disease is actively transmitted, making it one of the top ten tropical diseases being
targeted by the World Health Organization (72, 73). In its most obvious manifestation,
lymphatic ﬁlariasis causes enlargement of the entire leg or arm, the genitals, vulva and

breasts. The crippling physical effects of the disease have a huge social and economic

10

impact. Existing drugs to combat the disease, discovered decades ago as
chemotherapeutic agents, are deﬁcient because of their inability to kill adult worms,
severe side effects, long treatment durations and the emergence of drug resistant strains in
humans (72, 74). Brugia malayi asparaginyl tRNA synthetase (AsnRS) has been
acknowledged as a rational target for drug development against ﬁlariasis (72, 75). This
dissertation presents the discovery of new inhibitors of Brugia AsnRS using structure-

based ligand screening and design techniques.

Chapter 2 describes the discovery of seven new classes of Brugia AsnRS
inhibitors using SLIDE (25, 26, 34), our virtual screening tool capable of modeling
protein and ligand side chains during docking. The discovery of these inhibitors was a
result of our collaboration with a parasitologist and biochemist, Dr. Michael Kron at
Medical College of Wisconsin, crystallographer Dr. Stephen Cusack at EMBL Grenoble
in France, medicinal chemists, Dr. Jonathan Morris at University of Adelaide in Australia
and Dr. Morten Grotli at University of Goteborg in Sweden and biochemists, Dr. Frank
Danel and Dr. Malcolm Page at Basilea Pharrnaceutica in Switzerland. The sampling of
the active-site loop motions, using our restricted random-walk algorithm ROCK (59, 60),
not only allows the modeling of protein conformational ﬂexibility in ligand binding but
also provides. a potent tool for exploiting it in structure-based screening and design of
species selective inhibitors. Chapter 3 describes the optimization of two Brugia AsnRS
inhibitors to design and identify analogs, with improved afﬁnity and speciﬁcity for the

target protein, for chemical synthesis by our medicinal chemistry collaborators.

Extension from our efforts to optimize Brugia AsnRS inhibitors, to improve their

afﬁnity and selectivity for the target protein relative to human AsnRS or other proteins,

11

has motivated the development of an automated binding site comparison tool. Chapter 4
describes the algorithm for the automated shape and chemistry comparison for deﬁning
binding site invariants and speciﬁcity determinants. The application of this algorithm to
Brugia AsnRS and other ATP-binding proteins reveals novel binding site differences

which can be used as effective ﬁlters in our virtual screening protocol.

12

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

(11)

References

Drews, J. (2000) Drug discovery: a historical perspective. Science 287, 1960-4.
Schwartz, R. S. (2004) Paul Ehrlich's magic bullets. N Engl J Med 350, 1079-80.

Riethmiller, S. (2005) From Atoxyl to Salvarsan: searching for the magic bullet.
Chemotherapy 51, 234-42.

Hertzberg, R. P., and Pope, A. J. (2000) High-throughput screening: new
technology for the let century. Curr Opin Chem Biol 4, 445-51.

Fox, 8., Farr-Jones, S., Sopchak, L., Boggs, A., and Comley, J. (2004) High-
throughput screening: searching for higher productivity. J Biomol Screen 9, 354-
8.

Sheridan, R. P., and Kearsley, S. K. (2002) Why do we need so many chemical
similarity search methods? Drug Discov Today 7, 903-11.

Martin, Y. C., Kofron, J. L., and Traphagen, L. M. (2002) Do structurally similar
molecules have similar biological activity? J Med Chem 45, 4350-8.

Willett, P., Barnard, J. M., and Downs, G. M. (1998) Chemical similarity
searching. J Chem Inf Comp Sci 38, 983-996.

Bajorath, J. (2001) Selected concepts and investigations in compound
classiﬁcation, molecular descriptor analysis, and virtual screening. J Chem Inf
Comp Sci 41, 233-245.

Schnecke, V., and Bostrom, J. (2006) Computational chemistry-driven decision
making in lead generation. Drug Discov Today 11, 43-50.

Nicholls, A., MacCuish, N. E., and MacCuish, J. D. (2004) Variable selection and
model validation of 2D and 3D molecular descriptors. J Comput Aided Mol Des
18, 45 1-74.

13

(12)

(13)

(14)

(15)

(16)

(17)

(18)

(19)

(20)

(21)

(22)

(23)

Walters, W. P., Stahl, M. T., and Murcko, M. A. ( 1998) Virtual screening - an
overview. Drug Discov Today 3, 160-178.

Garrod, A. E. (1902) The incidence of alkaptonuria a study in chemical
individuality. Lancet 2, 1616-1620.

Ladu, B. N., Seegrniller, J. E., Laster, L., and Zannoni, V. G. (1958) The Nature
of the Metabolic Defect in Alcaptonuria. Arthritis Rheum 1, 271-271.

Titus, G. P., Mueller, H. A., Burgner, J ., Rodriguez De Cordoba, S., Penalva, M.
A., and Tim, D. E. (2000) Crystal structure of human homogentisate
dioxygenase. Nat Struct Biol 7, 542-6.

Kendrew, J. C., and Perutz, M. F. (1957) X-ray studies of compounds of
biological interest. Annu Rev Biochem 26, 327-72.

Perutz, M. F. (1960) Structure of hemoglobin. Brookhaven Symp Biol 13, 165-83.

Beddell, C. R., Goodford, P. J ., Norrington, F. E., Wilkinson, S., and Wootton, R.
(1976) Compounds designed to ﬁt a site of known structure in human
haemoglobin. Br J Pharmacol 5 7, 201 -9.

Kuntz, I. D., Blaney, J. M., Oatley, S. J ., Langridge, R., and Ferrin, T. E. (1982) A
geometric approach to macromolecule-ligand interactions. J Mol Biol 161, 269-
88.

Berrnan, H., Henrick, K., Nakamura, H., and Markley, J. L. (2007) The
worldwide Protein Data Bank (waDB): ensuring a single, uniform archive of
PDB data. Nucleic Acids Res 35, D301-3.

Kuntz, I. D. (1992) Structure-based strategies for drug design and discovery.
Science 25 7, 1078-82.

Klebe, G. (2000) Recent developments in structure-based drug design. J Mol Med
78, 269-81.

Kroemer, R. T. (2007) Structure-based drug design: docking and scoring. Curr
Protein Pepi Sci 8, 312-28.

14

(24)

(25)

(26)

(27)

(28)

(29)

(30)

(31)

(32)

(33)

Schnecke, V., Swanson, C. A., Getzoff, E. D., Tainer, J. A., and Kuhn, L. A.
(1998) Screening a peptidyl database for potential ligands to proteins with side-
chain ﬂexibility. Proteins 33, 74-87.

Schnecke, V., and Kuhn, L. A. (1999) Database screening for HIV protease
ligands: the inﬂuence of binding-site conformation and representation on ligand
selectivity. Proc Int Coanntell Syst Mol Biol, 242-51.

Zavodszky, M. 1., Sanschagrin, P. C., Korde, R. S., and Kuhn, L. A. (2002)
Distilling the essential features of a protein surface for improving protein-1i gand
docking, scoring, and virtual screening. J Comput Aided Mol Des 16, 883-902.

Gehlhaar, D. K., Verkhivker, G. M., Rejto, P. A., Sherman, C. J., Fogel, D. B.,
Fogel, L. J ., and Freer, S. T. (1995) Molecular recognition of the inhibitor AG-
1343 by HIV-1 protease: conformationally ﬂexible docking by evolutionary
programming. Chem Biol 2, 317-24.

von Itzstein, M., Wu, W. Y., Kok, G. B., Pegg, M. S., Dyason, J. C., Jin, B., Van
Phan, T., Smythe, M. L., White, H. F., Oliver, S. W., and et a1. (1993) Rational

design of potent sialidase-based inhibitors of inﬂuenza virus replication. Nature
363, 418-23.

Doman, T. N., McGovern, S. L., Witherbee, B. J ., Kasten, T. P., Kurumbail, R.,
Stallings, W. C., Connolly, D. T., and Shoichet, B. K. (2002) Molecular docking
and hi gh-throughput screening for novel inhibitors of protein tyrosine
phosphatase-1B. J Med Chem 45, 2213-21.

Verkhivker, G. M., Bouzida, D., Gehlhaar, D. K., Rejto, P. A., Arthurs, S.,
Colson, A. B., Freer, S. T., Larson, V., Luty, B. A., Marrone, T., and Rose, P. W.
(2000) Deciphering common failures in molecular docking of ligand-protein
complexes. J Comput Aided Mol Des 14, 731-51.

Brooijmans, N., and Kuntz, I. D. (2003) Molecular recognition and docking
algorithms. Annu Rev Biophys Biomol Struct 32, 335-73.

Klebe, G. (2006) Virtual ligand screening: strategies, perspectives and limitations.
Drug Discov Today 11, 580-94.

McGann, M. R., Almond, H. R., Nicholls, A., Grant, J. A., and Brown, F. K.
(2003) Gaussian docking functions. Biopolymers 68, 76-90.

15

(34)

(35)

(36)

(37)

(38)

(39)

(40)

(41)

(42)

(43)

(44)

Schnecke, V., and Kuhn, L. A. (2000) Virtual screening with solvation and
ligand-induced complementarity. Perspect Drug Discov 20, 171-190.

Jones, G., Willett, P., and Glen, R. C. ( 1995) Molecular recognition of receptor
sites using a genetic algorithm with a description of desolvation. J Mol Biol 245,
43-53.

McMartin, C., and Bohacek, R. S. (1997) QXP: powerful, rapid computer
algorithms for structure-based drug design. J Comput Aided Mol Des 11, 333-44.

Rarey, M., Kramer, B., Lengauer, T., and Klebe, G. (1996) A fast ﬂexible
docking method using an incremental construction algorithm. J Mol Biol 261,
470-89.

Friesner, R. A., Banks, J. L., Murphy, R. B., Halgren, T. A., Klicic, J. J ., Mainz,
D. T., Repasky, M. P., Knoll, E. H., Shelley, M., Perry, J. K., Shaw, D. E.,
Francis, P., and Shenkin, P. S. (2004) Glide: a new approach for rapid, accurate
docking and scoring. 1. Method and assessment of docking accuracy. J Med
Chem 47, 1739-49.

Muegge, I. (2000) A knowledge-based scoring ﬁmction for protein-ligand
interactions: Probing the reference state. Perspect Drug Discov 20, 99-114.

Gohlke, H., Hendlich, M., and Klebe, G. (2000) Knowledge-based scoring
function to predict protein-ligand interactions. J Mol Biol 295 , 337-56.

Charifson, P. S., Corkery, J. J ., Murcko, M. A., and Walters, W. P. (1999)
Consensus scoring: A method for obtaining improved hit rates from docking
databases of three-dimensional structures into proteins. J Med Chem 42, 5100-9.

Wang, R., Lai, L., and Wang, S. (2002) Further development and validation of
empirical scoring functions for structure-based binding afﬁnity prediction. J
Comput Aided Mol Des 16, 1 1-26.

Shoichet, B. K. (2004) Virtual screening of chemical libraries. Nature 432, 862-5.

Tong, L., Pav, S., Mui, S., Lamarre, D., Yoakim, C., Beaulieu, P., and Anderson,
P. C. (1995) Crystal structures of HIV-2 protease in complex with inhibitors
containing the hydroxyethylamine dipeptide isostere. Structure 3, 33-40.

16

(45)

(46)

(47)

(48)

(49)

(50)

(51)

(52)

(53)

(54)

(55)

Weichsel, A., and Montfort, W. R. (1995) Ligand-induced distortion of an active
site in thymidylate synthase upon binding anticancer drug 1843U89. Nat Struct
Biol 2, 1095-101.

Teague, S. J. (2003) Implications of protein ﬂexibility for drug discovery. Nat Rev
Drug Discov 2, 527 -41.

Alberts, I. L., Todorov, N. P., and Dean, P. M. (2005) Receptor ﬂexibility in de
novo ligand design and docking. J Med Chem 48, 6585-96.

Lesk, A. M., and Chothia, C. (1988) Elbow motion in the immunoglobulins
involves a molecular ball-and-socket joint. Nature 335, 188-90.

Zavodszky, M. I., and Kuhn, L. A. (2005) Side-chain ﬂexibility in protein-ligand
binding: the minimal rotation hypothesis. Protein Sci 14, 1104-14.

Knegtel, R. M., Kuntz, I. D., and Oshiro, C. M. (1997) Molecular docking to
ensembles of protein structures. J Mol Biol 266, 424-40.

Osterberg, F., Morris, G. M., Sanner, M. F., Olson, A. J ., and Goodsell, D. S.
(2002) Automated docking to multiple target structures: incorporation of protein
mobility and structural water heterogeneity in AutoDock. Proteins 46, 34-40.

Carlson, H. A., Masukawa, K. M., Rubins, K., Bushman, F. D., J orgensen, W. L.,
Lins, R. D., Briggs, J. M., and McCammon, J. A. (2000) Developing a dynamic
pharrnacophore model for HIV -1 integrase. J Med Chem 43, 2100-14.

Femandes, M. X., Kairys, V., and Gilson, M. K. (2004) Comparing ligand
interactions with multiple receptors via serial docking. J Chem Inf Comput Sci 44,
1961-70.

Totrov, M., and Abagyan, R. (1997) Flexible protein-ligand docking by global
energy optimization in internal coordinates. Proteins Suppl 1, 215-20.

Leach, A. R. (1994) Ligand docking to proteins with discrete side-chain
ﬂexibility. J Mol Biol 235, 345-56.

17

(56)

(57)

(53)

(59)

(60)

(61)

(62)

(63)

(64)

(65)

Kallblad, P., Todorov, N. P., Willems, H. M., and Alberts, I. L. (2004) Receptor
ﬂexibility in the in silico screening of reagents in the S1' pocket of human
collagenase. J Med Chem 47, 2761-7.

Rader, A. J ., Hespenheide, B. M., Kuhn, L. A., and Thorpe, M. F. (2002) Protein
unfolding: rigidity lost. Proc Natl Acad Sci U S A 99, 3540-5.

Hespenheide, B. M., Rader, A. J ., Thorpe, M. F., and Kuhn, L. A. (2002)
Identifying protein folding cores from the evolution of ﬂexible regions during
unfolding. J Mol Graph Model 21, 195-207.

Lei, M., Zavodszky, M. I., Kuhn, L. A., and Thorpe, M. F. (2004) Sampling
protein conformations and pathways. J Comput Chem 25, 1133-48.

Zavodszky, M. 1., Lei, M., Thorpe, M. E, Day, A. R., and Kuhn, L. A. (2004)
Modeling correlated main-chain motions in proteins for ﬂexible molecular
recognition. Proteins 5 7, 243-61.

Walker, B., and Lynas, J. F. (2001) Strategies for the inhibition of serine
proteases. Cell Mol Life Sci 58, 596-624.

Coghlan, M. J ., Elmore, S. W., Kym, P. R., and Kort, M. E. (2003) The pursuit of
differentiated ligands for the glucocorticoid receptor. Curr Top Med Chem 3,
1617-35.

Matter, H., and Schudok, M. (2004) Recent advances in the design of matrix
metalloprotease inhibitors. Curr Opin Drug Discov Devel 7, 513-35.

Fabian, M. A., Biggs, W. H., 3rd, Treiber, D. K., Atteridge, C. E., Azimioara, M.
D., Benedetti, M. G., Carter, T. A., Ciceri, P., Edeen, P. T., Floyd, M., Ford, J.
M., Galvin, M., Gerlach, J. L., Grotzfeld, R. M., Herrgard, S., Insko, D. E., Insko,
M. A., Lai, A. G., Lelias, J. M., Mehta, S. A., Milanov, Z. V., Velasco, A. M.,
Wodicka, L. M., Patel, H. K, Zarrinkar, P. P., and Lockhart, D. J. (2005) A small
molecule-kinase interaction map for clinical kinase inhibitors. Nat Biotechnol 23,
329-36.

Goodford, P. J. (1985) A computational procedure for determining energetically
favorable binding sites on biologically important macromolecules. J Med Chem
28, 849-57.

18

(66)

(67)

(68)

(69)

(70)

(71)

(72)

(73)

(74)

(75)

Kastenholz, M. A., Pastor, M., Cruciani, G., Haaksma, E. E., and Fox, T. (2000)
GRID/CPCA: a new computational tool to design selective ligands. J Med Chem
43, 3033-44.

Braiuca, P., Cruciani, G., Ebert, C., Gardossi, L., and Linda, P. (2004) An
innovative application of the "ﬂexible" GRID/PCA computational method: study
of differences in selectivity between PGAs from Escherichia coli and a
Providentia rettgeri mutant. Biotechnol Prog 20, 1025-31.

Sheridan, R. P., Holloway, M. K., McGaughey, G., Mosley, R. T., and Singh, S.
B. (2002) A simple method for visualizing the differences between related
receptor sites. J Mol Graph Model 21, 217-25.

Miller, M. D., Kearsley, S. K., Underwood, D. J ., and Sheridan, R. P. (1994)
FLOG: a system to select 'quasi-ﬂexible' ligands complementary to a receptor of
known three-dimensional structure. J Comput Aided Mol Des 8, 153-74.

Deng, Z., Chuaqui, C., and Singh, J. (2004) Structural interaction ﬁngerprint
(SIFt): a novel method for analyzing three-dimensional protein-1i gand binding
interactions. J Med Chem 4 7, 337-44.

Perola, E. (2006) Minimizing false positives in kinase virtual screens. Proteins
64, 422-35.

Lazdins, J ., and Kron, M. (1999) New molecular targets for ﬁlariasis drug
discovery. Parasitol Today 15, 305-6.

Melrose, W. D. (2002) Lymphatic ﬁlariasis: new insights into an old disease. Int J
Parasitol 32, 947-60.

Brown, K. R., Ricci, F. M., and Ottesen, E. A. (2000) Ivermectin: effectiveness in
lymphatic ﬁlariasis. Parasitology 121 Suppl, S133-46.

Kron, M. A., Kuhn, L. A., Sanschagrin, P. C., Hartlein, M., Grotli, M., and
Cusack, S. (2003) Strategies for antiﬁlarial drug development. J Parasitol 89
(Suppl), $226-$235.

19

Chapter 2

Discovering new classes of Brugia malayi
asparaginyl-tRNA synthetase inhibitors and

relating speciﬁcity to conformational change

This research has been previously published as:

Sukuru, S. C. K., Crepin, T., Milev, Y., Marsh, L. G, Hill, J. B., Anderson, R. J ., Morris,
J. C., Rohatgi, A., O’Mahony, G., Gretli, M., Danel, F., Page, M. G. P., Hartlein, M.,
Cusack, S., Kron, M., and Kuhn, L. A. (2006) Discovering new classes of Brugia malayi

asparaginyl-tRNA synthetase inhibitors and relating speciﬁcity to conformational change.
J Comput Aided Mol Des 20, 159-78.

2.1 Abstract

SLIDE, which models the ﬂexibility of protein and ligand side chains while docking, was
used to screen several large databases to identify inhibitors of Brugia malayi asparaginyl-
tRNA synthetase (AsnRS), a target for anti-parasitic drug design. Seven classes of
compounds identiﬁed by SLIDE were conﬁrmed as having micromolar inhibition
constants against the enzyme. Analogs of one of these classes of inhibitors, the long side-

chain variolins, cannot bind to the adenosyl pocket of the closed conformation of AsnRS

20

due to steric clashes, though the short side-chain variolins identiﬁed by SLIDE
apparently bind isosterically with adenosine. We hypothesized that an open conformation
of the motif 2 loop also permits the long side-chain variolins to bind in the adenosine
pocket and that their selectivity for Brugia relative to human AsnRS can be explained by
differences in the sequence and conformation of this loop. Loop ﬂexibility sampling
using ROCK conﬁrms this possibility, while scoring of the relative afﬁnities of the
different ligands by SLIDE correlates well with the compounds’ ranks in inhibition
assays. Combining ROCK and SLIDE provides a promising approach for exploiting
conformational ﬂexibility in structure-based screening and design of species selective

inhibitors.

2.2 Introduction

Lymphatic ﬁlariasis, also known as elephantiasis, is caused by the nematode worms
Wuchereria bancroﬁi and Brugia malayi. It is a debilitating human disease that afﬂicts
more than 200 million people worldwide. More than 1.2 billion people in 80 countries
reside in areas where the disease is actively transmitted and are at a great risk of
contracting the disease (1-3). The crippling physical effects of the disease have a huge
economic and social impact, which is why lymphatic ﬁlariasis is one of the top 10
tropical diseases being targeted by the World Health Organization. Strategies to control
the disease include administration of drugs like iverrnectin, diethylcarbarnazine and
albendazole, which reduce the level of infection and prevent transmission. Most of these

drugs were discovered decades ago as chemotherapeutic agents to combat human

21

ﬁlariasis. However, the use of these drugs has been plagued with concerns about their
inability to kill the adult worms even after long treatment durations, severe side effects,
and the emergence of drug resistance in humans (4, 5).

Aminoacyl-tRNA synthetases (AARS) have been acknowledged as rational
targets for anti-infective drug development (6) because these enzymes are essential for
viability. AARS are one of several new drug targets in human ﬁlarial parasites that have
been proposed in recent years (1). They differ signiﬁcantly in sequence and structure
between the parasite and host organism, although sharing a common catalytic site
topology. AARS are responsible for the speciﬁc aminoacylation of transfer RNAs
(tRNAs). The two-step catalytic reaction involves the ATP-based activation of the amino
acid and the transfer of activated amino acid to the 3’-end of the cognate tRNA. The 20
AARS (one for each amino acid) are divided into two classes on the basis of their active
site architecture and conserved sequence identity. The class I AARS possess two
signature arnino-acid sequences (HIGH and KMSKS) located in the active site with its
characteristic nucleotide-binding Rossmann fold, which consists of alternating pattern of
B-strands and (it-helices (7). The active site of class II AARS is comprised of a six-
stranded antiparallel B-sheet ﬂanked by an additional parallel strand and 3 a-helices (8).
Three signature sequence motifs - motifs 1, 2 and 3 — characterize this class (Figure
2.1.3). Motif 1 consists of: +G(F/Y)XX(V/L/I)P<D(D, where + is a positively charged
residue, (D is a hydrophobic residue and X is any residue. Motif 2 consists of:
+<D<l>X<l>XXXFRxE. Motif 3 consists of: <DG<DG<I>G<I><I>ER<D¢<D<D. Several exceptions to
these classes have been reported (9). It is possible to further divide class II AARS into

subclasses Ila, IIb, and He based on the presence of speciﬁc domains that play a role in

22

 

 

Figure 2.1: (A) Brugia AsnRS dimer, with one chain‘of the dimer colored yellow while
the other is colored cyan. ASNAMS is rendered as a space ﬁlling model, colored by atom
type (carbon, green; oxygen, red; and nitrogen, blue; sulfur, yellow; magnesium, white).
The associated magnesium ions are shown as white spheres. (B) The three class II AARS
sequence motifs are mapped onto the structure. Motif 1 is shown in purple, motif 2 in
yellow (including the adenine binding loop at bottom center), and motif 3 in magenta.
ASNAMS is shown in atom colored tubes and interacts primarily with the motif 2
residues.

 

23

anticodon recognition. Given the sequence and structural differences between the
prokaryotic and eukaryotic AARS, and that they are essential for the viability of all
organisms, AARS make attractive targets for developing selective inhibitors (6, 10). For
example, Pseudomonic acid (mupirocin), a natural product synthesized by Pseudomonas
ﬂuorescens, inhibits isoleucyl-tRNA synthetase (IleRS) from Gram-positive infectious
bacteria, including antibiotic-resistant S. aureus (11). This molecule has been shown to

have almost 8000-fold selectivity for pathogen IleRS over mammalian IleRS (12).

We are targeting the Brugia malayi asparaginyl tRNA synthetase (AsnRS) for
drug development against ﬁlariasis because it is an essential enzyme in protein synthesis
that is expressed in both sexes of the nematode and in several stages of the life cycle -
adults, bloodbome microﬁlariae, and infective larvae (13). Recent studies have shown
that the expression levels of AsnRS in Brugia females are signiﬁcantly higher than those
of other AARS (14). AsnRS is also speciﬁcally associated with chemokine activity
towards human cells that may play a role in the massive inﬂammatory response
associated with lymphatic ﬁlariasis (15). AsnRS has been well characterized
biochemically and structurally and can be recombinantly expressed to facilitate in vitro
studies (16, I 7). AsnRS is a class IIb arninoacyl tRNA synthetase, together with AspRS
and LysRS. Class 11b AARS have a distinct N-terrninal-beta-barrel domain (OB fold) that
binds the tRNA anticodon stem loop.

Here, we present results of applying structure-based computational ligand
screening and design to develop inhibitors against Brugia AsnRS, using a 1.9 A
resolution structure of the enzyme in complex with a non-hydrolyzable analog of

asparaginyl adenylate, which likely mimics the enzymatic product. The homodimeric

24

structure (Figure 2.1.A) is N-terrninally truncated, with each monomer lacking the ﬁrst
111 amino-acid residues to enable crystallization of the complex; this tnmcation,
however, retains catalytic activity. Two more crystal structures of the enzyme, one a
dimer with the monomers bound to two different ligands and another a ligand-free dimer,
further aided our analysis. Structure-based drug design (18) has led to the development of
potent and speciﬁc new drugs, such as the widely used HIV protease inhibitor, Viracept.
Computational screening is useful for identifying new leads for drug design as well as
narrowing down the search domain and focusing in vitro screening toward appropriate,
oﬁen novel molecular scaffolds. Shoichet and co-workers have shown that computational
screening using molecular docking for identifying new scaffolds typically has a

signiﬁcantly higher hit rate than in vitro high-throughput screening alone (19).

The efﬁciency of structure-based screening methods depends on the speed with
which they eliminate infeasible ligand candidates and how accurately they predict the
binding modes and afﬁnities of the docked ligands. Both protein and ligand ﬂexibility
need to be incorporated in computational drug design to improve the accuracy of the
results (20, 21). Modeling protein ﬂexibility using representative protein conformations
to screen against each conformer of each ligand candidate is computationally very
expensive, requiring some compromise between speed and accuracy. A number of
methods have been developed in recent years to include such ﬂexibility in drug design.
Kuntz and co-workers averaged the information from multiple crystallographic and NMR
structures of the same protein to describe its conformational variability (22), and the
developers of AutoDock have used a similar strategy (23). FlexE software (24)

incorporates protein ﬂexibility by retaining the varied orientations of ﬂexible side chains

25

of multiple crystal structures of the same protein and using compatible combinations of
these conformations to dock ligands. Carlson and co-workers used several snapshots ﬁ'om
molecular dynamics (MD) simulations to map the conserved, relatively immobile
interaction sites in the dynamic protein binding site (25). However, these methods are
either limited to experimentally solved structures to explore the available conformational

space for proteins or focus on the rigid regions within the protein.

SLIDE (Screening for Ligands by Induced-ﬁt Docking, Efﬁciently)
accommodates protein and ligand side-chain ﬂexibility when screening databases of
hundreds of thousands of small organic molecules to identify potential ligand candidates
for a target protein (20, 26). Knowledge-based representation of the protein binding site
in SLIDE gives good sampling and identiﬁes the correct binding modes of ligands (27,
28). Steric misﬁt between the docked ligand and the protein is resolved in SLIDE through
the minimal directed rotation of single bonds in the ligand and in the protein side chains
(29). Ligand candidates are assumed to be in a conformation close to the bioactive,
bound conformation, or are input as libraries of low—energy conformers. Here we employ
ﬂexibility modeling to identify and model the interactions of new ligands for Brugia
AsnRS using SLIDE screening of databases of small organic molecules with drug-like
molecular weights and atomic compositions. The observed afﬁnities and speciﬁcities of
known and predicted ligands for Brugia AsnRS are compared with the structure-based
predictions. We also present a case in which large-scale conformational change of an
active-site loop must (and can) be modeled in order to predict the mode of ligand binding

and explain its reasonable selectivity for Brugia relative to human AsnRS.

26

The graph theoretic algorithm ProFlex, successor to the FIRST software (30),
was used to identify the coupled networks of covalent and non-covalent bonds within the
target protein to predict the ﬂexible regions in Brugia AsnRS. Diverse conformers of
these ﬂexible regions were generated using ROCK, a random-walk sampling algorithm
(31). In order to assess the contributions of the residues of the ﬂexible loop regions to
speciﬁcity and relative binding afﬁnities of ligands, detailed docking studies were
performed. Flexible low-energy conformers generated by Omega (OpenEye Software) for
the top-scoring ligand candidates were docked into the most open ROCK conformer of
the protein using SLIDE, and SLIDE modeled additional side-chain ﬂexibility in the
ligand and protein upon binding. This approach has provided insights into the role of
main-chain ﬂexibility in ligand binding for other systems (cyclophilin A, estrogen
receptor, dihydrofolate reductase, and HIV protease) (31, 32), and, in the case of Brugia
AsnRS, helps identify the structural elements that determine ligand speciﬁcity and

binding for long side-chain variolins.

2.3 Materials and methods

2.3.1 Asparaginyl-tRNA synthetase structures

A 1.9 A resolution closed structure of Brugia AsnRS in complex with a non-hydrolyzable
analog of asparaginyl adenylate (ASNAMS) was used for structure-based ligand
screening and design (Figure 2.1.A and Table 2.1). Additionally, two crystal structures
(Table 2.1) providing the ligand-free (apo) conformation, and a structure with one

monomer bound to ATP and the other bound to L-aspartate-B-hydroxamate adenylate

27

 

Table 2.1: Data collection and reﬁnement statistics of Brugia AsnRS crystal structures a

 

 

Data collection
Space group
Cell dimensions (A)

Resolution range (A)

Complex I

AsnRS: ASNAMS

P212121
55.5 125.7 144.3

30.19 (2.0-1.9)

Complex II

AsnRS:LBHAMP:ATP

P212121
57.9 106.4 161.4

5024 (2492.4)

Apo AsnRS

P212121
58.8 108.8 162.4

49-2.3 (2.42-2.3)

Rsym 1 b- ° (%) 8.7 (43.7) 9.5 (42.3) 7.1 (33.3)
Completeness b (%) 89.5 (69.6) 97.6 (95.4) 97.4 (91.9)
Reﬁnement statistics d

R-factor (%) 22.6 21.1 26.9

Rm, (%) 26.3 28.6 32.2
Ramachandran plot ‘3

Favoured (%) 86.9 79.1 86.5
Additional (%) 12 19.2 12.4

 

 

a Determined by Carmen Berthet-Colominas, Michael Hartlein, Thibaut Crepin and Stephen Cusack,
EMBL Outstation, Grenoble, France (33). X-ray coordinates will be provided upon request to
cusack@embl-grenoble.fr.

Values in parentheses are for the highest-resolution shell.

0 Rsyrn (I) = [thIZi|<Ihkl> - Ihkl,i|]/[thl Xillhkll], where i is the number of reﬂection hkl.
Reﬁnement with CNS (34).

e Ramachandran diagram has been calculated with PROCHECK (35 ).

 

(LBHAMP) were used for analyzing the conformational ﬂexibility of the protein. Data
collection and reﬁnement statistics are provided in Table 2.1. Details of crystallization

and structure determination will be presented elsewhere, and X-ray coordinates will be

28

deposited in the Protein Data Bank (33). In the meantime, X-ray coordinates can be

obtained by contacting Stephen Cusack (cusack@embl-grenobleﬁ).

2.3.2 Experimental assay

We standardized the malachite green assay for phosphate release (36—39) for use in
monitoring inhibition of aminoacylation using colorimetric measurement of

pyrophosphate generation from the ﬁrst step in the aminoacylation reaction:
E + AA + ATP 4—» E(AA-AMP) + PP. (2.1)
E(AA-AMP) + tRNA <——> E + AA-tRNA + AMP (2.2)

This assay was used to measure inhibition constants for inhibitors predicted by SLIDE

and their analogs.

2.3.3 Screening and docking with SLIDE

SLIDE (Screening for Ligands by Induced-Fit Docking Efﬁciently) was used to screen
databases of small organic molecules to ﬁnd potential inhibitors of Brugia AsnRS.
SLIDE (20, 26, 40) is a screening and docking tool that uses distance geometry to screen
and dock ligand candidates into the binding site of the target protein. SLIDE represents
the binding site of the protein by a template consisting of points identiﬁed as the most
favorable positions for ligand atoms to form hydrogen bonds or make hydrophobic
interactions with the neighboring protein atoms (27). The ligand candidates in the

database are similarly represented by a set of interaction points, assigned to polar atoms

29

or centers of hydrophobic atom clusters. For each ligand candidate, all possible triplets of
its interaction points are mapped onto all geometrically and chemically compatible
template triangles. The anchor fragment of the ligand is deﬁned by the triplet of
interaction points that match with a template triangle. Any portion of the ligand outside
this anchor fragment is considered ﬂexible by SLIDE. After ﬁnding a feasible match
between the ligand candidate and protein template, SLIDE models induced-ﬁt by
resolving steric overlaps between the ﬂexible portion of the ligand and the protein side
chains using minimal rotations determined by mean-ﬁeld optimization (26, 29).
Collision-free docked ligand orientations are scored based on the number of hydrogen
bonds and degree of hydrophobic complementarity with the protein. The SLIDE
software is available to academic and commercial researchers; see Software at

http://www.bch.msu.edu/labs/kuhn

2.3.4 Scoring protein-ligand interactions

Several comparative studies of docking and scoring methods (28, 41 -45) have shown that
no one scoring function for predicting ligand binding and afﬁnity performs consistently
well across diverse protein families. Hence, to develop a scoring protocol that can
distinguish ligands from non-ligands, reliably detect the correct conformation and
binding mode for known ligands and score them in the order of their relative afﬁnity for
Brugia AsnRS, a panel of three different scoring functions was tested: SLIDE score (2 7),
DrugScore (46), and X-Score (4 7). SLIDE score is a weighted sum of hydrophobic and

hydrogen-bond interaction terms, trained to match afﬁnity values in known complexes.

30

DrugScore is a knowledge-based scoring ftmction that uses structural information from
the Protein Data Bank (PDB) to score protein-ligand complexes based on the preferred
distances observed between different ligand and protein atom pairs. X-Score is an
empirical scoring function that calculates the binding afﬁnity of a protein-ligand complex
by using terms that account for van der Waals interactions, hydrogen bonding,
deformation, and the hydrophobic effect. Scoring accuracy was determined by how well
the scoring functions assessed the binding modes and relative afﬁnities of known Brugia
AsnRS ligands, whereas the enrichment accuracy was determined by their ability to
select true ligands from a large number of decoys (1000 diverse, random drug-like
molecules obtained from the website of Dr. Didier Rognan, CNRS: http://bioinfo.pharma.
u-strasbg.fr/bioinformatics-cheminforrnatics-group.htrnl). Low-energy conformers of
these decoy molecules were generated using Omega (OpenEye Software) as input to
SLIDE screening. All conformers within 7.5 Kcal/mol of the minimum energy conformer
sampled (using the MMFF force ﬁeld) were included. For each docked compound
(known ligands as well as non-ligand decoys), only the top scoring binding orientation

was ranked. Any two docked compounds scoring identically were given identical ranks.

2.3.5 Modeling main-chain ﬂexibility

The active sites of human and Brugia AsnRS are very similar, with only 3 amino acid
differences in the ﬁrst shell of amino acids surrounding the active site, including all
residues within 9A of any atom of ASNAMS (Figure 2.2.A). Because these three side

chains point away from the binding site, it is most likely that they inﬂuence the

31

 

Figure 2.2: (A) Known ﬂexible regions of Brugia AsnRS, based on comparison of the
three crystal structures, are shown in green ribbons, while the rest of the monomer is
shown in grey ribbons (see Table 2.5 for details; residue ranges are given for each
ﬂexible loop). The three residues that differ near the active site in Brugia and human
AsnRS are rendered in space ﬁlling models colored by atom type, with carbon atoms
colored grey. ASNAMS is rendered in atom-colored tubes with carbon atoms colored
orange. The template generated by SLIDE to represent the binding site is rendered in
small stars. The template points, representing ligand chemistry that would be favored at
that site, are colored red for hydrogen-bond acceptor; blue for hydrogen-bond donor;
white for hydrogen-bond donor and/or acceptor; and, green for hydrophobic interactions.
(B) ROCK ﬂexible loop conformations generated from the ASNAMS-bound Brugia
AsnRS crystal structure. The diverse conformations shown were chosen from 500 ROCK
conformers by selecting those conformations with the largest pairwise RMSDs in main-
chain dihedral angles, as described in [31]. The monomer is rendered in ribbons colored
by the ﬂexibility index, with blue being the most rigid and red being the most ﬂexible
regions of the structure, according to ProFlex. The ligand is shown in atom colored tubes
with carbon atoms in orange. The three active-site-neighboring residues that differ
between Brugia and human AsnRS are rendered as space ﬁlling models colored by
ﬂexibility index. Different conformers for the ﬂexible regions of Brugia AsnRS,
generated by ROCK, are shown in green ribbons, with the adenine binding loop shown at
bottom.

 

32

 

Figure 2.2

33

conformations of residues that interact with asparagine or adenosine or the conformations
of the motif 2 adenine-binding loop, which has residue 224 (Ala in Brugia and Thr in
human) near its hinge. To examine alternate conformations accessible to the known
ﬂexible active-site loops in Brugia AsnRS, we used ProFlex software (30) to identify the
ﬂexible regions in the protein, and ROCK (31, 32) to sample them. ProFlex predicts the
ﬂexible and rigid regions in a given structure (which bonds are constrained and which
bonds remain free to rotate) based on analysis of constraints posed by the protein’s
network of covalent bonds, hydrogen bonds, salt bridges, and hydrophobic interactions.
ProFlex calculations are fast and have been shown to predict the conformational
ﬂexibility of a protein reliably from a single 3D structure (30, 48, 49). ROCK (Rigidity
Optimized Conformational Kinetics) uses a restricted random-walk sampling to search
the conformational space available to proteins given the ﬂexible regions deﬁned by
ProFlex as input. A conformer generated by ROCK is either accepted or rejected,
depending upon whether it maintains the non-covalent bond network and results in no
van der Waals overlaps between atoms. The most distinct main-chain conformers
generated by ROCK are selected based on the RMSD values relative to the initial
structure. Brugia AsnRS active-site loop conformers representing favorable open
conformations of the protein were used to interpret the observed afﬁnities and
speciﬁcities for Brugia AsnRS relative to human AsnRS for inhibitors that could not bind
(based on steric clashes) with the closed loop conformation of Brugia AsnRS. ProFlex
and ROCK software are available to academic and commercial researchers; see Software

under http://www.bch.msu.edu/labs/kuhn.

34

2.4 Results

2.4.1 Scoring Brugia AsnRS-ligand interactions

The results of the scoring analysis by SLIDE score, DrugScore and X-Score (Figure 2.3)
show that both SLIDE score and DrugScore do a reliable job of assessing the right
conformation, binding mode, and relative afﬁnity of known AsnRS ligands, and also

distinguish three known ligands from 1000 diverse drug-like molecules.

 

 

 

 

 

 

 

 

 

-I— Slide
+ Drugscore
. —A-— XScore
100 - I 0
g 90 .-
'5 .
9 80 -
U)
'o ..
8
g 70 4 ‘
1: .
E 60 -
C
x .
“6
a 501
g 'I
g 40 -' i
‘ ll
30 1
l 1 l l L I I

 

Rank

Figure 2.3: Enrichment plot for the three different scoring functions - SLIDE score,
DrugScore and X-Score. The scoring functions were assessed for their ability to
distinguish as top—scoring compounds the three known Brugia AsnRS ligands
(ASNAMS, LBHAMP, ATP) when mixed into a set of 1000 drug-like small molecules
(from http://bioinfo-pharma.u-strasbg.fr/bioinformatics-cheminformatics-group.html). The top-
scoring binding orientation and conformer for each ligand candidate was ranked relative
to the other candidates.

 

35

The ﬁrst of these known ligands discovered for Brugia malayi independently of
SLIDE, with an experimentally characterized binding mode, is ASNAMS, which was
designed as a product analog for step 1 in the aminoacylation reaction and had previously
been shown to bind in the crystal structure of Thermus thermophilus AsnRS (8). The
structure of Complex I (Figure 2.1) shows that this compound binds in the same
orientation in Brugia malayi AsnRS, with an IC50 value of 4.5 M, as measured by the
malachite green assay (Table 2.2). L-aspartate-B-hydroxamate was used as a reagent
during development of the malachite green assay (39). When the third known ligand, the
substrate ATP, was added during the assay, L-aspartate-B-hydroxamate adenylate
(LBHAMP) was formed and remained bound to AsnRS, as conﬁrmed by the structure of
Complex II (Table 2.1). This complex contains LBHAMP and pyrophosphate bound to
one of the monomers in the dimer, whereas ATP alone was observed in the other
monomer. The IC50 value of LBHAMP is 4 pM (Table 2.2), very similar to that of the
product analog ASNAMS. The three known ligands, ASNAMS, LBHAMP, and ATP,
were all docked by SLIDE to within 1 A RMSD of their crystallographically observed
positions, and the scoring functions ranked them correctly according to their
experimentally determined IC50 values against Brugia AsnRS; adenosine is not observed
to inhibit AsnRS at a concentration of 500 pM. The enrichment plot in Figure 2.3 shows
that SLIDE score performs particularly well in ranking the three Brugia AsnRS known
ligands within the top 5 scoring compounds and clearly distinguishes them from the vast

majority of the molecules used as non-ligand decoys.

36

 

 

__ \o
o olalulolxolamloA
32980 soz 4 one- a. z z m mi :zlo: Bee/Ema
_\
_ \V
2/ z
N=2
:0 IO
o o
o Diminimi—ﬂUJWIOA
363560 E 2. as. 8 W.,. 2 6 £2 2": m2<zm<
_ \V
2/ 2
£2
mega. 9:2
535! 892% Q: 5 0.50m
625250 28002: Mme—Aw 232:5 5 2305

9320— a

 

0:50:50 madam 392% 9:5:

@322me 05 mo < 523 SE mweﬂooe Lou 008950 0.63 0029» 0.80me5 98 maum 0E. .mm:0< EMEM mo 005m: 565. Co
005:; bags “SEEDS—0 33308530 28 AOLoomwen we“ meqmv 00800 3520503800 0o:n_:5-mM=m< 88:00.5 ”ﬁn 030B

37

80888808 82 u
.856 65 s 8.... 28 8:2 86 see 26 a 8:8 2 82:3 8 8358

8 0:53 @8044. .8 88:0 :83 8 983 mm m2<Zm< A 83800 5 8008388 88880 05 8 8:08:88 088 88 ﬂu 030B 9 8.8% 0

03838 088 mm 080me 8 029» 03:80: 088 < a
03838 088 8 080.6. mega 8 039» 8:3: < m

 

 

: 83800

DZ

DZ

36-

NV

 

:o _._o
-o .o -o
o olln__llo w olhlo.
W2 2 m t w .5
./_ .0

N22

8888 8 as:

38

2.4.2 Screening the databases

A template was generated by SLIDE to represent the active site of Brugia AsnRS and its
interactions with ASNAMS and used to screen databases of small organic molecules. The
Cambridge Structural Database (CSD) (50) and the National Cancer Institute (NCI)
Plated Compounds Database (51) were ﬁltered to retain compounds with appropriate
atom types (no bound metals or inorganic atoms other than halogens), molecular weights
(5 SOODa), lipophilicity values (logP S 5) , ﬂexibility (S 5 rotatable bonds) and polar
character (S 5 hydrogen bond donor atoms and S 10 hydrogen bond acceptor atoms), to
focus screening on the most drug-like compounds (52). The ﬁnal set of compounds for
screening included about 110,000 compounds from the CSD and about 78,000
compounds from the NCI Plated Compounds Database. The CSD contains at least one
low-energy crystallographic conformation for each of its compounds, and these
conformations were used for screening. For the NCI compounds, low-energy 3-
dimensional conformers were generated using Omega (OpenEye Software). All
conformers within 7.5 Kcal/mol of the minimum energy conformer sampled (using the
MMFF force ﬁeld) were included. Both SLIDE score and DrugScore were used to score
the docked orientations of the potential ligand candidates screened by SLIDE.
Compounds were selected for experimental assays based on these scores and molecular
graphics inspection of their interactions with Brugia AsnRS. Out of the high-scoring
candidates, we selected for assays those having several of the following desirable
features: matching the hydrogen-bond and pi-cation interactions of known AsnRS
ligands, ﬁlling the same volume as the product analog ASNAMS in the adenine and

ribose pockets, having amine or halogen groups that would lend to ready substitution, and

39

having no known tendency to self—assemble at typical inhibitor concentrations. Upon
assaying, the following compounds were found to inhibit Brugia AsnRS signiﬁcantly at

concentrations in the low to mid-micromolar range.

2.4.2.] Results from screening the CSD

2.4.2.1.1 Variolin B

Variolin B (CSD code LEPWIM), a pyrrolopyrimidine, was originally isolated from an
Antarctic sea sponge and has been shown to have antitumor and antiviral activity (53). In
earlier work, SLIDE identiﬁed variolin B as a potential inhibitor (3), and it was
conﬁrmed to inhibit ~50% (47%:h16%) of Brugia AsnRS activity at a concentration of
SOnM. This marine natural product contains ﬁve- and six-membered rings that are
isosteric and share chemistry with adenine, suggesting that it could bind similarly to
adenosine to the AsnRS structure. Indeed, SLIDE indicates (Figure 2.4.A) that it binds in
the same pocket as the adenosyl portion of ASNAMS in the Brugia AsnRS
crystallographic complex (3). Pyrrolopyrimidines and their analogs have been shown to
compete with ATP to inhibit cyclin dependent kinases (54, 55). Here, as a follow-up
study, three available, synthesized derivatives of variolin B (56-58) were tested for
Brugia and human AsnRS inhibition; their IC50 values appear in Table 2.4. One of these
derivatives, SMEVAR, shows signiﬁcant inhibition of AsnRS at 50 uM concentration,
whereas LCMOI and LCM02 (59-61) showed somewhat weaker (125-175 uM) IC50

values, with the advantage of 3- to 8-fold selectivity for Brugia over human AsnRS.

4O

 

Figure 2.4: The SLIDE-predicted binding mode of (A) variolin B, shown in thick atom
colored tubes (carbon, green; oxygen, red; and nitrogen, blue), is compared with
ASNAMS bound in the crystal structure, shown in atom colored tubes but with carbon
atoms colored orange. Side chains of binding-site residues rotated by SLIDE, to model
induced-ﬁt while docking, are shown in purple tubes, while their original positions in the
crystal structure are shown in grey. Predicted binding modes are also shown for the
SLIDE-discovered inhibitors, in atom-colored tubes, compared with the crystallographic
binding modes of ASNAMS. (B) Rishirilide B, (C) cycloadenosine, (D) NSC114691
(phenanthridinol), (E) NSC363624 (triazinylamine), (F) NSC35467
(phenanthrylethanone), and (G) NSC12156 (dimethylmalonamide).

 

41

 

Figure 2.4

42

 

21mm 3 £2 2 z 2...:
8:535 / \ /j
$3 2.. OS- 8 _ _ £08894
2

 

 

0 Q2
0...
z
om mc end- 0v .0 \ // aw: ~UmZ
0'0 .
o
cow cvm and- 3V 02—530mm
085. mwaﬁ.
58:5 a.» 3.5 M 080
a: 0 a 0 25250 E 2305

080 w...—
E5 80— a m G nan—Am

 

 

.A _ .N 030E 000v H 063800 .8 < 8830 88 mwﬁxoou 8% 8083800 0803 002? 080meQ 80 man—Am 08 0:55 £8.04 58::
88 892% 08 >008 8005 0030.208 08 maﬁa 8 @0003 0.8 003850 380880an .9804 592% .8 88338 wouonvoaﬁeqm .8
003? .3850 80888008 338083898 88 $80me 880 mew—9 0080.... 3880803800 8:338-m~80< 80880.5 ”ﬂu 03:8

43

.0EEO>£ 058 mm 080me mo 02? 0>umw0c 085 <

00.0.5530 Ho Z 0

n

diggﬂ 088 mm 058 mew—m we 0:?» 00:3: < m

 

 

2108 3
cou:=::_
fog

Q2

galoomym
:ou:=n:_
ovg

QZ

géloo~aa
couuzgam

REV

DZ

gancomam
couwzgam
@»mm

OZ

0N6-

o3».

o /

\\\

$54va

$3 mUmZ

 

835:8 3 «Sue

 

 

 

 

 

oomA omﬂm: 0 OZ mm 5204
Sum—3 mi m nod- Mm M<>m2m
$8000 983
008.5 0.00:5 Q: My 0.80m
0000mw=0D maﬂm 00:00:05 ON .80»:
93 an: 0
00800.00 my80< Swim 0859

-mZSAm< 05 .00 < 8080 008 080—000 00.“ 0080800 0003 000000 0000808 05 0.83 .mM:0< 808:: 0:0 SMEM 00.0 @000 000% 0020808 05
$800 :0 0003 000 005850 _0808toaxm 0800800020? 000 m 8:088 0.80588 mug/x umwﬁm 0000>000_?m94m 08 .«0 $2080 00800850
.00 0080> E850 008800000 £0808t00x0 0:0 A0000meQ ~80 meqmv 00.500 38080808800 008£8Tm~80< 03080.5 ”Wm 030B.

45

.000 050.5 02020.0 :0 0:80 0025030
000 3 008% 950w 0208000000 08 .00 0008000 805 AHOQ<>UV 0800000002003 .00 808000 mew—m :0 00009 00008000 003 w0_0:0 000C. 0

008800000 002 0
030000000 0008 08 008me .00 080> 0>00w000 0008 < 0
050000000 0008 08 00000 meqm .00 080> 0088 < 0

 

 

 

O

\\

OlmlzIOIIOIUIIU
/z£ u z0<-0
8 00 -8906
82A «0002 02 00 820.0

 

0008800 EN 030,—.

46

Although variolin B and its derivatives show promising inhibition of Brugia
AsnRS enzymatic activity, they are highly cytotoxic in human Namalwa cell lines (F.
Danel, data not shown). The binding mode predicted by SLIDE indicates that the ligand
binds in the adenosyl pocket of the binding site. Because variolins contains ﬁve- and six-
membered rings that are isosteric and share chemistry with adenine, these compounds
may be toxic because they bind to ATP sites in general. This interpretation is supported
by research showing that pyrrolopyrimidines and their variants inhibit human protein
kinases (62-65). The key to address cytotoxicity would be to design in selectivity for
Brugia AsnRS by conjugating an asparagine side-chain to the variolin scaffold via an
appropriate linker, and this work is in progress. Since no crystallographic information is
available on the variolin B derivatives, but the Cambridge Structural Database provides a
crystal structure for the variolin B scaffold (CSD code: LEPWIM), the 3D structures of
the variolin B derivatives were built on this scaffold using CORINA (66), followed by
generating all low-energy conformers using Omega (OpenEye Software). All conformers
within 7.5 Kcal/mol of the minimum energy conformer sampled (using the MMF F force
field) were included. Results of docking these conformers will be presented in the section

entitled “Impact of Main-chain Conformational Flexibility on Ligand Binding”.

2.4.2.1.2 Rishirilide B

Rishirilide B (CSD code CUQZUJ), isolated from Streptomyces rishiriensis, has been
shown to have antithrombotic activity through selective az-macroglobulin inhibition,

leading to the activation of plasmin (6 7). It has a tricyclic scaffold that is relatively rigid

47

and is isosteric with the adenosine moiety of ASNAMS, according to the SLIDE docking.
This orientation (Figure 2.4.B) was scored highly by both SLIDE and DrugScore. An
alkyl side-chain protrudes from the binding site and is in a polar environment surrounded
by residues Arg411, Glu310 and Hile9. The scaffold of this compound, generated by
pruning the alkyl chain and a carboxymethyl group (-COOMe) from the ligand, was also
docked using SLIDE. The scaffold docked with improved isostericity with the adenine
portion of ASNAMS but with a poor interaction score, apparently due to loss of the side
chains. The malachite green assay was carried out on a sample of rishirilide B, provided
generously by Dr. Samuel Danishefsky (Sloan-Kettering Institute, New York), to test for
inhibition of Brugia AsnRS. The compound has + and — enantiomers, and only the
enantiomeric mixture showed weak inhibitory activity (Table 2.3), while the —
enantiomer showed no inhibition. This suggests that the + enantiomer is responsible for
inhibition. Given the unavailability of puriﬁed + enantiomer for assays and the apparently
weak binding of the enantiomeric mix, we chose to focus on tighter-binding compounds,

as described in the following sections.

2.4.2.1.3 Cycloadenosine

Also identiﬁed by SLIDE as a top-scoring potential inhibitor was 8, 2’-cycloadenosine
(CSD code for the crystal structure of its trihydrate: CYADOT). Cycloadenosine is a
modiﬁed nucleoside cyclized at the C(8) and 0(2’) atoms and is known to be active
against leukemic and other tumor cells (68). Derivatives of cycloadenosine are also

known to inhibit other tRNA synthetases: PheRS, SerRS, LysRS, ValRS, IleRS and

48

AragRS (69). This compound was docked by SLIDE in a position that is isosteric with
the adenosine of ASNAMS and was considered to have a potential advantage over
adenosine because the bridged ring system in cycloadenosine could potentially reduce the
entropic cost of binding to AsnRS. The binding mode predicted by SLIDE is shown in
Figure 2.4.C. This compound can be made more stable by replacing the oxygen in the
bridge between the adenine and ribose moieties by a methylene group. While this
cycloadenosine did not show inhibitory activity even at a concentration of 500 uM, this is
also true of the native substrate, adenosine. The corresponding sulfamoyl asparagine
derivative (CYADOT-S-Asn) was then synthesized and assayed for inhibition of Brugia
AsnRS. The corresponding sulfamoyl asparagine derivative of cycloadenosine showed
moderate inhibition, with IC50 values of 70 p.M and 90 [1M against Brugia and human
AsnRS, respectively (Table 2.4). This represents weaker binding than indicated by IC50
values of the un-cyclized analog, ASNAMS (4.5 uM and 1.7 uM, respectively). Strain
caused by cyclization of the ribose moiety to the adenine, resulting in the 5’ sulfamoyl
asparagine group being directed somewhat out of the asparagine pocket of the binding
site, may have weakened the binding relative to non-cyclized ASNAMS. This can be

addressed by redesigning the linker.

2.4.2.2 Results from screening the N CI plated compounds database

2.4.2.2.] Phenanthridinol

8-Chloro—3-(hydroxy(oxido)amino)-6-phenanthridinol (NCI code: NSC114691) has a

rigid tricyclic scaffold (Figure 2.4.D) that ﬁlls the adenine pocket of AsnRS. Although

49

this compound showed a favorable 65 uM inhibition of Brugia AsnRS in the
experimental assay (Table 2.3), the planarity and aromaticity of the scaffold are a cause
for: concern. Planar, tricyclic scaffolds are potentially toxic because of their ability to
intercalate DNA. Compounds sharing similar scaffolds have been shown to possess

inhibitory activity against Brugia AsnRS (70) and phosphodiesterase 4 (PDE4) (71 ).

2.4.2.2.2 Triazinylamine

4-(3 -(4-amino-6—isopropeny1— 1 ,3 ,5-triazin-2-yl)phenyl)-6-isopropenyl-1 ,3 ,5-triazin-2-

ylamine (NCI code: NSC363624) has a symmetric structure with two substituted triazine
rings connected by a phenyl group. The SLIDE-predicted orientation of the compound
(Figure 2.4.E) places one of the triazine rings isosteric with the 6-membered ring of
ASNAMS and mimics its interactions with the surrounding binding site residues as well.
The bridging phenyl ring in the center is docked in the ribose pocket of the binding site,
but is unable to mimic the polar interactions of the ribose ring owing to its hydrophobic
character. Results of the malachite green assay (Table 2.3) show that this compound
inhibits 50% of Brugia AsnRS activity and 80% of human AsnRS activity at 25 uM
concentration. Further studies will assess if structure-based substitutions can make it an
even more potent and selective inhibitor of Brugia AsnRS. 1,3,5—triazine-substituted-
polyamines have been shown to be active against the malarial parasite, Plasmodium

falciparum ( 72).

50

2.4.2.2.3 Phenanthrylethanone

2-(3-methyl-1 XS-pyridin-l-yl)—1-(2-phenanthryl)ethanone (NCI code: NSC35467) is a
charged pyridine derivative. A keto group bridges between the tricyclic moiety and the
pyridine ring. The SLIDE-predicted orientation (Figure 2.4.F) shows the tricyclic
scaffold in the adenine pocket which, though isosteric with adenine does not form any
speciﬁc hydrogen bonds with the surrounding binding site residues. Planarity of the
tricyclic scaffold in this compound increases concerns about potential toxicity associated
with DNA intercalation. Results from the malachite green assay (Table 2.3) show that at
200 uM, this compound inhibits 53% Brugia AsnRS activity but does not inhibit human
AsnRS. Although it is weak inhibitor of Brugia AsnRS, its selectivity for Brugia relative
to human AsnRS is attractive. We are focusing on identifying the source of speciﬁcity

within this compound to guide the optimization of more potent inhibitor scaffolds.

2.4.2.2.4 Dimethylmalanomide

N1,N3-bis(4-amino-Z-methyl-6-quinolinyl)-2,2-dimethylmalonamide (NCI code: NSC-
12156) is a symmetric compound with two bicyclic ring systems. Disubstituted
malonamides are known to have weak trypanocidal activity against Trypanosoma brucei
(73). SLIDE docked one of the bicyclic groups, which shares chemistry, and shape with
the 6-membered ring of adenine, into the adenine pocket of the binding site (Figure
2.4.G), mimicking the hydrogen bonds formed by N3 and N6 of adenine. However, in
this predicted binding mode, the other bicyclic ring system could not be docked

favorably. Results from the inhibition assay (Table 2.3) show that this compound weakly

51

inhibits both Brugia and human AsnRS (~ 50% inhibition at 200 uM ligand
concentration). However, a single malonamide group could be substituated to allow

binding in the ribose and asparagine pockets as well as the adenine pocket.

2.4.2.3 Success rate of screening

From SLIDE screens on the CSD and NCI drug-like compounds, 45 compounds
altogether were tested for Brugia and human AsnRS inhibition. Out of the compounds
tested, seven classes of compounds predicted by SLIDE were conﬁrmed as low- to mid-
micromolar inhibitors: rishirilide and cycloadenosine from the CSD, four NCI plated
compounds, and variolin B and its analogs (Tables 2.3 and 2.4 and Figure 2.4). Some of
these compounds and their analogs (particularly the long-chain variolins, Table 2.4)
selectively inhibit Brugia relative to human AsnRS. The success rate in screening by
SLIDE for AsnRS inhibitors is thus 7 out of 45 compounds (~15%). The best published
hit rate for structure-based screening is 34% (I9), involving visual screening by

medicinal chemists as well as using docking scores as a guide.

2.4.3 Modeling the conformational ﬂexibility of Brugia AsnRS

Modeling protein ﬂexibility in ligand binding is important to improve the accuracy of
results in computational ligand screening and design, as even a small change in the
protein binding site conformation can introduce large changes in ligand interactions and

computed binding afﬁnities. Understanding conformational differences can also enable

52

the design of substituents that improve binding and speciﬁcity. The results from the
enzyme inhibition assays performed on ligand candidates identiﬁed by SLIDE indicate
that they bind to Brugia AsnRS with a range of binding afﬁnities, from low (2 200 uM)
to moderate (S 25 M). The variolin B derivatives are of particular interest because they
show some selectivity towards Brugia AsnRS. Given the absence of active-site sequence
differences between Brugia and human AsnRS, we sought to understand how the
ﬂexibility of active-site loops coupled with neighboring sequence differences (Figure

2.2.A) could inﬂuence ligand binding through conformational differences.

To model ﬂexibility that might contribute to inhibitor speciﬁcity, the known
ﬂexible regions in Brugia AsnRS were mapped by comparing the ASNAMS-bound and
ligand-free (apo) crystal structures (Table 2.5). Analysis of the apo crystal structure
indicates high mobility of the adenine-binding loop, and comparison of Brugia AsnRS
structures bound to LBHAMP and ATP indicates the amino acid recognition loop adopts
signiﬁcantly different conformations depending on the type of ligand bound. To assess
alternative conformations for the adenine-binding loop, ProFlex ﬂexibility analysis (30)
was performed on the ASNAMS-bound conformation of Brugia AsnRS to identify the
coupled networks of covalent and non-covalent bonds within the protein. The ligand was
removed from the protein before running ProFlex, and only those hydrogen bonds and
salt bridges with energies of S —1 .0 Kcal/mol were included to avoid including hydrogen
bonds that are too weak to inﬂuence protein ﬂexibility. The results of ProFlex analysis
for Brugia AsnRS included the relative ﬂexibility for each bond (from rigid/non rotatable
through entirely ﬂexible) and lists of which bond rotations were coupled through rings of

covalent and noncovalent interactions. This information and the structure of Complex I

53

 

Table 2.5: Known ﬂexible regions in Brugia AsnRS. See structures in Figure 2.2.

 

 

 

Residue Range of

Flexible Loop Remarks

K213 ' L220 Coordinates could not be determined due to mobility of these
Adenine binding loop residues in both chains A and B of the apo crystal structure.
E297 '— F302 Residues were mobile (no coordinates determined) in chain A
Distal loop of the apo crystal structure.

Q1 63 — L172

Signiﬁcant conformational differences were observed for these
Amino acid substrate residues between the LBHAMP-bound monomer and the ATP-

recognition 100p bound monomer of the same crystal structure.

 

 

 

with ASNAMS removed were used as the input to ROCK. ROCK then generated
alternative low-energy conformations that preserved the non-covalent bond network, by
sampling (3]) favored main-chain dihedral angles in the ﬂexible regions of Brugia
AsnRS. Five hundred main-chain conformers were generated, spanning from closed to
very open conformations. Out of the 500 conformers generated by ROCK, there were 14
conformers with a signiﬁcant main-chain deviation (more than 4.5 A) in the adenine-
binding loop. To select from these conformations the most open conformer of the protein,
overall, we computed the minimum and maximum distances of the three known ﬂexible
loops (Table 2.5 and Figure 2.2.A) from the centroid of the co-crystallized ligand,
ASNAMS, and chose the AsnRS conformation with the greatest sum of these loop

distances. Thus, the open conformation analyzed not only had a signiﬁcant main-chain

54

deviation in the adenine-binding loop, but also reﬂected a feasible, highly open

conformation of the protein overall, when compared to the closed conformation.

The most open conformation generated by ROCK for Brugia AsnRS shows a
signiﬁcant opening of the adenine-binding loop (residues K213 — L220; Figure 2.5.3)
connecting the two anti-parallel B-strands near the binding site (Figure 2.1.B). This loop
is involved in the binding of ATP and the acceptor end of the cognate tRNA and has been
reported to play a signiﬁcant role in conformational changes associated with other class II
AARS, such as AspRS (74), LysRS (75), SerRS (76), ProRS (77) and HisRS (78).
Mutations in this loop have also been shown to affect the tRNA dependent amino acid
recognition by SerRS (79). Motion of this loop in Brugia AsnRS, as simulated by ROCK,
exposes a new cavity near the adenosine pocket of the binding site, leading to an open
conformation that emulates the apo crystal structure of the protein (Figure 2.5.B). A shift
in the position of His 219 facilitates the signiﬁcant change in backbone conformation of
this loop. In the ASNAMS-bound crystal structure, representing the closed conformation
of the protein, His 219 is docked between Glu 310 and Arg 411 and blocks access to the

cavity that is exposed in the open conformation (Figure 2.5.C, D).

55

 

Figure 2.5: (A) The ASNAMS-bound closed crystal structure conformation of Brugia
AsnRS is compared with the ligand-free (apo) crystal structure. ASNAMS is shown as
atom colored tubes, with carbon atoms colored orange. The Connolly solvent-accessible
molecular surface of the apo conformation, lacking atomic coordinates of the mobile
adenine binding loop, is rendered as a solid surface (off-white), while that of the closed
conformation is rendered as mesh in cyan, showing the closed adenine binding loop as a
cyan-colored ribbon. (B) The ROCK-generated most open conformation of Brugia
AsnRS is compared with the apo crystal structure. The Connolly surface of the apo
crystal structure conformation is again rendered as solid, while that of the open
conformation is rendered as green mesh. The most open conformation of the adenine
binding loop is rendered as a green ribbon. (C) The adenine binding loop residue His 219
(Connolly surface colored by atom type, with carbon atoms colored cyan) is docked
between Glu 310 and Arg 411 (atom-colored Connolly surface, with carbons in green) in
the closed crystal structure conformation of Brugia AsnRS. ASNAMS is shown in atom
colored tubes with carbons colored orange. (D) His 219 (Connolly surface colored by
atom type) undergoes a signiﬁcant motion away from adenosine, due to reorientation of
the main chain in the ROCK-generated open conformation. (E) LCM02, a long side-chain
variolin B derivative shown in atom colored tubes, manually docked in the closed crystal
structure conformation of Brugia AsnRS (with Connolly surface colored purple)
according to the predicted binding mode of variolin B (Figure 2.4.A). ASNAMS is shown
in atom colored tubes, with carbons colored orange. The steric clashes between the side-
chain of LCM02 at the lower leﬁ and the closed conformation of the adenine binding
loop (purple ribbon) could not be resolved by any single bond rotations in the ligand or
protein. (F) LCM02, long-chain variolin B derivative shown in atom colored tubes,
docked in the ROCK-generated open conformation of Brugia AsnRS (Connolly surface
colored cyan) to match the variolin B binding mode. For reference, ASNAMS is shown
in atom colored tubes with carbons colored orange. There were no steric clashes between
LCM02 and the open conformation of the adenine binding loop (cyan ribbon), and the
long side chain with dimethyl amino group (lower-left) ﬁts well into the channel
uncovered by opening the adenine binding loop and the proposed His 219 gate.

 

56

 

Figure 2.5

57

2.4.4 Impact of main-chain conformational ﬂexibility on ligand
binding: interpreting the observed affinities and

speciﬁcities

The binding modes predicted by SLIDE, using the closed, crystal structure Brugia AsnRS
conformation, could explain the observed binding afﬁnities and speciﬁcities of all the
compounds except the two long side-chain derivatives of variolin B (LCM01 and LCM02
in Table 2.4). All low-energy conformations of the variolin B derivatives, generated
using Omega, were tested for docking into the closed conformation with SLIDE. LCM01
and LCM02 could not be docked into the binding site of the closed conformation, due to
unresolvable steric collisions between the long side chains of the variolins and the
backbone atoms of the adenine-binding loop (Figure 2.5.E). However, with the same
docking protocol, SLIDE was able to dock these compounds into the binding site of the
open conformation generated by ROCK (Figure 2.5.F). (The apo crystal structure was
not used for docking these compounds because the adenine-binding loop is so ﬂexible in
the apo structure that its atomic coordinates could not be determined, and the interactions
between the ligand and the loop therefore could not be assessed.) The binding mode of
LCM01 and LCM02 docked in the , ROCK-generated open conformation of Brugia
AsnRS was in good agreement with the SLIDE-predicted binding mode of the

unsubstituted variolin B in the closed conformation of the protein.

Long side-chain variolins LCM01 and LCM02 have IC50 values of 173i90 11M
and 123i54 uM, respectively, indicating they are moderate to weak binders of Brugia

AsnRS, whereas variolin B and its short side-chain derivative SMEVAR have IC50

58

values of ~50 uM against the enzyme. However, the assay data on the long side-chain
variants of variolin B indicates they bind to Brugia AsnRS 3- to 8-fold more tightly than
to human AsnRS. While the structure of human AsnRS has not yet been determined,
there are only three sequence differences (A224T, A3353, and L353V; Figure 2.2.A)
near the active site. One of these residues, Ala 224 in Brugia AsnRS, is near the hinge of
the adenine-binding loop, and may favor a different, more open conformation in Brugia
than in human AsnRS, since the less bulky and non-hydrogen bonding alanine side chain
is likely to restrict the motion of this loop less than threonine. The Brugia selectivity of
long side—chain variolins may therefore be explained by their ﬁtting only into the open
conformation of the binding site, with this conformation being more readily accessible in
Brugia than in human AsnRS due to the sequence substitution at the base of the loop.
Similarly, the greater potency of the short side-chain variolins relative to the long side-
chain analogs in Brugia could also be accounted for by this conformational model, since
the closed conformation of the adenine-binding loop allows more favorable contacts with

inhibitors.

2.5 Discussion

A realistic expectation of structure-based drug screening is to ﬁnd low afﬁnity binders
with novel scaffolds that can be further optimized by adding substituents to develop tight
and selective inhibitors. Low micromolar afﬁnity is typical of such lead compounds,
especially in the case of Brugia AsnRS, where even the product mimic has a low

micromolar (4.5 uM) IC50 (Table 2.2). While aminoacyl-tRNA synthetases are

59

acknowledged as rational drug targets (6), this is the ﬁrst published account of
discovering new classes of AARS inhibitors by structure-based screening. Screening and
docking algorithms previously had been used to model the binding and relative afﬁnity
values of known inhibitors of synthetases and their analogs. Goddard and co-workers
used the HierDock virtual screening protocol to dock and predict the relative binding
energies of phenylalanine analogs to the T. thermophilus PheRS crystal structure (80).
Lee and Kim used comparative molecular ﬁeld analysis (81) to dock four known
inhibitors of S. aureus MetRS and develop a predictive quantitative structure-activity
relationship (82). Most of the highly potent and selective AARS inhibitors discovered in
recent years have come from in vitro screening and optimization studies (83-86). Here we
show that structure-based screening against an AARS target can identify several new

classes of inhibitors.

SLIDE has identiﬁed seven classes of inhibitors showing 50% inhibition of
Brugia AsnRS at 25-240 uM concentrations. Analogs of variolin B showed 3- to 8-fold
selectivity for Brugia relative to human AsnRS. This success rate for identifying new
ligands based on SLIDE virtual screening (~15%) supports the beneﬁts of including 3-
dimensional structural information in high-throughput screening, since structure-blind in
vitro screening typically has a success rate of <0.l% (19). In the process of screening,
SLIDE predicts the binding mode of the docked ligand in the binding site of the protein,
which aids in optimizing the new ligands for higher afﬁnity and selectivity for the target

protein.

Incorporating complete protein ﬂexibility during the screening of large molecular

databases is sufﬁciently computationally intensive that it is not yet feasible. Various

60

methods developed in recent years have shown that selecting a small ensemble of protein
structures can satisfactorily represent the conformational space available to the ﬂexible
regions of a protein binding site. In particular, crystallographic snapshots, representing
structures of the same protein in different conformational states, have been used to
represent protein ﬂexibility in the screening and design of ligand candidates (22, 24, 25,
87). However, this approach is limited to experimentally observed states rather than fully
representing the low-energy conformations of a protein. Molecular dynamics simulations
can provide a sample of low-energy states, but remain limited to sampling motions on the
sub-millisecond timescale, typically reﬂecting small-scale motions. However, Gorfe and
Caﬂisch have used explicit-water MD simulations in a similar application to ours, to
assess the ﬂexibility of the substrate binding site between apo and inhibitor-bound
structures of B-secretase (88). Their results indicate that the open- and closed-ﬂap
conformations of the protein are accessible at room temperature; hence, the open
conformation could also be used for B-secretase inhibitor design. ROCK is designed to
sample ﬂexible regions in a protein using a non-forceﬁeld approach, in which the
motions maintain the non-covalent bond network and avoid steric overlaps. Unlike MD,
ROCK does not ascribe timescales to modeled motions nor assess the relative
likelihood/energy of the generated conformers; this can be done by coupling ROCK and
MD, however. By preserving non-covalent interactions, ROCK tends to sample low-
energy states and follow low-barrier paths between conformations. Furthermore, the
ProFlex software used with ROCK can automatically deﬁne interactions that are coupled
within the protein, without the need for expensive normal modes or essential dynamics

calculations (32). This approach can also assess how ﬂexibility in a protein changes or

61

redistributes upon complex formation, as has been analyzed for HIV protease and the

Ras—Raf complex (89).

The active sites of Brugia and human AsnRS have high sequence identity, with
only three amino acid differences adjacent to the substrate binding sites. One of these
substitutions occurs at the base of the adenine-binding loop (residue 224 is Ala in Brugia
and Thr in human AsnRS), which likely alters its conformational ﬂexibility in Brugia
relative to human AsnRS. Designing inhibitor substituents that optimally ﬁll the pocket
created when this loop opens could improve inhibitor binding afﬁnity and selectivity for
Brugia AsnRS. This approach is supported by the work of others. For instance, Bursavich
and Rich have proposed that stabilizing the conformational ensemble of an enzyme,
including less-populated open conformations, can explain a range of ligand binding
events that cannot be explained by lock-and-key or induced ﬁt to a single target structure
(90). Stroud and co-workers also suggest, based on their crystallographic analysis of C.
neoformans and E. coli thymidylate synthase (91), that differences in ﬂexibility or
dynamics can be employed for species-speciﬁc inhibition. Thus, we envision that
considering conformational differences of active-site loops between species, rather than
only considering residue differences in the static parts of binding pockets, will open a

range of new possibilities for gaining speciﬁcity between closely homologous enzymes.

2.6 Conclusions

Using a template designed to represent the active site of Brugia AsnRS and its

interactions with known ligands, SLIDE has successﬁilly identiﬁed seven diverse

62

compounds that mimic the interactions between adenosine and the protein and bind with
micromolar afﬁnity. All the CSD and NCI compounds docked into the adenosyl pocket
of the binding site. This protein is highly speciﬁc for binding asparagine in its aminoacyl
pocket, as is generally true for AARS and their cognate amino acids. As a consequence, a
productive strategy for AARS inhibitor design is to ﬁnd promising scaffolds that bind
strongly in the adenosyl pocket and can be linked appropriately to the cognate aminoacyl
group. SLIDE identiﬁcation of variolin B as an inhibitor led to the testing of variolin
derivatives, which prove to be similarly potent and show selectivity for the Brugia
enzyme. The impact of main-chain conformational ﬂexibility on ligand binding in Brugia
AsnRS has been modeled, providing insights into the binding of long side-chain
variolins and their selectivity for the parasite AsnRS. The motions of active-site loops
sampled by ROCK enable us to assess the contributions of protein conformational
ﬂexibility to ligand binding and speciﬁcity and provide a potent tool to develop even

more selective inhibitors of the protein.

63

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

References

Lazdins, J ., and Kron, M. (1999) New molecular targets for ﬁlariasis drug
discovery. Parasitol Today 15, 305-6.

Melrose, W. D. (2002) Lymphatic ﬁlariasis: new insights into an old disease. Int J
Parasitol 32, 947-60.

Kron, M. A., Kuhn, L. A., Sanschagrin, P. C., Hartlein, M., Grotli, M., and
Cusack, S. (2003) Strategies for antiﬁlarial drug development. J Parasitol 89
(Suppl. ), 8226-8235.

Brown, K. R., Ricci, F. M., and Ottesen, E. A. (2000) Iverrnectin: effectiveness in
lymphatic ﬁlariasis. Parasitology 121 Suppl, 8133-46.

Horton, J., Witt, C., Ottesen, E. A., Lazdins, J. K., Addiss, D. G., Awadzi, K.,
Beach, M. J ., Belizario, V. Y., Dunyo, S. K., Espinel, M., Gyapong, J. O.,
Hossain, M., Ismail, M. M., J ayakody, R. L., Lammie, P. J ., Makunde, W.,
Richard-Lenoble, D., Selve, B., Shenoy, R. K., Simonsen, P. E., Wamae, C. N.,
and Weerasooriya, M. V. (2000) An analysis of the safety of the single dose, two
drug regimens used in programmes to eliminate lymphatic ﬁlariasis. Parasitology
121 Suppl, 8147-60.

Schimmel, P., Tao, J ., and Hill, J. (1998) Aminoacyl tRNA synthetases as targets
for new anti-infectives. F aseb J 12, 1599-609.

Eriani, G., Delarue, M., Poch, O., Gangloff, J., and Moras, D. (1990) Partition of
tRNA synthetases into two classes based on mutually exclusive sets of sequence
motifs. Nature 347, 203-6.

Cusack, S., Berthet-Colominas, C., Hartlein, M., Nassar, N., and Leberman, R.
(1990) A second class of synthetase structure revealed by X-ray analysis of
Escherichia coli seryl-tRNA synthetase at 2.5 A. Nature 34 7, 249-55.

Woese, C. R., Olsen, G. J ., Ibba, M., and Sol], D. (2000) Aminoacyl-tRNA
synthetases, the genetic code, and the evolutionary process. Microbiol Mol Biol
Rev 64, 202-36.

Brown, M. J ., Mensah, L. M., Doyle, M. L., Broom, N. J ., Osbourne, N., Forrest,
A. K., Richardson, C. M., O'Hanlon, P. J ., and Pope, A. J. (2000) Rational design
of femtomolar inhibitors of isoleucyl tRNA synthetase from a binding model for

pseudomonic acid-A. Biochemistry 39, 6003-11.

64

(11)

(12)

(13)

(14)

(15)

(16)

(17)

(18)

(19)

(20)

Casewell, M. W., and Hill, R. L. (1985) In-vitro activity of mupirocin
('pseudomonic acid') against clinical isolates of Staphylococcus aureus. J
Antimicrob Chemother 15, 523-31.

Hughes, J ., and Mellows, G. (1980) Interaction of pseudomonic acid A with
Escherichia coli B isoleucyl-tRNA synthetase. Biochem J 191, 209-19.

Nilsen, T. W., Maroney, P. A., Goodwin, R. G., Perrine, K. G., Denker, J. A.,
Nanduri, J ., and Kazura, J. W. (1988) Cloning and characterization of a
potentially protective antigen in lymphatic ﬁlariasis. Proc Natl Acad Sci U S A 85,
3604-7.

Kron, M., Petridis, M., Milev, Y., Leykam, J ., and Hartlein, M. (2003)
Expression, localization and alternative function of cytoplasmic asparaginyl-
tRNA synthetase in Brugia malayi. Mol Biochem Parasitol 129, 33-9.

Ramirez, B. L., Howard, 0. M., Dong, H. F., Edamatsu, T., Gao, P., Hartlein, M.,
and Kron, M. (2006) Brugia malayi asparaginyl-transfer RNA synthetase induces

chemotaxis of human leukocytes and activates G-protein-coupled receptors
CXCR1 and CXCR2. J Infect Dis 193, 1164-71.

Kron, M., Marquard, K., Hartlein, M., Price, 8., and Leberman, R. (1995) An
immunodominant antigen of Brugia malayi is an asparaginyl-tRNA synthetase.
FEBS Lett 374, 122-4.

Beaulande, M., Tarbouriech, N., and Hartlein, M. (1998) Human cytosolic
asparaginyl-tRNA synthetase: cDNA sequence, ﬁmctional expression in
Escherichia coli and characterization as human autoantigen. Nucleic Acids Res 26,
521-4.

Doucet, J. P., and Weber, J. (1996) Molecular Similarity, in Computer-Aided
Molecule Design: Theory and Applications pp 328-362, Springer.

Doman, T. N., McGovern, S. L., Witherbee, B. J ., Kasten, T. P., Kurumbail, R.,
Stallings, W. C., Connolly, D. T., and Shoichet, B. K. (2002) Molecular docking

and high-throughput screening for novel inhibitors of protein tyrosine
phosphatase-1B. J Med Chem 45, 2213-21.

Schnecke, V., Swanson, C. A., Getzoff, E. D., Tainer, J. A., and Kuhn, L. A.
(1998) Screening a peptidyl database for potential ligands to proteins with side-
chain ﬂexibility. Proteins 33, 74-87.

65

(21)

(22)

(23)

(24)

(25)

(26)

(27)

(28)

(29)

(30)

(31)

(32)

Carlson, H. A., and McCammon, J. A. (2000) Accommodating protein ﬂexibility
in computational drug design. Mol Pharmacol 5 7, 213-8.

Knegtel, R. M., Kuntz, I. D., and Oshiro, C. M. (1997) Molecular docking to
ensembles of protein structures. J Mol Biol 266, 424-40.

Osterberg, F., Morris, G. M., Sanner, M. F ., Olson, A. J ., and Goodsell, D. S.
(2002) Automated docking to multiple target structures: incorporation of protein
mobility and structural water heterogeneity in AutoDock. Proteins 46, 34—40.

Claussen, H., Buning, C., Rarey, M., and Lengauer, T. (2001) FlexE: efﬁcient
molecular docking considering protein structure variations. J Mol Biol 308, 377-
95.

Carlson, H. A., Masukawa, K. M., Rubins, K., Bushman, F. D., Jorgensen, W. L.,
Lins, R. D., Briggs, J. M., and McCammon, J. A. (2000) Developing a dynamic
pharmacophore model for HIV -1 integrase. J Med Chem 43, 2100-14.

Schnecke, V., and Kuhn, L. A. (1999) Database screening for HIV protease
ligands: the inﬂuence of binding-site conformation and representation on ligand
selectivity. Proc Int Coanntell Syst Mol Biol, 242-51.

Zavodszky, M. I., Sanschagrin, P. C., Korde, R. S., and Kuhn, L. A. (2002)
Distilling the essential features of a protein surface for improving protein-1i gand
docking, scoring, and virtual screening. J Comput Aided Mol Des 16, 883-902.

Zavodszky, M. I., and Kuhn, L. A. (2006) Improving Docking Validation.
Proteins in review.

Zavodszky, M. 1., and Kuhn, L. A. (2005) Side-chain ﬂexibility in protein-ligand
binding: the minimal rotation hypothesis. Protein Sci 14, 1104-14.

Jacobs, D. J ., Rader, A. J., Kuhn, L. A., and Thorpe, M. F. (2001) Protein
ﬂexibility predictions using graph theory. Proteins 44, 150-65.

Lei, M., Zavodszky, M. I., Kuhn, L. A., and Thorpe, M. F. (2004) Sampling
protein conformations and pathways. J Comput Chem 25, 1133-48.

Zavodszky, M. I., Lei, M., Thorpe, M. E, Day, A. R., and Kuhn, L. A. (2004)
Modeling correlated main-chain motions in proteins for ﬂexible molecular
recognition. Proteins 5 7, 243-61.

66

(33)

(34)

(35)

(36)

(37)

(38)

(39)

(40)

(41)

(42)

(43)

(44)

Berthet-Colominas, C., Crepin, T., Haertlein, M., Kron, M., and Cusack, S. to be
submitted.

Brunger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., Grosse-
Kunstleve, R. W., Jiang, J. S., Kuszewski, J ., Nilges, M., Pannu, N. S., Read, R.
J ., Rice, L. M., Simonson, T., and Warren, G. L. (1998) Crystallography & NMR
system: A new software suite for macromolecular structure determination. Acta
Crystallogr D Biol Crystallogr 54, 905-21.

Laskowski, R. A., Moss, D. S., and Thornton, J. M. (1993) Main-chain bond
lengths and bond angles in protein structures. J Mol Biol 231 , 1049-67.

Hess, H. H., and Derr, J. E. (1975) Assay of inorganic and organic phosphorus in
the 0.1-5 nanomole range. Anal Biochem 63, 607-13.

Baykov, A. A., Evtushenko, O. A., and Avaeva, S. M. (1988) A malachite green
procedure for orthophosphate determination and its use in alkaline phosphatase-
based enzyme immunoassay. Anal Biochem 1 71, 266-70.

Cogan, E. B., Birrell, G. B., and Grifﬁth, O. H. (1999) A robotics-based
automated assay for inorganic and organic phosphates. Anal Biochem 271, 29-35.

Danel, F., Walle, C., Kron, M., Haertlein, M., Cusack, S., and Page, M. G. P.
(2004) in International Conference on Aminoacyl tRNA Synthetases pp 115,
Seoul, Korea.

Schnecke, V., and Kuhn, L. A. (2000) Virtual screening with solvation and
1i gand-induced complementarity. Perspect Drug Discov 20, 171-190.

Bissantz, C., Folkers, G., and Rognan, D. (2000) Protein-based virtual screening
of chemical databases. 1. Evaluation of different docking/ scoring combinations. J
Med Chem 43, 4759-67.

Stahl, M., and Rarey, M. (2001) Detailed analysis of scoring functions for virtual
screening. J Med Chem 44, 1035-42.

Halperin, 1., Ma, B., Wolfson, H., and Nussinov, R. (2002) Principles of docking:
An overview of search algorithms and a guide to scoring functions. Proteins 4 7,
409-43.

Ferrara, P., Gohlke, H., Price, D. J ., Klebe, G., and Brooks, C. L., 3rd. (2004)
Assessing scoring functions for protein-ligand interactions. J Med Chem 4 7,
3032-47.

67

(45)

(46)

(47)

(48)

(49)

(50)

(51)

(52)

(53)

(54)

(55)

Perola, E., Walters, W. P., and Charifson, P. S. (2004) A detailed comparison of
current docking and scoring methods on systems of pharmaceutical relevance.
Proteins 56, 235-49.

Gohlke, H., Hendlich, M., and Klebe, G. (2000) Knowledge-based scoring
function to predict protein-ligand interactions. J Mol Biol 295, 337-56.

Wang, R., Lai, L., and Wang, S. (2002) Further development and validation of
empirical scoring functions for structure-based binding afﬁnity prediction. J
Comput Aided Mol Des 16, 11-26.

Hespenheide, B. M., Rader, A. J ., Thorpe, M. F., and Kuhn, L. A. (2002)
Identifying protein folding cores from the evolution of ﬂexible regions during
unfolding. J Mol Graph Model 21, 195-207.

Rader, A. J ., Hespenheide, B. M., Kuhn, L. A., and Thorpe, M. F. (2002) Protein
unfolding: rigidity lost. Proc Natl Acad Sci U S A 99, 3540-5.

Taylor, R. (2002) Life-science applications of the Cambridge Structural Database.
Acta Crystallogr D Biol Crystallogr 58, 879-88.

Ihlenfeldt, W. D., Voigt, J. H., Bienfait, B., Oellien, F ., and Nicklaus, M. C.
(2002) Enhanced CACTVS browser of the Open NCI Database. J Chem Inf
Comput Sci 42, 46-57.

Lipinski, C. A., Lombardo, F ., Dominy, B. W., and F eeney, P. J. (2001)
Experimental and computational approaches to estimate solubility and

permeability in drug discovery and development settings. Adv Drug Deliv Rev 4 6,
3-26.

Perry, N. B., Ettouati, L., Litaudon, M., Blunt, J. W., Munro, M. H. G., Parkin, S.,
and Hope, H. (1994) Alkaloids ﬁom the antarctic sponge Kirkpatrickia varialosa.:
Part 1: Variolin b, a new antitumour and antiviral compound. Tetrahedron 50,
3987.

Evers, D. L., Breitenbach, J. M., Borysko, K. Z., Townsend, L. B., and Drach, J.
C. (2002) Inhibition of cyclin-dependent kinase 1 by purines and pyrrolo[2,3-

d]pyrimidines does not correlate with antiviral activity. Antimicrob Agents
Chemother 46, 2470-6.

Gompel, M., Leost, M., De Kier Joffe, E. B., Puricelli, L., Franco, L. H., Palermo,
J ., and Meij er, L. (2004) Meridianins, a new family of protein kinase inhibitors

68

(56)

(57)

(58)

(59)

(60)

(61)

(62)

(63)

(64)

(65)

(66)

isolated from the ascidian Aplidium meridianum. Bioorg Med Chem Lett 14,
1703-7.

Anderson, R. J ., and Morris, J. C. (2001) Total synthesis of variolin B.
Tetrahedron Lett 42, 8697-8699.

Anderson, R. J ., and Morris, J. C. (2001) Studies toward the total synthesis of the
variolins: rapid entry to the core structure. Tetrahedron Lett 42, 311-313.

Anderson, R. J ., Hill, J. B., and Morris, J. C. (2005) Concise total syntheses of
variolin B and deoxyvariolin B. J Org Chem 70, 6204-12.

Anderson, R. J. (2002) in Department of Chemistry, University of Canterbury,
Christchurch, NZ.

Marsh, C. L. (2005) in Department of Chemistry, University of Canterbury,
Christchurch, NZ.

Hill, J. B. (2005) in Department of Chemistry, University of Canterbury,
Christchurch, NZ.

Saffer, J. D., and Glazer, R. I. (1981) Inhibition of histone H1 phosphorylation by
sangivamycin and other pyrrolopyrimidine analogues. Mol Pharmacol 20, 211-7.

Davies, L. P., J amieson, D. D., Baird-Lambert, J. A., and Kazlauskas, R. (1984)
Halogenated pyrrolopyrimidine analogues of adenosine ﬁ'om marine organisms:

pharmacological activities and potent inhibition of adenosine kinase. Biochem
Pharmacol 33, 347-55.

Recchia, I., Rucci, N., Festuccia, C., Bologna, M., MacKay, A. R., Migliaccio, S.,
Longo, M., Susa, M., Fabbro, D., and Teti, A. (2003) Pyrrolopyrimidine c-Src
inhibitors reduce growth, adhesion, motility and invasion of prostate cancer cells
in vitro. Eur J Cancer 39, 1927-35.

Recchia, I., Rucci, N., Funari, A., Migliaccio, S., Taranta, A., Longo, M.,
Kneissel, M., Susa, M., Fabbro, D., and Teti, A. (2004) Reduction of c-Src
activity by substituted 5,7-diphenyl-pyrrolo[2,3-d]-pyrimidines induces osteoclast
apoptosis in vivo and in vitro. Involvement of ERK1/2 pathway. Bone 34, 65-79.

Sadowski, J ., and Gasteiger, J. (1993) From Atoms And Bonds To 3-Dimensional
Atomic Coordinates - Automatic Model Builders. Chem Rev 93, 2567-2581.

69

(67)

(68)

(69)

(70)

(71)

(72)

(73)

(74)

(75)

(76)

(77)

Allen, J. G., and Danishefsky, S. J. (2001) The total synthesis of (+/-)-rishirilide
B. JAm Chem Soc 123, 351-2.

Neidle, S., Taylor, G. L., and Cowling, P. C. (1979) Crystal And Molecular-
Structure Of 8,2'-Cycloadenosine Trihydrate. Acta Crystallogr B 35, 708-712.

Freist, W., and Cramer, F. (1980) Phenylalanyl-tRNA, lysyl-tRNA, isoleucyl-
tRNA and arginyl-tRNA synthetases. Substrate speciﬁcity in the ATP/PPi
exchange with regard to ATP analogs. Eur J Biochem 107, 47-50.

Dhananjeyan, M. R., Milev, Y. P., Kron, M. A., and Nair, M. G. (2005) Synthesis
and activity of substituted anthraquinones against a human ﬁlarial parasite, Brugia
malayi. J Med Chem 48, 2822-30.

Burnouf, C., and Pruniaux, M. P. (2002) Recent advances in PDE4 inhibitors as
immunoregulators and anti-inﬂammatory drugs. Curr Pharm Des 8, 1255-96.

Klenke, B., Barrett, M. P., Brun, R., and Gilbert, 1. H. (2003) Antiplasmodial
activity of a series of 1,3,5-triazine-substituted polyamines. J Antimicrob
Chemother 52, 290-3.

Goble, F. C. (1950) Chemotherapy of experimental trypanosomiasis; trypanocidal
activity of certain bis (2-methyl-4-amino—6-quinolyl) amides and ethers. J
Pharmacol Exp Ther 98, 49-61.

Ruff, M., Krishnaswamy, S., Boeglin, M., Poterszman, A., Mitschler, A.,
Podjarny, A., Rees, B., Thierry, J. C., and Moras, D. (1991) Class II aminoacyl
transfer RNA synthetases: crystal structure of yeast aspartyl-tRNA synthetase
complexed with tRNA(Asp). Science 252, 1682-9.

Shiba, K., Stello, T., Motegi, H., Noda, T., Musier-Forsyth, K., and Schimmel, P.
(1997) Human lysyl-tRN A synthetase accepts nucleotide 73 variants and rescues
Escherichia coli double-defective mutant. J Biol Chem 272, 22809-16.

Cusack, S., Yaremchuk, A., and Tukalo, M. (1996) The crystal structure of the
ternary complex of T.thermophilus seryl-tRNA synthetase with tRNA(Ser) and a

seryl-adenylate analogue reveals a conformational switch in the active site. Embo
J15, 2834-42.

Burke, B., Yang, F., Chen, F., Stehlin, C., Chan, B., and Musier-Forsyth, K.
(2000) Evolutionary coadaptation of the motif 2--acceptor stem interaction in the
class II prolyl-tRNA synthetase system. Biochemistry 39, 15540-7.

70

(78)

(79)

(80)

(81)

(82)

(83)

(84)

(85)

(86)

(87)

Yaremchuk, A., Tukalo, M., Grotli, M., and Cusack, S. (2001) A succession of
substrate induced conformational changes ensures the amino acid speciﬁcity of

Thermus thermophilus prolyl-tRNA synthetase: comparison with histidyl-tRNA
synthetase. J Mol Biol 309, 989-1002.

Lenhard, B., Filipic, S., Landeka, 1., Skrtic, I., 8011, D., and Weygand-Durasevic,
I. (1997) Deﬁning the active site of yeast seryl-tRNA synthetase. Mutations in

motif 2 loop residues affect tRNA-dependent amino acid recognition. J Biol Chem
272, 1136-41.

Wang, P., Vaidehi, N., Tirrell, D. A., and Goddard, W. A., 3rd. (2002) Virtual
screening for binding of phenylalanine analogues to phenylalanyl-tRNA
synthetase. J Am Chem Soc 124, 14442-9.

Cramer, R. D., Patterson, D. E., and Bunce, J. D. (1988) Comparative Molecular-
Field Analysis (Comfa).1. Effect Of Shape On Binding Of Steroids To Carrier
Proteins. J Am Chem Soc 110, 5959-5967.

Kim, S. Y., and Lee, J. (2003) 3-D-QSAR study and molecular docking of
methionyl-tRNA synthetase inhibitors. Bioorg Med Chem 11, 5325-31.

Finn, J ., Mattia, K., Morytko, M., Ram, S., Yang, Y., Wu, X., Mak, E., Gallant,
P., and Keith, D. (2003) Discovery of a potent and selective series of pyrazole
bacterial methionyl-tRNA synthetase inhibitors. Bioorg Med Chem Lett 13, 2231-
4.

Lee, J., Kim, S. E., Lee, J. Y., Kim, S. Y., Kang, S. U., Seo, S. H., Chun, M. W.,
Kang, T., Choi, S. Y., and Kim, H. O. (2003) N-Alkoxysulfamide, N-
hydroxysulfamide, and sulfamate analogues of methionyl and isoleucyl
adenylates as inhibitors of methionyl-tRNA and isoleucyl-tRNA synthetases.
Bioorg Med Chem Lett 13, 1087-92.

Jarvest, R. L., Erskine, S. G., Forrest, A. K., Fosberry, A. P., Hibbs, M. J ., Jones,
J. J ., O'Hanlon, P. J ., Sheppard, R. J ., and Worby, A. (2005) Discovery and
optimisation of potent, selective, ethanolarnine inhibitors of bacterial phenylalanyl
tRNA synthetase. Bioorg Med Chem Lett 15, 2305-9.

Bemier, S., Akochy, P. M., Lapointe, J ., and Chenevert, R. (2005) Synthesis and
aminoacyl-tRNA synthetase inhibitory activity of aspartyl adenylate analogs.
Bioorg Med Chem 13, 69-75.

Cavasotto, C. N., and Abagyan, R. A. (2004) Protein ﬂexibility in ligand docking
and virtual screening to protein kinases. J Mol Biol 33 7, 209-25.

71

(88)

(89)

(90)

(91)

Gorfe, A. A., and Caﬂisch, A. (2005) Functional plasticity in the substrate binding
site of beta-secretase. Structure 13, 1487-98.

Gohlke, H., Kuhn, L. A., and Case, D. A. (2004) Change in protein ﬂexibility
upon complex formation: analysis of Ras-Raf using molecular dynamics and a
molecular framework approach. Proteins 5 6, 322-37.

Bursavich, M. G., and Rich, D. H. (2002) Designing non-peptide peptidomimetics
in the let century: inhibitors targeting conformational ensembles. J Med Chem
45, 541-58.

Finer-Moore, J. S., Anderson, A. C., O'Neil, R. H., Costi, M. P., Ferrari, S.,
Krucinski, J ., and Stroud, R. M. (2005) The structure of Cryptococcus
neoformans thymidylate synthase suggests strategies for using target dynamics for
species-speciﬁc inhibition. Acta Crystallogr D Biol Crystallogr 61, 1320-34.

72

Chapter 3

Optimizing variolin B and triazinylamine to
improve their binding afﬁnity and speciﬁcity for

Brugia AsnRS

3.1 Introduction

The preclinical drug discovery process could be broadly divided into two phases: the lead
discovery phase and the lead optimization phase (I, 2). Lead discovery typically involves
different ﬂavors of high throughput screening (HTS) in which large sections of chemical
space are sampled for biological activity. During lead optimization, the chemically
feasible hits obtained from HTS are subjected to synthetic modiﬁcation to optimize
activity. The underlying rationale for chemical modiﬁcation is that the change in
molecular structure of a lead compound could maximize its biological aetivity (3, 4). The
objectives of the lead optimization process could be multi pronged, to improve one or
more of the following: biological properties (e.g., in vitro and in vivo potency),
physiochemical properties (e.g., logP, pKa), pharmaceutic properties (e.g., solubility,

crystallinity), or pharrnacokinetic properties (absorption, distribution, metabolism and

73

elimination) (3, 5). The objective of the work presented here is to optimize, using
computational techniques, two of the top hits discovered using SLIDE (6) to improve
their afﬁnity and speciﬁcity for Brugia AsnRS over human AsnRS or other human
proteins. The best analogs of these compounds, assessed using different metrics described
in later sections of this chapter, were recommended for chemical synthesis to our

medicinal chemistry collaborators.

Variolin B, a pyrrolopyrimidine, was identiﬁed by SLIDE as a potential inhibitor,
and was conﬁrmed to inhibit ~50% of Brugia AsnRS activity at a concentration of 50p.M
(6, 7). It had been reported to bind to a cyclin-dependent kinase (8), possibly explaining
its cytotoxicity. The SLIDE-predicted binding mode (Figure 3.1), supported by
quantitative structure activity relationship (QSAR) analysis of inhibition assay results of
synthesized analogs, indicate that variolin B binds in the adenosyl pocket. We
hypothesize that growing an asparagine side chain to the variolin scaffold will improve
its afﬁnity for Brugia AsnRS relative to other ATP-binding proteins. New analogs of
variolin B were designed with the asparagine side chain attached to two different grth
points on its scaffold and investigated to see whether they could bind well in both the
adenosine and asparagine pockets of the binding site. Protein-ligand complementarity
scores and the difference in ligand internal energies (between the bound and free

conformations) were used to compare the docked orientations, if any, of these analogs.

Triazinylamine has a symmetric structure with two substituted triazine rings
connected by a phenyl linker. It was identiﬁed by SLIDE as a potential inhibitor and was
conﬁrmed to inhibit 50% of Brugia AsnRS activity and 80% of human AsnRS activity at

a concentration of 25uM . The SLIDE-predicted binding mode (Figure 3.2) indicates that

74

 

 

 

 

 

Figure 3.1: (A) The 2D structure of variolin B. (B) The SLIDE-predicted binding mode
of variolin B, shown in atom-colored tubes (carbon, green; oxygen, red; nitrogen, blue),
is compared to ASNAMS bound in the crystal structure, shown in atom-colored tubes,
but with carbon atoms colored orange. The pendant ring of variolin B, consisting the N6
atom, sits in the ribose pocket, with N6 oriented towards the asparagine pocket. Five
analogs with sulfamoyl-asparagine grown from two different positions (occupied by N6
and 01 atoms) on the variolin scaffold were assessed.

 

one of the triazine rings is isosteric with the adenine ring of ASNAMS and mimics its
interactions with the surrounding binding site residues as well. The bridging phenyl ring

is docked in the ribose pocket and is unable to make any polar interactions owing to its

75

 

 

 

 

 

 

 

 

 

 

Figure 3.2: (A) The 2D structure of triazinylamine is shown. (B) The truncated scaffold
used to design substituents (R1 and R2) is shown. (C) The SLIDE—predicted binding
mode of triazinylamine, shown in atom-colored tubes (carbon, green; oxygen, red;
nitrogen, blue), is compared to ASNAMS bound in the crystal structure, shown in atom-
colored tubes, but with carbon atoms colored orange. The phenyl ring in the middle,
linking the two triazine rings, sits in the ribose pocket. Combinations of R1 and R2
groups were assessed to see whether they can replace the isopropenyl and phenyl groups
in triazinylamine respectively.

 

hydrophobic character. The program BOMB (Biochemical and Organic Molecule
Builder, (9)) was used to design user-defmed substituents off of the truncated (only one

of the two triazine rings) triazinylamine scaffold (Figure 3.2.B) to replace the phenyl

76

linker with a polar ring, isosteric with the ribose in ASNAMS, and to ﬁll in the small
cavity in AsnRS binding site where the C2 atom of adenine binds. The analogs designed
by BOMB were assessed using protein-ligand complementarity scores and the difference

in ligand internal energies.

3.2 Methods

3.2.1 Designing new analogs and generating their structures

The analogs of variolin B were designed manually while those of triazinylamine were
designed using an automated method. The most probable binding mode of variolin B
(Figure 3.1.B) indicates that it binds in the adenosyl pocket of Brugia AsnRS. Given that
the amino acid pocket of AsnRS is highly speciﬁc for asparagine, attaching a sulfamoyl-
asparagine side chain to the variolin scaffold will improve its afﬁnity and speciﬁcity for
Brugia AsnRS relative to other ATP-binding proteins. Using the Cambridge Structural
Database structure of variolin B scaffold (CSD code: LEPWIM), the 3D structures of the
new analogs were built using CORINA (10), followed by generating multiple low-energy
conformers using Omega (OpenEye Software). Five analogs of variolin with the
sulfamoyl-asparagine attached to two different positions on the variolin B scaffold

(Figure 3.1.A) were designed.

The analogs of triazinylamine were designed using BOMB, a molecular
mechanics-based method that builds new structures by growing user-deﬁned substituents

off of a given core structure. The truncated triazinylamine scaffold (Figure 3.2.B) was

77

used as an input structure to BOMB. The analogs generated by BOMB had one structure
each, assessed as the best by its internal scoring function. However, to be consistent with
our optimization protocol used for variolin B, the conformational space available to these

analogs was explored by generating multiple low-energy conformers using Omega.

3.2.2 Scoring the interactions between designed analogs and

Brugia AsnRS

The multiple low-energy conformers generated for analogs of both variolin B and
triazinylamine were screened and docked into the AsnRS binding site using SLIDE. For
scoring the interactions between the docked orientations of designed analogs and Brugia
AsnRS, we used a new version of SLIDE score (11) that assesses the protein-ligand
complementarity using two scoring functions: SLIDE OrientScore to pick the correct
orientation and SLIDE AfﬁScore to predict the binding afﬁnity. All the three different
scoring functions SLIDE OrientScore, SLIDE AfﬁScore and DrugScore (12) do a reliable
job of assessing the right conformation, binding mode, and relative order of afﬁnity of

known AsnRS ligands.

3.2.3 Assessing ligand internal energies

It has been observed that the bioactive, protein-bound conformation of a ligand often

differs from its minimum-energy conformation in a protein-free environment or solution.

78

The energetic costs of the protein-induced ligand strain must be offset by the ﬁ'ee energy
of binding. Hence, it is important to assess the difference in ligand internal energies
between the docked (bound) and minimum-energy conformations of the designed
analogs. An analog with a docked conformation having a low difference in ligand internal
energies and hence minimized strain is desirable. We used the force ﬁeld MMFF94s,
available with Omega, to assess the ligand internal energies of conformers of the

designed analogs (13).

3.3 Results and discussion

3.3.1 Variolin B analogs

Five analogs of variolin B were designed and investigated to see whether they could bind
well in both the adenosine and asparagine pockets of Brugia AsnRS binding site. Three
of them had the sulfamoyl-asparagine attached to the N6 atom of the pendant ring of the
variolin B scaffold (Figure 3.1.A) while two of them had it attached to the 01 atom of
the scaffold. In analogs where the sulfamoyl-asparagine was attached to the 01 position,
the oxygen atom at that position was replaced by a nitrogen atom to make it a more stable
bond. The 2D structures of all the ﬁve analogs are shown in Table 3.1. Each of the ﬁve

analogs and their relative favorability is brieﬂy discussed in the following sections.

79

Table 3.1: Designed analogs of variolin B with the sulfamoyl-asparagine (S-ASN)
attached to two different positions on the variolin scaffold (Figure 3.1.A).

 

 

Analog 2D Structure

 

II

III

 

N-r-‘S-ASN

truncated

 

 

80

Analogs I, II and III are similar overall and differ only in the chemical group present at
the 01 position on the variolin B scaffold (Figure 3.1.A). Analog I has an OMe group at
this position while analog II has an OH group. Analog 111 does not have any group there

and hence is labeled as deoxy in Table 3.1.

SLIDE could not dock any of the conformers of analogs I, H or III in the Brugia
AsnRS binding site because the steric overlaps between the attached side chain and
residues in the adenosyl pocket could not be resolved. When the minimum-energy
conformers of each of these three analogs were docked manually in the AsnRS binding
site by superimposing onto the binding mode of the variolin B scaffold shown in Figure
3.1.3, it was observed that the pendant ring was rotated by varying degrees compared to
its position in the unsubstituted scaffold as shown in Figure 3.3. Due to the rotated
positions of the pendant ring, the attached sulfamoyl-asparagine was oriented in totally
different directions than intended. Ideally, the asparagine side chain in the designed
analogs should be docked as close as possible to that of ASNAMS. The rotated positions
of the pendant ring in these analogs could be due to the electrostatic interactions between
the oxygen atoms of the attached sulfamoyl group and the planar tricyclic ring system,
including the chemical groups present in the 01 position (OMe in analog I and OH in
analog II). The conformers of analog IV, with the sulfamoyl-asparagine attached to the
01 position, also could not be docked by SLIDE in the AsnRS binding site. When the
minimum-energy conformers of analog IV was docked manually in the AsnRS binding
site by superimposing onto the binding mode of the variolin B scaffold, it was observed
that the asparagine side chain, although oriented in the right direction, was still far off

from its

81

 

Figure 3.3: The SLIDE-predicted binding mode of variolin B, rendered as atom colored
tubes (carbon, green; oxygen, red; and nitrogen, blue), along with ASNAMS bound in the
crystal structure, rendered as atom colored tubes but with carbon atoms colored orange, is
shown in all the panels for reference. Binding modes of minimum-energy conformers of
designed variolin analogs, rendered as atom colored tubes but with carbon atoms colored
yellow, assessed manually by superimposing onto the binding mode of variolin B
scaffold are also shown. (A) analog 1, (B) analog II, and (C) analog III.

 

82

 

counterpart in ASNAMS as shown in Figure 3.4.A. After detailed docking analysis
performed both manually and using SLIDE, we inferred that for the asparagine side chain
to be docked in the right orientation, the pendant ring has to be further buried in the

ribose pocket, as shown in Figure 3.4.3. However, when this ring is docked deep in the

 

 

Figure 3.4: The SLIDE-predicted binding mode of variolin B, rendered as atom colored
tubes (carbon, green; oxygen, red; and nitrogen, blue), along with ASNAMS bound in the
crystal structure, rendered as atom colored tubes but with carbon atoms colored orange, is
shown in panel A for reference. Binding mode of minimum-energy conformer of variolin
analog IV, rendered as atom colored tubes but with carbon atoms colored yellow,
assessed manually by superimposing onto the binding mode of variolin B scaffold is
shown in panel A. The pendant ring of analog IV has to be further buried in the ribose
pocket in order to match the orientation of its attached asparagine side chain with that of
ASNAMS as shown in panel B.

 

ribose pocket, there are many steric overlaps with the neighboring residues that form the
a beta sheet, that cannot be resolved without altering the main-chain torsion angles. In
view of the steric problems posed by the pendant ring, the possibility of attaching
sulfamoyl-asparagine side chain to a truncated asparagine scaffold was explored, leading
to the design of variolin analog V (Table 3.1). SLIDE was able to dock one of the low-
energy conformers of analog V with favorable orientation of the asparagine side chain as

shown in Figure 3.5. Analog V, in its docked orientation, forms multiple hydrogen bonds

 

 

Figure 3.5: The SLIDE-predicted binding mode of variolin analog V, rendered as atom
colored tubes (carbon, green; oxygen, red; and nitrogen, blue), is compared with
ASNAMS bound in the crystal structure, rendered as atom colored tubes but with carbon
atoms colored orange. The hydrogen bonds between analog V and Brugia AsnRS
residues, rendered as atom colored tubes, are shown as white dashed lines. Bound
crystallographic water molecules that mediate interactions between ASNAMS and
AsnRS are shown as cyan spheres. Analog V is capable of displacing the water molecule
28 and mimics its interactions as well.

 

with several binding site residues as shown in Figure 3.5. In addition to the favorable
interactions, analog V is also capable of displacing the buried crystallographic water
molecule 28 (Figure 3.5). Releasing a bound water molecule and mimicking its

interactions reduces the entropic costs and contributes favorably to the binding free

85

energy. Protein-ligand complementarity scores as well as the difference in ligand internal

energies of known AsnRS ligands and variolin analogs have been reported in Table 3.2.

 

Table 3.2: Predicted protein-ligand complementarity scores and difference in ligand
internal energies of docked orientations of designed variolin analogs compared with
known ligands. The scores were computed for dockings into the chain A of the
ASNAMS-bound Brugia AsnRS structure.

 

 

 

Com ound Could be ﬁt in Protein-ligand WFFIMS ligand
p the binding site? complementarity scoresa (Ecili/lgognergy
Yes 03" = -10.88 Min.e = -106.4
ASNAMS (known 1i and) AS" = -9.00 Boundf = -69.5
g Dsd = -7.25 Diff.g = 36.9
Yes OS = -7.90 Min. = -117.9
Variolin B (known 1i and) AS = -9.24 Bound = -95.0
g Ds = 322 Diff. = 22.9
Analog I No - -
Analog II No - -
Analog 111 No - -
Analog IV No - -
OS = -8.90 Min. = -89.6
Analog V Yes AS = -8.10 Bound = -11.7
DS = -3.71 Diff. = 77.9

 

 

a More negative is more favorable.

b SLIDE OrientScore (OS) in Kcal/mol.

° SLIDE AfﬁScore (AS) in Kcal/mol.

d DrugScore (x105) (DS) in arbitrary units.

a Ligand internal energy of the minimum-energy conformer.
f Ligand internal energy of the bound conformer.

g Ligand internal energy difference between the bound and minimum-energy conformer. A lower
value is more favorable.

 

Analog V lacks the pendant ring of the variolin B scaffold (Figure 3.1.A) and this results

in loss of interactions in the ribose pocket. Given the large difference predicted in the

86

ligand internal energy between the bound conformation and the minimum-energy
conformer of this analog (Table 3.2), substituting the truncated pendant ring with a
smaller, more ﬂexible, polar ring, will help overcome the ligand strain by increasing its

interactions with the protein.

3.3.2 Triazinylamine analogs

Combinations of R1 and R2 groups were tested by the BOMB program to substitute the
isopropenyl group and the bridging phenyl ring respectively (Figure 3.2). Nine different
analogs designed by the program were investigated further to see if they could be docked
into the Brugia AsnRS binding site. Multiple low-energy conformers of these analogs
were screened using SLIDE and the scores and ligand internal energies for the top-
scoring docked orientation for each of them is tabulated in Table 3.3. The substituents to
replace the phenyl ring of triazinylamine docked in the ribose pocket (R2 groups in Table
3.3) are all variants of pyrimidyl and furanyl linkers as shown in Figure 3.6. The
difference in ligand internal energies between the bound and free conformations for all
the triazinylamine analogs varies from 2.2 to 4.6 Kcal/mol. Favorable interactions
between the docked orientations of these compounds and AsnRS binding site residues
can readily overcome the energetic costs of the ligand strain in this range. Both pyrimidyl
and furanyl linkers are rigid ring structure, unlike the puckered ribose ring in ASNAMS,
which binds to AsnRS in a C3’-endo conformation. Alternate ﬂexible linkers for the
ribose pocket would be more favorable over the limited number of heterocyclic

compounds in the R-group library used by BOMB.

87

Table 3.3: Predicted protein-ligand complementarity scores and difference in ligand
internal energies of docked orientations of designed triazinylamine analogs. These
analogs were designed using the BOMB program by testing combinations of R1 and R2
groups on the truncated triazinylamine scaffold (Figure 3.2.B). The scores were
computed for dockings into the chain A of the ASNAMS-bound Brugia AsnRS structure.

 

 

 

 

. . MMFF94s
Protein-ligand 1i and internal
Compound R1 R2 complementarity eEer
scores“ (Kcalg/ymol)
sozmr,
osb = -8.80 Min.° = 432.6
1 com-I, \ N AS° = -8.10 Boundf = 429.5
1 J Dsd = -403 Diff.g = 3.0
/
N
so,,NH2
os = -8.3 Min. = 421.9
2 coon AS = -8.4 Bound = 418.7
Ds = -3.65 Diff. = 3.2
os = 77 Min. = -2912
3 on AS = -8.5 Bound = -289.0
Ds = -3.86 Diff. = 2.2
\ N 0s = -8.0 Min. = -541
4 CONH, I A AS = -7.4 Bound = -50.4
/ = _ ' =
N 502"”: DS 3.79 Diff. 3.7
\ n 08 = .77 Min. = 441.9
5 on I X AS = -7.5 Bound = 437.3
/ = _ - =
N 802"": DS 3.32 Diff. 4.6
30er2
0s = -7.6 Min. = 427.4
6 cnzon \ N AS = -6.7 Bound = 424.8
I /| Ds = -3.81 Diff. = 2.6
N/

 

88

Table 3.3 continued

 

 

. . MMFF94s
Protein-ligand Ii and internal
Compound R1 R2 complementarity g
scores energy
(Kcal/mol)
sozmi2
OS = -8.1 Min. = -254.6
7 on / \ AS = -8.1 Bound = -2515
..... DS = -3.43 Diff. = 3.1
0
30sz2
OS = -8.1 Min. = -162.1
8 CH20H / \ AS = -7.7 Bound = 458.1
..... DS = -2.94 Diff. = 4.0
0
U 08 = -8.3 Min. = -220.3
9 OH ,,,,, AS = -7.4 Bound = -215.9
0 302"": DS = -2.66 Diff. = 4.4

 

 

a More negative is more favorable.

” SLIDE OrientScore (OS) in Kcal/mol.

c SLIDE AfﬁScore (AS) in Kcal/mol.

d DrugScore (x105) (DS) in arbitrary units.

8 Ligand internal energy of the minimum-energy conformer.

f Ligand internal energy of the bound conformer.

g Ligand internal energy difference between the bound and minimum-energy conformer. A lower

value is more favorable.

89

 

Figure 3.6: The SLIDE-predicted binding modes of designed analogs of triazinylamine,
rendered as atom colored tubes (carbon, green; oxygen, red; and nitrogen, blue), are
compared with ASNAMS bound in the crystal structure, rendered as atom colored tubes
but with carbon atoms colored orange. The compound names and 2D structures of the R2
(Figure 3.2.B) groups, docked in the ribose pocket, are tabulated in Table 3.3. (A)
compound 1, (B) compound 2, (C) compound 3, (D) compound 4, (E) compound 5, (F)
compound 6, (G) compound 7, (H) compound 8, and (1) compound 9. The ﬁrst six
compounds have six-membered pyrimidyl rings while the last three have S-membered
furanyl rings docked in the ribose pocket

 

90

 

91

The R2 groups in compoundsl, 6 and 7 (Table 3.3 and Figure 3.6) roughly
occupy the same volume as ribose in ASNAMS and also mimic some of its hydrogen
bonds with AsnRS. These compounds have good protein-ligand complementarity scores
and relatively low ligand strain as indicated by their difference in ligand internal energies.
In addition, the R1 groups for compounds 1 and 6 are capable of displacing the buried
crystallographic water molecule 20 (Figure 3.5), forming direct hydrogen bonds with
most of the surrounding AsnRS residues. Meta substituted R2 rings seem to allow better

ﬁts in the adenine and ribose regions than ortho substituted R2 rings.

All the substituents in the ribose pocket were attached with —SOzNH2
(sulfonamide) linker, which could be used as a grth point for an asparagine side chain.
Compounds 4 and 8 are attractive because they have their sulfonamide linkers docked
very close to that in ASNAMS. In addition, their R1 groups also displace the buried
crystallographic water molecule 20 (Figure 3.5) and mimic all its interactions with the
binding site residues. However, their R2 groups partially occupy the ribose pocket,

making them less effective as ribose mimics.

Overall, compounds 1 and 9 look best as they could displace a buried water
molecule as well as ﬁt the ribose pocket with greater similarity, both in terms of
interactions as well as volume occupied, to ribose in ASNAMS. TheRl group in
compound 9 (-CHzOH) is less bulky and more ﬂexible than that of compound 1 (-
CONHz). A more ﬂexible R1 substituent is desirable as it could provide more options for
making optimal interactions with the binding site residues. However, compound 1 has
better protein-ligand complementarity scores than compound 9, implying better

interactions between its docked orientation and the binding site residues.

92

3.4 Conclusions

A series of analogs were designed for two of the top Brugia AsnRS inhibitors discovered
using SLIDE. For variolin analogs, it was observed that the pendant ring in the scaffold
posed a hurdle in terms of enabling the attached sulfamoyl-asparagine to be ﬁt, with
minimal strain, in the highly speciﬁc asparagine pocket of the AsnRS binding site. The
variolin analog V was assessed as the best analog, as SLIDE-predicted binding mode
indicates that the sulfamoyl-asparagine, attached to the 01 position of the truncated
variolin scaffold, could not only ﬁt in the asparagine pocket making favorable
interactions but is also capable of displacing a buried crystallographic water molecule in
that pocket. The chemical synthesis of variolin analog V, along with analogs II and IV, is
being pursued in the laboratory of Dr. Jonathan Morris (University of Adelaide,
Adelaide, Australia), our medicinal chemistry collaborator. For compounds designed
from a truncated triazinylamine scaffold, compounds 1 and 9 were assessed as the best
candidates to be pursued further for chemical synthesis. The SLIDE-predicted binding
modes for these compounds indicate better isostericity in the ribose pocket and ability to
displace a buried crystallographic water molecule in the adenine pocket of the AsnRS
binding site. The chemical synthesis of the starting compound, the truncated
triazinylamine scaffold, is being pursued in the laboratory of Dr. Morten Grotli
(University of Goteborg, Goteborg, Sweden), our medicinal chemistry collaborator. Upon
successful synthesis and puriﬁcation of the starting compound, chemical synthesis of

compounds 1 and 9 will be pursued.

93

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

References

Kenakin, T. (2003) Predicting therapeutic value in the lead optimization phase of
drug discovery. Nat Rev Drug Discov 2, 429-38.

Hajduk, P. J ., and Greer, J. (2007) A decade of fragrnent-based drug design:
strategic advances and lessons learned. Nat Rev Drug Discov 6, 211-9.

Pickett, S. D., McLay, I. M., and Clark, D. E. (2000) Enhancing the hit-to-lead
properties of lead optimization libraries. J Chem Inf Comput Sci 40, 263-72.

Joseph-McCarthy, D., Baber, J. C., Feyfant, E., Thompson, D. C., and Humblet,
C. (2007) Lead optimization via high-throughput molecular docking. Curr Opin
Drug Discov Devel 1 0, 264-74.

Stahl, M., Guba, W., and Kansy, M. (2006) Integrating molecular design
resources within modern drug discovery research: the Roche experience. Drug
Discov Today I 1, 326-33.

Sukuru, S. C., Crepin, T., Milev, Y., Marsh, L. G, Hill, J. B., Anderson, R. J .,
Morris, J. C., Rohatgi, A., O'Mahony, G., Grotli, M., Danel, F ., Page, M. G.,
Hartlein, M., Cusack, S., Kron, M. A., and Kuhn, L. A. (2006) Discovering new
classes of Brugia malayi asparaginyl-tRNA synthetase inhibitors and relating
speciﬁcity to conformational change. J Comput Aided Mol Des 20, 159-78.

Anderson, R. J ., Hill, J. B., and Morris, J. C. (2005) Concise total syntheses of
variolin B and deoxyvariolin B. J Org Chem 70, 6204-12.

Bettayeb, K., Tirado, O. M., Marionneau-Lambot, S., Ferandin, Y., Lozach, 0.,
Morris, J. C., Mateo-Lozano, S., Drueckes, P., Schachtele, C., Kubbutat, M. H.,
Liger, F., Marquet, B., Joseph, B., Echalier, A., Endicott, J. A., Notario, V., and
Meij er, L. (2007) Meriolins, a new class of cell death inducing kinase inhibitors
with enhanced selectivity for cyclin-dependent kinases. Cancer Res 67, 8325-34.

Jorgensen, W. L., Ruiz-Caro, J ., Tirado-Rives, J ., Basavapathruni, A., Anderson,
K. S., and Hamilton, A. D. (2006) Computer-aided design of non-nucleoside
inhibitors of HIV-1 reverse transcriptase. Bioorg Med Chem Lett 16, 663-7.

94

(10)

(11)

(12)

(13)

Sadowski, J ., and Gasteiger, J. (1993) From Atoms And Bonds To 3-Dimensional
Atomic Coordinates - Automatic Model Builders. Chem Rev 93, 2567-2581.

Zavodszky, M. I., Sanschagrin, P. C., Korde, R. S., and Kuhn, L. A. (2002)
Distilling the essential features of a protein surface for improving protein-ligand
docking, scoring, and virtual screening. J Comput Aided Mol Des 16, 883-902.

Gohlke, H., Hendlich, M., and Klebe, G. (2000) Knowledge-based scoring
ﬁinction to predict protein-ligand interactions. J Mol Biol 295 , 337-56.

Bostrom, J ., Greenwood, J. R., and Gottfries, J. (2003) Assessing the performance
of OMEGA with respect to retrieving bioactive conformations. J Mol Graph
Model 21 , 449-62.

95

Chapter 4

Automated shape and chemistry comparison for
deﬁning binding site invariants and speciﬁcity

determinants

4.1 Abstract

Structure-based virtual screening methods have been successful in identifying new
ligands to a protein target. However, it is challenging to identify new ligands that will
bind strongly to one protein relative to another using such methods. This is an important
factor in designing selective inhibitors (or agonists) as reagents to probe for biochemistry,
or as drug lead compounds. We have developed an automated approach, CompSite, to
identify speciﬁcity determinants in one protein relative to another protein to aid in
structure-based virtual screening and design. SLIDE, our screening and docking software,
generates a binding site template to represent sites where a ligand can make favorable
hydrogen-bond and hydrophobic interactions with atoms in the protein binding site.
Using complete-linkage clustering of superimposed templates, shared interaction sites

that are chemically similar or different between the templates are identiﬁed. Steric

96

difference sites between the templates are also identiﬁed by checking for van der Waals
overlaps to indicate pockets for ligand binding available in one protein relative to
another. We applied this method on different pairs of proteins that bind to the same
compound, to see if the identiﬁed chemical difference sites could explain the
experimentally measured relative binding afﬁnities. In addition, we also applied this
method to identify the binding site invariants and differences between Brugia malayi
asparaginyl-tRNA synthetase (AsnRS), an anti-parasitic drug target, and a set of diverse
ATP-binding proteins. The ATP-binding proteins were broadly divided into two
categories based on the way they bind to the adenine moiety, with the exocyclic amine
N6 buried in the binding site (N6-in) or exposed to the solvent (N6-out). Combining our
CompSite method with virtual screening protocols provides a promising approach for
screening prospective ligands to identify those most likely to be speciﬁc to a protein
target. The results obtained from our method can also be used to suggest modiﬁcations to
known ligands that improve their speciﬁcity for the target protein relative to other
protein(s). Examples are shown for serine proteases, protein kinases and our drug

discovery target, Brugia AsnRS.

4.2 Introduction

Successful applications of protein structure-based virtual screening efforts have been well
documented in several reviews in recent years (1-5). The universally acknowledged
major bottlenecks in virtual high throughput screening methods are identifying and

ranking true hits from a large pool of small organic molecules (6-8). Recent algorithmic

97

advances aimed to address those problems have led to improvement in the hit rates and
have identiﬁed potent drug leads that were predicted and experimentally validated to bind
strongly to the target protein (9, 10). However, even some of the potent lead compounds
discovered by structure-based screening are beset by drawbacks in the early stages of
drug discovery, one of the biggest being the ability of the compound to discriminate
between the targeted (protein) structure and its structural homologs. Protein kinases are
an obvious example because more than 500 homologous kinases have been identiﬁed in
the human genome (11) and so many of them are rational drug targets for various
diseases including cancer, diabetes and inﬂammation (12). Some of the promising ATP-
site kinase inhibitors, including those that were approved or were in clinical development,
have been reported to have poor speciﬁcity proﬁles (13). The problem of selectivity is
not limited to protein kinases but has also been reported in other common drug targets
like serine proteases (14), nuclear receptors (15) and matrix metalloproteinases (MMPs)
(16). One of the new classes of inhibitors of Brugia malayi asparaginyl-tRNA synthetase
(AsnRS), discovered in our lab (I 7), has been recently reported to bind to a cyclin-
dependent kinase (18). This lack of selectivity may explain its cytotoxicity observed in

human cell lines.

The problem of ligand selectivity can be computationally addressed without
affecting the quality of docking or speed of virtual screening of current structure-based
methods, by ﬁltering docking results to select those molecules which include pre-
identiﬁed speciﬁcity determinants using an algorithm like CompSite. There have been a
few other methods published in the literature that follow this paradigm of treating

computational selectivity modeling as a separate module that could be integrated into the

98

overall virtual screening and docking protocol. Kastenholz and co-workers developed a
method in which the molecular interaction ﬁelds generated by the program GRID (19,
20) for the binding sites of several target protein structures are analyzed with consensus
principal component analysis to obtain contour plots identifying regions that are
important for selectivity in the chosen target protein (21). The GRID/PCA technique was
adapted in a different way by Braiuca and co-workers to partially account for protein
ﬂexibility that could predict selectivity differences caused by arnino-acid residue
differences in not only the active site but also in regions that are not directly interacting
with the ligand (22). Sheridan and co-workers developed a mathematically simpler
method FLOGTV (23) that uses the trend vector paradigm (24) to compare the binding
site ﬁeld maps, generated by the program FLOG (25), to visualize the differences in
closely related proteins superimposed in a reasonable way. Deng and co-workers
developed the structural interaction ﬁngerprint method that reduces the three-dimensional
structural binding information of protein-ligand complexes into corresponding one-
dirnensional (1D) binary strings (26). A hierarchical clustering analysis of these binary
strings helps identify similarities and diversities between their small-molecule binding
interaction patterns, and facilitates the post-docking analysis to organize and ﬁlter
screening results. The same authors modiﬁed their method to develop a strategy for
designing protein target-focused chemical libraries by using the desirable protein-ligand
interactions as ﬁltering constraints (27). By modifying the virtual screening protocol to
introduce essential protein-ligand interactions, identiﬁed by prior ﬁndings, as constraints
during the docking stage, Perola reported signiﬁcant reduction in the false positive rates

in kinase virtual screens (28). Other than the structure-based methods that we have brieﬂy

99

covered here, different sequence-based and ligand-based approaches to model ligand

selectivity in drug design have been reviewed recently by Ortiz and co-workers (29).

My goal was to develop a method that can automatically identify differences in
preferred ligand chemistry, or differences in pockets available for ligand binding, to aid
structure-based drug design and incorporate speciﬁcity determinants in virtual screening.
Instead of focusing on an atomistic protein comparison approach, we focus on the
implications of such changes in the ligand binding space. For instance, if an amino group
were interchanged with a hydroxyl group on the other side of the binding Site, this would
appear to be a signiﬁcant difference between two proteins being compared. However, the
interchange of the two groups could create a substantially similar environment on the
ligand binding space: the presence of H-bonding donor and acceptors at similar distances
and angles. Here we implement a ligand-space chemistry and shape comparison tool,
CompSite, and test its ability to explain experimentally observed speciﬁcity differences

in protein homologs.

Our structure-based screening and docking tool SLIDE (30—34), capable of
efﬁciently screening large databases of small organic molecules while accommodating
the protein and ligand side-chain ﬂexibility during the process, generates knowledge-
based templates to represent the protein binding sites. Templates are essentially detailed
pharmacophores representing the favored spatial distribution of ligand chemistry from the
protein’s point of view. The key to our method is the underlying mechanism to ﬁnd the
shared interaction sites in the proteins, achieved by complete-linkage clustering of their
SLIDE-generated templates. Complete-linkage clustering provides an objective technique

to resolve superimposed interaction template points into the most highly occupied set of

100

sites of ﬁxed radius. This approach has been used earlier to locate consensus water sites
in serine proteases (35). The results of the clustering process in this method are
dependent on the clustering threshold chosen and the accuracy of the superposition
method used to bring the templates in the same reference frame. I will elaborate on those
aspects in the “Materials and methods” section of this chapter. Each of the shared
interaction sites, identiﬁed in the clustering process, consists of individual template
points, representing one or more of the proteins included as input to the algorithm, that
make up the cluster at that site. Post-clustering processing is performed to classify the
shared interaction sites as either chemically similar or different sites based on the
preferred ligand chemistry represented by the clustered template points. In order to
assess the quantitative importance of any differences identiﬁed, we quantiﬁed these sites
using DrugScore, a knowledge-based scoring function developed to score the binding
geometry of ligands in proteins (36). DrugScore converts structural information obtained
from the Protein Data Bank (PDB) into distance-dependent preferences for pairs of
interacting protein and ligand atoms. Since our method ﬁnds the shared interaction sites
where potential ligand atoms can interact with the respective proteins, DrugScore is
suited to score those sites and conﬁrm any chemical differences. The chemical difference
sites were further validated using LigPlot (37) to compute the interactions at each shared
interaction site with each protein. In addition to the chemical difference sites, steric
difference sites between two proteins are also identiﬁed, independent of the clustering
process, by checking for van der Waals overlaps between the template points of one
protein and the atoms of the other protein. Steric difference sites are useful in indicating

pockets available for ligand binding in one protein relative to another and are

101

automatically validated by checking for van der Waals overlaps. The two distinct
advantages of our fast, modular method are the ease with which it can be integrated into
the structure-based virtual screening and docking protocols, independent of the
availability of known ligands, and its usefulness in guiding lead optimization efforts by
suggesting synthetic modiﬁcations that could improve the speciﬁcity of the lead

compounds.

We applied this method to two sets of proteins, assembled to address these
questions: i) can we explain the known relative binding afﬁnities of a ligand that binds to
two different proteins? and ii) can we identify speciﬁcity determinants between Brugia
AsnRS and a set of other ATP-binding proteins? To address the ﬁrst question, we
assembled 4 protein pairs taken from the AfﬁnDB database (38). Each homologous
protein pair was bound to the same ligand and the structure of the crystal complex was
available plus the binding afﬁnities. Henceforth, we will call this set the AfﬁnDB set. We
addressed the second question in the context of Brugia AsnRS and other ATP-binding
proteins. It is important, in attaining ligands speciﬁc to AsnRS, to ﬁnd the chemical and
steric difference sites that can be exploited to identify or design ligands that discriminate
between AsnRS and other ATP-binding sites. For this, we assembled a set of ATP-
binding proteins (the “ATP set”) that was further subdivided based on the way they bind
to the adenine moiety. Results obtained by applying our method on this set elucidate the

signiﬁcant similarities and differences.

102

4.3 Materials and methods

4.3.1 Representing the protein binding sites

The ﬁrst step in our algorithm (Figure 4.1) is to generate the templates to represent the
protein binding sites using our SLIDE software (30-32). SLIDE uses distance geometry
to screen and dock ligand candidates into the template of the target protein. A template
consists of points identiﬁed as the most favorable positions for ligand atoms to form
hydrogen bonds or make hydrophobic interactions with the neighboring protein atoms
(33). The template points are given chemistry labels as donors, acceptors, doneptors
(donor and/or acceptor) or hydrophobic, depending upon the type of interaction that a
potential ligand atom at that site would make with the protein. An acceptor template
point, for example, is located near a donor protein atom, such as the backbone amide
nitrogen, and represents a favorable placement for a ligand acceptor atom at that point. A
doneptor (donor/acceptor) point is deﬁned in two cases: when a ligand atom at that site
could make favorable hydrogen bonds with separate hydrogen-bond donor and acceptor
atoms in the protein, or when it could interact with a group that both donates and accepts
hydrogen bonds (e.g., —OH in the side chains of Ser, Thr, or, Tyr). Geometrically favored
interaction sites for ligand hydrogen-bonding atoms are assigned based on the distance
and angle to protein hydrogen-bonding partners. The parameters for optimal hydrogen
bonding geometry were taken from the literature (39, 40). Hydrophobic template points
are placed near signiﬁcantly hydrophobic protein surface patches that complement the

hydrophobic groups in ligands for a number of 3D protein-ligand complexes (41).

103

 

 

Generate the templates for the
binding sites of the proteins

 

 

Bring the proteins in the same reference frame by ligand-
based or protein backbone-based superposition

 

 

 

 

Cluster the template points using
complete-linkage clustering

if %

Post clustering processing to identify
(chemical) difference sites between a
pair of proteins

 

 

 

Post clustering processing to identify
(chemically) similar sites

 

 

 

 

Prune sites that have more than one point If there are any two points in the
from a given template to keep the closest cluster, one from each template, that
one to the cluster centroid are chemically similar, this is not a
chemical difference site

 

 

 

 

 

 

Is the dominant chemical type the same as

in at least 2/3 of the templates clustered? If the cluster passes all the

previous conditions, then it is a
chemical difference site

 

 

 

 

 

If the cluster passes all the previous
conditions, then it is a chemically
similar site

 

Figure 4.1: A ﬂowchart describing steps for deﬁning chemically similar or different sites

 

104

A dense template generated by SLIDE ﬁnely samples the protein binding site surface and
hence leads to a better representation of the chemistry of the protein that not only
improves the identiﬁcation, docking and scoring of ligands (33), but also allows us to
compare different binding Sites of similar proteins. The binding site limits (and hence the
template volume) are user-deﬁned by creating a 3D box. A known ligand, if available,
can serve as a starting point to generate the box, in this case the ligand atom’s (x, y, z)
coordinates become the two opposite comers of the box. For the proteins selected for our

analysis, the co-crystallized ligands were used to deﬁne the volume of the binding.

4.3.2 Superposition to bring the templates in the same

reference

Once the templates representing the binding sites are generated using SLIDE, they have
to be transformed into the same reference frame before we cluster them to identify the
shared interaction sites. This can be achieved by superimposing the proteins (from which
the templates are derived) using either the bound (co-crystallized) ligand-based
superposition or protein main-chain least squares superposition. The choice of the
superposition method depends on two things: one, whether the structure of the bound
ligand is available at all and two, the purpose of clustering to compare the binding site
templates and the questions being addressed. When no structural information is available
for ligands that are known to bind to the proteins being compared, then we have to bring
them into the same reference frame using main-chain least-squares superposition. On the

other hand, when structures of the proteins being compared are available with their bound

105

ligands, then we have an option of choosing either of the two superposition methods. If
the objective of clustering the templates is to identify speciﬁcity determinants in order to
optimize a ligand (or a scaffold) that is known to hit two or more proteins such that it
becomes more speciﬁc to one protein relative to the other protein(s), then ligand-based
superposition should be used. If the objective is to ﬁnd out what makes two or more
proteins that have a degree of sequence and structural Similarity bind to different ligands
and whether there is any, yet unknown, ligand (or scaffold) that can bind to both or all of
them, then protein main-chain least squares superposition should be used. In our case,
protein structures were either bound to the same ligand (AfﬁnDB set) or to ligands that
shared a common scaffold (adenine in the ATP set), and hence we used ligand-based
superposition before clustering the templates. To reiterate, the AfﬁnDB set was
assembled to address the question whether our method can explain the known relative
binding afﬁnities of a ligand that binds to two different proteins, while the ATP set was
assembled to address the question whether our method can identify speciﬁcity
determinants between Brugia AsnRS and a set of other ATP-binding proteins. The
clustering sensitivity to superpositional accuracy will be discussed in the subsection

entitled “Clustering sensitivity to superpositional accuracy”.

4.3.3 Complete-linkage clustering to identify the shared

interaction sites

Complete linkage clustering was chosen to identify the Shared interaction sites of the

superimposed templates. It provides an objective technique to not only detect the

106

template points (representing different proteins) that occupy the same site but also to
resolve overlapped sites into dense microclusters representing shared interaction sites.
Complete linkage clustering allows us to deﬁne the maximum diameter (or a threshold)
for any cluster (helping control the separation between the cluster centroids) and it
produces compact, most densely occupied set of clusters for that diameter. The complete
linkage clustering algorithm has been used in our lab earlier to identify consensus water
sites in different serine protease structures (35). The algorithm is explained using a
hypothetical case in Figure 4.2, in which 5 template points are clustered. Three of these
template points (labeled as 1, 2 and 3 in Figure 4.2) belong to one protein while the other
two (labeled as 4 and 5 in Figure 4.2) belong to the other protein. The ﬁrst step is to
compute the distances between all pairs of template points as shown in Figure 4.2.A.
Complete-linkage clustering algorithm begins by placing the two closest points together
into a cluster, provided they are separated by a distance lower than the clustering
threshold (chosen to be 1.3 A for our analysis). In our example, points 1 and 5 form the
ﬁrst cluster as shown in Figure 4.2.B. Each subsequent iteration in the algorithm is
performed to ﬁnd the next closest pair. The distance between a point and a cluster is
deﬁned by the maximum distance between that point and all the elements in the cluster.
In our example, points 2 and 4, that form cluster 11 (Figure 4.2.B), are at a distance of 1.8
A and 1.9 A respectively from point 3. Therefore, the distance between cluster II and
point 3 is 1.9 A (greater of the two distances) as shown in Figure 4.2.B. This feature in
complete-linkage clustering ensures that all the pairwise distances within a given cluster
meet the distance threshold and hence result in the most compact clusters for that

threshold.

107

 

Figure 4.2: The complete-linkage clustering algorithm is brieﬂy explained using a
hypothetical example of 5 interaction sites (template points) of which 1, 2 and 3
(rendered as solid spheres) belong to one protein and 4 and 5 (rendered as spheres with a
mesh surface) belong to another protein. They are colored by the type of interaction that a
ligand atom can make at each site: red for hydrogen bond acceptor, blue for hydrogen
bond donor, and green for hydrophobic. The clustering threshold for this example is 1.3
A, same as used in our analysis. The clustering process begins by computing all pairwise
distances as shown in A. The two closest sites are clustered together provided their
separation is below the clustering threshold. Hence, sites 1 and 5, separated by 1.1 A,
form the ﬁrst cluster (enclosed in a white box) as shown in B. Each subsequent iteration
in the algorithm is performed to ﬁnd the next closest pair. Sites 2 and 4, separated by 1.2
A (below the threshold), happen to be the next closest pair and hence form the second
cluster in this example. The distance between a site and a cluster is deﬁned by the
maximum distance between that site and all the elements in the cluster. For example, the
distance between cluster H (formed by sites 2 and 4) and site 3 in this example is 1.9 A,
deﬁned by the distance (shown as a dotted line in B) between site 4 and site 3. The
iterative process is repeated until no further elements can be clustered without exceeding
the threshold distance. Sites that are not yet clustered because of the clustering threshold
are considered to deﬁne single-point or singlet clusters, e.g. site 3 forms singlet cluster III
in B. Once these clusters are identiﬁed, the post-clustering processing will eventually
identify chemically similar (e.g. cluster I shown in B is a similar (acceptor) site) and
different (e. g. cluster 11 shown in B is difference (polar-phobic) site) sites.

 

108

Cluster Ill

Cluster l

Cluster ll

 

Figure 4.2

109

This iterative process is repeated until no further elements can be clustered without
exceeding the threshold distance. Any points that are not yet clustered because of the
clustering threshold are considered to deﬁne single-point or singlet clusters, e.g. point 3

shown in Figure 4.2.B.

A clustering threshold of 1.3 A was chosen for our analysis. The rationale behind
deﬁning a particular clustering threshold is to choose a distance that would ensure that
only spatially overlapping template points (from two or more proteins) are clustered. The
non-overlapping separation between the centers of two atoms with a typical van der
Waals radius of 1.4 A is 2.8 A. However, when the separation between their centers is 1.3
A, there is an overlap of 1.5 A between the two atoms. Of course, for atoms with van der
Waals radius greater than 1.4 A, the overlap will be even higher. Choosing a clustering
threshold of 1.3 A ensures that the clustered template points, representing potential ligand
will be Spatially overlapping and, in most cases, prevents two template points from the
same protein from being clustered in the same interaction site. Rarely, the ﬁne sampling
of hydrogen-bonding points in the templates results in two H-bonding points ﬁ'om the
same protein being placed in one cluster. This is explicitly considered in deﬁning

chemically similar and different sites.

110

4.3.4 Post-clustering processing to identify similar and

chemical difference sites

After the shared interaction sites are identiﬁed by clustering, each one of them is
processed computationally to check if it could be classiﬁed as either a similar site or a

chemical difference site, based on the templates points that occupy the site.

4.3.4.1 Similar sites

The rationale behind identifying the similar sites in the templates representing different
proteins is to ﬁnd the binding site invariants. Each of the individual template points that
occupy a shared interaction site denotes a chemical type based on the preferred ligand
chemistry that it represents at that site. For each of the shared interaction sites, the
method queries if the dominant chemical type is the same as in 2/3 of the templates
clustered. If so, the interaction site is classiﬁed as a chemically similar site. The idea is to
impose a strict ﬁlter so that only those interaction sites that have at least 2/3 occupancy,
i.e. sites that are occupied by template points representing at least 2/3 of the proteins
clustered, and representing a dominant chemical type are identiﬁed as similar sites.
Hence, by deﬁnition, the method will assign a chemical type to each similar site that it
identiﬁes. Any interaction site that has more than one point ﬁ'om a given template is
pruned to include only the closest point to the cluster centroid. Pruning is done to avoid
any bias while determining the chemical type of the similar site. Our method has been

designed to identify similar sites between two or more templates that could either belong

lll

to conformers of the same protein or to a set of homologs that can be reasonably
superposed. Templates of proteins that are diverse in both sequence and structure space,
but bind to the same ligand (or scaffold) could also be clustered and processed to identify
the similar Sites that enable them to bind the same ligand. Identifying similar sites in
templates representing conformers of the same protein is similar in concept to MUSIC,
developed by Carlson and co-workers (42) to identify the binding regions that are
conserved in molecular dynamics (MD) simulations of the target protein and use the
information for drug design. Recently, Rarnensky and co-workers developed a similar
method that identiﬁes binding site local similarity based on the analysis of protein
environments of ligand fragments, the results of which could be used for ligand

optimization (43).

4.3.4.2 Chemical difference sites

Our method has been designed to identify chemical difference sites between two
templates only. As the name suggests, the rationale behind identifying chemical
difference sites is to ﬁnd the shared interaction sites that are occupied by template points,
representing the two proteins, of the opposite chemical type or in other words ﬁnd the
speciﬁcity determinants for the two proteins. For each of the shared interaction sites, the
method queries if there are any two points, one from each template, clustered in the site
that are chemically similar. If so, the interaction Site is not a chemical difference site.
Hence, algorithmically, unlike the similar sites, the chemical difference sites are

identiﬁed by elimination.

112

4.3.5 Relative signiﬁcance of the chemical difference sites

Our method identiﬁes the chemical difference sites in a binary qualitative way, i.e., a
shared interaction site is either a chemical difference site or not. However, quantifying
the degree of the chemical difference is required to assess their relative signiﬁcance and
potential contribution to the binding speciﬁcity. Since the template points represent the
geometrically optimal positions in a protein binding site for a ligand atom to make
favorable hydrogen bond and hydrophobic interactions, we chose DrugScore (36), a
knowledge-based scoring function, to score the chemical difference sites. DrugScore uses
structural information from the PDB to score protein-ligand complexes based on the
preferred distances observed between different ligand and protein atom pairs and hence is
suited to score our chemical difference sites. Scoring these sites also enables us to ﬁlter
out any noise that may have been introduced by one or more of the following: quality of
the protein structures, quality of the superposition or the chosen clustering threshold.
Each chemical difference site is deﬁned by two template points of the opposite chemical
type, one from each protein. To score a chemical difference site, for example say one
that has a hydrogen-bond acceptor template point from protein 1 and a hydrogen-bond
donor template point from protein 2, the following scheme is implemented to score it

using DrugScore (DS):
ADSpmtein] = DS(A) — DS(D) in protein 1 (4.1)
ADSmtein 2 = DS(D) — DS(A) in protein 2 (4.2)

The difference in DrugScore values, denoted by ADS, is computed for both the

proteins in order to determine the degree of preference. Hence, ADSpmtein 1 quantitatively

113

indicates protein 1’s degree of preference for its own acceptor point at that site over
protein 2’s donor point that also shares the site. Similarly, ADSpmtein 2 quantitatively
indicates protein 2’s degree of preference for its own donor point at that site over protein
1’s acceptor point that also shares the site. Since a more negative value of DrugScore is
more favorable, therefore, for a chemical difference site to be qualiﬁed as a favorable
difference, ADS must also be negative. To be considered as a significant chemical
difference site, we chose a threshold of -10,000 (arbitrary DrugScore units) for the ADS
values. Apart from being the average value of all the ADS values computed in our
analysis, it was successful in not only eliminating more than half of the false positives
(see Figure 4.3 and the subsection entitled “Clustering sensitivity to superpositional
accuracy”) but also detecting the most signiﬁcant chemical difference sites that explain
the observed differences in binding afﬁnities for the proteins of the AfﬁnDB set. The
degree of preference of the chemical difference sites was further validated using LigPlot
(3 7). LigPlot computes whether the template points at any chemical difference site can
make hydrogen-bonding or hydrophobic interactions with the neighboring atoms in either
of the proteins. In any chemical difference site, it would be expected that a template point
can make interactions with the neighboring atoms of its own protein and none with those

of the other protein.

4.3.6 Clustering sensitivity to superpositional accuracy

The results of clustering and the subsequent processing to identify the similar and

chemical difference sites depend on the superposition performed to bring the templates

114

into the same reference frame. It is, of course, desirable to have a robust clustering
method in which the results do not vary much with small, reasonable shifts in the
template positions, introduced by alternative superposition methods. To assess the
sensitivity of our clustering method to superpositional accuracy, the Brugia AsnRS
template and its copy were used as sample templates to be clustered, at a clustering
threshold of 1.3 A. The template copy was shifted from O to 2.0 A in one direction, in
steps of 0.25 A, and clustered with the original template after every shift. The same steps
were repeated in another orthogonal direction too. The rationale behind shifting the
template copy was to mimic the shifts that are introduced by using alternative
superposition methods. Since we are clustering two templates that have the same spatial
distribution of chemistry, albeit with a shift in one of them, we would expect that greater
the shift, the lower the number of similar sites and higher the number of chemical
difference sites. Indeed, as the plot shown in Figure 4.3 indicates, 90% of true chemical
similarities or differences are preserved when the proteins are shifted by 1.25 A or less.
When the chemical difference Sites are scored and include only those that pass the
threshold difference in DrugScore values (-1.0 x 104 in arbitrary units), there is almost a

two-fold reduction in their number.

4.3.7 Steric difference sites

Like the chemical difference sites, the steric difference sites are also pairwise in nature

and are identiﬁed for two templates. However, unlike the chemical difference sites, the

steric difference sites are identiﬁed by checking for van der Waals overlaps between the

115

 

Similar sites identiﬁed by shift in:
X direction —I—

100 4 I I l—l—I
- x z direction -e—

901 \.

80- \\
70-

 

30; ,,.../
20.: /

‘ g/OA;
10- e/
0 - a a a—a—eé

r I U 1 U I I I V T

I l I I I l I l I
0.00 0.25 0.50 0.75 1 .00 1.25 1.50 1 .75 2.00
RMSD of shifted templates in Angstroms

2 i \l
3 60 -

3 50 ‘ Chemical difference sites identiﬁed by shift in:

.3 . X direction —a— all. —0— Drugscore selected

: 40 - 2 direction —e— all, —0— DrugScore selected 0
§

0

0.

DO

\.

 

 

 

Figure 4.3: Clustering sensitivity to superpositional accuracy was assessed and the
results are shown in this plot. A copy of the AsnRS template was shifted (in steps of 0.25
A) in two orthogonal directions (X and Z) and each shifted copy was clustered (clustering
threshold of 1.3 A) with its original and processed to identify chemically similar sites and
chemical difference sites between the two. The RMSD shift shown here mimics the
RMSD shift obtained as a result of superposition (ligand-based or protein backbone-
based) that is required to bring the templates in the same reference frame before
clustering them to identify the shared sites (step 2 in Figure 4.1). Since we are clustering
the original AsnRS template with its shifted copy, we would expect the number of similar
sites is expected to gradually decrease with increasing RMSD while the number of
chemical difference sites to gradually increase The chemical difference sites become less
inﬂuenced by superpositional error when the chemical differences are quantiﬁed using
DrugScore. Only the signiﬁcant differences that pass the threshold difference in
DrugScore values (-1 .0 x 104 in arbitrary units) were selected. At the clustering threshold
of 1.3 A, 80-90% of the shared interaction sites are identiﬁed as similar sites while only
0-10% of them are identiﬁed as signiﬁcant chemical difference sites, when the
superpositional error has an RMSD of 1.25 A or less.

 

116

template and the other protein structure, without performing clustering. These sites
identify volumes that are sterically accessible in one protein relative to the other. Let us
consider two templates — template 1 and template 2 — that represent the binding sites of
protein 1 and protein 2. For each interaction Site in template 1, the closest atom in protein
2 is identiﬁed. To be identiﬁed as a steric difference Site, a point in template 1 is
identiﬁed as having van der Waals overlaps with its nearest neighbor atom in protein 2.
The van der Waals radius of each template point was deﬁned to be 1.4 A, which tends to
be on the conservative (small) side; the effective van der Waals radii of organic atoms,
including a correction for the bound hydrogens, ranges ﬁ'om 1.4 to 2.0 A. This ensures
that the van der Waals overlaps that are detected with this radius do not result in over

prediction of steric differences.

4.3.8 Datasets

We assembled two sets of proteins on which to apply our method on and check whether
the results can address our questions. As introduced earlier, the two sets are referred to as
the AfﬁnDB set and the ATP set in our analysis, and they have been detailed in the

following subsections.

4.3.8.1 AffinDB set

This set was assembled to address the question whether the results from our method can

explain the known relative binding afﬁnities of a ligand that binds to two different

117

proteins. The AfﬁnDB database (38) is a freely accessible repository of afﬁnity data of
structurally resolved protein-ligand complexes from the PDB. We selected 4 protein pairs
(Table 4.1) from this database that met our selection criteria: i) two different proteins (or
variants) bound to the same ligand, ii) the difference in binding afﬁnity is at least 2 fold,
iii) the resolution of the crystal structure complex is 2.5 A or better, and iv) no major
change in the ligand conformation between the two structures. The difference in binding
afﬁnity in 3 out of the 4 pairs turns out to be almost 2 orders of magnitude. All the
protein pairs happen to belong to the serine proteases family. The individual protein pairs
were brought into the same reference frame using (bound) ligand-based superposition.
The database is regularly updated and we acknowledge that more protein pairs, that meet

our criteria, could have been added at the time of preparing this manuscript.

4.3.8.2 ATP set

AsnRS, a classIlb aminoacyl t-RNA synthetase (AARS), is responsible for the speciﬁc
aminoacylation of tRNAAs". The two-step catalytic reaction involves the ATP-based
activation of the cognate amino acid asparagine and the transfer of the activated
asparagine to the 3’-end of tRNAAs". We earlier applied SLIDE to discover new
inhibitors of Brugia AsnRS by targeting a 1.9 A resolution closed form of the enzyme in
complex ASNAMS (17). The catalytic binding site of AsnRS can be divided into two

pockets -— adenosine and asparagine. The asparagine pocket in AsnRS is very speciﬁc for

118

 

o «:2 Amcoaam 080m 6N.N ”ONm—S
Ed ad
«:2 z z: «X .8“on
I dyad
o o ease mom .8 2 DE:
2: oo 58.5

 

m
GEES 0:83 ”mm; ”NmU:
21 mama 5338.“
comes—€33 25 395—95
65m N
A3298 080E moo; MOmUS
2: 33%
£2 2: 5 98 o E a.
$538 080: “mu; ammo:
£2 m
2: 8.918 Season
«:2 / me cowoﬁﬁmﬂa 25 325—83 H
@538 oEom mow; ”Zmu:
_
2: ~48 Boson;
Avon—Sm J. E 56283! wan—Ac
Sc 3% one? 2.5.58 E can: no.3»: :8

£395

 

 

n83 bags 28 285%: cases :05 £3, Ema—m5 .50 E com: Dom macaw/w 05 me 3685 a: 03:.

119

.oﬁvgwncom E

.5me 3508 Bus QFASDAENED-EoEEmEmemo. mYN AoEEaAEcontmotv.TEEQEAEBEEDENJmm-m _

.oEEESEm e.
.052830380-«damages?”3-0838244 _
.65 mod Boa 86.8 88 632-... one _

.8388: @2233 Bob t 3.89: E3 5 £023 .Ame 838% mQ=E< 05 Bob :83 we? 83 385w 2F c

 

 

2: G

21 8m

Nfz

2...

.2mm

WEBB mom 5o.— MAS/3
m Eat? Ema»;
cans mom so; :9:

_ SB; 535

 

won—cacao :4 03.5.

120

the amino acid, as expected for an AARS. In fact, collaborator Michael Kron’s screen of
11,000 amine compounds in the collection of Discovery Technologies indicated that none
would bind to the asparagine or other sites of AsnRS. However, SLIDE-predicted binding
modes of the new Brugia AsnRS inhibitors, supported by experimental assays and
ﬂexibility modeling studies, indicate that the adenosine pocket can bind to inhibitor
scaffolds that mimic the interactions of adenine, in full or in part. In fact, variolin B, one
of the SLIDE-discovered Brugia AsnRS inhibitors, has been recently reported to bind
also to a human cyclin-dependent kinase (18), which could possibly explain the
compound’s cytotoxicity in human cell lines as well. Hence, motivated by the desire to
optimize the AsnRS speciﬁcity of variolin B and other inhibitors binding to the ATP site,
CompSite was tested for its ability to identify binding site speciﬁcity determinants that
set it apart from other ATP-binding proteins. The ATP-binding proteins were broadly
divided into two categories based on the way they bind to the adenine moiety — with the
exocyclic amine N6 buried in the binding site (N 6-in) or exposed to the solvent (N 6-out).
Chene reported in the context of ATPases, that the N6-in orientation, which allows for
maximum hydrogen-bonding contacts between the purine ring and the active site, is
found in majority of ATPases reviewed (44). AsnRS falls in the N6-out category while
protein kinases belong to the N6-in category. To expand our dataset beyond AsnRS and
protein kinases, we also included diverse non-AsnRS N6-out and non-protein-kinase N6-
in proteins in our study. Dividing the ATP-binding proteins in this manner provides us
with a convenient tool to validate the results of CompSite, especially in terms of
identifying steric differences, and also allows identiﬁcation of chemical speciﬁcity

determinants in AsnRS relative to a variety of ATP-binding proteins. The 4 subsets of the

121

ATP set have been detailed in the following subsections and tabulated in Table 4.2. All
the selected proteins are bound to ATP or its analogs and have a resolution of 2.5 A or
better. The proteins were brought into the same reference frame by ligand-based.
superposition, using the adenine moiety. A set of 9 PDB protein kinase structures, all of
them mammalian, was assembled for the analysis. The selected kinase structures
approximately cover most of the different kinase families deﬁned (11) and analyzed (45)
in literature. A set of 5 PDB structures of non-protein-kinase N6-in proteins, three of
which are from a mammalian source while two are from a bacterial source was
assembled. It is not possible to scan all the available structures of ATP-binding proteins
and pick the N6-in ones. Hence, the structures of ATP-binding protein families, identiﬁed
and analyzed earlier by Kuttner and co-workers (46), was surveyed to assemble our N6-in
subset. A set of 5 PDB crystal structure complexes of non-AsnRS, N6-out proteins was
assembled, two of which are from mammalian sources, two from bacterial sources and
one from yeast. For assembling the N6-out subset too, the structures of ATP-binding
protein families identiﬁed and analyzed earlier by Kuttner and co-workers (46) was

surveyed.

Apart ﬁom the crystal structure bound to ASNAMS, another crystal structure with
one monomer bound to ATP and the other bound to L-aspartate-B-hydroxamate adenylate
(LBHAMP) was also used. In summary, 3 different crystallographic conformations of
Brugia AsnRS were used in our analysis. This allowed the three crystallographic
conformations of AsnRS, provided by collaborator Stephen Cusack (EMBL Grenoble), to

be analyzed by CompSite to identify ligand interaction sites shared between them.

122

Table 4.2: Proteins of the ATP set used in our analysis

 

 

Protein kinase subset

 

 

 

 

 

PDB Protein Source Resolution (A) Liganda

lATP CAMP-dependent Protein Mus musculus 2.20 ATP
Kinase A

1CM8 MAP Kinase P3 8-gamma Homo sapiens 2.40 ANPb
Cyclin-dependent Kinase .

lHCK (CDK) 2 Homo sapiens 1.90 ATP

lIR3 Ins ulrn Receptor (Tyr) Homo sapiens 1.90 ANP
Kinase

lJNK c-Jun N-terminal Kinase Homo sapiens 2.30 ANP

lJPA EPHBZ (Ephrm B2 Mus musculus 1.91 ANP
receptor) Kinase

lPHK Phosphorylase Kinase 0ryctolagus 2.20 ATP

cuntculus
lQPC IIEymphocyte-SpeCIﬁc Homo sapiens 1.60 ANP
inase
2SRC Protein Kinase c-Src Homo Sapiens 1.50 ANP
N6-in subset

PDB Protein Source Resolution (A) Ligand

1A82 Dethiobiotin Synthetase Escherichia coli 1.80 ATP

1AUX Synapsin IA Bos taurus 2.30 SAP°
Ubiquitous Kinesin

lBG2 Homo sapiens 1.80 ADP

Motor Domain

 

123

Table 4.2 continued

 

 

 

 

 

PDB Protein Source Resolution (A) Ligand
1BYQ g5” Shmk Pram“ (H513) Homo sapiens 1.50 ATP
1L4U Shikimate Kinase MyCObac’e’if’m 1.80 ADP
tuberculoszs
N6-out subset
PDB Protein Source Resolution (A) Ligand
1BX4 Adenosine Kinase Homo Sapiens 1.50 ANDd
lHPl 5 -Nucleotidase (open Escherichia coli 1.70 ATP
form)
1QSY DNA Polymerase I Thermus 2.30 DADe
aquatzcus
1S3X $106“ Sh°°k Pmte‘“ (HSP) Homo Sapiens 1.84 ADP
lYAG Actin S““”“".”."yces 1.90 ATP
cerevzszae

 

 

a The 346m code taken from PDB ﬁle.
b Phosphoaminophosphonic acid-adenylate ester.
c ADP-monothiophosphate.

d Adenosine.

'3 2’, 3’ —dideoxy ATP.

124

4.4 Results

4.4.1 Chemical ' difference sites: Explaining the observed
experimental selectivity of ligands bound to proteins of

the AffinDB set

Signiﬁcant chemical difference sites identiﬁed by CompSite in the overlapping volumes
of binding sites can explain the selectivity of the same ligand bound to two different
protein structures of the AfﬁnDB set (Table 4.1). The degree of preference for one
template point (or interaction) over other in each chemical difference site is quantiﬁed by
the difference in DrugScore (ADS) value, with a more negative value being more
favorable. To further validate that the template points are well placed for making
favorable interactions with the neighboring protein atoms, we used Li gPlot to detect the
interactions. The sites detailed in Table 4.3 could be located, with reference to the bound
ligand, in the labeled panels of Figure 4.4. For each of the 4 protein pairs of the AfﬁnDB
set, the chemical difference sites identiﬁed by our method and their agreement with

experimental results have been explained in the following paragraphs.

Thrombin (PDB: 1C5N) and urokinase type plasminogen activator (UPA, PDB:
1C5X) are both bound to the ligand 4-iodobenzo-[B]-thiophene-2-carboxamidine (PDB
3-letter code ESI, Figure 4.4.A). The ligand ESI inhibits UPA better than thrombin, with

almost a lOO-fold difference in the inhibition constants (K) (Table 4.1).

125

Table 4.3: Signiﬁcant chemical difference sites identiﬁed between protein pairs of the
AfﬁnDB set that can explain the known relative binding afﬁnities of their bound ligands
(Table 4.1).

 

 

Protein paira: IC5N and 105x

 

 

, 1: Interaction Degree Ofd
Site Protein referredc preference Interactions detected
P ADS (x 10‘)
C interacts with Val 213: CGl
26 lCSX C over N -l .05 and Trp 215: CA.

N has no interactions

 

Protein pair: 1 C50 and 1 C52

 

 

Interaction Degree 0f
Site Protein referred preference Interactions detected
" ADS (x 10‘)
16 1C5Z C over N 4.75 C interacts with Cys 191: C.
N has no interactions.
21 1C50 C over N -122 C interacts with Trp 215: C.

N has no interactions.

 

Protein pair: 1F 0U and lEZQ

 

 

Interaction Degree 0f
Site Protein referred preference Interactions detected
9 ADS (x 10“)
C interacts with ring atoms of
116 lEZQ C over D -0.74 Phe 174.
D has no interactions.
C interacts with Val 213: CGl
127 lEZQ CoverN -1.04 and Trp 215; C, CA.

N has no interactions.

 

126

Table 4.3 continued

 

Protein pair: I V2J and I V2L

 

 

Interaction Degree 0f
Site Protein referred preference Interactions detected
1’ ADS (x 10‘)
21 1sz C over N _1 .50 C interacts with Va1213: CGl.

N has no interactions.

 

 

a Only the PDB codes of the protein pairs are mentioned here. For the protein names and other
details, please refer to Table 4.1.

The shared interaction sites are numbered by the clustering algorithm. To locate these sites
please see Figure 4.4.

c The template points are labeled by their interaction type: A for hydrogen-bond acceptor, D for
hydrogen-bond donor, N for hydrogen-bond acceptor and/or donor, and C for hydrophobic
points.

d Degree of preference is quantitatively indicated by the difference in DrugScore (ADS) value (in
arbitrary units).

127

 

 

 

Figure 4.4: The signiﬁcant chemical difference sites, tabulated in Table 4.3, for the 4
pairs of proteins in the AfﬁnDB set (Table 4.1) are shown in this ﬁgure. For each pair,
these sites can explain the observed difference in binding afﬁnities of the same ligand
that they bind to. The superimposed bound conformations of each ligand, rendered in
atom colored tubes (carbon, green or orange; oxygen, red; nitrogen, blue; and sulfur,
yellow) are shown in all panels for reference. The template points in these chemical
difference sites, rendered as spheres (solid or mesh surface), are colored by interaction
type: red for hydrogen bond acceptor, blue for hydrogen bond donor, white for hydrogen
bond acceptor and/or donor, and green for hydrophobic. The signiﬁcant chemical
difference site between thrombin (PDB: 1C5N) and urokinase type plasminogen activator
(UPA, PDB: 1C5X) is shown in A. The 1C5N template point is rendered as a sphere with
mesh surface while the lCSX template point is rendered as a solid sphere. The bound
ligand inhibits 1C5X with a better Ki than 1C5N (Table 4.1). The signiﬁcant chemical
difference sites between thrombin (PDB: 1C50) and urokinase type plasrrrinogen
activator (UPA, PDB: 1C5Z) are shown in B. The 1C50 template points are rendered as
a sphere with mesh surface while the 1CSZ template points are rendered as a solid sphere.
The bound ligand inhibits 1C5Z with a better Ki than 1C50 (Table 4.1). The signiﬁcant
chemical difference sites between trypsin (PDB: 1FOU) and factor Xa (PDB: lEZQ) are
shown in C. The 1FOU template points are rendered as a sphere with mesh surface while
the lEZQ template points are rendered as a solid sphere. The bound ligand inhibits lEZQ
with a better Ki than 1FOU (Table 4.1). The signiﬁcant chemical difference site between
trypsin variant 1 (PDB: 1V2J) and trypsin variant 2 (PDB: 1V2L) is shown in D. The
1V2J template point is rendered as a sphere with mesh surface while the 1V2L template
point is rendered as a solid sphere. The bound ligand inhibits 1V2L with a better Ki than
1V2] (Table 4.1).

 

128

 

Figure 4.4

129

The chemical difference site 26 (Table 4.3, Figure 4.4.A) indicates that UPA prefers a
hydrophobic ligand atom over a hydrogen bond acceptor and/or donor. A hydrophobic
ligand atom at this site can interact with carbon atoms in UPA residues Val 213 and Trp
215 whereas no hydrogen bond interactions could be made. This is well corroborated by
the interactions made by the nearby (~l.5 A) sulfur atom in the ligand with the same
residues Val 213 and Trp 215 in UPA. Thrombin too has the same (Val 213 and Trp 215)
residues and they do interact with the sulfur atom in its bound ligand. However, a
hydrogen bond acceptor and/or donor ligand atom at this site in thrombin could improve
the binding afﬁnity as it can both donate to the main-chain oxygen atom in Ser 214 as
well as accept from the hydroxyl group in Ser 195. In the crystal structure, these two
serine residues make hydrogen bonds with water molecules. Now, the same serine
residues, 195 and 214, are also present in UPA but are located at an unfavorable distance
to interact with a ligand atom at this site. In the UPA crystal structure, Ser 195 donates to
the citrate ion while Ser 214 accepts from a water molecule. Katz and co-workers
suggested that the selectivity of the ligand ESI is due to a hydrogen bond between the
ligand’s amidine group and the hydroxyl group in Ser 190, located in the S1 pocket of
UPA (4 7). In the thrombin structure, the same serine residue in the S1 pocket is replaced
by Ala 190 and hence lacks the hydrogen bond with the amidine group of the ligand. The
templates representing the two protein structures did place a hydrophobic point in UPA
and a hydrogen bond acceptor and/or donor in thrombin, but the clustering algorithm
couldn’t identify them as a shared interaction site as they were separated by 1.53 A,

greater than our clustering threshold of 1.3 A. In summary, our method identiﬁed an

130

additional speciﬁcity determinant between the two proteins which could further explain

the ligand’s selectivity.

Thrombin (PDB: 1C50) and urokinase type plasminogen activator (UPA, PDB:
1C5Z) are both bound to the ligand benzarnidine (PDB 3-letter code BAM, Figure
4.4.B). The ligand BAM inhibits UPA better than thrombin, with almost a 3-fold
difference in K; (Table 4.1). There were two signiﬁcant chemical difference sites
identiﬁed by our method between these two protein structures (Table 4.3, Figure 4.4.B).
The chemical difference site 16 indicates that UPA prefers a hydrophobic ligand atom
over a hydrogen bond acceptor and/or donor. A hydrophobic ligand atom at this site can
interact with UPA residue Cys 191 whereas no hydrogen bond interactions could be
made. This is well corroborated by the interactions made by the nearby (~0.6 A) carbon
atom in the ligand with other hydrophobic residues in its proximity. Similarly, the
chemical difference site 21 indicates that thrombin prefers a hydrophobic ligand atom
over a hydrogen bond acceptor and/or donor. A hydrophobic ligand atom at this site can
interact with thrombin residue Trp 215 whereas no hydrogen bond interactions could be
made. This site actually detects a known binding site chemistry difference in the S1
pockets of the two structures where a serine residue in UPA is substituted by an alanine

residue in thrombin at the 190 position (47).

Trypsin (PDB: 1FOU) and factor Xa (PDB: lEZQ) are both bound to the ligand 3-
[(3'-aminomethyl-biphenyl-4-carbonyl)-amino]- 2-(3-carbamimidoyl-benzyl)-butyric acid
methyl ester (PDB 3-letter code RPR, Figure 4.4.C). The ligand RPR inhibits factor Xa
better than trypsin, with almost an 80-fold difference in K, (Table 4.1). Maignan and co-

workers reported that the only binding site differences in these two serine protease

131

structures are found in their 81 (left side of Figure 4.4.C) and S4 (right side of Figure
4.4.C) pockets (48). In both structures, the S1 pocket is occupied by the benzamidine
group of the ligand and the S4 pocket is occupied by the aminomethylbiphenyl group of
the ligand. The two chemical difference sites identiﬁed by our method are in good
agreement with the known differences, although one of them didn’t pass the signiﬁcance,
difference in DrugScore (ADS) threshold value of -1.0 x 104 (arbitrary units). The
signiﬁcant chemical difference site 127 in factor Xa, located in the S1 pocket, indicates
that a hydrophobic ligand atom is preferred over a hydrogen bond acceptor and/or donor.
A hydrophobic ligand atom at this site can interact with carbon atoms of factor Xa
residues Val 213 and Trp 215 whereas no hydrogen bond interactions could be made.
Apart from the interactions made by nearby (~ 0.9 A) carbon ring atoms in the ligand
with the same residues, the site also detects the known binding site chemistry difference
due to the presence of Ala 190 in the S1 pocket of factor Xa as opposed to Ser 190 in the
SI pocket of trypsin. In fact, a hydrogen bond acceptor and/or donor ligand atom at this
site in trypsin can accept from the hydroxyl group of Ser 190 in its S1 pocket. The
chemical difference site 116 in factor Xa, located in the S4 pocket, indicates that a
hydrophobic ligand atom is preferred over a hydrogen bond donor. A hydrophobic ligand
atom at this site can interact with ring atoms of factor Xa residue Phe 174 whereas no
hydrogen bond interactions could be made. This is well corroborated by the ring atoms in
the aminomethylbiphenyl group of the ligand RPR that interact with residues Phe 174
and Tyr 99. A hydrogen bond donor ligand atom at this site in trypsin can interact with
the side chain oxygen atom (OEl) of Gln 175, which occupies the same space as Phe 174

in factor Xa.

132

The structures of several recombinant bovine trypsin variants were solved and
analyzed by Rauh and co-workers to study the effect conformational variability on
binding afﬁnity (49). For our analysis, we selected two structures that were bound to the
same ligand and had the maximum difference in their experimental Ki’s. The ﬁrst trypsin
variant (PDB: 1V2J) contained Ser 172, Ser 173, Arg 174 and Ile 175 insertion, while the
second one contained Glu 97, Tyr 99, Ser 172, Ser 173, Phe 174 and Ile 175 insertion,
and Ala 190 and Glu 217 of factor Xa. Both these structures were bound to the ligand
benzamidine (PDB 3-letter code BEN, Figure 4.4.D). The ligand BEN inhibits variant 2
better than variant 1, with a 100-fold difference in K, (Table 4.1). Except for Ser 190 in
variant 1 and Ala 190 in variant 2, the difference in the two structures is peripheral to the
ligand-binding site. The chemical difference site 21 indicates that variant 2 prefers a
hydrophobic ligand atom over a hydrogen bond acceptor and/or donor. A hydr0phobic
ligand atom at this site can interact with variant 2 residue Val 213 whereas no hydrogen
bond interactions could be made. Apart from the interactions made by the closest (~0.2
A) carbon atom in the ligand with the same residue, this site also detects the actual
binding site chemistry difference substitution of Ser 190 in variant 1 by Ala 190 in the S1
pocket of variant 2. In fact, a hydrogen bond acceptor and/or donor ligand atom at this
site in variant 1 can accept from the hydroxyl group of Ser 190 in its S1 pocket. In
addition to this chemical difference site identiﬁed by our method, it was algorithmically
challenging to assess the known contribution of conformational ﬂexibility of different
insertions that are considerably distant from the ligand-binding site that our method

focused on.

133

4.4.2 Similar sites identiﬁed in the ATP-set proteins

Chemically similar sites identiﬁed by our method for proteins of each of the 4 subsets of
the ATP set are shown in Figure 4.5. These sites represent the binding site invariants in
each of the 4 subsets as they are identiﬁed after passing a stringent ﬁlter described in the
“Materials & Methods” section of this chapter. The details of the similar sites identiﬁed
by our method for each of the 4 subsets of the ATP set have been explained in the

following paragraphs.

4.4.2.1 AsnRS similar sites

Based on the comparison of available crystal structures, we know the ﬂexible regions of
Brugia AsnRS that adopt different conformations when bound to different ligands (1 7).
Similar sites identiﬁed for the AsnRS subset, representing the invariant features of the 3
conformers is shown in Figure 4.5.A. The individual templates representing the 3 AsnRS
conformers bound to ASNAMS, LBHAMP and ATP had 148, 142 and 123 points
respectively. The adenine moiety of all the three bound ligands was superimposed, as
shown in Figure 4.5.A, to bring the templates in the same reference frame. Clustering
and further processing of these individual templates led to the identiﬁcation of 98
chemically similar sites out of which 77 were hydrogen bonding sites while 21 were
hydrophobic sites. Since the three structures were essentially conformers of the same
protein, the large number of similar sites identiﬁed, covering more than 50% of the points
in each individual template wasn’t unexpected. But, our method was able to sense the

conformational variability by not identifying the template points that were found in only

134

 

Figure 4.5: The chemically similar sites identiﬁed for each of the 4 subsets of ATP-
binding proteins in the ATP set (Table 2) are shown in this ﬁgure. The bound ligands of
the clustered structures (templates) in each class are rendered as atom-colored tubes in all
panels. The similar sites are rendered as solid spheres, colored by interaction type (red for
hydrogen bond acceptor, blue for hydrogen bond donor, white for hydrogen bond
acceptor and/or donor, and green for hydrophobic) in all the panels. The similar sites for:
3 conformers of AsnRS are shown in A, 9 protein kinase structures in B, 5 non-kinase
N6-in structures in C, and 5 N6-out structures in D.

 

135

 

Figure 4.5

136

one of the three, and hence accounting for the change in binding site architecture,
structures clustered. The density of hydrogen bonding sites, as expected, was high near
the polar atoms of the adenine moiety, the ribose moiety and the asparagine side chain.
ATP binds to AsnRS in a bent conformation and our method could identify a lot of
similar, hydrogen-bond acceptor sites near the phosphate tail. The ASNAMS-bound
AsnRS conformation, with its template points occupying 80 of the 98 similar sites, was

chosen as the representative structure of this subset for further analysis.

4.4.2.2 Protein Kinase similar sites

Similar sites identiﬁed by our method for the protein kinase subset (Table 4.2),
representing the most invariant features of the 9 structures selected for our analysis, are
shown in Figure 4.5.B. The number of interaction sites in the individual templates
representing the 9 protein kinase structures bound to ATP or its analogs ranged from 59
to 183. We used the co-crystallized ligand to deﬁne the volume of the binding site while
generating the template for each of the selected protein. In the ephrin B2 receptor
(EPHBZ) kinase (PDB: lJPA) crystal structure complex, only the adenine ring of the
bound phosphoaminophophonic acid-adenylate ester (ANP) was seen in the density. As a
result, the template representing the binding site of this kinase structure had the least
number of points. The template representing protein kinase c-src (PDB: 2SRC) had the
highest number of interaction sites. The adenine moiety of all the bound ligands was
superimposed, as shown in Figure 4.5.3, to bring the templates in the same reference

frame. Clustering and further processing of these individual templates led to the

137

identiﬁcation of 17 chemically similar sites out of which 10 were hydrogen bonding sites
while 7 were hydrophobic sites. The selected set of protein kinases are structurally quite
diverse and also differ in the way they bind to the ATP analogs as is evident in the
different puckered conformations of the ribose ring that sends the phosphate tail in
different directions (Figure 4.5.B). However, the interactions made by the polar nitrogen
atoms, especially N1 (a hydrogen bond acceptor) and N6 (a hydrogen bond donor), of the
adenine moiety in bound ligands are well represented by similar sites identiﬁed by our
method and is in agreement with previous surveys on ATP-binding proteins (50, 51).
There were hardly any hydrogen-bonding similar sites identiﬁed near the phosphate tails,
indicating the difference in their bound conformations observed across all the 9 structures
included in our analysis. For further analysis, we chose the phosphorylase kinase
structure (PDB: lPHK), with its template points occupying 16 of the 17 similar sites, as

the representative structure for protein kinase subset.

4.4.2.3 N6—in similar sites

Similar sites identiﬁed by out method for proteins of the N6-in subset (Table 4.2),
representing the most invariant features of the 5 structures selected for our analysis, are
shown in Figure 4.5.C. The number of points in the individual templates representing the
5 N6-in structures bound to ATP or its analogs ranged from 125 to 159. The adenine
moiety of all the bound ligands was superimposed, as shown in Figure 4.5.C, to bring the
templates in the same reference frame. Clustering and further processing of these

individual templates led to the identiﬁcation of 11 chemically similar sites out of which

138

only 3 were hydrogen-bonding sites while 8 were hydrophobic sites. The polar similar
sites are located near the a-phosphate moiety, 5’-OH of the ribose moiety and near the
N1 and N6 atoms of the adenine moiety of the superimposed ligands (Figure 4.5.C).
Although the 5 selected structures are structurally quite diverse, we were still expecting
to see more similar sites. The low number of hydrogen bonding interaction sites is both
surprising and calls for further investigation. There are three possible explanations for
this low number. The ﬁrst one pertains to the post-clustering processing of the shared
interaction sites to identify chemically similar sites. For a shared interaction site to be
identiﬁed as a Similar site, a strict ﬁlter of 2/3 occupancy (in this case, points from at
least 4 of the 5 templates) is imposed. There are Similar sites which have occupancy of
0.6 or points from 3 out 5 templates, but are left out because of the strict ﬁlter. The
second one pertains to the clustering process itself to identify the shared interaction sites.
The lack of structural and sequence similarity between these proteins could result in the
geometrically favored positions for ligand atoms to make hydrogen bonds being farther
apart than the clustering threshold of 1.3 A and hence are not clustered. The third reason
could be that the hydrogen bonding potential of the polar atoms in the adenine moiety of
the bound ligands may actually be satisﬁed by water molecules, and hence we don’t see
many hydrogen-bonding points from the individual templates being clustered. For further
analysis, we chose the dethiobiotin synthetase structure (PDB: 1A82), with its template
points occupying 10 of the 11 similar sites, as the representative structure for N6-in

subset.

139

4.4.2.4 N6-out similar sites

Similar sites identiﬁed by out method for proteins of the N6-out subset (Table 4.2),
representing the most invariant features of the 5 structures selected for our analysis, are
shown in Figure 4.5.D. The number of points in the individual templates representing the
5 N6-out structures bound to ATP or its analogs ranged from 85 to 162. The template
representing the adenosine kinase structure (PDB code lBX4) had the least number of
interaction sites Since the protein was bound to adenosine and hence the volume deﬁned
by it during template generation was relatively smaller. The adenine moiety of all the
bound ligands was superimposed, as shown in Figure 4.5.D, to bring the templates in the
same reference frame. Clustering and further processing of these individual templates led
to the identiﬁcation of 9 chemically similar sites out of which only 2 was a hydrogen-
bonding site while 7 were hydrophobic sites. The two polar similar sites are located near
the C2, N3 atoms of the adenine moiety and 3’-OH of the ribose moiety of the
superimposed ligands (Figure 4.5.D). The low number of similar hydrogen-bond Sites
could be because of the same reasons discussed earlier in the “N6-in similar sites”
section. A stronger reason could be that the hydrogen bonding potential of most of the
polar atoms in the adenine moiety of the bound ligands may actually be satisﬁed by water
molecules or other hetero groups (e.g. DNA in DNA polymerase I structure (PDB:
1QSY)), and hence we don’t see many polar interaction sites in the individual templates
being clustered. For further analysis, we chose the actin structure (PDB: lYAG), with its
template points occupying 8 of the 9 similar sites, as the representative structure for N6-

out subset.

140

 

 

4.4.3 Chemical difference sites identiﬁed in protein pairs of the

ATP set

We chose only the representative structures, chosen based on the analysis of their similar
sites, of the ATP subsets to identify the chemical difference sites between each of them
and AsnRS. Signiﬁcant chemical difference sites identiﬁed by our method between
AsnRS and protein structures representing each of the other 3 subsets of the ATP set have
been tabulated in Table 4.4 and shown in Figure 4.6. Since our objective is to identify
Speciﬁcity determinants in Brugia AsnRS relative to other ATP-binding proteins, only
the sites with AsnRS-preferred interactions have been discussed here. Just like we did
for the AfﬁnDB set, the degree of preference is quantiﬁed by the difference in DrugScore
(ADS) value and the template points are validated by LigPlot to detect the interactions.
The sites detailed in Table 4.4 could be located, with reference to the bound ligands, in
the labeled panels of Figure 4.6. Based on their location in the AsnRS binding site, we
have divided the chemical difference sites into three categories described in the following

subsections.

4.4.3.1 Chemical difference sites in adenine pocket

Adenine pocket seems to be the most conserved in the different binding sites of ATP-set
proteins in our analysis. The hydrogen-bond interactions of the polar nitrogen atoms and
the hydrophobic interactions of the aromatic carbon atoms of adenine have been

identiﬁed as speciﬁcity determinants to distinguish adenine from other nucleotides

141

 

Table 4.4: Signiﬁcant chemical difference sites identiﬁed between Brugia AsnRS and
representative structures of other ATP-binding proteins (Table 4.2).

 

 

Protein paira: AsnRS and lPHK (a representative structure of the protein kinase subset)

 

 

, b , Interaction Degree Ofd .
Site Protein referredc preference Interactions detected
P ADS (x 10‘)
A accepts from Gly 408: N.
11 AsnRS A over D -1.03
D donates to Ile 361: O.
C interacts with atoms in Ile
71 AsnRS C over D -l .63 361 and Val 362'

D has no interactions.

 

Protein pair: AsnRS and 1A 82 (a representative structure of the N6-in subset

 

 

Interaction Degree at
Site Protein referred preference Interactions detected
P ADS (x 10‘)
3 AsnRS A over D _1_72 A accepts from Arg 411: NHl.
D donates to Glu 360: O.
A accepts from Tyr 223: OH
29 AsnRS A over C 4 .58 and/or Arg 210: NHZ-
C interacts with His 225: CEl
and Arg 210: CZ.
53 AsnRS C over A -1.67 C interacts with Tyr 223: CE2.
A has no interactions.
64 AsnRS C over A -l .39 C interacts with Gly 363: CA.

A has no interactions.

 

142

 

Ii:

Table 4.4 continued

 

Protein pair: AsnRS and I YAG (a representative structure of the N6-out subset

 

 

, Degree of
, , Interaction .
Site Protein referred preference Interactions detected
P ADS (x 10‘)

A accepts from Tyr 223: OH

7 AsnRS A over C -1.58 and/or Arg 210: NHZ'
C interacts with His 225: CEl
and Arg 210: CZ.

18 AsnRS A over D _1 .72 A accepts from Arg 411: NHl.
D donates to Glu 360: O.

54 AsnRS C over A 444 C interacts with His 219: CD2.

A has no interactions.

 

 

a Only the PDB codes of the ATP-set protein are mentioned here. For the protein names and
other details, please refer to Table 4.2.

b The shared interaction sites are numbered by the clustering algorithm. To locate these sites
please see Figure 4.5.

c The template points are labeled by their interaction type: A for hydrogen-bond acceptor, D for
hydrogen-bond donor, N for hydrogen-bond acceptor and/or donor, and C for hydrophobic
points.

d Degree of preference is quantitatively indicated by the difference in DrugScore (ADS) value (in
arbitrary units).

143

 

 

Figure 4.6: The signiﬁcant chemical difference sites (Table 4.4) between Brugia AsnRS
and representative structures of each of the other three classes of ATP-binding proteins in
the ATP set (Table 4.2) are shown: (A) phosphorylase kinase (PDB: lPHK), a
representative protein kinase structure bound to ATP, (B) dethiobiotin synthetase (PDB:
1A82), a representative non-kinase, N6-in structure bound to ATP, and (C) actin (PDB:
lYAG), a representative N6-out structure bound to ATP. Most of the difference sites are
located in the ribose pocket of AsnRS binding site. The bound ligands (superimposed by
their adenine moiety), rendered as atom-colored tubes, are shown for reference in all the
panels. ASNAMS bound to AsnRS can be distinguished from other ligands by its orange
carbon atoms. The template points in the chemical difference sites are rendered as
spheres (solid surface for AsnRS and mesh surface for the other) and colored by
interaction type: red for hydrogen bond acceptor, blue for hydrogen bond donor, white
for hydrogen bond acceptor and/or donor, and green for hydrophobic.

 

144

 

(50, 51). Even the similar sites identiﬁed by our method for these proteins (Figure 4.5)
indicate high similarity in the adenine pocket. Therefore, it wasn’t surprising to ﬁnd there
were only 2 signiﬁcant chemical difference Sites, with AsnRS-preferred interactions,
identiﬁed by our method — site 53 between AsnRS and dethiobiotin synthetase (Figure
4.6.B) and site 54 between AsnRS and actin (Figure 4.6.C). The chemical difference site
53 (Table 4.4, Figure 4.6.B) indicates that AsnRS prefers a hydrophobic ligand atom
over a hydrogen bond acceptor. A hydrophobic ligand atom at this Site can interact with
the aromatic side-chain AsnRS residue Tyr 223 whereas no hydrogen bond interactions
could be made. On the other hand, the hydrogen bond acceptor at this site in dethiobiotin
synthetase can accept from side-chain amide nitrogen atom of 1A82 residue Asn 175,
mimicking the interactions of the nearby ligand nitrogen atom N7 of the adenine moiety.
The N7 atom in the bound ligand (ASNAMS) of AsnRS does not interact with either the
neighboring protein atoms or any crystallographic water molecules. Therefore, a
hydrophobic ligand atom at this site in the AsnRS binding site should improve the
afﬁnity and selectivity for AsnRS over dethiobiotin synthetase, a non-protein-kinase N6-
in representative structure. The other signiﬁcant chemical difference site in the adenine
pocket, site 54 between AsnRS and actin (Table 4.4, Figure 4.6.C), indicates that AsnRS
prefers a hydrophobic ligand atom over a hydrogen bond acceptor. A hydrophobic ligand
atom at this site can interact with a side-chain carbon atom of AsnRS residue His 219
(CD2) whereas no hydrogen bond interactions could be made. On the other hand, the
hydrogen bond acceptor at this site in actin can accept from side-chain terminal amine of

lYAG residue Lys 336.

146

 

 

4.4.3.2 Chemical difference sites in ribose pocket

The different puckers of ribose ring, observed in structures of bound ligands, has been
used earlier in designing selective ligands (52, 53). Our method identiﬁed 5 unique
signiﬁcant chemical difference sites in the ribose pocket between AsnRS and the 3 other
ATP-binding proteins. Chemical difference sites 3 and 29 (Figure 4.6.B), between
AsnRS and dethiobiotin synthetase, are almost identical to Sites 18 and 7 (Figure 4.6.C)

respectively, between AsnRS and actin.

Two of the chemical difference sites near the 2’-OH of the ribose moiety in
ASNAMS — site 11 (Figure 4.6.A) and ahnost identical sites 3 (Figure 4.6.B) and 18
(Figure 4.6.C) — prefer a hydrogen-bond acceptor ligand atom over hydrogen-bond
donors in rest of the three proteins. An acceptor atom at site 11 can accept from main-
chain nitrogen atom of AsnRS residue Gly 408 whereas a donor atom can donate to
main-chain oxygen atom of Ile 361 (Table 4.4). On the other hand, a hydrogen-bond
donor at site 11 (Figure 4.6.A) can donate to main-chain oxygen atom Leu 25 and/or
side-chain carboxyl oxygen atom of Glu 110 in phosphorylase kinase. An acceptor atom
at site 3 in Figure 4.6.B, identical to site 18 in Figure 4.6.C, can accept from side-chain
terminal amine of AsnRS residue Arg 411 whereas a donor atom can donate to main-
chain oxygen atom of Glu 360 (Table 4.4). On the other hand, a hydrogen-bond donor at
site 3 in dethiobiotin synthetase can donate to side-chain carboxyl oxygen atom Glu 211,
and at site 18 in actin can donate to side-chain carboxyl oxygen atom of Glu 214. While
hydrogen-bond donors can interact with AsnRS residues, the interactions made by an
acceptor atom seem to be much stronger as indicated by the difference in DrugScore

values (Table 4.4). It is also signiﬁcant to note that these sites, located near the 2’-OH of

147

 

the ribose moiety in ASNAMS, are exposed to the solvent in both phosphorylase kinase
and dethiobiotin synthetase while they are buried in the binding site in AsnRS. In fact,
the 2’-OH in the bound ligands of phosphorylase kinase and dethiobiotin synthetase
interact with the crystallographic water molecules whereas in ASNAMS, it (2’-OH)
donates to the main-chain oxygen atom of AsnRS residue Ile 361 and accepts from the
main-chain nitrogen atom of Gly 408. The signiﬁcant chemical difference site 29 in
Figure 4.6.B, almost identical to site 7 in Figure 4.6.C, near the 5’-OH of the ribose
moiety, indicates that AsnRS prefers a hydrogen-bond acceptor over a hydrophobic atom.
Arr acceptor atom at this site can accept from side-chain hydroxyl group of AsnRS
residue Tyr 223 and/or side-chain terminal amine of Arg 210, whereas a hydrophobic
atom can interact with side-chain carbon atoms of AsnRS residue His 225 and Arg 210
(Table 4.4). A hydrophobic ligand atom is preferred in AsnRS for the other two sites —
site 71 (Figure 4.6.A) and site 64 (Figure 4.6.B) — over hydrogen-bond donor atom in
phosphorylase kinase and an acceptor atom in actin. The hydrophobic atoms at sites 71
and 64, located near the C3’ atom and the sulfamoyl group respectively of the ribose
moiety in ASNAMS, can interact with AsnRS residues Ile 361, Val 362 and Gly 363

whereas no hydrogen bond interactions could be made (Table 4.4).

4.4.3.3 Chemical difference sites in the amino-acid (asparagine)

pocket

As expected from an AARS, the asparagine pocket in AsnRS is very speciﬁc for the

cognate amino acid. However, there were no signiﬁcant differences identiﬁed by our

148

method in and around this pocket. When the proteins are brought in the same reference
frame, other protein binding site pockets in the vicinity of the AsnRS asparagine pocket
are more polar as they bind to the phosphate tail. However, it must be noted that AsnRS
does bind to ATP but in a bent conformation (Figure 4.5.A) such that the B and y
phosphates of ATP bind and interact with residues located away from the asparagine
pocket in the AsnRS binding site. In terms of speciﬁcity determinants for AsnRS in this
pocket, no new information is revealed except the obvious that it is sterically highly
speciﬁc for asparagine. In fact, the steric difference sites (presented in the next
subsection) identiﬁed by our method reveal that the asparagine pocket may not even be

sterically accessible in the phosphorylase kinase structure.

4.4.4 Steric difference sites identiﬁed between AsnRS and

phosphorylase kinase

We used our method to identify steric difference sites between AsnRS and protein
structures representing each of the other 3 subsets of the ATP set. However, we present
here results obtained for only one pair —- AsnRS and phosphorylase kinase (representative
structure of the kinase subset), as it sufﬁciently demonstrates the ability of our method to
identify the true steric difference sites between a pair of protein structures. As mentioned
earlier, AsnRS and phosphorylase kinase bind to ATP in completely opposite ways with
reference to the exocyclic amine N6, which is exposed to the solvent in the former while
it is buried in the binding site in the latter. Hence, it is easier to verify the steric difference

sites by looking at the surfaces of the two proteins around them. Also, the fact that we

149

 

check for van der Waals overlaps between template points of one and atoms of the other
protein structure, self-validates these steric difference sites. The Connolly solvent-
accessible molecular surface of both AsnRS and phosphorylase kinase are shown in the

panels of Figure 4.7. Steric difference sites indicating the accessible space in AsnRS

 

 

 

 

Figure 4.7: The steric difference sites, showing accessible space in Brugia AsnRS
relative to phosphorylase kinase (PDB: lPHK), a representative protein kinase, and vice-
versa, are shown in this ﬁgure. The ligands (ASNAMS in A and B, ATP in C and D),
rendered as atom-colored tubes, bound in each of the structures are also shown for
reference. The steric difference sites are rendered as solid cyan spheres in all the panels.
The Connolly solvent-accessible molecular surface of AsnRS is colored grey while that
of phosphorylase kinase is colored yellow. The back of both the surfaces appears black in
B and D. The accessible space in AsnRS (binds to adenine with the exocylic N6 amine
exposed to the solvent) is depicted by steric difference sites in A. This same space is
inaccessible in phosphorylase kinase (binds to adenine with the exocylic N6 amine buried
in the binding site) as shown in B, where the steric difference sites, accessible in AsnRS,
are occluded behind the kinase surface. Similarly, the accessible space in phosphorylase
kinase is depicted by steric difference sites in C. This same space is inaccessible in
AsnRS as shown in D, where the steric difference sites, accessible in the kinase, are
occluded behind the AsnRS surface.

 

150

(surface colored grey) are Shown in Figure 4.7.A, while the same space is inaccessible in
phosphorylase kinase (surface colored yellow) as shown in Figure 4.7.B. Similarly, steric
difference sites indicating the accessible space in phosphorylase kinase are shown in
Figure 4.7.C, while the same space is inaccessible in AsnRS as shown in Figure 4.7.D.
It turns out that the asparagine pocket is sterically inaccessible in phosphorylase kinase,
explaining the lack of signiﬁcant chemical difference sites in the same pocket between
the two proteins. The steric difference sites identiﬁed by our method reveal two important
things. Firstly, the adenine moiety is efﬁciently packed in both the binding sites, hardly
leaving any room for binding of additional ligand atoms. Of course, this observation
discounts any conformational changes that might alter the shape and accessible volume in
the two binding sites, allowing for larger ligands to bind. Secondly, the ribose moiety in
AsnRS (and other N6-out proteins as well) is buried and tightly packed against the
binding site, leaving no room for any larger ligand to bind in that pocket. In contrast, the
ribose moiety in phosphorylase kinase (and other N6-in protein as well) is exposed to the
solvent and hence offers more options for ligand modiﬁcation to improve the selectivity
for the kinase. The asparagine pocket provides the best sterically accessible option for

ligand modiﬁcation to selectivity for AsnRS.

4.5 Discussion

4.5.1 Relative signiﬁcance of chemical difference sites

We used DrugScore to score the chemical difference sites identiﬁed by our method in

order to assess their relative signiﬁcance. An interaction site that is quantiﬁed as a

151

 

signiﬁcant chemical difference in one protein need not be a signiﬁcant difference in the
other protein. Although counterintuitive at ﬁrst, this can be explained if you consider the
contribution of speciﬁc interactions to binding afﬁnities of the same ligand bound to two
different proteins. For example, ligand RPR binds to both trypsin and factor Xa (Table
4.1, Figure 4.4.C) with the benzamidine group occupying the S1 pocket in both the
binding sites and making interactions with residue the Asp189, present in both the
proteins. However, the residue Ser 190 in trypsin is substituted by Ala 190 in factor Xa in
the S1 pocket, which is sensed by the chemical difference site 127 (Table 4.3, Figure
4.4.C) identiﬁed by our method. Similarly, the residue Phe 174 in trypsin occupies the
same volume as Gln 175 in factor Xa in the S4 pocket, which is sensed by the chemical
difference site 116 (Table 4.3, Figure 4.4.C) identiﬁed by our method. The interactions
made by the trypsin residue Ser 190 with the benzamidine group may contribute
signiﬁcantly to its binding afﬁnity but are not enough to confer selectivity relative to
factor Xa and hence Site 127 is not characterized as signiﬁcant in trypsin. The 80-fold
difference in K, for ligand RPR between trypsin and factor Xa has been experimentally
accounted for by the differences in the S1 and S4 pockets of the two binding sites (48),

and is in good agreement with the chemical difference sites identiﬁed by our method.

Similarly, the chemical difference sites identiﬁed in the ribose pocket are
quantiﬁed as signiﬁcant in AsnRS but not in other ATP-binding proteins (Table 4.4,
Figure 4.6). Especially site 71 (near C3’, Figure 4.6.A) and site 3 (near 2’-OH, Figure
4.6.B), with ADS values of -l6,300 and -l7,200 respectively (Table 4.4), have very high
degree of preference for AsnRS relative to phosphorylase kinase or dethiobiotin

synthetase. The fact that the co-crystallized ligands bind to these proteins with their

152

ribose moieties either buried (AsnRS) or exposed to the solvent (phosphorylase kinase
and dethiobiotin synthetase), does have a bearing on whether a chemical difference site
located in these pockets are signiﬁcant or not. Based on our results, we observe that a
chemical difference site, identiﬁed in a pocket that is solvent exposed in one but buried in
another protein binding site, usually is a signiﬁcant speciﬁcity determinant for the protein

where it is buried.

4.5.2 Integrating our method into virtual screening protocol

The chemical as well as steric difference sites identiﬁed by our method could be directly
integrated into our structure-based virtual screening tool SLIDE. For docking a ligand,
represented by a set of interaction points, into the binding site of the target protein, all
possible triplets of its interaction points are mapped onto geometrically and chemically
compatible template triangles. Template points that are identiﬁed as signiﬁcant chemical
difference sites by our method can be marked as key points, and any docking must then
include a match to at least one of these points. Labeling selected template points as key
points in SLIDE has yielded improved results in the past in docking and identifying both
known as well as new ligands of thrombin, glutathione S-transferase (GST), HIV-1
protease and AsnRS (17, 33). Results from our methods could also used as
pharmacophore constraints and/or ﬁlters and enable a bias to be applied in other
structure-based virtual screening protocols like DOCK (54, 55), FlexX (56, 57) and
FRED (58). Other approaches for target-biased structure-based virtual screening were

reviewed by Jansen and co-workers (5 9).

153

4.5.3 Conformational ﬂexibility and chemical difference sites

For identifying chemical difference sites, the method is currently designed to work for a
pair of protein structures fed into the algorithm, without accounting for their
conformational ﬂexibility. To circumvent this issue, one could use our method to ﬁrst
identify the similar sites among the available structures of conformers of the same protein
(see AsnRS similar sites, Figure 4.5.A) and then compare the similar sites, representing
the binding site invariants, for identifying chemical difference sites. When structures of
different conformers of the same protein are not available, then sampling algorithms
could be used to generate low-energy conformers of the input protein structures. We have
earlier employed our graph-theoretic algorithm to identify the ﬂexible regions in a given
protein structure, ProFlex (60, 61), combined with our random-walk sampling algorithm,
ROCK (62, 63), to generate several low-energy conformers of various proteins including
cyclophilin A, estrogen receptor, dihydrofolate reductase, HIV protease and AsnRS (I 7,

62, 63).

4.5.4 Superpositional accuracy

All methods of this type need a reasonable superposition between the protein structures to
start with. We have used ligand-based superposition to apply our method to address the
questions we asked and it was shown that it performs robustly as long as the
superpositional shift in the two structures is under 1.5 A (Figure 4.3). Protein structure-

based superposition methods like DALI (64) and MSDfold (65) could also be used to

154

 

 

bring the two structures in the same reference frame. However, the alignment of the
structures may be challenging when the proteins are not related and have low sequence

and/or structural homology.

4.6 Conclusions

Using complete-linkage clustering of superimposed templates, generated by SLIDE to
represent the protein binding sites, we developed a method to identify binding site
invariants and speciﬁcity determinants between proteins. We applied this method on two
sets of proteins, assembled to address different questions. The signiﬁcant chemical
difference sites identiﬁed by our method were able to explain the experimentally
observed selectivity of ligands bound to the proteins of the AfﬁnDB set. For proteins of
the ATP set, we used our method to identify chemically similar sites, chemical difference
sites and steric difference sites. Given the high density of similar sites identiﬁed in the
adenine pocket, a productive strategy for AsnRS inhibitor design would be to exploit the
signiﬁcant chemical difference sites and steric difference sites identiﬁed in the ribose and
asparagine pockets of the AsnRS binding site respectively. The results from this method
could easily be integrated in our structure-based virtual screening protocol which could
then be used to screen for selective ligands that occupy the signiﬁcant chemical

difference sites in AsnRS.

155

 

 

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

(11)

References

Carr, R., and Jhoti, H. (2002) Structure-based screening of low—afﬁnity
compounds. Drug Discov Today 7, 522-7.

Lyne, P. D. (2002) Structure-based virtual screening: an overview. Drug Discov
Today 7, 1047-55.

Stahura, F. L., and Bajorath, J. (2004) Virtual screening methods that complement
HTS. Comb Chem High Throughput Screen 7, 259-69.

Ghosh, S., Nie, A., An, J ., and Huang, Z. (2006) Structure-based virtual screening
of chemical libraries for drug discovery. Curr Opin Chem Biol 10, 194-202.

Seifert, M. H., Kraus, J ., and Kramer, B. (2007) Virtual high-throughput
screening of molecular databases. Curr Opin Drug Discov Devel 10, 298-307.

Verkhivker, G. M., Bouzida, D., Gehlhaar, D. K., Rejto, P. A., Arthurs, S.,
Colson, A. B., Freer, S. T., Larson, V., Luty, B. A., Marrone, T., and Rose, P. W.

(2000) Deciphering common failures in molecular docking of ligand-protein
complexes. J Comput Aided Mol Des 14, 731-51.

Gohlke, H., and Klebe, G. (2002) Approaches to the description and prediction of

the binding afﬁnity of small-molecule ligands to macromolecular receptors.
Angew Chem Int Ed Eng] 41, 2644-76.

Coupez, B., and Lewis, R. A. (2006) Docking and scoring-theoretically easy,
practically impossible? Curr Med Chem 13, 2995-3003.

Alvarez, J. C. (2004) High-throughput docking as a source of novel drug leads.
Curr Opin Chem Biol 8, 365-70.

Cavasotto, C. N., and Orry, A. J. (2007) Ligand docking and structure-based
virtual screening in drug discovery. Curr Top Med Chem 7, 1006-14.

Manning, G., Whyte, D. B., Martinez, R., Hunter, T., and Sudarsanam, S. (2002)
The protein kinase complement of the human genome. Science 298, 1912-34.

156

V‘ \‘I ‘1 “Ag

 

 

(12)

(13)

(14)

(15)

(16)

(17)

(18)

(19)

(20)

Cohen, P. (2002) Protein kinases--the major drug targets of the twenty-ﬁrst
century? Nat Rev Drug Discov 1, 309-15.

Fabian, M. A., Biggs, W. H., 3rd, Treiber, D. K., Atteridge, C. E., Azimioara, M.
D., Benedetti, M. G., Carter, T. A., Ciceri, P., Edeen, P. T., Floyd, M., Ford, J.
M., Galvin, M., Gerlach, J. L., Grotzfeld, R. M., Herrgard, S., Insko, D. E., Insko,
M. A., Lai, A. G., Lelias, J. M., Mehta, S. A., Milanov, Z. V., Velasco, A. M.,
Wodicka, L. M., Patel, H. K., Zarrinkar, P. P., and Lockhart, D. J. (2005) A small
molecule-kinase interaction map for clinical kinase inhibitors. Nat Biotechnol 23,
329-36.

Walker, B., and Lynas, J. F. (2001) Strategies for the inhibition of serine
proteases. Cell Mol Life Sci 58, 596-624.

Coghlan, M. J ., Elmore, S. W., Kym, P. R., and Kort, M. E. (2003) The pursuit of
differentiated ligands for the glucocorticoid receptor. Curr Top Med Chem 3,
1617-35.

 

Matter, H., and Schudok, M. (2004) Recent advances in the design of matrix
metalloprotease inhibitors. Curr Opin Drug Discov Devel 7, 513-35.

Sukuru, S. C., Crepin, T., Milev, Y., Marsh, L. G, Hill, J. B., Anderson, R. J .,
Morris, J. C., Rohatgi, A., O'Mahony, G., Grotli, M., Danel, F., Page, M. G.,
Hartlein, M., Cusack, S., Kron, M. A., and Kuhn, L. A. (2006) Discovering new
classes of Brugia malayi asparaginyl—tRNA synthetase inhibitors and relating
speciﬁcity to conformational change. J Comput Aided Mol Des 20, 159-78.

 

Bettayeb, K., Tirado, O. M., Marionneau-Lambot, S., F erandin, Y., Lozach, 0.,
Morris, J. C., Mateo-Lozano, S., Drueckes, P., Schachtele, C., Kubbutat, M. H.,
Liger, F., Marquet, B., Joseph, B., Echalier, A., Endicott, J. A., Notario, V., and
Meij er, L. (2007) Meriolins, a new class of cell death inducing kinase inhibitors
with enhanced selectivity for cyclin-dependent kinases. Cancer Res 6 7, 8325-34. 3

 

Goodford, P. J. (1985) A computational procedure for determining energetically
favorable binding sites on biologically important macromolecules. J Med Chem
28, 849-57.

Wade, R. C., Clark, K. J ., and Goodford, P. J. (1993) Further development of
hydrogen bond functions for use in determining energetically favorable binding
sites on molecules of known structure. 1. Ligand probe groups with the ability to
form two hydrogen bonds. J Med Chem 36, 140-7.

157

(21)

(22)

(23)

(24)

(25)

(26)

(27)

(28)

(29)

(30)

Kastenholz, M. A., Pastor, M., Cruciani, G., Haaksma, E. E., and Fox, T. (2000)
GRID/CPCA: a new computational tool to design selective ligands. J Med Chem
43, 3033-44.

Braiuca, P., Cruciani, G., Ebert, C., Gardossi, L., and Linda, P. (2004) An
innovative application of the "ﬂexible" GRID/PCA computational method: study
of differences in selectivity between PGAs from Escherichia coli and a
Providentia rettgeri mutant. Biotechnol Prog 20, 1025-31.

Sheridan, R. P., Holloway, M. K., McGaughey, G., Mosley, R. T., and Singh, S.

 

B. (2002) A simple method for visualizing the differences between related E
receptor sites. J Mol Graph Model 21, 217-25. 5
Sheridan, R. P., Nachbar, R. B., and Bush, B. L. (1994) Extending the trend

vector: the trend matrix and sarnple-based partial least squares. J Comput Aided 1
Mol Des 8, 323-40. 1

Miller, M. D., Kearsley, S. K., Underwood, D. J ., and Sheridan, R. P. (1994)
FLOG: a system to select 'quasi-ﬂexible' ligands complementary to a receptor of
known three-dimensional structure. J Comput Aided Mol Des 8, 153-74.

Deng, Z., Chuaqui, C., and Singh, J. (2004) Structural interaction ﬁngerprint
(SIFt): a novel method for analyzing three-dimensional protein-ligand binding
interactions. J Med Chem 4 7, 337-44.

Deng, Z., Chuaqui, C., and Singh, J. (2006) Knowledge-based design of target-
focused libraries using protein-ligand interaction constraints. J Med Chem 49,
490-500.

Perola, E. (2006) Minimizing false positives in kinase virtual screens. Proteins
64, 422-35.

Ortiz, A. R., Gomez-Puertas, P., Leo-Macias, A., Lopez-Romero, P., Lopez—
Vinas, E., Morreale, A., Murcia, M., and Wang, K. (2006) Computational
approaches to model ligand selectivity in drug design. Curr Top Med Chem 6, 41-
55.

Schnecke, V., Swanson, C. A., Getzoff, E. D., Tainer, J. A., and Kuhn, L. A.
(1998) Screening 8 peptidyl database for potential ligands to proteins with side-
chain ﬂexibility. Proteins 33, 74-87.

158

(31)

(32)

(33)

(34)

(35)

(36)

(37)

(38)

(39)

(40)

(41)

Schnecke, V., and Kuhn, L. A. (1999) Database screening for HIV protease
ligands: the inﬂuence of binding-site conformation and representation on ligand
selectivity. Proc Int Conflntell Syst Mol Biol, 242-51.

Schnecke, V., and Kuhn, L. A. (2000) Virtual screening with solvation and
ligand-induced complementarity. Perspect Drug Discov 20, 171-190.

Zavodszky, M. I., Sanschagrin, P. C., Korde, R. S., and Kuhn, L. A. (2002)
Distilling the essential features of a protein surface for improving protein-ligand
docking, scoring, and virtual screening. J Comput Aided Mol Des 16, 883-902.

Zavodszky, M. I., and Kuhn, L. A. (2005) Side-chain ﬂexibility in protein-ligand
binding: the minimal rotation hypothesis. Protein Sci 14, 1104-14.

Sanschagrin, P. C., and Kuhn, L. A. (1998) Cluster analysis of consensus water
sites in thrombin and trypsin shows conservation between serine proteases and
contributions to ligand speciﬁcity. Protein Sci 7, 2054-64.

Gohlke, H., Hendlich, M., and Klebe, G. (2000) Knowledge-based scoring
function to predict protein-ligand interactions. J Mol Biol 295 , 337-56.

Wallace, A. C., Laskowski, R. A., and Thornton, J. M. (1995) LIGPLOT: a
program to generate schematic diagrams of protein-ligand interactions. Protein
Eng 8, 127-34.

Block, P., Sotriffer, C. A., Dramburg, I., and Klebe, G. (2006) AfﬁnDB: a freely
accessible database of afﬁnities for protein-1i gand complexes from the PDB.
Nucleic Acids Res 34, D522-6.

McDonald, 1., and Thornton, J. M.

Ippolito, J. A., Alexander, R. S., and Christianson, D. W. (1990) Hydrogen-Bond
Stereochemistry In Protein-Structure And Function. J Mol Biol 215, 457-471.

Kuhn, L. A., Swanson, C. A., Pique, M. E., Tainer, J. A., and Getzoff, E. D.
(1995) Atomic and residue hydrophilicity in the context of folded protein
structures. Proteins 23, 536-47.

159

 

(42)

(43)

(44)

(45)

(46)

(47)

(48)

(49)

(50)

(51)

Carlson, H. A., Masukawa, K. M., Rubins, K., Bushman, F. D., Jorgensen, W. L.,
Lins, R. D., Briggs, J. M., and McCammon, J. A. (2000) Developing a dynamic
pharmacophore model for HIV-1 integrase. J Med Chem 43, 2100-2114.

Ramensky, V., Sobol, A., Zaitseva, N., Rubinov, A., and Zosirnov, V. (2007) A
novel approach to local similarity of protein binding sites substantially improves
computational drug design results. Proteins 69, 349-357.

Chene, P. (2002) ATPases as drug targets: learning from their structure. Nat Rev
Drug Discov 1, 665-73.

 

IT
Vieth, M., Higgs, R. E., Robertson, D. H., Shapiro, M., Gragg, E. A., and
Hemmerle, H. (2004) Kinomics-structural biology and chemo genomics of kinase
inhibitors and targets. Biochim Biophys Acta 1697, 243-57.
Kuttner, Y. Y., Sobolev, V., Raskind, A., and Edelrnan, M. (2003) A consensus- 1

binding structure for adenine at the atomic level permits searching for the ligand
site in a wide spectrum of adenine-containing complexes. Proteins 52, 400-11.

Katz, B. A., Mackman, R., Luong, C., Radika, K., Martelli, A., Sprengeler, P. A.,
Wang, J ., Chan, H., and Wong, L. (2000) Structural basis for selectivity of a small
molecule, Sl-binding, submicromolar inhibitor of urokinase-type plasmino gen
activator. Chem Biol 7, 299-312.

Maignan, S., Guilloteau, J. P., Pouzieux, S., Choi-Sledeski, Y. M., Becker, M. R.,
Klein, S. I., Ewing, W. R., Pauls, H. W., Spada, A. P., and Mikol, V. (2000)
Crystal structures of human factor Xa complexed with potent inhibitors. J Med
Chem 43, 3226-32.

Rauh, D., Klebe, G., and Stubbs, M. T. (2004) Understanding protein-ligand
interactions: the price of protein ﬂexibility. J Mol Biol 335 , 1325-41.

 

Moodie, S. L., Mitchell, J. B., and Thornton, J. M. (1996) Protein recognition of
adenylate: an example of a fuzzy recognition template. J Mol Biol 263, 486-500.

Mao, L., Wang, Y., Liu, Y., and Hu, X. (2004) Molecular determinants for ATP-
binding in proteins: a data mining and quantum chemical analysis. J Mol Biol
336, 787-807.

160

(52)

(53)

(54)

(55)

(56)

(57)

(58)

(59)

(60)

(61)

(62)

Pankiewicz, K. W., Zatorski, A., and Watanabe, K. A. (1996) NAD-analogues as
potential anticancer agents: conformational restrictions as basis for selectivity.
Acta Biochim Pol 43, 183-93.

Jacobson, K. A. (2001) Probing adenosine and P2 receptors: Design of novel
purines and nonpurines as selective ligands. Drug Develop Res 52, 178-186.

Shoichet, B. K., and Kuntz, I. D. ( 1993) Matching chemistry and shape in
molecular docking. Protein Eng 6, 723-32.

Good, A. C., Cheney, D. L., Sitkoff, D. F., Tokarski, J. S., Stouch, T. R.,
Bassolino, D. A., Krystek, S. R., Li, Y., Mason, J. S., and Perkins, T. D. (2003)
Analysis and optimization of structure-based virtual screening protocols. 2.

Examination of docked ligand orientation sampling methodology: mapping a
pharmacophore for success. J Mol Graph Model 22, 31-40.

Gruneberg, S., Stubbs, M. T., and Klebe, G. (2002) Successful virtual screening
for novel inhibitors of human carbonic anhydrase: strategy and experimental
conﬁrmation. J Med Chem 45, 3588-602.

Hindle, S. A., Rarey, M., Buning, C., and Lengaue, T. (2002) Flexible docking
under pharmacophore type constraints. J Comput Aided Mol Des 16, 129-49.

Schulz-Gasch, T., and Stahl, M. (2003) Binding site characteristics in structure-
based virtual screening: evaluation of current docking tools. J Mol Model 9, 47-
57.

Jansen, J. M., and Martin, E. J. (2004) Target-biased scoring approaches and
expert systems in structure-based virtual screening. Curr Opin Chem Biol 8, 359-
64.

Jacobs, D. J ., Rader, A. J ., Kuhn, L. A., and Thorpe, M. F. (2001) Protein
ﬂexibility predictions using graph theory. Proteins 44, 150-65.

Rader, A. J ., Hespenheide, B. M., Kuhn, L. A., and Thorpe, M. F. (2002) Protein
unfolding: rigidity lost. Proc Natl Acad Sci U S A 99, 3540-5.

Lei, M., Zavodszky, M. I., Kuhn, L. A., and Thorpe, M. F. (2004) Sampling
protein conformations and pathways. J Comput Chem 25, 1133-48.

161

 

(63)

(64)

(65)

Zavodszky, M. 1., Lei, M., Thorpe, M. F., Day, A. R., and Kuhn, L. A. (2004)
Modeling correlated main-chain motions in proteins for ﬂexible molecular
recognition. Proteins 5 7, 243-61.

Holm, L., and Sander, C. ( 1993) Protein-Structure Comparison by Alignment of
Distance Matrices. J Mol Biol 233, 123-138.

Krissinel, E., and Henrick, K. (2004) Secondary-structure matching (SSM), a new
tool for fast protein structure alignment in three dimensions. Acta Crystallogr D
60, 2256-2268.

162

 

 

Chapter 5

Summary and future directions

5.1 Virtual screening for aminoacyl tRNA synthetase

inhibitors

5.1.1 Summary and perspective

In chapter 2, the discovery of seven new classes of Brugia AsnRS using structure-based
virtual screening is presented. This is the ﬁrst reported example of tRNA synthetase
inhibitors being discovered by protein structure-based screening methods. A survey of the
recently published database of known aminoacyl-tRNA synthetase (AARS) inhibitors (1)
reveals that majority of them are bacterial class I AARS (e. g., IleRS and MetRS)
inhibitors and very few are class II AARS (e.g. AsnRS, ProRS) inhibitors. Most of the
potent class I AARS inhibitor scaffolds are natural products discovered in experimental

screening (2-4).

A combination of an empirical scoring function (SLIDE score) and a knowledge-
based scoring function (DrugScore) does a reliable job of assessing the right

conformation, binding mode, and relative afﬁnity of known AsnRS ligands, as well as

163

 

.-_.
-' G

distinguish them from a pool of 1000 decoy molecules. This scoring protocol also guided
us in selecting 45 compounds for experimental assays, out of which 7 were conﬁrmed as
inhibitors. From our experience, proper representation of the binding site and having the
ligand conformers close to the bioactive, bound conformation are important factors for

scoring functions to perform well on a given system.

The selectivity of long side chain variolins, for Brugia relative to human AsnRS,
was explained using a ROCK-generated open conformation of a ﬂexible active-site loop
of Brugia AsnRS, coupled with a sequence substitution at the base of the loop. These
results open a new range of possibilities of considering conformational differences
between active-site loops, rather than only considering residue differences in the static

parts of binding pockets, for gaining speciﬁcity between close homologs.

5.1.2 Future directions

Brugia AsnRS inhibitors discovered by SLIDE were all docked in the adenosyl pocket of
the binding site. AsnRS is highly speciﬁc for binding asparagine in its aminoacyl pocket,
as is generally true for all AARS and their cognate amino acids. A productive strategy for
AsnRS inhibitor design is to link the sulfamoyl-asparagine group to the promising
inhibitor scaffolds that bind in the adenosyl pocket. Chapter 3 describes the design of
analogs of two of the most promising Brugia AsnRS inhibitors by employing this
strategy. However, given the high active-site sequence homology between Brugia and
human AsnRS, there is need for identiﬁcation of alternative binding pockets in Brugia

AsnRS that could be used for screening and design of new inhibitors.

164

Assisted by the binding site comparison tool described in chapter 4, our
laboratory identiﬁed a pocket in Brugia AsnRS, off of the binding site occupied by the
co-crystallized ligand ASNAMS. The ZINC database (5), containing more than a million
commercially-available compounds for virtual screening, was screened by my colleague
Anj ali Rohatgi to ﬁnd inhibitors that may be docked in the new binding pocket. It will be
interesting to ﬁnd compounds, predicted to bind elsewhere from the known binding site,

that inhibit the enzyme

Our collaborator Prof. Michael Kron and co-workers have shown that Brugia
AsnRS could possibly contribute to the acute host inﬂammatory response against the
ﬁlarial parasite by activating human chemokine receptors CXCR1 and CXCR2 (6). To
elucidate the structural and/or sequence determinants that confer chemokine activity to
Brugia AsnRS, it was compared with the interleukin 1L8, a representative chemokine that
binds to both CXCR1 and CXCR2 with high afﬁnity. Preliminary analysis of the results
showed two short (three residues long) sequence stretches in Brugia AsnRS, far from its
binding site, that were most similar to human 1L8. Further experiments are required to
test whether these sequences play a crucial role in conferring the chemokine activity of
Brugia AsnRS. The Brugia AsnRS pocket containing residues interacting with
chemokine receptors, once conﬁrmed, could be an additional target for virtual screening
to identify potential ligands that block the enzyrne’s interactions with chemokine

receptors.

A structurally distinct editing site for proofreading has been reported in many
AARS (7). However, for class II AARS, of which AsnRS is a member, the editing site

has been identiﬁed for only AlaRS (8), ThrRS (9) and ProRS (10). Elucidation of an

165

V a“: in. tmcuqyu—q

editing site in Brugia AsnRS will provide an additional binding Site for virtual screening

to identify potential ligands that can inhibit the prooﬁeading activity of the enzyme.

5.2 Using speciﬁcity determinants in virtual screening

5.2.1 Summary and perspective

Structure-based virtual screening has been successful in identifying ligands with good
shape and chemical complementarity to a protein target. However, it is challenging to
ﬁnd ligands that are speciﬁc to one protein relative to another in a fast, automated way. In
chapter 4, a new method to perform automated Shape and chemistry comparison to
identify binding site invariants and speciﬁcity determinants has been described. The
results from this method could be integrated into our structure-based virtual screening
protocol to selectively screen for ligands that match the speciﬁcity determinants of a

target protein.

To identify the speciﬁcity determinants between Brugia AsnRS and other ATP-
binding proteins, their binding sites were compared using the new method. The results
obtained are not only useful in aiding structure-based drug design efforts but also
elucidate novel differences between the binding sites of ATP-binding proteins. The
adenine pockets of these proteins are very similar given the high density of similar sites
identiﬁed by the method. However, there are key chemical and steric difference sites in

their ribose and phosphate pockets which can be exploited in ligand design.

166

 

 

5.2.2 Future directions

The results obtained ﬁom comparing the binding sites will be very useful when they are
integrated with structure-based screening protocol. Our screening and docking tool
SLIDE can use the speciﬁcity determinants identiﬁed by the method by labeling them as
key points in its protein binding site model called a template. The docking of ligand
candidates must then include a match to at least one of these key template points. The
results obtained from our method could also be used as pharmacophore constraints and/or

ﬁlters to enable a bias to be applied in other virtual screening protocols.

The degree of preference of chemical difference sites, identiﬁed by our method
between two protein binding Sites, was quantiﬁed using DrugScore, a knowledge-based
scoring function. Two interesting questions can be addressed by performing cross

docking experiments of promising compounds to the binding sites of the two proteins:

1. Can a compound, docked by matching the speciﬁcity determinants in one protein

relative to other, be docked at all to the binding site of the other?

2. If a compound can be docked to both proteins, then do the predicted relative

binding afﬁnities have any correlation with the quantiﬁed degree of preference?

Answers to these questions can contribute to our understanding of the essential
features in molecular recognition that are similar and/or different between two proteins.
They can also aid in further development of scoring functions that predict protein-ligand

complementarity, by deciphering the features that contribute most to the binding afﬁnity.

167

 

 

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

References

Torchala, M., and Hofﬁnann, M. (2007) IA, database of known ligands of
aminoacyl-tRNA synthetases. J Comput Aided Mol Des 21, 523-5.

Kim, S., Lee, S. W., Choi, E. C., and Choi, S. Y. (2003) Aminoacyl-tRNA
synthetases and their inhibitors as a novel family of antibiotics. Appl Microbiol
Biotechnol 61, 278-88.

Pohlmann, J ., and Brotz-Oesterhelt, H. (2004) New aminoacyl-tRNA synthetase
inhibitors as antibacterial agents. Curr Drug Targets Infect Disord 4, 261-72.

V132.“ N‘Wu__..'
:’

Ochsner, U. A., Sun, X., Jarvis, T., Critchley, I., and Janjic, N. (2007) Aminoacyl-
tRNA synthetases: essential and still promising targets for new anti-infective
agents. Expert Opin Investig Drugs 16, 573-93.

Irwin, J. J ., and Shoichet, B. K. (2005) ZINC--a free database of commercially
available compounds for virtual screening. J Chem Inf Model 45, 177-82.

Ramirez, B. L., Howard, O. M., Dong, H. F., Edamatsu, T., Gao, P., Hartlein, M.,
and Kron, M. (2006) Brugia malayi asparaginyl-transfer RNA synthetase induces

chemotaxis of human leukocytes and activates G-protein-coupled receptors
CXCR1 and CXCR2. J Infect Dis 193, 1164-71.

Ibba, M., and S011, D. (2000) Aminoacyl-tRNA synthesis. Annu Rev Biochem 69,
617-50.

Beebe, K., Ribas De Pouplana, L., and Schimmel, P. (2003) Elucidation of tRNA-
dependent editing by a class II tRNA synthetase and signiﬁcance for cell viability. —
Embo J22, 668-75. ‘

Beebe, K., Meniman, E., Ribas De Pouplana, L., and Schimmel, P. (2004) A
domain for editing by an archaebacterial tRNA synthetase. Proc Natl Acad Sci U
SA 101, 5958-63.

Crepin, T., Yaremchuk, A., Tukalo, M., and Cusack, S. (2006) Structures of two
bacterial prolyl-tRNA synthetases with and without a cis-editing domain.
Structure 14, 1511-25.

168

   

1llllljllﬂlllllljljj11111131111|