COMPUTATIONAL CHEMISTRY: INVESTIGATIONS OF PROTEIN-PROTEIN
INTERACTIONS AND POST-TRANSLATIONAL MODIFICATIONS TO PEPTIDES
By
Michael R. Jones

A DISSERTATION
Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of
ChemistryâDoctor of Philosophy
2017

ABSTRACT
COMPUTATIONAL CHEMISTRY: INVESTIGATIONS OF PROTEIN-PROTEIN
INTERACTIONS AND POST-TRANSLATIONAL MODIFICATIONS TO PEPTIDES
By
Michael R. Jones
Computational chemistry plays a vital role in understanding chemical and physical
processes and has been useful in advancing the understanding of reactions in biology. Improper
signaling of the nuclear factor-ÎşB (NF-ÎşB) pathway plays a critical role in many inflammatory
disease states, including cancer, stroke, and viral infections. Aberrant regulation of this
pathway happens upon the signal-induced degradation of the inhibitor of ÎşB (IÎşB) proteins.
The activation of IÎşB kinase (IKK) subunit Î˛ (IKKÎ˛) or NF-ÎşB Inducing Kinase (NIK),
initiates this cascade of events. Understanding the structure-property relationships associated
with IKKÎ˛ and NIK is essential for the development of prevention strategies. Although the
signaling pathways are known, how the molecular mechanisms respond to changes in the
intracellular microenvironment (i.e., pH, ionic strength, temperature) remains elusive.
In this dissertation, computer simulation and modeling techniques were used investigate
two protein kinases complexed with either small molecule activators or inhibitors in the active,
inactive, and mutant states to correlate structure-property and structure-function relationships
as a function of intracellular ionic strength. Additionally, radical-induced protein
fragmentation pathways, as a result of reactions with reactive oxygen species, were
investigated to yield insight into the thermodynamic preference of the fragmentation
mechanisms. Analyses of the relationship between structure-activity and conformationalactivity indicate that the protein-protein interactions and the binding of small molecules are
sensitive to changes in the ionic strength and that there are several factors that influence the

selectivity of peptide backbone cleavage. As there are many computational approaches for
predicting physical and chemical properties, several methods were considered for the
predictions of protein-protein dissociation, protein backbone fragmentation, and partition
coefficients of drug-like molecules.

This dissertation is dedicated to family.

iv

ACKNOWLEDGEMENTS

I am truly grateful for all of those who have been supportive during my academic career.
Firstly, I would like to thank my research advisor and mentor, Professor Angela K. Wilson for her
unremitting support and guidance. As an advisor, Professor Wilson consistently pushed me to be
my best (propel through challenges), encouraged me to attend conferences and scientific meetings,
apply for awards and seize opportunities for research enhancement and professional development.
As a mentor, she fosters creativity in problem solving and I am very thankful for her patience
through this journey.
I acknowledge the training and support from:
ď§

Professor Maria C. Nagan for teaching me to be fearless of caveats in scientific research;
Professor Thomas R. Cundari for promoting out-of-the-box thinking and his support on
several projects;

ď§

Professor H. David Wohlers for inspiring me on innovative teaching;

ď§

Dr. Bernard R. Brooks, for training in software development and the opportunity to
experience the National Institute of Health;

ď§

Dr. Frank Pickard, III and Dr. Andy Simmonett for teaching me better practices in
programming; the National Institutes of Health Graduate Partnership Program for
providing financial support throughout my dissertation research;

ď§

Reata Pharmaceuticals for their support and Dr. Mike Visnick for continued collaboration;

ď§

the Ronald E. McNair Scholars Program at Truman State University, for the academic
counseling and continuing support.

v

I thank my committee members at Michigan State University Professor Kenneth M. Merz,
Jr., Professor Piotr Piecuch, and Professor Gary J. Blanchard for their time and guidance.
Additionally, I would like to acknowledge my former committee members at the University of
North Texas for their mentoring and encouragement: Professor Thomas R. Cundari, Professor
Martin Schwartz, and Professor Weston T. Borden.
The Wilson Research Group, collective for both current and past members, is characterized
by a team of diverse thinkers from different backgrounds with a variety of talents that act as a
family, sharing knowledge and promoting an environment of unconditional positive regard. I am
very grateful for the Wilson Research Group for limitless discussion and feedback, advice, and
exposure to amazing foods. Thank you Dr. Zainab H. A. Alsunaidi and Dr. Jiaqi Wang for sharing
workspace and sharing your lunches with me; Dr. Kameron Jorgensen, Dr. Marie Laury, Dr.
Andrew Mahler Dr. Cong Liu, Dr. Michael Drummond, Dr. George Schoendorff, Dr. Deborah
Penchoff, and Dr. Inga Ulusoy for helping me troubleshoot through problems. Many thanks to
Lucas Aebersold, Thomas Diaz, and Yigitcan Eken for great research discussions and for sharing
pizza with me during the late evenings in lab.
I extend my gratitude to my friends who filled in the roles of family, for their support and
encouragement through my academic and professional pursuits: Michael D., Edward A., Erica S.,
Stephanie M., Kelsey W., Jessi B., Thiky V., Alexis M., Darius T., Shawn G., Chelsea C. Gabino
M., Gail W., Jesseca S., Sherard L. Darrell M., Jose D., and Matthew C. I am grateful for my
fraternity, Sigma Lambda Beta International Fraternity Incorporated, for providing an extended
network of support.

vi

Special thanks to Anna Penchenina for unconditional love, motivation during the long
nights of studying at the coffee shops, and making sure I stayed healthy and focused during my
doctoral candidacy exams.
Little Brother, Grandmother, and Intermediate Relatives, thank you for being there.
Lastly, Mom and Dad, I thank you for everything. Although you were not able to see me
through the completion, I carried your teachings and trainings throughout. Because of you, I am.

vii

TABLE OF CONTENTS

LIST OF TABLESâŚ.âŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚx
LIST OF FIGURESâŚ.âŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚxi
CHAPTER 1

Introduction ........................................................................................................... 1

CHAPTER 2 Theory and Methods in Molecular Modeling ....................................................... 5
2.1 Equations of Motion ......................................................................................................... 5
2.2 Molecular Dynamics ........................................................................................................ 7
2.2.1
Considerations......................................................................................................... 10
2.2.2
Limitations .............................................................................................................. 13
2.3 Quantum Mechanics ....................................................................................................... 14
2.3.1
Ab initio Methods.................................................................................................... 16
2.3.2
Density Functional Theory ..................................................................................... 18
2.3.3
Basis Sets ................................................................................................................ 19
REFERENCESâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚ.âŚâŚâŚ.21
CHAPTER 3 Molecular Dynamics Studies of the ProteinâProtein Interactions in Inhibitor of
ÎşB Kinase-Î˛âŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚ.29
3.1 Introduction .................................................................................................................... 29
3.2 Computational Methods ................................................................................................. 32
3.3 Results ............................................................................................................................ 37
3.3.1
Docking is Aesthetic ............................................................................................... 37
3.3.2
Dimer Stability of the Crystal System .................................................................... 40
3.3.3
Coarse-Grained Simulations ................................................................................... 41
3.3.4
Atomistic Simulation of Dimers and Monomer...................................................... 42
3.3.5
Hydrogen Bonding Between Dimer Interfaces ....................................................... 45
3.3.6
Binding Free Energies............................................................................................. 49
3.4 Discussion ...................................................................................................................... 52
3.5 Conclusion...................................................................................................................... 54
REFERENCESâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚ.âŚâŚâŚâŚâŚâŚâŚâŚ.56
CHAPTER 4 Impact of Intracellular Ionic Strength on Dimer Binding in the NF-kB Inducing
KinaseâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚ...64
4.1 Introduction .................................................................................................................... 64
4.2 Material and Methods..................................................................................................... 68
4.3 Results and Discussion ................................................................................................... 70
4.3.1
Structural Fluctuation.............................................................................................. 70
4.3.2
Changes in Solvent Accessibility............................................................................ 73
4.3.3
Hydrogen-Bonding Analysis/Dimer vs Monomer .................................................. 75
4.3.4
Aggregation Propensity Sensitive to Ion Buffer ..................................................... 77
4.3.5
Comparisons Between the Dimer Binding Energies .............................................. 78
4.4 Conclusions .................................................................................................................... 79
REFERENCESâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚ.âŚâŚâŚ.81
viii

CHAPTER 5 Selectivity in ROS-Induced Peptide Backbone Bond Cleavage ......................... 87
5.1 Introduction .................................................................................................................... 87
5.2 Methodology .................................................................................................................. 90
5.3 Results and Discussion ................................................................................................... 91
5.3.1
Pathway Favorability .............................................................................................. 91
5.3.2
Role of Structure on Bond Strength and Site Reactivity ........................................ 95
5.3.3
Insights into Pathway Preferences .......................................................................... 96
5.3.4
Evaluation of Methods ............................................................................................ 97
5.4 Conclusion...................................................................................................................... 99
REFERENCESâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚ..âŚâŚ.101
CHAPTER 6

Partition Coefficients for the SAMPL5 Challenge using Transfer Free Energies
âŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚ..âŚâŚâŚâŚâŚâŚ108
6.1 Introduction .................................................................................................................. 108
6.2 Methods ........................................................................................................................ 111
6.3 Results and Discussion ................................................................................................. 113
6.4 Conclusion.................................................................................................................... 121
REFERENCESâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚ..âŚâŚ.123
CHAPTER 7 CONCLUDING REMARKS ............................................................................ 129
REFERENCESâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚ..âŚâŚâŚâŚâŚâŚâŚ.133

ix

LIST OF TABLES

Table 3.1 Summary of modeled systems. ..................................................................................... 33
Table 3.2 Binding cavities near the activation segment. .............................................................. 38
Table 3.3 Summary of dimer assembly energetics, determined with PISA. ................................ 41
Table 3.4 Averaged thermodynamic data for the 10 ns atomistic simulation. ............................. 43
Table 3.5 Hydrogen bonding occupancies for the AB dimer interface. ....................................... 46
Table 3.6 Hydrogen bonding occupancies for the AD dimer interface. ....................................... 48
Table 3.7 Binding free energies of the IKKÎ˛ dimer.a ................................................................... 50
Table 4.1 Average thermodynamic parameters from simulations. ............................................... 70
Table 4.2 Summary of the dimer interface of NIK. ...................................................................... 74
Table 4.3 Hydrogen bonding analysis of the NIK dimer interface.a............................................. 76
Table 4.4 Binding free energies of the NIK dimer.a ..................................................................... 79
Table 5.1 Calculated reaction enthalpies for Pathway [A] and Pathway [B] at the CCSD(T) level
of theory. ............................................................................................................................... 92
Table 5.2 Calculated reaction enthalpies for Pathway [A] and Pathway [B]. .............................. 98
Table 5.3 Calculated reaction barriers for reactions [A1] and [B1]. ............................................ 99
Table 6.1 Comparison of predicted logP with experimental logD. ............................................ 114
Table 6.2 Overview of the results submitted to the SAMPL5 challenge.................................... 121

x

LIST OF FIGURES

Figure 3.1 Representation of the models. The crystal structure was solved with 8 identical
molecules (a) forming two âdimer of dimersâ tetrameric complex (b). Although a dimer is
defined as AB form (d), the alternative dimer AD (c) has significance for structure stability.
............................................................................................................................................... 34
Figure 3.2 Characterization of ATP/Mg2+ docking cavity. The ATP ligand is positioned in the
mouth of the activation loop (Top). The adenine head of ATP is pointed towards Asp166
and the phosphate tail is outside of the pocket. The binding pocket is outlined and described
by colored circles (purple and green) representing surrounding amino acids (Bottom); the
different outlines on the purple circles contrast the different types of polar side chains. Blue
and green arrows indicate sidechain and backbone acceptor-donors, respectively. The blue
spheres surrounding the phosphate tail represent the pocket exposure, whereas the lighterblue spheres highlight the receptorâs exposure. .................................................................... 35
Figure 3.3 Visualization of the predicted cavities. All alpha-spheres are shown for the four sites
(Top). The red and white spheres indicated hydrophilic and hydrophobic cavities,
respectively. Wireframe representations of the four sites are shown individually: Site 2
(middle left); Site 3 (middle right); Site 6 (bottom left); and Site 10 (bottom right). .......... 39
Figure 3.4 RMSD plots for the different assemblies using coarse-graining MD. Frames were
plotted for every 1 ps for a total of 50,000 frames (50 ns). .................................................. 42
Figure 3.5 The RMSD from the starting structure for the 10 ns simulation. The different RMSDs
of the monomer (a-b), AB dimer (c-d) and AD dimer (e-f) are colored accordingly (blue,
WT; red, S177/181E; green, WT+ATP; purple;S177/181pS). Plots A, C, and E represent
the RMSD of the entire structure whereas the plots B, D, and F represent only the activation
segment of the principle monomeric chain. Frames were plotted for every 2 ps for a total of
5,000 frames (10 ns). ............................................................................................................ 44
Figure 3.6 Comparison of the phosphorylated AB Dimer. Shown are the starting structure (top)
and the final structure (bottom) after 10 ns. .......................................................................... 51
Figure 3.7 Comparison of binding free energy trends with the HCT, OBC1, and OBC2 models.
............................................................................................................................................... 52
Figure 4.1 Root mean square deviations (RMSDs) obtained from the 215 ns simulation. (A) The
RMSDs were calculated from the starting structure of the inactive (WT) and
phosphomimetic mutant (S549D) states in a neutral solution (color coded in blue and green,
respectively) and a 100 mM NaCl solution ([Na+]; color coded in red and purple,
respectively). For comparison, plots B and C highlight the trends between the inactive and
the active states, whereas the impact of the ionic environment is highlighted in plots D and
E. The darker colored lines represent the moving average per 100 frames. ......................... 72

xi

Figure 4.2 Root mean square fluctuations (RMSF) of each protomer of NIK. The RMSFs over
the 215 ns were calculated per residue for each protomer (332-675) for the modeled inactive
states, WT and WT [Na+] (represented in blue and red, respectively), and active states,
S549D and S549D [Na+] (represented in green and purple, respectively)........................... 73
Figure 4.3 Solvent residing between the dimer interface. ............................................................ 77
Figure 4.4 Changes in the compactness of the crystal structure. .................................................. 78
Figure 4.5 Binding free energy trends with the HCT and OBC Generalized Born models. ........ 79
Figure 5.1 Proposed reaction pathways. The diamide (pathway [A]) and Îą-amidation (pathway
[B]) pathways were modeled in this study. The different sites of radicalization C1, C2, and
C3 represent CÎą1, CÎą2 and CÎą3, respectively. ..................................................................... 89
Figure 5.2 Modeled backbone conformations. (1) Represents the Î˛-strand or âsheet-likeâ
component of a Î˛-pleated-sheet. (2) The Î˛-turn structure. In both conformations, the first Îącarbon has an alkoxyl radical in place of a hydrogen at CÎą1. .............................................. 92
Figure 6.1 Structure of the 53 compounds investigated in the SAMPL5 challenge. The 13
molecules represent (left) correspond to the Batch 0 subset............................................... 112
Figure 6.2 Subset of molecules (Batch0) LogP: effect of increasing basis sets.The predicted logP
with respect to increasing basis set size (DZ, TZ, QZ) are represented by the red box, green
triangle, and purple âXâ, respectively. ................................................................................ 116

xii

CHAPTER 1 Introduction

Theoretical and computational chemistry play a vital role in understanding chemical and
physical processes at both qualitative and quantitative levels and has proven useful towards
advancements in research areas within chemistry, physics, materials science, biology, and
medicine. In terms of chemical physics and physical chemistry, advances in computing technology
have enabled theoretical predictions to yield insight about molecular properties and reactions,
connecting microscopic and macroscopic phenomena that take place across various time scales.
The development of next-generation simulation and modeling techniques have had a noteworthy
influence on advancing the understanding of biological phenomena; applications of these
theoretical approaches are often essential for resolving or uncovering phenomena not readily
observable or explorable by traditional experimental techniques.
A strength in computational modeling stems from the ability to commence investigations
from either ab initio (from the beginning) or a posteriori (from the latter)âa bottom-up or topdown approachâallowing theoretical predictions to provide useful insight, especially in an
absence of experimental knowledge. The foundations of theoretical chemistry lie within the
motions of atoms and molecules and are described using quantum, classical, and statistical
mechanics. Methods in quantum chemistry are used to obtain comprehensive detail about
structure, whereas methods in classical and statistical mechanics are employed to examine and
relate structural dynamics to function or the macroscopic observation. Despite the advancements
in high performance computing, computational chemistry methods face limitations as the
complexity of the system of study increases as this requires more computing resources, hence there

1

are many methods that have advantages over the other at the expense of adding additional
approximations.
For the study of biological species, where complex processes happen at the cellular level,
computation can play an important role. Mechanisms of the cell are a result of a series of molecular
recognition events of proteins with other moleculesâa many body problem. From a mesoscopic
perspective, the structure, function, and activity of proteins rely on the local environment and must
respond to perturbations to maintain equilibrium. Cellular dysfunctions occur when there is a fault
within intracellular mechanisms, which may happen as a result of homeostatic imbalances,
external signals, or errors within the genetic code. It is essential to understand the faults from not
only a macroscopic perspective, but also a microscopic perspective, and, it is here where
computation can provide important insight.
Critical to the onset and progression of disease states, it is essential to understand not only
the origin of the fault, but also the intermediates involved within the progression, like a chemical
reaction, as this allows the potential to discover or design strategies to control the course of the
disease. As electrostatic interactions mediate protein recognition of other molecules, including
RNA, DNA, ligands and other proteins, it is essential to discern how the structure and function of
key mechanisms, components, and proteins within the inflammatory pathway respond to
perturbations.
In this dissertation, theoretical methods, discussed in Chapter 2, were employed to examine
post-translational modifications to proteins and how these alter protein structure and function. NFÎşB (nuclear factor kappa-light-chain-enhancer of activated B cells) proteins are involved in many
cellular and organismal processes, including immune and inflammatory responses, developmental
processes, cellular growth, and apoptosis. In most cells, NF-ÎşB is a latent, inactive complex in the

2

cytoplasm, but can rapidly enter the nucleus and activate gene expression via an abnormal signal
cascade once it receives any number of extracellular signals. Many cancers are induced by NF-ÎşB
signaling, including breast cancer, leukemia, lung cancer, pancreatic cancer, lymphoma, colon
cancer, prostate cancer, and ovarian cancer. Thus, control of NF-ÎşB activity is critical, and NF-B
activating protein kinases are of great interest as a target for cancer therapeutics.
In this work, two key enzymes associated with the canonical and non-canonical pathway
of the NF-ÎşB pathway are investigated. Activation of the inhibitor of ÎşB kinase subunit Î˛ (IKKÎ˛)
oligomer initiates a cascade that results in the translocation of NF-ÎşB transcription factors
activating the canonical pathway. Dimerization of IKKÎ˛ is required for its activation. In Chapter
4, coarse-grained and atomistic molecular dynamics simulations were employed to investigate the
conformation-activity and structure-activity relationships within the oligomer assembly of IKKÎ˛
that are impacted upon activation, mutation, and binding of ATP. The non-canonical pathway is
dependent upon the activation of NF-ÎşB Inducing Kinase (NIK). While the activation of the
canonical pathway is straightforward as a series of post-translational modifications, the activation
of the NIK differs as it is dependent on the enzyme stability. In Chapter 5, molecular dynamics
simulations were employed to determine how the ionic environment alters the structure-property
relationships and aid in the stabilization of NIK.
In cancer cells, high levels of reactive oxygen species (ROS) are prevalent and can result
in post-translational mechanisms of protein oxidation that yield selective side-chain and backbone
modifications including abstractions, donations, additions, substitutions, and fragmentation. From
a molecular perspective, prediction of the area of a protein most likely to be oxidized is of interest
as this may shed insight into mechanisms that follow. In Chapter 6, to characterize the selectivity
of radical-mediated fragmentation, quantum mechanical investigations using ab initio and density

3

functional methods were employed to evaluate site, conformation, and fragmentation pathway
trends of peptide models.
In Chapter 7, physiochemical properties of small drug-like molecules were predicted using
a series of ab initio approaches to assess the reliability of different methods that may serve in lieu
of when experimental data is unavailable. All of the research is summarized in Chapter 8 and future
directions and interests are highlighted.

4

CHAPTER 2 Theory and Methods in Molecular Modeling

2.1

Equations of Motion
Chemistry is the study of structure, composition, and properties of molecules and the

transformations they undergo. Understanding the underlying mechanisms that drive these
transformations lies within the concept of mechanics, specifically the motions of electrons and
nuclei. The properties of a many-body system can be expressed as a function of the position and
the rate of change of displacement.
In classical mechanics, the motion of a system of N particles can be described by Newtonâs
laws of motion (2.1). 1â3
đđ 2 đŤđŤđđ
đđđđ 2 = đđđđ (đŤđŤ1 , âŚ , đŤđŤN , đŤđŤĚ1 , âŚ , đŤđŤĚN , đĄđĄ)
đđđĄđĄ

(2.1)

For a general particle i of mass m, the force F depends on the position đŤđŤđđ = (đĽđĽđđ , đŚđŚđđ , đ§đ§đđ ), velocity
(đŤđŤĚ ), and time (t). * This relationship of Newtonâs second law is also expressed as the change in
momentum (p) with respect to time (t).
đđđđ =

đđ
đđ
đđđđ đŤđŤĚđđ = đŠđŠđđ
đđđđ
đđđđ

đđđđ =

đđđđext

(2.2)

đđ

+ ďż˝ đđđđđđ
đđâ đđ,đđ=1

For a system of N particles, the motion of particle i is influenced by external forces outside the
system and internal forces that arise from the particle interacting with every other particle in the

*

Vector quantities are represented in the boldface notation. Total time derivatives are expressed using Newtonâs

notation, where a dot is placed over the dependent variable ( xĚ =

5

đđx
đđđđ

, xĚ =

đđ 2 x
đđđĄđĄ 2

).

system. Forces in action may do work and transfer energy. For isolated systems in which there are
no external forces (đđđđext = 0), the momentum and the total energy (E) are conserved. For
conservative forces, the change in energy of motion, or kinetic energy (T), is defined as
đđ

đđ
đđ = ďż˝ đđđđ â đŤđŤđ¤đ¤Ě
đđđđ
đđ=1

đđđđ = âđđđđ đđ

đđ = đđ(đŤđŤ1 , âŚ , đŤđŤđđ )

(2.3)

where the forces are derived from the gradient of a potential energy function V.
In an alternative formulation, Newtonâs laws can be expressed in the Lagrangian formula,

đżđż = đđ(đŞđŞ, đŞđŞĚ , đĄđĄ) â đđ(đŞđŞ, đŞđŞĚ , đĄđĄ)

(2.4)

where the Lagrangian (L) is defined as the difference between the kinetic energy and potential
energy and is expressed in generalized coordinates q and generalized velocity (đŞđŞĚ ). Lagrangian
mechanics differ from Newtonian mechanics because it uses generalized coordinates
Hamiltonâs equations (2.5) replaces the generalized velocity with generalized momenta
(p), in which the Hamiltonian (H) represents the total energy.
đđ

đťđť(đŞđŞ, đŠđŠ, đĄđĄ) âĄ ďż˝ đŠđŠđđ đŻđŻđđ (đŞđŞ, đŠđŠ, đĄđĄ) â đżđż(đŞđŞ, đŻđŻ(đŞđŞ, đŠđŠ, đĄđĄ), đĄđĄ)
đđ=1

(2.5)

These equations of motions are useful as they can be used in quantum mechanics, thermodynamics,
and statistical mechanics.

6

2.2

Molecular Dynamics
Molecular dynamics (MD) simulations and the atomic level detail that they provide have

enhanced our understanding of natural phenomena, allowing for enabling predictions about a
system of interacting particles as time evolves. While serving as a connection between microscopic
and macroscopic observations, MD simulations also act as a bridge between theory and experiment
by establishing a connection between structure and dynamics.4â6 Using theory to help interpret and
guide experiment, MD simulations are widely used in the study of hard and soft condensed matter
physics, chemical reactions, and biological systems. For biological systems, MD simulations not
only can provide insight into intermolecular interactions and structure, but can additionally
examine properties not readily discernable by traditional experimental or other structural biology
methods.7â9
Classical MD simulations are deterministic and employ Newtonâs equations of motion
(2.6) to propagate the positions and velocities of particles under the influence of a given, predefined potential (V) where F is the force on a particle i of mass m at configuration r with the
acceleration (đŤđŤĚ ) and is the gradient of the potential.4
đđđđ = đđđđ đŤđŤĚđđ = đđđđ

đđ2 đŤđŤđđ
= ââV(đŤđŤđđ )
đđđĄđĄ 2

(2.6)

Dynamics of the system are determined by integrating Newtonâs motion equation as a
function of time, t, and is done so numerically (as there is no analytical solution) with a time
stepping algorithm, such as the Verlet,10 Leap-Frog,11

Velocity-Verlet,12

or Beemanâs13

algorithms that differ by how the velocities are assigned. Trajectories of particle i at time t can be
determined using a Taylor series expansion for a finite time step (Î´t) of the forward and backward
steps (2.7).
7

1
đŤđŤ(đĄđĄ + đżđżđżđż) = đŤđŤ(đĄđĄ) + đŤđŤĚ (đĄđĄ)đżđżđżđż + đŤđŤĚ (đĄđĄ)đżđżđĄđĄ 2 + âŻ
2
1
đŤđŤ(đĄđĄ â đżđżđżđż) = đŤđŤ(đĄđĄ) â đŤđŤĚ (đĄđĄ)đżđżđżđż + đŤđŤĚ (đĄđĄ)đżđżđĄđĄ 2 â âŻ
2

(2.7)

Adding the two expansions together yields the Verlet algorithm (2.8), which only uses the
position and acceleration to predict the next step. Since it requires a previous stepâs position, the
velocity is not directly involved within the expression, which is needed for the kinetic energy. The
velocity can be estimated by the difference between the forward and backward steps.
đŤđŤ(đĄđĄ + đżđżđżđż) = 2đŤđŤ(đĄđĄ) â đŤđŤ(đĄđĄ â đżđżđżđż) + đŤđŤĚ (đĄđĄ)đżđżđĄđĄ 2 + âŻ
đŤđŤĚ (đĄđĄ) =

đŤđŤ(đĄđĄ + đżđżđżđż) â đŤđŤ(đĄđĄ â đżđżđżđż)
2đżđżđżđż

(2.8)

The Leap-Frog algorithm (2.9) is a modification of the Verlet algorithm, where the velocities are
predicted at each half step independently of the positions.
1
đŤđŤ(đĄđĄ + đżđżđĄđĄ) = đŤđŤ(đĄđĄ) + đŤđŤĚ ďż˝đĄđĄ + đżđżđżđżďż˝ đżđżđżđż
2

1
1
đŤđŤĚ ďż˝đĄđĄ + đżđżđżđżďż˝ = đŤđŤĚ ďż˝đĄđĄ â đżđżđżđżďż˝ + đŤđŤĚ (đĄđĄ)đżđżđżđż
2
2

(2.9)

The Velocity-Verlet algorithm (2.10) is a more precise numerical algorithm than the Leap-Frog
algorithm, as within its formulation the acceleration, velocities, and positions are predicted
simultaneously.
1
1
đŤđŤĚ ďż˝đĄđĄ + đżđżđżđżďż˝ = đŤđŤĚ (đĄđĄ) + ďż˝đŤđŤĚ (đĄđĄ) + đŤđŤĚ (đĄđĄ + đżđżđżđż)ďż˝đżđżđżđż
2
2

8

(2.10)

The Beemanâs algorithm is similar to the Velocity-Verlet algorithm, but has a more
complex expression for the Taylor expansion that is advantageous for infinitesimal time steps (see
Ref 13).
The motions and interactions between particles are modeled using an empirical potential
energy function (2.11), better known as a force field, which is represented as the sum of bonding
and nonbonding energies.14â16
2

2

đđ(đŤđŤđđ , âŚ , đŤđŤđđ ) = ďż˝ đžđžđđđđ ďż˝đđđđ â đđđđđđ ďż˝ + ďż˝ đžđžđđÎ¸ ďż˝đđđđ â đđđđđđ ďż˝ +
đđđđđđđđđđ

+

đđđđđđđđđđđđ

đđ
ďż˝ đžđžđđ (1 + cos(đđđđđđ â đżđżđđ ))

đĄđĄđĄđĄđĄđĄđĄđĄđĄđĄđĄđĄđĄđĄđĄđĄ

đ´đ´đđđđ đľđľđđđđ
đđđđ đđđđ
2
đđ
đžđžđđ ďż˝đđđđ â đđđđđđ ďż˝ + ďż˝ ďż˝ ďż˝ 12 â 6 ďż˝ + ďż˝ ďż˝
đđđđđđđđ
đđđđđđ
đđđđđđ
đđđđđđđđđđđđđđđđđđ
đđ đđâ đđ
đđ đđâ đđ
ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝
ďż˝

đđđđđđđđđđđđđđ

(2.11)
The first four terms of Eq. 2.11 represent the energy of bond stretching (b), angle bending
(Î), proper (Ď) and improper torsion (Ď). The potential utilizes the force constants, Kb, KÎ, KĎ, and
KĎ, for the bonds, angles, proper and improper torsions between atoms, respectively, exhibited at
a configuration and reference-equilibrium position to characterize the potential. Van der Waals
interactions between particles i and j are described by the Lennard-Jones potential separated by a
distance r, in which A and B are constants that control the interatomic distances (well-depth) for
the repulsive and attractive terms, respectively. The electrostatic interactions are defined by partial
charges, q, of the two particles in a medium, Îľ, described by the effective dielectric constant, Îľ.

Despite not being able to account for excited electronic states, quantum effects, or chemical

bond breakage breaking and formation, classical MD simulations give insight into many biological
problems of both chemical and physical interests. For example, biological macromolecules such

9

as proteins, nucleic acids, and lipids are naturally dynamic systems in which structure, function,
and activity are modulated by intrinsic forces.
While internal motions can be observed using spectroscopic methods like infrared
spectroscopy and nuclear magnetic resonance spectroscopy, such investigations can become
expensive and time intensive for complex systems. MD simulations clarify time-dependent
attributes and dynamic processes, allowing for an in-depth understanding and explanation of
biophysical behaviors and chemical properties, such as the free energy contribution to structural
rearrangements.17â19
MD studies can aid in defining the role of the motifs in protein-protein interactions and
protein-ligand recognition, which are capable of many different types of interactions including van
der Waals interactions through aliphatic chains, cation-Ď interactions with aromatics, hydrogen
bonding and purely electrostatic interactions with solvent.20 For instance, solvent effects, structure,
stacking interactions between aromatic residues and the strength of ion binding can all be
correlated to biological activity to elucidate mechanisms in protein recognition of other molecules.
2.2.1

Considerations
There are many parameters needed to carry out a simulation. As improper set up of a

simulation can generate trajectories that have no basis in reality, hence much detail should be taken
into account when employing MD simulations.21
For biomolecular systems, many force fields have been developed.16,22 Popular force fields,
such as OPLS,23 GROMOS,24 CHARMM,25 and AMBER,26 are parameterized based on a mixture
of empirical data and quantum mechanical calculations. Each force field differs by how a
parameter is defined per atom (atom-type). While harmonic potentials for bond stretching and
angle bending are typically fitted to experimental vibrational data (normal mode frequencies), the

10

force fields differ by how the in-plane and out-of-plane torsion and non-bonded terms have been
parameterized and the data sets to which the parameters have been fitted.
Partial atomic charges, although not a physically measurable property, have a critical
impact on interaction energies, hence effective partial charges impact the bonding terms within the
force field.27 For example, partial atomic charges in the CHARMM force field are parameterized
based upon Mulliken population analysis while the AMBER force field uses the Restrained
Electrostatic Potential (RESP)28 approach.
The environment of the solute is critical in the determination of the properties and reactivity
of biomolecular systems.29,30 For MD simulations, the solvent may be treated with an implicit
water model, such as the Generalized Born model,31,32 or explicitly using rigid or flexible water
molecules.33 Using an explicit water solvent is advantageous because the solute can relax and
interact with the solvent. Many empirical water models have been developed over the years to
reproduce bulk water properties,34,35 including the TIP3P36 (transferable intermolecular potential
3 point), several variants of TIP4P37â43 and TIP5P,36,44 the SPC45 (simple point charge) and the
SPC/E46 (extended simple point charge) water models.
Periodic boundary conditions (PBCs) are used to represent the simulation as a continuous
system.21 Under PBCs, if an atom were to leave the unit cell in one direction, its image would
reenter the cell on the other side. Imposing PBCs avoids surface artifacts that would arise from the
atomsâ unrealistic interactions with the unit cell surface.
Statistical mechanics is fundamental to MD simulations as it allows for macroscopic
properties (i.e. free energy) to be determined from properties of microstates.47,48 For a given
microstate (N) characterized at position r, momentum p, and time t, the instantaneous value of

11

property A will fluctuate over time as a result of the interacting particles (2.12). In experiment, the
value reflects an averaged value of the property over the time of the measurement.
With the assumption that all accessible microstates are attainable and equally probable, the
calculated value of A approaches the true average value of the property as time evolves. The
average value đ´đ´Ě (2.13) is expressed as a time average over infinite time. For a finite time period,
the average value is expressed as the ensemble average (2.14), in which the probability Ď of the
state is proportional to the total energy (E), where Boltzmannâs constant (kB) and temperature (T).

đ´đ´ďż˝đŠđŠđđ (đĄđĄ), đŤđŤ đđ (đĄđĄ)ďż˝
(2.12)
1 đđ
đ´đ´Ě = lim ďż˝ đ´đ´ďż˝đŠđŠđđ (đĄđĄ), đŤđŤ đđ (đĄđĄ)ďż˝ đđđđ
đđââ đĄđĄ đĄđĄ=0
(2.13)
â¨đ´đ´âŠ = ďż˝ đđđŠđŠđđ đđđŤđŤ đđ đ´đ´(đŠđŠ,đđ , đŤđŤ N )đđ(đŠđŠđđ , đŤđŤ đđ )

đđ

đđ(đŠđŠ , đŤđŤ

đđ )

â đđ

â

Eďż˝đŠđŠđđ ,đŤđŤ đđ ďż˝
kB T

(2.14)

Fundamental internal motions, such as molecular vibrations, rotations, and torsions occur
in the femtosecond, picosecond, and nanosecond timescales, respectively, while larger motions
12

such as allosteric transitions and large-scale conformational changes, are observed on much larger
time scales.49 In order to observe the molecular vibrational modes, the time step used in the
numerical integration scheme must be smaller than the period of the fastest oscillation. Constraint
algorithms, such as SETTLE,50 SHAKE,51 EEM,52 SHAPE,53 and LINCS,54 are used to freeze high
frequency motions that have minimal effects on the overall dynamics of the system by placing
algebraic constraints on bonds or angles to an equilibrium distance, allowing for larger time steps
to be used.
For example, the SHAKE algorithm51 can be computationally beneficial when a simulation
of a large system where a molecule is surrounded by bulk solvent (e.g. explicit water) because
bond stretching motions (e.g. O-H stretching) can be constrained. SHAKE is applied to the Verlet
algorithm by defining pairwise bond distance constraints between atoms and is used to calculate
the force constants iteratively, allowing the use of a larger time step.
2.2.2

Limitations

A major limitation to the computational modeling of the chemistry of large biological systems
using atomistic MD simulations is the inadequate sampling of conformational states that are
computationally feasible. In order to circumvent the time-scale limitation of atomistic MD
simulations, alternative potentials and protocols, such as coarse-grained and continuum modeling,
have been developed to offer a reduced representation that allows for more conformational state
sampling than atomistic simulations.55â57 For example, in a coarse-grained model for a protein,
the amino acid lysine can be reduced from 24 atoms to three pseudo atoms that represent the polar
backbone, the apolar hydrocarbons, and the charged amino group. Together, utilizing both
atomistic and coarse-grained simulations techniques, many rare events can be uncovered.

13

2.3

Quantum Mechanics
While classical mechanics is used to study large systems, containing many microscopic

particles, quantum mechanics explains the motion of electrons.58 In quantum mechanics, the state
of a system is described by a continuous wave function (Î¨) and can be used with an operator (đđďż˝)
to determine the expectation value (EO) which can correspond to physical observables, such as
position, momentum, or energy.
â¨EOâŠ =

âŤ Î¨ â đđďż˝Î¨ dĎ
âŤ Î¨ â Î¨ dĎ

(2.15)

The quantum mechanical Hamiltonian operator (2.16) is similar to the classical
Hamiltonian, where đđďż˝ is the kinetic energy operator that is dependent on the momentum and đđďż˝ is
the potential energy operator that is dependent on a position. Within the kinetic energy operator,

the momentum operator đđĚ is adapted to wave mechanics, where i is an imaginary unit equal to

ââ1, Ä§ is the reduced Planck constant, and m is mass.
đđďż˝ =

ďż˝ = đđďż˝ + đđďż˝
đťđť

(đđĚ )2 âÄ§2 2
=
â
2đđ
2đđ

đđĚ = đđÄ§

â
âđĄđĄ
(2.16)

The non-relativistic time-dependent SchrĂśdinger equation (2.17a) describes the dynamics of a
system given by the wave function, however for many applications in chemistry, the nonrelativistic time-independent SchrĂśdinger equation is used for obtaining the total energy of
chemical systems (2.17b).

14

â

â đđ
ďż˝ Î¨(đđ, đđ, đĄđĄ)
Î¨(đđ, đđ, đĄđĄ) = đťđť
đđ đđđđ

(2.17a)

ďż˝ Î¨(đđ, đđ) = đ¸đ¸Î¨(đđ, đđ)
đťđť

(2.17b)

The non-relativistic molecular Hamiltonian is expressed as the sum of kinetic energy and
potential energy operators for the electrons (N) and the nuclei (M), shown in eq. (2.18) where mA
is the mass of nucleus A, ZA is the atomic number of nucleus A, and R and r represent the position
of the nuclei (A and B) and electrons (i and j), respectively. This expression includes the kinetic
energy of the electrons (đđďż˝đđ ) and nuclei (đđďż˝đđ ), the columbic attraction between the nuclei and

electrons (đđďż˝đđđđ ), and the repulsion between nuclei (đđďż˝đđđđ ) and electrons (đđďż˝đđđđ ).
đđ

đđ

đđ

đđ

đđ

đđ

đđ

đđ

1
1
â2
đđđ´đ´
1
đđđ´đ´ đđđľđľ
ďż˝ = â ďż˝ â2i â ďż˝ đ´đ´ â ďż˝ ďż˝
đťđť
+ ďż˝ďż˝
+ďż˝ďż˝
|đđđ´đ´ â đđđđ |
|đđđ´đ´ â đđđľđľ |
2ďż˝ďż˝ďż˝ďż˝ďż˝ 2
đđđ´đ´ ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝
ďż˝đđđđ â đđđđ ďż˝
ďż˝ďż˝ďż˝ďż˝ďż˝
ďż˝ďż˝
ďż˝ďż˝ďż˝ďż˝ ďż˝ďż˝
ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝
đ´đ´=1
đ´đ´=1 đđ=1
đđ=1ďż˝đđ>đđ
đ´đ´=1
đľđľ>đ´đ´
đđ=1
ďż˝ďż˝ďż˝ďż˝
ďż˝ďż˝ďż˝ďż˝
đđďż˝đđ

đđďż˝đđ

ďż˝đđđđ
đđ

ďż˝đđđđ
đđ

ďż˝đđđđ
đđ

(2.18)

As the masses of the nuclei are much larger than the electrons and therefore the nuclei
move much slower than the electrons, the motion of the nuclei and the electrons can be separated.
This is the Born-Oppenheimer approximation and allows for the separation of the nuclear and
electronic components for the wave function, Î¨, as shown in eq. (2.19).
Î¨ = đđđđ â đđđđ

(2.19)

Within the Born-Oppenheimer approximation, the molecular Hamiltonian can be simplified to the
ďż˝đđđđ ) where đđďż˝đđ is zero and đđďż˝đđđđ is treated as a constant (2.20).59 The
electronic Hamiltonian (đťđť
electronic wave function includes all possible states for electrons for a nucleus.

15

đđ

đđ

đđ

đđ

đđ

1
đđđ´đ´
1
ďż˝đđđđ = â ďż˝ â2i â ďż˝ ďż˝
đťđť
+ ďż˝ďż˝
|đđđ´đ´ â đđđđ |
2ďż˝ďż˝ďż˝ďż˝ďż˝ ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝
ďż˝đđ â đđđđ ďż˝
ďż˝ďż˝
ďż˝ďż˝ďż˝ďż˝ ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝ďż˝
đ´đ´=1 đđ=1
đđ=1 đđ>đđ đđ
đđ=1
đđďż˝đđ

ďż˝đđđđ
đđ

ďż˝đđđđ
đđ

(2.20)
The Pauli-Exclusion Principle expresses that no two electrons can occupy the same
quantum state, meaning that the total wave function must be antisymmetric with respect to
exchange of the two electrons. This is accounted for by a linear combination of the possibilities
for N electrons with respect to the spin (xN) and spatial relations (ĎN) and is expressed with a Slater
determinant as shown in eq. (2.21).
Î¨ = đđ1 (đĽđĽ1 )đđ2 (đĽđĽ2 ) â đđ1 (đĽđĽ2 )đđ2 (đĽđĽ1 )

đđ1 (đĽđĽ1 ) đđ2 (đĽđĽ1 ) âŻ
đđ (đĽđĽ ) đđ2 (đĽđĽ2 ) âŻ
Î¨=
ďż˝ 1 2
âŽ
âŽ
âą
âđđ!
đđ1 (đĽđĽđđ ) đđ2 (đĽđĽđđ ) âŻ
1

2.3.1

đđđđ (đĽđĽ1 )
đđđđ (đĽđĽ2 )
ďż˝
âŽ
đđđđ (đĽđĽđđ )

(2.21)

Ab initio Methods
The Hartree-Fock (HF) approximation60 is one of the most commonly utilized electronic

structure methods for solving the electronic SchrĂśdinger equation by using a mean field
approximation for the two electron term đđďż˝đđđđ in which each electron is treated as interacting with

an effective average potential of the other electrons rather than an individual electron. The Fock
operator (đđĚ) is composed of the sum of the one electron terms (2.22), in which the first term

represents the kinetic energy of the electron, the second term represents the electron-nuclear
attraction, and the third term represents the HF potential, which approximates the two electron
term of the SchrĂśdinger equation.

16

đđ

đđđ´đ´
1
đťđťđťđť
đđĚđđ = â â2đđ â ďż˝ |đđ â đđ | + đŁđŁ (đđ)
2
đ´đ´
đđ
đđ

(2.22)

Although this method accounts for most of the electronic energy of a system, the HF
approximation does not fully describe the wave function because it neglects electron-electron
correlation effects for electrons of the opposite spin, thus cannot describe dynamic correlation (the
instantaneous repulsion of moving electrons) or static correlation (nearly degenerate electron
configurations)âwhich is significant for understanding chemical phenomena.6 Post-Hartree-Fock
methods have been developed to recover electron correlation using a single slater determinant,
including configuration interaction (CI),6,59 MĂ¸ller-Plesset (MPn) perturbation theory (e.g. MP2
and MP4),61â65 and coupled cluster (CC)66â69 methods.
Full CI includes all possible configurations or excitations of electrons (2.23), where ai is
the coefficient corresponding to unique single (S), double (D), triple (T), quadruple (Q), and i-th
level excitations, respectively, that are added to the HF reference (Î¨0 ).

Î¨CI = đđ0 ÎŚHF + ďż˝ đđS ÎŚS + ďż˝ đđD ÎŚD + ďż˝ đđT ÎŚT + ďż˝ đđQ ÎŚQ + âŻ = ďż˝ đđđđ ÎŚđđ
S

D

T

Q

đđ=0

(2.23)

When truncated, CI is still variational but is no longer size extensive, or linearly scaled with the
electron count and the theory is no longer exact. In CC methods, the wave function uses an
exponential form ansatz that contains connected excitation operators (đđďż˝) to approximate the true

wave function, in which each Nth excitation includes previous excitations (2.24).70 By
formulation, CC methods are size extensive, but no longer variational. Similar to CI methods,
increasing the number of excitations in CC yields (e.g. CCS, CCSD, CCSDT, etc.) approach the

17

exact energy. While CI and CC are rigorous methods for recovering electron-electron correlation,
they are both computationally expensive.
Î¨đśđśđśđś = đđ đđ ÎŚ0

đđďż˝ = đđďż˝1 + đđďż˝2 + đđďż˝3 + đđďż˝4 + âŻ + đđďż˝đđ

(2.24)

A popular CC method is CCSD(T), which includes single (S) and double (D) excitations and
perturbative triples (T).
ďż˝0 ), which is the sum of the oneMPn methods use a reference, unperturbed Hamiltonian (đťđť

electron Fock operators, and a perturbation (đđďż˝ ) on the reference Hamiltonian to recover electron
correlation, where Îť is the expansion parameter that determines the strength of the perturbation
(2.25).
ďż˝ = đťđť
ďż˝0 + đđđđďż˝
đťđť
2.3.2

(2.25)

Density Functional Theory
Density functional theory (DFT) is an alternative to wave function-based methods that

relates the electron density (Ď) to the ground state energy of a system through the Hohenberg-Kohn
and Kohn-Sham theorems.71â73
đ¸đ¸đˇđˇđˇđˇđˇđˇ [đđ] = đđđ đ  [đđ] + đ¸đ¸đđđđ [đđ] + đ˝đ˝[đđ] + đ¸đ¸đĽđĽđĽđĽ [đđ]
đđ

đđ(đđ) = ďż˝|đđđđ (đđ)|2
đđ=1

(2.26)

In Kohn-Sham theory, (2.26), Ts[Ď] is the kinetic energy for the non-interacting reference
system, Ene[Ď] is the columbic nuclear-electron attraction energy, J[Ď] is the electron-electron

18

repulsion energy, and Exc[Ď] is the remaining energy termed the exchange-correlation functional.
The exact functional form of the exchange-correlation term is unknown. DFT functionals have
been developed that attempt to approximate this term, such as local density approximation (LDA),
generalized gradient approximation (GGA), meta-GGA, and hybrid-GGA.74,75 The LDA
functionals are only dependent on the stationary density whereas the GGA functionals are based
on the change in electron density. Meta-GGA functionals include corrections for kinetic energy
density. Hybrid functionals incorporate a mixture of exact exchange from HF with other DFT
exchange-correlation approximations. A popular DFT functional used in this work is B3LYP,
which is a hybrid-GGA functional optimized to contain 20% of HF exchange.76â78 Although
popular, B3LYP is not the best functional. There are many DFT functionals which can outperform
one another, however the superiority is often dependent upon the problem of interest.79,80 For
example, the B3LYP functional is good for predicting the structure of organic molecules.
2.3.3

Basis Sets
Electronic structure methods use basis sets, which are mathematical representations used

to describe the wave function. The Slater-type orbital (STO) is one of the earlier basis functions to
model atomic orbitals similar to the hydrogen atom (2.27), where N is a normalization constant,
Yl,m represents the angular function, n is the principle quantum number, and Îś is exponential
parameter that determines radial distance r. The spherical harmonic is defined by the orbital
angular momentum (l) and magnetic quantum number (m).
đđ(đđ, đđ, đđ) = đđđđđđ,đđ (đđ, đđ)đđ (đđâ1) đđ âđđđđ

(2.27)

However, STOs are difficult to evaluate numerically. Gaussian-type orbitals (GTO), on the other
hand, have a more advantageous mathematical form (2.28), however GTOs poorly represent the

19

atomic orbital. Rather, linear combinations of multiple GTOs are taken to better approximate
STOs. A solution was to combine GTOs to better approximate STOs, which in general, improve
by the addition of basis functions.
đđ(đđ, đđ, đđ) = đđđđđđ,đđ (đđ, đđ)đđ (2đđâ2âđđ) đđ âđđđđ

2

(2.28)

There are many different families of atomic orbital basis sets. Two of the more popular families
are the correlation consistent sets of Dunning and co-workers81â85 and the Pople-style sets.86â89 For
the Pople-style basis sets, there are many different levels of complexity, beginning with a minimal
basis such as STO-3G86 and increasing to split valence sets such as 3-21G87 and 6-31G.88
Additional basis functions, such as diffuse and polarization functions can be added to the basis set
to give a more complete representation of the orbital space.
The correlation consistent basis sets were designed to systematically recover correlation
energy with the addition of basis functions. These basis sets are noted correlation consistent,
polarized, valence, n-Îś level, or cc-pVnZ, where n is equal to double-Îś (DZ), triple-Îś (TZ),
quadruple-Îś (QZ), etc. An advantage of the correlation consistent basis sets is that they can be used
to extrapolate to the complete basis limit.

20

REFERENCES

21

REFERENCES
(1)

Goldstein, H.; Poole, C. P.; Safko, J. L. Classical Mechanics, 3. ed.; Addison Wesley:
San Francisco, NJ, 2002.

(2)

Chaichian, M.; Merches, I.; Tureanu, A. Mechanics an Intensive Course; Springer: Berlin;
New York, 2012.

(3)

Fowles, G. R.; Cassiday, G. L. Analytical Mechanics, 7th ed.; Thomson Brooks/Cole:
Belmont, CA, 2005.

(4)

Haile, J. Molecular Dynamics Simulation: Elementary Methods. 1992, 512.

(5)

Cramer, C. J. Essentials of Computational Chemistry: Theories and Models, 2nd Ed., 2nd
ed.; John Wiley and Sons: West Sussex, England, 2004.

(6)

Jensen, F. Introduction to Computational Chemistry, 2nd Ed.; John Wiley & Sons, Ltd:
West Sussex, England, 2007.

(7)

van Gunsteren, W. F.; Bakowies, D.; Baron, R.; Chandrasekhar, I.; Christen, M.; Daura,
X.; Gee, P.; Geerke, D. P.; GlĂ¤ttli, A.; HĂźnenberger, P. H.; Kastenholz, M. a; Oostenbrink,
C.; Schenk, M.; Trzesniak, D.; van der Vegt, N. F. a; Yu, H. B. Biomolecular Modeling:
Goals, Problems, Perspectives. Angew. Chem. Int. Ed. Engl. 2006, 45 (25), 4064â4092.

(8)

van Gunsteren, W. F.; Luque, F. J.; Timms, D.; Torda, A. E. Molecular Mechanics in
Biology: From Structure to Function, Taking Account of Solvation. Annu. Rev. Biophys.
Biomol. Struct. 1994, 23, 847â863.

(9)

Adcock, S. A.; McCammon, J. A. Molecular Dynamics: Survey of Methods for
Simulating the Activity of Proteins. Chem. Rev. 2006, 106 (5), 1589â1615.

(10)

Verlet, L. Computer âExperimentsâ on Classical Fluids. I. Thermodynamical Properties of
Lennard-Jones Molecules. Phys. Rev. 1967, 159 (1), 98â103.

(11)

Cuendet, M. A.; van Gunsteren, W. F. On the Calculation of Velocity-Dependent
Properties in Molecular Dynamics Simulations Using the Leapfrog Integration Algorithm.
J. Chem. Phys. 2007, 127 (18), 184102.

(12)

Swope, W. C. A Computer Simulation Method for the Calculation of Equilibrium
Constants for the Formation of Physical Clusters of Molecules: Application to Small
Water Clusters. J. Chem. Phys. 1982, 76 (1), 637.

(13)

Beeman, D. Some Multistep Methods for Use in Molecular Dynamics Calculations. J.
Comput. Phys. 1976, 20 (2), 130â139.

(14)

Wang, J.; Hou, T.; Xu, X. Recent Advances in Free Energy Calculations with a
Combination of Molecular Mechanics and Continuum Models. Curr. Comput. Aided.
Drug Des. 2006, 2 (3), 287â306.
22

(15)

Ponder, J. W.; Case, D. a. Force Fields for Protein Simulations. Adv. Protein Chem. 2003,
66 (Protein Simulations), 27â85.

(16)

GonzĂĄlez, M. A. Force Fields and Molecular Dynamics Simulations. Ăcole thĂŠmatique la
SociĂŠtĂŠ FranĂ§aise la Neutron. 2011, 12, 169â200.

(17)

Karplus, M.; McCammon, J. A. Molecular Dynamics Simulations of Biomolecules. Nat.
Struct. Biol. 2002, 9 (9), 646â652.

(18)

Karplus, M.; Petsko, G. A. Molecular Dynamics Simulations in Biology. Nature 1990,
347 (6294), 631â639.

(19)

Karplus, M.; Kuriyan, J. Molecular Dynamics and Protein Function. Proc. Natl. Acad. Sci.
U. S. A. 2005, 102 (19), 6679â6685.

(20)

3rd, T. E. C.; Young, M. A. Molecular Dynamics Simulation of Nucleic Acids: Successes,
Limitations, and Promise. Biopolymers 2000, 56 (4), 232â256.

(21)

Hansson, T.; Oostenbrink, C.; van Gunsteren, W. Molecular Dynamics Simulations. Curr.
Opin. Struct. Biol. 2002, 12 (2), 190â196.

(22)

Mackerell, A. D. Empirical Force Fields for Biological Macromolecules: Overview and
Issues. J. Comput. Chem. 2004, 25 (13), 1584â1604.

(23)

Jorgensen, W. L.; Maxwell, D. S.; Tirado-Rives, J. Development and Testing of the OPLS
All-Atom Force Field on Conformational Energetics and Properties of Organic Liquids. J.
Am. Chem. Soc. 1996, 118 (45), 11225â11236.

(24)

Scott, W. R. P.; HĂźnenberger, P. H.; Tironi, I. G.; Mark, A. E.; Billeter, S. R.; Fennen, J.;
Torda, A. E.; Huber, T.; KrĂźger, P.; van Gunsteren, W. F. The GROMOS Biomolecular
Simulation Program Package. J. Phys. Chem. A 1999, 103 (19), 3596â3607.

(25)

MacKerell, a D.; Banavali, N.; Foloppe, N. Development and Current Status of the
CHARMM Force Field for Nucleic Acids. Biopolymers 2001, 56 (4), 257â265.

(26)

Cornell, W. D.; Cieplak, P.; Bayly, C. I.; Gould, I. R.; Merz, K. M.; Ferguson, D. M.;
Spellmeyer, D. C.; Fox, T.; Caldwell, J. W.; Kollman, P. a. A Second Generation Force
Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules. J. Am. Chem.
Soc. 1995, 117 (19), 5179â5197.

(27)

Meister, J.; Schwarz, W. H. E. Principal Components of Ionicity. J. Phys. Chem. 1994, 98
(33), 8245â8252.

(28)

Wang, J.; Cieplak, P.; Kollman, P. a. How Well Does a Restrained Electrostatic Potential
(RESP) Model Perform in Calculating Conformational Energies of Organic and Biological
Molecules? J. Comput. Chem. 2000, 21 (12), 1049â1074.

(29)

Davis, M. E.; McCammon, J. A. Electrostatics in Biomolecular Structure and Dynamics.
23

Chem. Rev. 1990, 90 (3), 509â521.
(30)

Ball, P. Water as an Active Constituent in Cell Biology. Chem. Rev. 2008, 108 (1), 74â
108.

(31)

Schaefer, M.; Karplus, M. A Comprehensive Analytical Treatment of Continuum
Electrostatics. J. Phys. Chem. 1996, 100 (5), 1578â1599.

(32)

Calimet, N.; Schaefer, M.; Simonson, T. Protein Molecular Dynamics with the
Generalized born/ACE Solvent Model. Proteins Struct. Funct. Genet. 2001, 45 (2), 144â
158.

(33)

Chen, J.; Brooks, C. L.; Khandogin, J. Recent Advances in Implicit Solvent-Based
Methods for Biomolecular Simulations. Curr. Opin. Struct. Biol. 2008, 18 (2), 140â148.

(34)

Wallqvist, A.; Mountain, R. D. Molecular Models of Water: Derivation and Description;
Lipkowitz, K. B., Boyd, D. B., Eds.; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 1999;
Vol. 13, pp 183â247.

(35)

Ouyang, J. F.; Bettens, R. P. A. Modelling Water: A Lifetime Enigma. Chim. Int. J. Chem.
2015, 69 (3), 104â111.

(36)

Mahoney, M. W.; Jorgensen, W. L. A Five-Site Model for Liquid Water and the
Reproduction of the Density Anomaly by Rigid, Nonpolarizable Potential Functions. J.
Chem. Phys. 2000, 112 (20), 8910.

(37)

Jorgensen, W. L.; Madura, J. D. Temperature and Size Dependence for Monte Carlo
Simulations of TIP4P Water. Mol. Phys. 1985, 56 (6), 1381â1392.

(38)

Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L.
Comparison of Simple Potential Functions for Simulating Liquid Water. J. Chem. Phys.
1983, 79 (2), 926.

(39)

Horn, H. W.; Swope, W. C.; Pitera, J. W.; Madura, J. D.; Dick, T. J.; Hura, G. L.; HeadGordon, T. Development of an Improved Four-Site Water Model for Biomolecular
Simulations: TIP4P-Ew. J. Chem. Phys. 2004, 120 (20), 9665â9678.

(40)

Rick, S. W. Simulations of Ice and Liquid Water over a Range of Temperatures Using the
Fluctuating Charge Model. J. Chem. Phys. 2001, 114 (5), 2276â2283.

(41)

Abascal, J. L. F.; Sanz, E.; GarcĂ­a FernĂĄndez, R.; Vega, C. A Potential Model for the
Study of Ices and Amorphous Water: TIP4P/Ice. J. Chem. Phys. 2005, 122 (23), 234511.

(42)

GonzĂĄlez, M. A.; Abascal, J. L. F. A Flexible Model for Water Based on TIP4P/2005. J.
Chem. Phys. 2011, 135 (22), 224516.

(43)

Fuentes-Azcatl, R.; Barbosa, M. C. Thermodynamic and Dynamic Anomalous Behavior
in the TIP4P/Îľ Water Model. Phys. A Stat. Mech. its Appl. 2016, 444, 86â94.
24

(44)

Rick, S. W. A Reoptimization of the Five-Site Water Potential (TIP5P) for Use with
Ewald Sums. J. Chem. Phys. 2004, 120 (13), 6085â6093.

(45)

Berendsen, H. J. .; Postma, J. P. M.; van Gunsteren, W. F.; Hermans, J. Interaction Models
for Water in Relation to Protein Hydration. In Intermolecular Forces; Springer
Netherlands, 1981; Vol. 14, pp 331â342.

(46)

Berendsen, H. J. C.; Grigera, J. R.; Straatsma, T. P. The Missing Term in Effective Pair
Potentials. J. Phys. Chem. 1987, 91 (24), 6269â6271.

(47)

Leach, A. R. Molecular Modelling: Principles and Applications. 2001.

(48)

Allen, M.; Tildesley, D. Computer Simulation of Liquids. 1987.

(49)

McCammon, J. A. Protein Dynamics. Reports Prog. Phys. 1984, 47 (1), 1.

(50)

Miyamoto, S.; Kollman, P. A. Settle: An Analytical Version of the SHAKE and RATTLE
Algorithm for Rigid Water Models. J. Comput. Chem. 1992, 13 (8), 952â962.

(51)

Ryckaert, J.-P. J.; Ciccotti, G.; Berendsen, H. H. J. . Numerical Integration of the
Cartesian Equations of Motion of a System with Constraints: Molecular Dynamics of NAlkanes. J. Comput. Phys. 1977, 23 (3), 327â341.

(52)

Edberg, R.; Evans, D. J.; Morriss, G. P. Constrained Molecular Dynamics: Simulations of
Liquid Alkanes with a New Algorithm. J. Chem. Phys. 1986, 84 (12), 6933â6939.

(53)

Tao, P.; Wu, X.; Brooks, B. R. Maintain Rigid Structures in Verlet Based Cartesian
Molecular Dynamics Simulations. J. Chem. Phys. 2012, 137 (13), 134110.

(54)

Hess, B.; Bekker, H.; Berendsen, H. J. C.; Fraaije, J. G. E. M. LINCS: A Linear
Constraint Solver for Molecular Simulations. J. Comput. Chem. 1997, 18 (12), 1463â
1472.

(55)

Tozzini, V. Coarse-Grained Models for Proteins. Curr. Opin. Struct. Biol. 2005, 15 (2),
144â150.

(56)

Riniker, S.; Allison, J. R.; van Gunsteren, W. F. On Developing Coarse-Grained Models
for Biomolecular Simulation: A Review. Phys. Chem. Chem. Phys. 2012, 14 (36), 12423â
12430.

(57)

Kamerlin, S. C. L.; Vicatos, S.; Dryga, A.; Warshel, A. Coarse-Grained (Multiscale)
Simulations in Studies of Biophysical and Chemical Systems. Annu. Rev. Phys. Chem.
2011, 62, 41â64.

(58)

Levine, I. N. Quantum Chemistry, 5th ed.; Prentice Hall: Upper Saddle River, N.J, 2000.

(59)

Szabo, A.; Ostlund, N. S. Modern Quantum Chemistry: Introduction to Advanced
Electronic Structure Theory; Dover Publications, Inc.: Mineola, New York, 1996.
25

(60)

Fock, V. NĂ¤herungsmethode zur LĂśsung des quantenmechanischen MehrkĂśrperproblems.
Zeitschrift fĂźr Phys. 1930, 61 (1â2), 126â148.

(61)

Cremer, D. MĂ¸ller-Plesset Perturbation Theory: From Small Molecule Methods to
Methods for Thousands of Atoms. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2011, 1 (4),
509â530.

(62)

MĂ¸ller, C.; Plesset, M. S. Note on an Approximation Treatment for Many-Electron
Systems. Phys. Rev. 1934, 46 (7), 618â622.

(63)

Head-Gordon, M.; Pople, J. A.; Frisch, M. J. MP2 Energy Evaluation by Direct Methods.
Chem. Phys. Lett. 1988, 153 (6), 503â506.

(64)

Frisch, M. J.; Head-Gordon, M.; Pople, J. A. Direct MP2 Gradient Method. Chem. Phys.
Lett. 1990, 166 (Journal Article), 275â280.

(65)

Raghavachari, K.; Pople, J. A.; Replogie, E. S.; Head-Gordon, M. Fifth Order MollerPlesset Perturbation TheoryâŻ: Comparison of Exististing Correiatlon Methods and
Implementation of New Methods Correct to Fifth Order. J. Phys. Chem. 1990, 94 (14),
5579.

(66)

Bishop, R. F. An Overview of Coupled Cluster Theory and Its Applications in Physics.
Theor. Chim. Acta 1991, 80 (2â3), 95â148.

(67)

Scuseria, G. E.; Schaefer, H. F. Is Coupled Cluster Singles and Doubles (CCSD) More
Computationally Intensive than Quadratic Configuration Interaction (QCISD)? J. Chem.
Phys. 1989, 90 (7), 3700â3703.

(68)

Bartlett, R. Many-Body Perturbation Theory and Coupled Cluster Theory for Electron
Correlation in Molecules. Annu. Rev. Phys. Chem. 1981, 359â401.

(69)

Bartlett, R. J.; Purvis, G. D. Many-Body Perturbation Theory, Coupled-Pair ManyElectron Theory, and the Importance of Quadruple Excitations for the Correlation
Problem. Int. J. Quantum Chem. 1978, 14 (5), 561â581.

(70)

Bartlett, R.; MusiaĹ, M. Coupled-Cluster Theory in Quantum Chemistry. Rev. Mod. Phys.
2007, 79 (1), 291â352.

(71)

Parr, R. G. Density Functional Theory. Annu. Rev. Phys. Chem. 1983, 34 (1), 631â656.

(72)

Hohenberg, P.; Kohn, W. Inhomogeneous Electron Gas. Phys. Rev. 1964, 136, B864.

(73)

Kohn, W.; Sham, L. J. Self-Consistent Equations Including Exchange and Correlation
Effects. Physical Review. 1965, pp A1133âA1138.

(74)

Perdew, J. P.; Schmidt, K. Jacobâs Ladder of Density Functional Approximations for the
Exchange-Correlation Energy. AIP Conf. Proc. 2001, 577 (May), 1â20.

26

(75)

Peverati, R.; Truhlar, D. G. Quest for a Universal Density Functional: The Accuracy of
Density Functionals across a Broad Spectrum of Databases in Chemistry and Physics.
Philos. Trans. A. Math. Phys. Eng. Sci. 2014, 372 (2011), 20120476.

(76)

Gill, P.; Johnson, B.; Pople, J.; Frisch, M. The Performance of the BeckeâLeeâYangâ
Parr (BâLYP) Density Functional Theory with Various Basis Sets. Chem. Phys. Lett.
1992, 197 (4â5), 499â505.

(77)

Becke, A. D. Density-Functional Thermochemistry. III. The Role of Exact Exchange. J.
Chem. Phys. 1993, 98, 5648â5652.

(78)

Lee, C.; Yang, W.; Parr, R. G. Development of the Colle-Salvetti Correlation-Energy
Formula into a Functional of the Electron Density. Phys. Rev. B 1988, 37 (2), 785â789.

(79)

Goerigk, L.; Grimme, S. A Thorough Benchmark of Density Functional Methods for
General Main Group Thermochemistry, Kinetics, and Noncovalent Interactions. Phys.
Chem. Chem. Phys. 2011, 13 (14), 6670â6688.

(80)

Tirado-Rives, J.; Jorgensen, W. L. Performance of B3LYP Density Functional Methods
for a Large Set of Organic Molecules. J. Chem. Theory Comput. 2008, 4 (2), 297â306.

(81)

Jr, T. H. D.; Jr., T. H. D.; Dunning, T. H. J. Gaussian Basis Sets for Use in Correlated
Molecular Calculations. I. The Atoms Boron through Neon and Hydrogen. J. Chem. Phys.
1989, 90, 1007â1023.

(82)

Dunning, T. H. J.; Peterson, K. A.; Wilson, A. K.; Jr, T. H. D.; Jr., T. H. D. Gaussian
Basis Sets for Use in Correlated Molecular Calculations. X. The Atoms Aluminum
through Argon Revisted. J. Chem. Phys. 2001, 114 (21), 9244â9253.

(83)

Woon, D. E.; Dunning, T. H. Gaussian Basis Sets for Use in Correlated Molecular
Calculations. III. The Atoms Aluminum through Argon. J. Chem. Phys. 1993, 98 (2),
1358.

(84)

Woon, D. E.; Dunning, T. H. Gaussian Basis Sets for Use in Correlated Molecular
Calculations. V. Core-Valence Basis Sets for Boron through Neon. J. Chem. Phys. 1995,
103 (11), 4572.

(85)

Woon, D. E.; Dunning, T. H. Gaussian Basis Sets for Use in Correlated Molecular
Calculations. IV. Calculation of Static Electrical Response Properties. J. Chem. Phys.
1994, 100 (4), 2975.

(86)

Hehre, W. J.; Stewart, R. F.; Pople, J. A. Self-Consistent Molecular-Orbital Methods. I.
Use of Gaussian Expansion of Slater-Type Orbitals. J. Chem. Phys. 1969, 51, 2657.

(87)

Hehre, W. J.; Ditchfield, R.; Pople, J. A. Self-Consistent Molecular Orbitals Methods. XII.
Further Extensions of Gaussian-Type Basis Sets for Use in Moelcular Orbital Studies of
Organic Molecules. J. Chem. Phys. 1972, 56, 2257.

27

(88)

Binkley, J. S.; Pople, J. A.; Hehre, W. J. Self-Consistent Molecular Orbital Methods. 21.
Small Split-Valence Basis Sets for First-Row Elements. J. Am. Chem. Soc. 1980, 102 (3),
939â947.

(89)

Krishnan, R.; Binkley, J. S.; Seeger, R.; Pople, J. A. Self-Consistent Molecular Orbital
Methods. XX. A Basis Set for Correlated Wave Functions. J. Chem. Phys. 1980, 72 (1),
650.

28

CHAPTER 3 Molecular Dynamics Studies of the ProteinâProtein Interactions in Inhibitor
of ÎşB Kinase-Î˛ â 

3.1

Introduction
Impeding activation of the nuclear factor (NF)-ÎşB pathway has been a popular aim for

designing new anti-inflammatory therapeutics for many disease states, including cancer, stroke,
and viral infections.1 Directly involved in regulating immune responses to a variety of stimuli, the
NF-ÎşB complex is comprised of an assembly of five transcription activators2 related by the
presence of a distinctive Rel homology domain (RHD):3,4 RelA (p65), RelB , c-Rel, p50/p105 (NFÎşB1), and p52/p100 (NF-ÎşB2).5-9 Separated into two classes by C-terminal domains, the NF-ÎşB
subfamily proteins, p50/p105 and p52/p100, are not always transcription activators unless they
form dimers with the Rel subfamily proteins, RelA, RelB, and c-Rel.10-17 The RHD, responsible
for dimerization and binding to DNA, contains the nuclear localization sequence (NLS)18-20 and
the binding sites for inhibitors.21-23
Although there are various ways to signal activation of the NF-ÎşB pathway, the inhibitory
subunits IÎşB (IÎşBÎą, IÎşBÎ˛, and IÎşBÎľ)5 are responsible for sequestering the inactive state of NF-ÎşB
dimers in the cytoplasm by acting as a steric block against NLS functionality.24 Liberation of NFÎşB complex happens upon signaled-induced degradation of the IÎşB complex thus mediating gene
transcription henceforth inhibition of IÎşB activation is a promising target for anti-inflammatory
drug development.24-29 Enzymes responsible for this cascade of events have been linked to the

â 

This chapter is presented in its entirety from: Jones, M. R; Liu, C.; Wilson, A. K. âMolecular
Dynamics Studies of the ProteinâProtein Interactions in Inhibitor of ÎşB Kinase-Î˛â J. Chem. Inf.
Mod. 2014 54 (2), 562-572.
29

catalytic subunits of (IÎşB kinase) IKK,30-33 specifically activation of the IKKÎą and IKKÎ˛ isoforms.
Dual phosphorylation of the IKKÎą (at Ser176 and Ser180) and IKKÎ˛ (at Ser177 and Ser181)
catalytic subunits regulate aberrant signaling pathways by which the activated subunits, both
capable of undergoing autophosphorylation, phosphorylate IÎşBÎą at Ser 32 and Ser26 and IÎşBÎ˛ at
Ser19 and Ser23.
Both IKKÎą and IKKÎ˛ subunits, having a 52% sequence identity, contain a kinase domain
(KD), leucine zippers (LZ), and helix-loop-helix (HLH) motifs that are capable of forming both
homo and heterodimers.34 Studies in vitro have shown that dimerization mediated by the LZ motifs
regulate the activity and activation of KD by mutation of LZ and HLH motifs, although the HLH
motif does not affect dimerization, it has been shown to contribute to KD activity.35 Serine to
alanine mutations in IKKÎ˛ prevent IKK activation, whereas in IKKÎą, autophosphorylation is
deactivated but has no effect on IKK activity highlighting IKKÎ˛ as an attractive target.
Additionally, in the wild-type conformation of IKKÎ˛, the position of the activation loop containing
the Ser177 and Ser181 hampers the docking of substrates into the pocket whereas the activated
conformation is more accessible as compared to the cyclic AMP-dependent kinase (cAPK).36 This
is mainly because the phosphorylation of the activation loop residues causes conformational
change of the loop and even surrounding residues. This is an important phenomenon that suggests
conformational change induced by these interactions play an important role in bioactivity of the
protein and medicinal chemistry. The activation loop also contains Cys179 within the motif that is
known to be important for the phosphorylation of both IKKÎ˛ and IKKÎą. Studies have shown that
modifications to the cysteine suppress activity of the enzyme and activation of the NF-ÎşB
pathway.37-40

30

With IKKÎ˛ being a key enzyme in activating the NF-ÎşB pathway, the mechanism for
inhibiting activation of IKKÎ˛ has remained elusive due to a lack of a tertiary structural insight.
The newly determined crystal structure of Xenopus laevis IKKÎ˛ (xIKKÎ˛), having a gapless 74%
sequence identity of human gene (hIKKÎ˛), was reported as a âdimer of dimersâ in which
dimerization was reasoned critical for activation, rather than the activity for each homogenous
protomer.41 Each protomer consists of 647 amino acids composed to form a kinase domain (KD)
(residues 16-307), an ubiquitin-like domain (ULD) (residues 310-394), and scaffold/dimerization
domain (SDD) (410-666).
The presented structure of the kinase domain in xIKKÎ˛ does not resemble the active
conformation upon comparison of the activation segments in available protein kinase crystal
structures.42 The activation segments were defined by a primary sequence beginning adjacent to a
Î˛sheet at a conserved DFG (Asp-Phe-Gly) triad and extends to a conserved APE (Ala-Pro-Glu)
motif near an alpha-helix. Although the random coils share a consistent fold between the two
motifs, the activation loop is diverse in both sequence and conformation unique to its proteinprotein interactions and regulation. The activation segment in the xIKKÎ˛ structure begins at
Asp166 and ends at Glu192, where Ser177-Ser181 (SXXXS) contain residues shown critical for
activation. While the KD contains the mitogen-activated protein (MAP) kinase43-46 for serine
phosphorylation, the SDD regulates activation via maintaining dimerization. Mutagenesis studies
showed that the wild-type of IKKÎ˛ was a dimer whereas the mutant was a monomer with a weaker
affinity for dimerization and upon mutation of Leu654 and Trp655 in the SDD, dimerization was
lost.41
Previous computational and experimental work with IKKÎ˛ and similar Ser/Thr/Tyr kinases
has revealed that the allosteric adenosine triphosphate (ATP) binding cavity is near the hinge

31

connecting the N and C lobes of the kinase domain.41,42,47-51 Comparative homology modeling of
the monomer of IKKÎ˛ suggested that the KD was flexible and allosteric ATP binding
conformational changes were not easily observed using 80 ns simulations without the crystal
structure.47,52
In this study, molecular dynamics (MD) simulations of IKKÎ˛ are presented offering a
greater insight on how conformational changes and protein-protein interactions within a
multimeric assembly are influenced by distal modifications in the interacting subunits of IKKÎ˛.
Using a multiscale approach, coarse-grained modeling offers a reduced representation useful for
predicting large-scale conformational changes whereas atomistic simulations provide a more
descriptive insight on activity; together this approach helps discern dimerization disruption.
3.2

Computational Methods
System Preparation. The initial coordinates for eight phosphomimetic mutant

(S177E/S181E) molecules of xIKKÎ˛ were obtained from the Protein Data Bank Database53
(PDBID: 3QA8).41 Combinations of the eight molecules were constructed to model multiple sets
of tetramers, trimers, and dimers while using a monomer as a control (Table 3.1; Figure 3.1).
Missing atoms on the side chains of Val79, Asp90, and Asp291 were added using the Molecular
Operating Environment v.2011.10 (MOE).54 Coarse-grained models (all models shown in Table
3.1) were protonated using MOE and minimized using the AMBER99 force field.55,56 Models for
the atomistic simulations (AB and AD dimers and monomer) were prepared using the Leap module
of AmberTools12.57,58 The Site Finder application in MOE was used to investigate potential sites
for ligand docking. Contact sites and Propensity for Ligand Binding (PLB) indices59 were used to
compare both allosteric and non-allosteric ATP binding sites. ATP coordinated with Mg2+ ions
were extracted from a crystal structure of the Vgrg protein (PDBID:4DTH)60 and docked using

32

MOE with the induced fit method positioning the adenine head towards the hinge (DFG motif)
and tail near the phosphoacceptor serines, using the same starting position for all models (Figure
3.2). Ligand parameters were retrieved from the AMBER parameter database61 and were set up
using the GAFF.62 Parameters for phosphoserine63 mutants (S177/181pS) were retrieved from
Leap.

Table 3.1 Summary of modeled systems.
Model
Tetramer
Trimer
Dimer
Monomer

Chain
ABCD, DEFG, BCEH
ABC, BCE
AB, AD, BE
A

33

Figure 3.1 Representation of the models. The crystal structure was solved with 8 identical
molecules (a) forming two âdimer of dimersâ tetrameric complex (b). Although a dimer is
defined as AB form (d), the alternative dimer AD (c) has significance for structure stability.

34

Figure 3.2 Characterization of ATP/Mg2+ docking cavity. The ATP ligand is positioned in the
mouth of the activation loop (Top). The adenine head of ATP is pointed towards Asp166 and the
phosphate tail is outside of the pocket. The binding pocket is outlined and described by colored
circles (purple and green) representing surrounding amino acids (Bottom); the different outlines
on the purple circles contrast the different types of polar side chains. Blue and green arrows
indicate sidechain and backbone acceptor-donors, respectively. The blue spheres surrounding the
phosphate tail represent the pocket exposure, whereas the lighter-blue spheres highlight the
receptorâs exposure.

35

Simulation Protocol. In order to observe large, dynamical conformational changes,
coarse-grained simulations were carried out for 50 ns at 300 K using CafeMol 2.064 with the
AICG2 model accounting for nonlocal intra-chain interaction (electrostatic, repulsion, and
hydrophobic interactions). Atomistic simulations were performed for 10 ns at 300 K and 1 atm
using AMBER 11 in the presence of 100 mM of sodium chloride65 and 10 Ă of SPC/E66 water
beyond the solute in a cuboid box under the Amber99SB67 force field. Prior to simulation, each
solvated system was gradually relaxed under the NVT ensemble by six minimization procedures
(500 cycles of steepest descent minimization followed by 500 cycles of conjugate gradient
minimization) with decreasing restraints on the protein of 500.0, 200.0, 20.0, 10.0, and 5.0
kcal/mol-Ă2 per procedure to no restrains and then heated to 300 K over 100 ps. The baths were
maintained using the isotropic position scaling protocol and the Langevin thermostat with the
SHAKE algorithm.
Trajectory Analysis. Trajectory analysis was performed using Cafemol and AMBER 11
for coarse-grained and atomistic simulations, respectively, and visualized in Chimera68 and
VMD.69
Simulation Stability. Simulation parameters were extracted from the trajectories and
plotted to access atomistic simulation stability. RMSDs from the starting structure were performed
using the PTRAJ module of Amber suite to investigate flexibility and conformational changes.
Hydrogen Bonding Analysis. Hydrogen bonding analysis was performed with the
CPPTRAJ module of AmberTools, where a hydrogen bond was defined by a cutoff distance of 3.5
Ă and a donor-hydrogen-acceptor angle of 180Â°Âą60Â°.
Energetic Analysis. Free energy calculations were carried out using the MMPBSA.py70
module with Amber Tools and Amber 11. The electrostatic solvation energy was calculated for

36

every ps over a span of 10 ns using three GB models: the pairwise GB models of Hawkins and coworkers (GB-HCT)71-73 two modified models developed by Onufriev and coworkers (GB-OBC1
and GB-OBC2)74 , setting dielectric constants to 1 and 80 for the interior and exterior of the
molecule, respectively. Because relative free energy trends were of interest, solute entropy was
neglected.75 Additionally, the PISA (protein interfaces, surfaces, and assemblies) procedure76 was
used to predict the free energy of formation and dissociation of the two dimeric assemblies studied.
3.3
3.3.1

Results
Docking is Aesthetic
Potential binding sites for ligands were investigated using the Site Finder application in

MOE to observe the accessible binding sites from the static tetrameric assembly (A, B, C and D
stand for the four subunits of IKKÎ˛, respectively). Approximately 46 accessible cavities were
found in a single protomer; only four of the predicted sites have contacts with the activation
segment and are listed in Table 3.2. Sites 2-3, 6 and 10 have pockets containing 103-270 contact
atoms and a wide range of PLB scores. PLB indices are ranked on concavity in which the highest
PLB index would be considered to be the most probable ligand binding site.59 The largest pocket
is shown in Site 2 (Figure 3.3). It is stationed at the bottom of the kinase domain in a cavity that
includes the APE motif as contact atoms. Deeper into the pocket of the activation loop, Sites 3 and
6 both reach towards the hinge of the N and C terminus and include the DFG triad. Both sites are
within a reasonable proximity (â¤5.0 Ă) of the SXXXS sequence. Site 10 is also in the hinge region
and corresponds to an allosteric binding region. The only adjacent contact atom pertinent to the
activation segment is Asp166 of the DFG triad. These accessible pockets within the structure led
rise to the proposal of a non-allosteric docking pose for ATP.

37

Table 3.2 Binding cavities near the activation segment.
Site
2

ÎąSpheres
118

Contact
Atoms
270

Amino
Acids
171

PLB
Index
2.37

Residues
LEU186, LEU189, ALA190, PRO191,
GLU192, LEU193, TRP206, PRO221,
PHE222, ASN225, GLN227, GLN230,
TRP231, HIS232, GLY233, LYS234, VAL235,
ILE243, VAL244, VAL245, TYR246, ASP247,
ASP248, LEU249, THR250, VAL253,
PHE255, SER256, SER257, LEU277,
GLN278, LEU281, MET282, TRP283
3
87
200
122
1.53
GLY22, THR23, GLY24, GLY25, ARG144,
ASP145, LEU146, LYS147, GLU149,
ASP166, LEU167, TYR169, THR180,
PHE182, VAL183, GLY184, THR185,
LEU186, GLN187, TYR188, LEU189,
GLU192, LEU194, GLU195, TYR199,
SER207
6
35
115
76
0.64
GLU61, ILE64, MET65, LYS66, LEU68,
ASN69, VAL73, VAL74, SER75, MET96,
TYR135, LEU136, ILE141, HIS143, ILE164,
ILE165, LEU167, GLY168, ALA170
10
40
103
68
0.32
LEU21, GLY22, VAL29, TYR98, CYS99,
GLU100, GLY101, GLY102, ASP103,
LYS106, GLU149, ASN150, VAL152,
ILE165, ASP166
Column 2 characterizes the number of alpha spheres identified within the cavity. Quantities in
columns 3 and 4 denote the number of contact atoms and amino acids surrounding the cavity.
The PLB index ranks the concavity in an increasing order.

38

Figure 3.3 Visualization of the predicted cavities. All alpha-spheres are shown for the four sites
(Top). The red and white spheres indicated hydrophilic and hydrophobic cavities, respectively.
Wireframe representations of the four sites are shown individually: Site 2 (middle left); Site 3
(middle right); Site 6 (bottom left); and Site 10 (bottom right).
39

3.3.2

Dimer Stability of the Crystal System
Using the static conformations of the initial ligand-free dimer models, theoretical

predictions on the dimer assemblies were calculated using PISA. The free energy of binding, âGint,
is used to describe the free energy of interface formation between subunits, whereas the free energy
of change upon dissociation, âGdiss, corresponds to the difference between the associated and
dissociated structures.76,

77

In these approximations, a negative âGdiss value indicates that the

structure is thermodynamically unstable and would readily dissociate and a positive value suggests
a more stable assembly in which an external component would need to be added to induce
dissociation. The results shown in Table 3.3 Summary of dimer assembly energetics,
determined with PISA.indicate that the AB dimer is more tightly bound than the AD dimer for
the WT and S177/188E models. In contrast, the S177/181pS model has a weaker binding free
energy in the AB dimer and a stronger binding affinity in the AD dimer. Additionally, both the
WT and S177/188E models have larger âGdiss values than the S177/181pS model in the AB dimer
but smaller values in the AD dimer. This indicates that the S177/181pS model is less
thermodynamically stable in the AB dimer and more thermodynamically stable in the AD dimer
than the WT and S177/188E models. Although predictions for the WT agree with experimental
observations, identical values for the WT and S177/181E models were predicted for both the AB
and AD dimers highlighting an inherent weakness in the parameterization of the PISA procedure.76
This method may not always be useful for predicting the conformational stability for proteinprotein interactions using static structures.

40

Table 3.3 Summary of dimer assembly energetics, determined with PISA.
AB dimer
Model
ABSA âG
âGdiss TâSdiss
Pv
WT
1614.2 -21.9
9.5
15.7 0.061
S177/188E 1614.2 -21.9
9.5
15.7 0.061
S177/188pS 1783.7 -14.9
1.5
17.2 0.207
int

AD dimer
ABSA âG
âGdiss TâSdiss
Pv
440.0
1.3 -14.9
15.6 0.801
440.0
1.3 -14.9
15.6 0.801
615.8 -3.0 -12.1
17.1 0.491
int

The buried surface between the engaged interfaces (ABSA) is reported in Ă2.The P-value (Pv)
indicates the probability of finding a more hydrophobic interface, eg. Lower Pv values indicate
fewer hydrophobic sites. The dissociation free energies (âGdiss ) and binding free energies (âGint )
are reported in kcal/mol.
3.3.3

Coarse-Grained Simulations
All coarse-grained models were simulated for 50 ns to gain qualitative insight about the

significance of simulating multiple protomers that may contribute to the conformational stability
of the xIKKÎ˛ assembly. The root-mean-square deviation (RMSD) calculated from the starting
structure was compared against the multiple models shown in Table 3.1 and scaled to fit a
monomer (Figure 3.4). Although calculating RMSD from the starting structure is one way to
determine the stability of the structure, the dynamics in flexible regions can yield misleading
RMSDs. Each model exhibits continuous increases in the deviation indicating sustained
conformational changes. In these simulations, the continuous structural changes occurred in the
kinase domain. Qualitatively, throughout the simulation, the activation loop of the kinase domain
periodically opened and closed suggesting that its structural flexibility may play a role in
regulating molecular traffic in the activation loop. In terms of comparing conformational stability
among the different models, it is apparent within the first 25 ns that there is conformational
favorability, specifically fewer fluctuations in the assembly, for models containing multiple
protomers. The monomer was the least stable system in these simulations because it began to
denature within the first 25 ns of the simulation, thus RMSD after 25 ns were ignored. Although
coarse-grained timescales cannot be directly compared with atomistic time scales, this
41

demonstrates the influence that the assembly of a complex has on the mobility of individual
protomers on short time scales.

Figure 3.4 RMSD plots for the different assemblies using coarse-graining MD. Frames were
plotted for every 1 ps for a total of 50,000 frames (50 ns).

3.3.4

Atomistic Simulation of Dimers and Monomer
To characterize the significance and relationship between kinase domain activity and

dimerization, atomistic simulations were carried out for 10 ns for two sets of dimers while the
monomer was observed as a control. Simulation stability was characterized by monitoring
thermodynamic parameters (Table 3.4) from the starting and average structures. Average total

42

energies, densities, and temperatures were compared amongst the different morphs of the
monomer and dimers. Properties of each monomer, AB dimer and AD dimer were comparable
among the WT, S177/181E, WT+ATP and S177/181pS models, indicating that each monomer and
dimers structures, respectively, maintained the defined simulation parameters. The RMSD values
for the twelve starting structures shown in Figure 3.5 reveal variations of conformational flexibility
throughout the 10 ns simulations. For the monomer, the WT has much more structural fluctuations
than the mutant and ATP bound models. For the dimers, the WT, WT+ATP and S177/181pS
models have more comparable RMSD values than the S177/181E model. In the WT+ATP model,
the flexibility of the activation loop is minimized due to electrostatic interactions with the nonallosteric bound ATP. From this data, it is indicative that the flexibility of the activation segment
in the kinase domain contributes little towards the larger fluctuations observed in the full structure
of the dimers indicating that the observed structural changes occur in other parts of the complexes.
Table 3.4 Averaged thermodynamic data for the 10 ns atomistic simulation.
Model
Etot
Ď
rho
Ď
T
Ď
Monomer -371875.0 398.6 1.0362 0.0008 300.0 0.9
WT
Dimer:AB -448514.1 528.9 1.0530 0.0009 300.0 0.8
Dimer:AD -880796.6 651.6 1.0222 0.0005 300.0 0.6
Monomer -371885.3 400.0 1.0362 0.0008 300.0 0.9
S177/188E Dimer:AB -460262.3 531.5 1.0513 0.0009 300.0 0.8
Dimer:AD -880880.4 652.1 1.0221 0.0005 300.0 0.6
Monomer -373339.9 394.6 1.0364 0.0007 300.0 0.9
WT-ATP
Dimer:AB -448774.7 524.0 1.0528 0.0010 300.0 0.8
Dimer:AD -881975.2 659.6 1.0226 0.0005 300.0 0.6
Monomer -372369.3 396.6 1.0363 0.0007 300.0 0.9
S177/181pS Dimer:AB -447724.1 519.1 1.0521 0.0009 300.0 0.8
Dimer:AD -881267.0 657.1 1.0221 0.0005 300.0 0.6
The total energy (Etot), density (rho) and temperature (T) per system are reported in units of
kcal/mol, (g/mL) and Kelvin, respectively next to the standard deviation (Ď).

43

Figure 3.5 The RMSD from the starting structure for the 10 ns simulation. The different RMSDs
of the monomer (a-b), AB dimer (c-d) and AD dimer (e-f) are colored accordingly (blue, WT;
red, S177/181E; green, WT+ATP; purple;S177/181pS). Plots A, C, and E represent the RMSD
of the entire structure whereas the plots B, D, and F represent only the activation segment of the
principle monomeric chain. Frames were plotted for every 2 ps for a total of 5,000 frames (10
ns).
44

3.3.5

Hydrogen Bonding Between Dimer Interfaces
Structure-based mutagenesis and ultracentrifugation studies investigating SDD mediated

dimerization established that the wild-type is a dimer, whereas upon mutation of Leu654, Trp655,
and Leu658 dimerization failed. To quantify the loss in dimerization upon activation of the
complex, a hydrogen bond analysis was performed. Hydrogen bonding patterns in the AB dimer
includes series of both charged and uncharged amino acids including Glu, Asp, Gln, Ser, and Leu.
Activation by both phosphomimetic mutation and phosphorylation lead to a decrease in hydrogen
bond occupancy throughout the simulation. Specifically, Glu576, Ser489, Asp493, Gln478,
Glu565 and Leu658 of one protomer maintain hydrogen bonds with the opposing protomer 50%
to 100% of the simulation for the wild-type; however these hydrogen bond occupancies decrease
in the activated species (Table 3.5). Trends for the AD dimer (Table 3.6), however, do not follow
a similar trend, which is due to the distances between the dimer interface. Occupancies for the
WT+ATP models failed to follow a trend readily comparable with the other models; however, it
is clear that the docking of ATP has the ability to disrupt SDD mediated dimerization.

45

Table 3.5 Hydrogen bonding occupancies for the AB dimer interface.
(AB) WT
(AB) S177/181E
Y
X-H
O
Y
X-H
O
Glu 576 Arg 573 99
Glu 576 Arg 573 71
Ser 489 Gln 651 96
Asp 569 Arg 572 68
Glu 576 Arg 572 89
Ser 489 Gln 651 57
Tyr 497 Gln 651 81
Asp 493 Gln 651 53
Asp 546 Arg 460 75
Asp 493 Gln 652 50
Asp 493 Gln 652 63
Ser 489 Gln 652 45
Gln 478 Gln 478 55
Glu 19 Arg 592 39
Glu 565 Arg 438 52
Phe 485 Phe 485 29
Leu 658 Leu 658 49
Gln 110 Gln 566 28
Trp 655 Gln 500 39
Glu 502 Arg 666 25
Arg 573 Arg 572 36
Lys 482 Phe 485 22
Glu 464 Thr 463 36
Glu 19 Asn 588 21
Trp 655 Tyr 497 34
Gln 548 Gln 548 21
Lys 659 Gly 504 29
Gly 504 Trp 655 21
Arg 573 Arg 573 28
Ala 481 Lys 482 19
Glu 648 Ser 489 27
Phe 503 Arg 666 18
Pro 578 Glu 576 26
Glu 19 Arg 419 17
Lys 467 Lys 467 26
Trp 655 Lys 496 17
Asp 569 Arg 572 26
Gln 548 Arg 452 16
Cys 662 Phe 503 25
Glu 648 Phe 485 16
Glu 502 Ser 653 24
Ser 619 Phe 503 15
Gln 500 Trp 655 23
Lys 482 Lys 482 15
Lys 482 Phe 485 18
Gln 110 Leu 570 15
Ala 481 Lys 482 18
Gln 478 Gln 478 14
Lys 482 Ala 481 17
Gly 504 Lys 659 13
In each model, an acceptor (Y) and a donor (X-H) were defined and analyzed throughout the 10
ns simulation. The percent occupancy (O) represents the fraction of frames in which a bond was
observed.

46

Table 3.5 (contâd)
(AB) WT-ATP
Y
X-H
Arg 650 Gln 651
Asp 546 Arg 460
Trp 655 Tyr 497
Gln 478 Gln 478
Ser 489 Glu 648
Phe 485 Phe 485
Phe 485 Gln 647
Gln 500 Lys 659
Phe 486 Gln 651
Ala 481 Lys 482
Trp 268 Gln 651
Asp 484 Lys 482
Ala 481 Gln 478
Lys 482 Phe 485
Lys 482 Lys 482
Leu 654 Trp 655
Phe 485 Lys 482
Phe 486 Phe 485
Phe 485 Phe 486
Gln 652 Lys 496
Gln 500 Trp 655
Ile 490
Gln 651
Ser 489 Gln 651
Phe 485 Glu 648

(AB) S177/181pS
Y
X-H
O
Glu 565 Arg 568 70
Lys 482 Lys 482 59
Lys 482 Phe 485 53
Asp 493 Gln 652 45
Asp 493 Trp 655 36
Phe 485 Phe 485 33
Gln 651 Gln 651 31
Glu 648 Ser 488 30
Phe 485 Lys 482 29
Trp 655 Tyr 497 28
Phe 503 Cys 652 28
Ala 481 Lys 482 27
Phe 485 Phe 486 25
Gln 478 Gln 478 23
Ser 357 Gln 548 23
Trp 655 Asp 493 21
Asp 493 Gln 651 20
Ser 489 Gln 651 20
Asp 569 Arg 568 19
Phe 485 Glu 648 18
Ala 481 Gln 478 17
Lys 467 Lys 467 16
Arg 549 Ser 357 16
Glu 16 Arg 419 16
Leu 658 Leu 658 15
Lys 659 Tyr 497 14
Ser 489 Glu 648 13
Phe 503 Arg 666 13
Gln 500 Lys 659 13
Arg 573 Arg 572 13
In each model, an acceptor (Y) and a donor (X-H) were defined and analyzed throughout the 10
ns simulation. The percent occupancy (O) represents the fraction of frames in which a bond was
observed.
O
84
71
50
36
31
29
26
26
22
22
22
22
19
19
16
16
16
14
14
14
13
12
11
11

47

Table 3.6 Hydrogen bonding occupancies for the AD dimer interface.
(AD) WT
Y
X-H
Asn 225 Leu 223
Leu 193 Thr 250
Thr 250 Trp 226
His 232 Val 229
Asn 225 Phe 222
Pro 228 Asp 248
Gly 251 Trp 226
His 232 Trp 231
Pro 228 Asp 247
Trp 226 Tyr 423
Trp 226 Phe 255
Asn 225 His 425
Pro 224 Pro 224
Asn 225 Pro 224
Leu 384 Trp 226
Val 183 Leu 421
Phe 255 Trp 226
Thr 250 Gln 227

(AD) S177/181E
O
Y
X-H
O
97
Thr 250 Trp 226 78
84
Asp 536 Arg 579 55
82
His 232 Trp 231 39
38
His 232 His 232 39
29
Pro 224 Asn 225 35
29
Arg 577 Arg 582 35
27
Glu 49 Arg 592 23
22
Pro 224 Pro 228 20
19
Thr 250 Gln 227 19
18
Arg 236 Pro 228 19
17
Gln 230 Val 229 18
16
Thr 424 Asn 225 18
15
Arg 579 Arg 582 17
14
Val 183 Thr 422 17
13
Asp 248 Pro 228 14
12
Glu 49 Arg 419 13
11
Arg 582 Arg 582 13
10
Asn 225 Val 229 12
Leu 186 Thr 250 10
In each model, an acceptor (Y) and a donor (X-H) were defined and analyzed throughout the 10
ns simulation. The percent occupancy (O) represents the fraction of frames in which a bond was
observed.

48

Table 3.6 (contâd)
(AD) WT-ATP
Y
X-H
Asp 536 Arg 582
Asp 536 Arg 579
Asp 248 Trp 226
Pro 417 Gln 48
Asn 225 Trp 226
His 232 Trp 226
Trp 226 Tyr 423
Leu 384 Ser 181
Trp 231 Trp 231
Gln 227 Thr 250
Leu 193 Leu 249
Trp 231 His 232
Pro 578 Arg 579

(AD) S177/181pS
O
Y
X-H
O
59
Asp 536 Arg 579 88
49
Pro 228 Thr 250 53
25
Pro 224 Asn 225 45
21
Leu 249 Leu 194 40
20
Asp248 Leu 193 34
16
Pro 578 Arg 579 29
15
Thr 250 Trp 226 28
14
Pro 228 His 232 28
12
Trp 226 His 425 27
10
Pro 228 Asp 248 25
10
Ser 257 Pro 228 17
9
Thr 250 Gln 227 15
7
Leu 249 Glu 195 13
Trp 226 Tyr 423 13
Thr 250 Leu 194 12
Sep 181 Lys 418 9
His 232 Val 229 9
In each model, an acceptor (Y) and a donor (X-H) were defined and analyzed throughout the 10
ns simulation. The percent occupancy (O) represents the fraction of frames in which a bond was
observed.
3.3.6

Binding Free Energies
Calculating the average binding free energy using GB approximations can provide relative

trends of free energies of association and dissociation. All binding free energies are shown in Table
3.7. Using the pairwise de-screening approach (HCT), the dimerization of the AB model is less
favorable upon mutation of the S177/181. Although the phosphomimetic mutations show a weaker
binding affinity trend than the phosphoserine model, the phosphoserine dimer does begin
dissociating (Figure 3.6). In contrast, the AD model has a stronger binding affinity upon active
rendering with similar exaggerations from the phosphomimetic mutation. This relative trend
visualized in Figure 3.7 supports the favorability for dimerization in the wild type but also proposes
an affinity shift towards the AD dimer upon activation. Observing the OBC models, the trend is
consistent, excluding the phosphomimetic mutant with the OBC2 model. For the AB dimer
49

containing ATP, only the HCT model was used to calculate âGbind between the dimer interfaces
to investigate how ATP binding affects dimerization. The results show that ATP, as expected,
prompts dissociation.
Table 3.7 Binding free energies of the IKKÎ˛ dimer.a
Model

HCT

AB -180.7
AD -74.6
AB -153.7
E177/181
AD -93.9
AB -168.5
pS177/181
AD -69.5
Wild-type

WT-ATP
a

AB

OBC1 OBC2
-88.6
-41.6
-65.5
-51.0
-82.4
-28.2

-50.3
-25.8
-30.4
-35.0
-39.5
-13.4

-122.0

All energies are in kcal/mol.

50

Figure 3.6 Comparison of the phosphorylated AB Dimer. Shown are the starting structure (top)
and the final structure (bottom) after 10 ns.

51

E177/181

âGbind (kcal/mol)

WT
0.0
-20.0
-40.0
-60.0
-80.0
-100.0
-120.0
-140.0
-160.0
-180.0
-200.0

AB

AD

AB

AD

pS177/181
AB

AD

HCT
OBC1
OBC2

Model

Figure 3.7 Comparison of binding free energy trends with the HCT, OBC1, and OBC2 models.

3.4

Discussion
Different structure-guided approaches to drug discovery have been developed over the

years to better guide rational design.78-81 MD simulations are widely used to elucidate structureactivity relationships, however this approach has its own weaknesses due to inadequate
conformational sampling which can be due to high computational costs and inherent weaknesses
within the force-field parameterizations.79,82,83 Focusing on the limitation of conformational
sampling, coarse-graining techniques have been designed to provide reduced representations that
permit investigations of various conformational states of large biomolecular systems and
assemblies with longer time scales84,85 that provide a foundation for multiscale modeling
approaches. Typically, molecular dynamics simulations of protein-ligand recognition are studied
using non-covalently bound ligands to a single subunit or domain of a protein, often neglecting the
proteinâs contribution to the quaternary structure. Although these free, unbound ligands are unable
to induce a change in effective force constants or break bonds with point-charged based force
52

fields, they do induce sterics which can make changes in the local tertiary structure. Despite the
limits to conformational sampling, it is important to study the multiple protomers or subunits of
quaternary structures when investigating the impact of ligand binding or point mutations to avoid
misleading details describing the recognition events, especially those of systems containing homomultimeric subunits.
As shown in this work, it is important to consider the full structure of IKKÎ˛ for the study
of the effects of distal modifications. The assembly of the inactive xIKKÎ˛ is described as a
symmetric stationary assembly whereas upon activation, the quaternary structure dissociates from
the central configuration. From the coarse-grained simulation, the large-scale motions of the kinase
domain are pertinent to the kinase activity, in that the activation loop symbolizes a pair of lips that
opens and closes repeatedly during the MD simulation. This âopenâ and âclosedâ conformation
resembles the backbone flexibility experimentally shown in the cAMP-dependent protein kinase
(cAPK or PKA) catalytic subunit.86,87 A central location for potential conformational dynamics is
at the hinge, thus opening of the hinge makes the possibility for ligand entrance in the lips of the
KD possible in both conformations. Due to the orientation of the dimer-dimer interface, the
activation loop was unhindered and accessible. The accessibility of the hinge through the
activation loop in the crystal structure is tight, but accessible for flat structures, such as ATP. The
polar functional groups of the residues in the pocket of the activation loop cause great polarity of
the pocket, which favors the formation of hydrogen bonds. Non-covalent docking of ATP in the
binding pocket made an impact on the dimerization. From this docked ATP position, the role of
CYS179 modulating ATP binding and activation can be described as a side chain that aids in
holding ATP in the pocket.

53

Since the publication of the xIKKÎ˛ crystal structure, two structures have been solved for
the active conformation of hIKKÎ˛.88,89 Both studies confirm that the overall modular arrangement
of the different domains between hIKKÎ˛ and xIKKÎ˛ are consistent. Additionally, both of the
crystal structures confirm that the active assembly of IKKÎ˛ resembles a pair of opening shears;
this is in agreement with our simulation data. Despite the subtle differences of the opened and
closed conformations, the activation segment of the KD regulates activity and dynamics.
Modifications to the distal activation loop made a significant impact to the dimerization within a
short simulation time. Although the atomistic simulation time was not long enough to observe the
opened and the closed conformations, we saw that the pocket from the KD began to open
throughout the 10 ns simulation across all models. The phosphomimetic and phosphoserine
mutations formed hydrogen bonds and salt interactions with the solvent and nearby side-chains,
however, the occupancies of these interactions were not maintained which suggests an entropic
flexibility of the KD. Previous investigations of IÎşB kinase activity showed that IKKÎ˛ activity can
be expressed with phosphomimetic mutations.32 In agreement with experiment, the effects of
imploring phosphomimetic mutations rather than actual phosphorylated amino acids were
observed in these simulations. Although the opened conformation resembles the active kinase, the
inactive KD shows similar activity.

3.5

Conclusion
In this study, molecular dynamics simulations were employed to investigate

conformational stability for modeling multimeric species and to decipher whether dimerization is
interrupted upon activation of IKKÎ˛. Coarse-grained MD simulations showed that simulating more
that one protomer of the multimeric assembly provided more stability for the assembly of IKKÎ˛.

54

Interactions at the dimer-dimer interface between the SDD and KD promote a conformational
stability for the inactive state of IKKÎ˛. Atomistic simulations of the two dimer models confirmed
that each protomer mediates electrostatic interactions that are responsible for activity of IKKÎ˛.
Dimerization of IKKÎ˛ is disrupted upon activation of Ser177 and Ser181. Rather than inducing
local conformational changes within the KD observable within the 10 ns simulation, binding of
ATP induces conformational changes that disrupt dimerization. Additionally, phosphomimetic
mutations can adequately express the active state of IKKÎ˛.
Regarding the molecular modeling of IKKÎ˛ activity, we highlight the importance of
considering both conformation-activity and structure-activity relationships. Considering how
protein-protein interactions within assemblies are affected when docking ligands or modifying
protein side-chains is critical for understanding structure-activity relationships, even in sites distal
from the protein-protein interface. Significant structure-activity relationships can be
underestimated by neglecting protein-protein interactions.

55

REFERENCES

56

REFERENCES
(1)

Baldwin, A. S., Jr. Series Introduction: The transcription factor NF-ÎşB and human
disease. J. Clin. Invest 2001, 107, 3-6.

(2)

de Martin, R.; Schmid, J. A.; Hofer-Warbinek, R. The NF-kappaB/Rel family of
transcription factors in oncogenic transformation and apoptosis. Mutat. Res. 1999, 437,
231-243.

(3)

Ghosh, S.; May, M. J.; Kopp, E. B. NF-kappa B and Rel proteins: evolutionarily
conserved mediators of immune responses. Annu. Rev. Immunol. 1998, 16, 225-260.

(4)

Sullivan, J. C.; Kalaitzidis, D.; Gilmore, T. D.; Finnerty, J. R. Rel homology domaincontaining transcription factors in the cnidarian Nematostella vectensis. Dev. Genes
Evol. 2007, 217, 63-72.

(5)

Baldwin, A. S. Control of oncogenesis and cancer therapy resistance by the
transcription factor NF-kappaB. J. Clin. Invest. 2001, 107, 241-246.

(6)

Ghosh, S.; Karin, M. Missing pieces in the NF-kappaB puzzle. Cell 2002, 109 Suppl,
S81-96.

(7)

Gilmore, T. D.; Garbati, M. R. Inhibition of NF-kappaB signaling as a strategy in
disease therapy. Curr. Top. Microbiol. Immunol. 2011, 349, 245-263.

(8)

Gilmore, T. D.; Wolenski, F. S. NF-kappaB: where did it come from and why?
Immunol. Rev. 2012, 246, 14-35.

(9)

Rayet, B.; Gelinas, C. Aberrant rel/nfkb genes and activity in human cancer. Oncogene
1999, 18, 6938-6947.

(10)

Gilmore, T. D. Introduction to NF-kappaB: players, pathways, perspectives. Oncogene
2006, 25, 6680-6684.

(11)

Perkins, N. D.; Gilmore, T. D. Good cop, bad cop: the different faces of NF-kappaB.
Cell Death Differ. 2006, 13, 759-772.

(12)

Gilmore, T. D.; Gerondakis, S. The c-Rel Transcription Factor in Development and
Disease. Genes Cancer. 2011, 2, 695-711.

(13)

Gilmore, T. D.; Kalaitzidis, D.; Liang, M. C.; Starczynowski, D. T. The c-Rel
transcription factor and B-cell proliferation: a deal with the devil. Oncogene 2004, 23,
2275-2286.

(14)

Gilmore, T. D. Introduction: The Rel/NF-kappaB signal transduction pathway. Semin.
Cancer Biol. 1997, 8, 61-62.

57

(15)

Gilmore, T. D.; Koedood, M.; Piffat, K. A.; White, D. W. Rel/NF-kappaB/IkappaB
proteins and cancer. Oncogene 1996, 13, 1367-1378.

(16)

Gilmore, T. D. NF-kappa B, KBF1, dorsal, and related matters. Cell 1990, 62, 841-843.

(17)

Sun, S. C. Non-canonical NF-kappaB signaling pathway. Cell Res. 2011, 21, 71-85.

(18)

Beg, A. A.; Ruben, S. M.; Scheinman, R. I.; Haskill, S.; Rosen, C. A.; Baldwin, A. S.,Jr
I kappa B interacts with the nuclear localization sequences of the subunits of NF-kappa
B: a mechanism for cytoplasmic retention. Genes Dev. 1992, 6, 1899-1913.

(19)

Beg, A. A.; Baldwin, A. S.,Jr The I kappa B proteins: multifunctional regulators of
Rel/NF-kappa B transcription factors. Genes Dev. 1993, 7, 2064-2070.

(20)

Zabel, U.; Henkel, T.; Silva, M. S.; Baeuerle, P. A. Nuclear uptake control of NF-kappa
B by MAD-3, an I kappa B protein present in the nucleus. EMBO J. 1993, 12, 201-211.

(21)

Huguet, C.; Crepieux, P.; Laudet, V. Rel/NF-kappa B transcription factors and I kappa
B inhibitors: evolution from a unique common ancestor. Oncogene 1997, 15, 29652974.

(22)

Baeuerle, P. A.; Henkel, T. Function and activation of NF-kappa B in the immune
system. Annu. Rev. Immunol. 1994, 12, 141-179.

(23)

Siebenlist, U.; Franzoso, G.; Brown, K. Structure, regulation and function of NF-kappa
B. Annu. Rev. Cell Biol. 1994, 10, 405-455.

(24)

Karin, M. How NF-kappaB is activated: the role of the IkappaB kinase (IKK) complex.
Oncogene 1999, 18, 6867-6874.

(25)

Israel, A. The IKK Complex, a Central Regulator of NF-ÎşB Activation. Cold Spring
Harb. Perspect. Biol. 2010, 2, a000158-a000158.

(26)

Nishikori, M. Classical and Alternative NF-kB Activation Pathways and Their Roles in
Lymphoid Malignancies. J. Clin. Exp. Hematop. 2005, 45, 15-24.

(27)

Gamble, C.; McIntosh, K.; Scott, R.; Ho, K. H.; Plevin, R.; Paul, A. Inhibitory kappa B
Kinases as targets for pharmacological regulation. Br. J. Pharmacol. 2012, 165, 802819.

(28)

Karin, M.; Delhase, M. The I kappa B kinase (IKK) and NF-kappa B: key elements of
proinflammatory signalling. Semin. Immunol. 2000, 12, 85-98.

(29)

May, M. J.; Ghosh, S. Rel/NF-kB and IkB proteins: an overview. Semin. Cancer Biol.
1997, 8, 63-73.

58

(30)

Zandi, E.; Rothwarf, D. M.; Delhase, M.; Hayakawa, M.; Karin, M. The IkappaB kinase
complex (IKK) contains two kinase subunits, IKKalpha and IKKbeta, necessary for
IkappaB phosphorylation and NF-kappaB activation. Cell 1997, 91, 243-252.

(31)

Hacker, H.; Karin, M. Regulation and function of IKK and IKK-related kinases. Sci.
STKE 2006, 2006, re13.

(32)

Mercurio, F.; Zhu, H.; Murray, B. W.; Shevchenko, A.; Bennett, B. L.; Li, J.; Young, D.
B.; Barbosa, M.; Mann, M.; Manning, A.; Rao, A. IKK-1 and IKK-2: cytokineactivated IkappaB kinases essential for NF-kappaB activation. Science 1997, 278, 860866.

(33)

Hayden, M. S.; Ghosh, S. Signaling to NF-kappaB. Genes Dev. 2004, 18, 2195-2224.

(34)

Zandi, E.; Chen, Y.; Karin, M. Direct phosphorylation of IkappaB by IKKalpha and
IKKbeta: discrimination between free and NF-kappaB-bound substrate. Science 1998,
281, 1360-1363.

(35)

Delhase, M.; Karin, M. The I kappa B kinase: a master regulator of NF-kappa B, innate
immunity, and epidermal differentiation. Cold Spring Harb. Symp. Quant. Biol. 1999,
64, 491-503.

(36)

Johnson, L. N.; Noble, M. E.; Owen, D. J. Active and inactive protein kinases:
structural basis for regulation. Cell 1996, 85, 149-158.

(37)

Pandey, M. K.; Sung, B.; Kunnumakkara, A. B.; Sethi, G.; Chaturvedi, M. M.;
Aggarwal, B. B. Berberine modifies cysteine 179 of IkappaBalpha kinase, suppresses
nuclear factor-kappaB-regulated antiapoptotic gene products, and potentiates apoptosis.
Cancer Res. 2008, 68, 5370-5379.

(38)

Gupta, S. C.; Prasad, S.; Reuter, S.; Kannappan, R.; Yadav, V. R.; Ravindran, J.; Hema,
P. S.; Chaturvedi, M. M.; Nair, M.; Aggarwal, B. B. Modification of cysteine 179 of
IkappaBalpha kinase by nimbolide leads to down-regulation of NF-kappaB-regulated
cell survival and proliferative proteins and sensitization of tumor cells to
chemotherapeutic agents. J. Biol. Chem. 2010, 285, 35406-35417.

(39)

Byun, M. S.; Choi, J.; Jue, D. M. Cysteine-179 of IkappaB kinase beta plays a critical
role in enzyme activation by promoting phosphorylation of activation loop serines. Exp.
Mol. Med. 2006, 38, 546-552.

(40)

Harikumar, K. B.; Kunnumakkara, A. B.; Ahn, K. S.; Anand, P.; Krishnan, S.; Guha, S.;
Aggarwal, B. B. Modification of the cysteine residues in IkappaBalpha kinase and NFkappaB (p65) by xanthohumol leads to suppression of NF-kappaB-regulated gene
products and potentiation of apoptosis in leukemia cells. Blood 2009, 113, 2003-2013.

(41)

Xu, G.; Lo, Y. C.; Li, Q.; Napolitano, G.; Wu, X.; Jiang, X.; Dreano, M.; Karin, M.;
Wu, H. Crystal structure of inhibitor of kappaB kinase beta. Nature 2011, 472, 325-330.

59

(42)

Nolen, B.; Taylor, S.; Ghosh, G. Regulation of protein kinases; controlling activity
through activation segment conformation. Mol. Cell 2004, 15, 661-675.

(43)

Boutros, T.; Chevet, E.; Metrakos, P. Mitogen-Activated Protein (MAP) Kinase/MAP
Kinase Phosphatase Regulation: Roles in Cell Growth, Death, and Cancer. Pharmacol.
Rev. 2008, 60, 261-310.

(44)

Pearson, G.; Robinson, F.; Beers Gibson, T.; Xu, B. E.; Karandikar, M.; Berman, K.;
Cobb, M. H. Mitogen-activated protein (MAP) kinase pathways: regulation and
physiological functions. Endocr. Rev. 2001, 22, 153-183.

(45)

Dhillon, A. S.; Hagan, S.; Rath, O.; Kolch, W. MAP kinase signalling pathways in
cancer. Oncogene 2007, 26, 3279-3290.

(46)

Ferrell, J. E.,Jr; Bhatt, R. R. Mechanistic studies of the dual phosphorylation of
mitogen-activated protein kinase. J. Biol. Chem. 1997, 272, 19008-19016.

(47)

Kalia, M.; Kukol, A. Structure and dynamics of the kinase IKK-beta--A key regulator
of the NF-kappa B transcription factor. J. Struct. Biol. 2011, 176, 133-142.

(48)

Kornev, A. P.; Taylor, S. S.; Ten Eyck, L. F. A helix scaffold for the assembly of active
protein kinases. Proc. Natl. Acad. Sci. USA 2008, 105, 14377-14382.

(49)

Steichen, J. M.; Iyer, G. H.; Li, S.; Saldanha, S. A.; Deal, M. S.; Woods, V. L., Jr;
Taylor, S. S. Global consequences of activation loop phosphorylation on protein kinase
A. J. Biol. Chem. 2010, 285, 3825-3832.

(50)

Scheeff, E. D.; Bourne, P. E. Structural evolution of the protein kinase-like superfamily.
PLoS Comput. Biol. 2005, 1, e49.

(51)

Zuccotto, F.; Ardini, E.; Casale, E.; Angiolini, M. Through the "gatekeeper door":
exploiting the active kinase conformation. J. Med. Chem. 2010, 53, 2681-2694.

(52)

Palermo, N. Y.; Natarajan, A. Beyond the frog: the evolution of homology models of
human IKKbeta. Bioorg. Med. Chem. Lett. 2011, 21, 6081-6084.

(53)

Berman, H. M. The Protein Data Bank. Nucleic Acids Res. 2000, 28, 235-242.

(54)

Chemical Computing Group Inc. Molecular Operating Environment (MOE), 2011.10.
2011, .

(55)

Wang, J.; Cieplak, P.; Kollman, P. A. How well does a restrained electrostatic potential
(RESP) model perform in calculating conformational energies of organic and biological
molecules? J. Comput. Chem. 2000, 21, 1049-1074.

(56)

Ponder, J. W.; Case, D. A. Force fields for protein simulations. Adv. Protein Chem.
2003, 66, 27-85.

60

(57)

Salomon-Ferrer, R.; Case, D. A.; Walker, R. C. An overview of the Amber
biomolecular simulation package. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2013, 3,
198-210.

(58)

Case, D. A.; Cheatham, T. E.; Darden, T.; Gohlke, H.; Luo, R.; Merz, K. M.; Onufriev,
A.; Simmerling, C.; Wang, B.; Woods, R. J. The Amber biomolecular simulation
programs. J. Comput. Chem. 2005, 26, 1668-1688.

(59)

Soga, S.; Shirai, H.; Kobori, M.; Hirayama, N. Use of amino acid composition to
predict ligand-binding sites. J. Chem. Inf. Model. 2007, 47, 400-406.

(60)

Durand, E.; Derrez, E.; Audoly, G.; Spinelli, S.; Ortiz-Lombardia, M.; Raoult, D.;
Cascales, E.; Cambillau, C. Crystal structure of the VgrG1 actin cross-linking domain
of the Vibrio cholerae type VI secretion system. J. Biol. Chem. 2012, 287, 3819038199.

(61)

Meagher, K. L.; Redman, L. T.; Carlson, H. A. Development of polyphosphate
parameters for use with the AMBER force field. J. Comput. Chem. 2003, 24, 10161025.

(62)

Wang, J.; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A.; Case, D. A. Development and
testing of a general amber force field. J. Comput. Chem. 2004, 25, 1157-1174.

(63)

Homeyer, N.; Horn, A. H.; Lanig, H.; Sticht, H. AMBER force-field parameters for
phosphorylated amino acids in different protonation states: phosphoserine,
phosphothreonine, phosphotyrosine, and phosphohistidine. J. Mol. Model. 2006, 12,
281-289.

(64)

Kenzaki, H.; Koga, N.; Hori, N.; Kanada, R.; Li, W.; Okazaki, K.; Yao, X.; Takada, S.
CafeMol: A Coarse-Grained Biomolecular Simulator for Simulating Proteins at Work.
J. Chem. Theory Comput. 2011, 7, 1979-1989.

(65)

Joung, I. S.; Cheatham, T. E. Determination of Alkali and Halide Monovalent Ion
Parameters for Use in Explicitly Solvated Biomolecular Simulations. J. Phys. Chem. B
2008, 112, 9020-9041.

(66)

Berendsen, H. J. C.; Grigera, J. R.; Straatsma, T. P. The missing term in effective pair
potentials. J. Phys. Chem. 1987, 91, 6269-6271.

(67)

Hornak, V.; Abel, R.; Okur, A.; Strockbine, B.; Roitberg, A.; Simmerling, C.
Comparison of multiple Amber force fields and development of improved protein
backbone parameters. Proteins 2006, 65, 712-725.

(68)

Pettersen, E. F.; Goddard, T. D.; Huang, C. C.; Couch, G. S.; Greenblatt, D. M.; Meng,
E. C.; Ferrin, T. E. UCSF Chimera--a visualization system for exploratory research and
analysis. J. Comput. Chem. 2004, 25, 1605-1612.

61

(69)

Prall, M. VMD: a graphical tool for the modern chemists. J. Comput. Chem. 2001, 22,
132-134.

(70)

Miller, B. R.; McGee, T. D.; Swails, J. M.; Homeyer, N.; Gohlke, H.; Roitberg, A. E.
MMPBSA.py: An Efficient Program for End-State Free Energy Calculations. J. Chem.
Theory Comput. 2012, 8, 3314-3321.

(71)

Hawkins, G. D.; Cramer, C. J.; Truhlar, D. G. Pairwise solute descreening of solute
charges from a dielectric medium. Chem. Phys. Lett. 1995, 246, 122-129.

(72)

Hawkins, G. D.; Cramer, C. J.; Truhlar, D. G. Parametrized Models of Aqueous Free
Energies of Solvation Based on Pairwise Descreening of Solute Atomic Charges from a
Dielectric Medium. J. Phys. Chem. 1996, 100, 19824-19839.

(73)

Tsui, V.; Case, D. A. Theory and applications of the generalized born solvation model
in macromolecular simulations. Biopolymers 2000, 56, 275-291.

(74)

Onufriev, A.; Bashford, D.; Case, D. A. Exploring protein native states and large-scale
conformational changes with a modified Generalized Born model. Proteins: Struct.,
Funct., Bioinf. 2004, 55, 383-394.

(75)

Massova, I.; Kollman, P. A. Combined molecular mechanical and continuum solvent
approach (MM-PBSA/GBSA) to predict ligand binding. Perspect. Drug Discovery Des.
2000, 18, 113-135.

(76)

Krissinel, E.; Henrick, K. Inference of macromolecular assemblies from crystalline
state. J. Mol. Biol. 2007, 372, 774-797.

(77)

Krissinel, E. Crystal contacts as nature's docking solutions. J. Comput. Chem. 2010, 31,
133-143.

(78)

Mandal, S.; Moudgil, M.; Mandal, S. K. Rational drug design. Eur. J. Pharmacol. 2009,
625, 90-100.

(79)

Durrant, J. D.; McCammon, J. A. Molecular dynamics simulations and drug discovery.
BMC Biol. 2011, 9, 71-7007-9-71.

(80)

Borhani, D. W.; Shaw, D. E. The future of molecular dynamics simulations in drug
discovery. J. Comput.-Aided Mol. Des. 2012, 26, 15-26.

(81)

Hao, G. F.; Yang, G. F.; Zhan, C. G. Structure-based methods for predicting target
mutation-induced drug resistance and rational drug design to overcome the problem.
Drug Discovery Today 2012, 17, 1121-1126.

(82)

Chodera, J. D.; Mobley, D. L.; Shirts, M. R.; Dixon, R. W.; Branson, K.; Pande, V. S.
Alchemical free energy methods for drug discovery: progress and challenges. Curr.
Opin. Struct. Biol. 2011, 21, 150-160.

62

(83)

Mobley, D. L.; Chodera, J. D.; Dill, K. A. The Confine-and-Release Method: Obtaining
Correct Binding Free Energies in the Presence of Protein Conformational Change.
J. Chem. Theory Comput. 2007, 3, 1231-1235.

(84)

Tozzini, V. Minimalist models for proteins: a comparative analysis. Q. Rev. Biophys.
2010, 43, 333-371.

(85)

Kamerlin, S. C. L.; Vicatos, S.; Dryga, A.; Warshel, A. Coarse-Grained (Multiscale)
Simulations in Studies of Biophysical and Chemical Systems. Annu. Rev. Phys. Chem.
2011, 62, 41-64.

(86)

Li, F.; Gangal, M.; Juliano, C.; Gorfain, E.; Taylor, S. S.; Johnson, D. A. Evidence for
an internal entropy contribution to phosphoryl transfer: a study of domain closure,
backbone flexibility, and the catalytic cycle of cAMP-dependent protein kinase. J. Mol.
Biol. 2002, 315, 459-469.

(87)

Li, F.; Gangal, M.; Jones, J. M.; Deich, J.; Lovett, K. E.; Taylor, S. S.; Johnson, D. A.
Consequences of cAMP and catalytic-subunit binding on the flexibility of the A-kinase
regulatory subunit. Biochemistry 2000, 39, 15626-15632.

(88)

Polley, S.; Huang, D.; Hauenstein, A. V.; Fusco, A. J.; Zhong, X.; Vu, D.;
Schroefelbauer, B.; Kim, Y.; Hoffmann, A.; Verma, I. M.; Ghosh, G.; Huxford, T. A
structural basis for IkB Kinase 2 Activation via oligomerization dependent trans autophosphorylation. PLoS Biol. 2013, 11, 1-13.

(89)

Liu, S.; Misquitta, Y. R.; Olland, A.; Johnson, M. A.; Kelleher, K. S.; Kriz, R.; Lin, L.
L.; Stahl, M.; Mosyak, L. Crystal Structure of a Human IkappaB Kinase beta
Asymmetric Dimer. J. Biol. Chem. 2013, 288, 22758-22767.

63

CHAPTER 4 Impact of Intracellular Ionic Strength on Dimer Binding in the NF-kB
Inducing Kinase

4.1

Introduction
Understanding the signaling pathways associated with the onset and progression of

oncogenesis has been a target of many diverse disciplines. Much emphasis has been placed on
elucidating the nuclear factor (NF)-ÎşB signaling pathway since aberrations within this pathway
are associated with many different kinds of immune responses.1â3 Sequestered in an inactive state
in the cytoplasm by its inhibitory subunits (IÎşBÎą, IÎşBÎ˛, and IÎşBÎľ), the NF-ÎşB pathway includes a
family of five transcription factors, NF-ÎşB1 (p50 and p105), NF-ÎşB2 (p52 and p100), RelA (p65),
RelB, and c-Rel.4â7 These transcription factors are assembled into operational-specific and
functional-specific dimers. The dimers include both heteroligomers and homoligomers that,
depending upon the conformational and oligomeric architecture, can either induce or repress gene
regulation and expression.8â12 These transcription factors differ on how and where they bind to
DNA and different combinations lead to different specificities of gene expression.9 As a central
activator of genes involved in inflammation and immune function, the NF-ÎşB family represents a
vital signaling cascade that must be further analyzed to prevent dysregulations in gene expression.
Although there are many ways in which NF-ÎşB activation can be triggered, the release of
the NF-ÎşB dimers are regulated by two main pathways.13 In the classical, or canonical, pathway,
the rate-limiting step of the signaling mechanism is the signal-induced degradation of the IÎşBs,
which happens upon the activation of the catalytic subunits of the IÎşB kinase (IKK) complex.
Comprised of two homologous catalytic subunits, IKKÎą and IKKÎ˛, and a regulatory subunit IKKÎł,
activated enzymes of the IKK complex phosphorylate IÎşB, leading to the selective release of the

64

NF-ÎşB dimers. These NF-ÎşB dimers translocate to the nucleus to regulate gene expression. The
signaling cascades that result from the activation of IKKÎ˛ regulate the classical pathway by
targeting NF-ÎşB1. In contrast, the alternative, or non-canonical, pathway is dependent upon the
activation of the NF-ÎşB Inducing Kinase (NIK), an upstream kinase that selectively
phosphorylates IKKÎą, which activates NF-ÎşB2.14â17 Although some studies have suggested that
NIK serves specific functions in the classical pathway, as the critical component of the noncanonical pathway, NIK serves a promising target for the treatment of autoimmune disorders and
cancers. 18â22
The human sequence of NIK, composed of 947 amino acids, is comprised of an N-terminal
domain that binds to TRAF3 (30â120), a negative regulatory domain (121â318), a kinase domain
(390â660), and a non-catalytic C-terminus (661-947).23 The negative regulatory domain contains
two structural motifs, a basic region (127-147)âsimilar to that of a leucine zipperâand a prolinerich repeat sequence (250-317). The two regions of the negative regulatory domain have been
shown to control the signaling function of NIK as deletions of either motif greatly enhances the
NF-ÎB inducing activity of NIK.24 The C-terminal region is responsible for the signaling function
of NIK and is critical for its recognition of other proteins within the pathway, including IKKÎą and
NF-ÎşB2. As a member of the mitogen-activated protein (MAP) kinase kinase kinase (MAP3K)
family, NIK contains the conserved DFG-APE motif (534-566) shared among serine/threonine
kinases. Kinases in this family typically are activated via phosphorylation of serines or threonines
homologous to Ser549, Thr552, and Thr559 in NIK.
Previous structural and biochemical studies investigating the regulation of NIK have drawn
interest into understanding how NIK maintains its structural integrity for its function.25 A crystal
structure of truncated NIK, encompassing residues 330-679, was shown to retain catalytic

65

activity.23 However, unlike other kinase domains, this catalytically competent conformation is
maintained by an N-terminal extension prior to the kinase domain rather than through a
phosphorylation event. Mutations that mimic the state of being phosphorylated, termed
phosphomimetic mutations, including S549D, T552E/E, and T559D/E, were considered, however,
it was determined that the activity of the mutants was similar to that of the wild-type (WT).26 The
crystal structure contained two molecules of NIK in an asymmetric arrangement that formed a
head-to tail homodimer with a buried solvent-accessible surface area between the interface of 2900
Ă2.
Additional studies were carried out to further investigate whether the observed dimer
arrangement represented the actual dimer interface or a nonspecific crystallographic artifact.
Chromatographic analyses indicated that the protein in solution formed both monomers and
dimers. While the full NIK protein is known to be an oligomer, what the dimeric arrangement of
the crystal corresponds to remains elusive.25 It has been demonstrated that dimerization, along with
other oligomeric arrangements, is significant among many protein kinases as the oligomeric state
represents a specific functional state that couples with local conformational changes within a
protomer, as small as local sidechain or segment movements to larger domain movements, that
dictate how the protein interacts with other biomolecules.27â29
The active site of NIK has been characterized as having structural features that resemble a
catalytically active conformation similar to other Ser/Thr kinases in the active state.23,26,30â32 Small
molecule binding sites in NIK have been identified in crystallographic studies. The nucleotide
binding mode has been shown to be consistent with that of protein kinases that undergo
phosphorylation in the activation segment, including the cyclic AMP-dependent kinase.26,33
Several inhibitors have been identified to bind within the catalytic segment. Despite the diversity

66

of the compounds in regards to the structure and activity properties, the compounds have a similar
scaffold that relates to the purine base of ATP, containing a nitrogen heterocycle that binds near
the hinge region between the N and C lobes and interacts with the backbone of Leu471, Leu472,
and Glu470. The crystal structure of the truncated NIK oligomer shows that the activation segment
of one subunit interacts with the N-terminal extension of the other. In this structure, each subunit
is crystallized with ATP-Îł-S in the activation segment, however, the conformation of the oligomer
interface may hinder the binding of other substrates.
Previous

work

investigating

the

conformational-activity

and

structure-activity

relationships of IKKÎ˛ identified that protein-protein complexes crystallized with surface mutations
or phosphomimetic mutations do not always represent the true conformational state and that
dynamics are essential.34 Additionally, it was demonstrated that through the use of molecular
simulation and by introducing mutations and the binding of small molecules to the crystal
structure, allosteric responses could be observed on both a local and a global scale. As the function
of a protein is determined by its structure, which is impacted by the surrounding environment, the
activity of proteins is extremely sensitive to changes in the environment. Specifically, changes in
the intracellular temperature, pressure, pH, and ionic strength can all affect the structure and
function of proteins.35â40 The aforementioned conditions can also influence crystal growth as
oligomerizing may be significant for crystallization. Since the activity of a protein is measured by
its ability to effectively bind and communicate with other molecules, it is essential to understand
the environmental sensitivity of the protein recognition of other molecules involved in abnormal
signal transduction mechanisms. Previous studies have observed that cancerous cells, relative to
normal cells, have different acidic environments and intracellular ionic compositions

67

41â44

,

however, how this directly impacts protein interactions with other molecules within the signaling
cascade have not been investigated.
In this work, computer simulation and modeling techniques are employed to investigate
the crystallographic arrangement corresponds to a stable dimer and offer additional insight into
how the structure-function properties of NIK are dependent upon the intracellular
microenvironment and contribute to the its stabilization. By investigating both the inactive and
active states of NIK in conjunction with different intracellular ionic strengths, the assessment of
the protein-protein interactions provide further resolution of the oligomeric properties of the
truncated construct and explicitly highlight the role of the microenvironment in the maintenance
of the active conformational states of NIK. The inactive state of NIK, also referred to as the wildtype (WT) system, is compared to an active state of NIK which has a phosphomimetic mutation
on Ser549 (S549D). Furthermore, to evaluate the impact of the environment on NIKâs structure,
100 mM of sodium chloride was added to both the inactive and active states, represented as WT
[Na+] and S549D [Na+] systems, respectively, and compared to the neutralized systems with 0
mM sodium chloride.
4.2

Material and Methods
The crystal structure of NIK, determined with two phosphomimetic mutant (S549D)

molecules, was obtained from the Protein Data Bank (PDB ID: 4DN5)26 and prepared using
Molecular Operating Environment 45 (MOE). Unresolved bonds and atoms were added using MOE
and minimized under the Amber99 force field. Mutations were introduced to model both the
inactive (wild-type) state and active state, via S549D phosphomimetic mutations. Both models of
the inactive and active states were neutralized with 8 Na+ ions and solvated in 14 Ă of TIP4P-Ew46
water beyond the solute in a cuboid box, resulting in an addition of 27560 water molecules. From

68

these, two additional models were created by solvating with 100 mM of sodium chloride by adding
136 molecules of NaCl (68 of each ion).
The energy of each solvated system was minimized holding a harmonic restraint on the
protein of 500, 200, and 20 kcal/mol-A2, using 500 steepest descent steps and 500 conjugate
gradient steps per harmonic restraint and a non-bond cutoff of 10 Ă, followed by 50 ps of NPT
MD with a 20 kcal/mol-A2 restraint. Subsequently, the energy of each system was further
minimized under decreasing restraints on the protein of 10, 5, and 0 kcal/mol using 250 steepest
descent steps and 250 conjugate gradient steps. After the energy minimization, the systems were
heated gradually to 300K over 50 ps, followed by 1 ns of NPT MD simulation. Four independent
simulations were performed for 215 ns at 300K using AMBER 1447 in the presence of 0 mM and
100 mM of sodium chloride and under the Amber99SB48 force field. The temperature was
controlled using Langevin dynamics with a collision frequency of 1 ps-1. The Berendsen barostat
was used with isotropic position scaling with a pressure relaxation time of 2 ps to maintain the
pressure at 1 bar.
Trajectory analysis was performed using AmberTools and visualized with VMD49.
Thermodynamic parameters and root mean-squared deviations (RMSDs) from the starting
structure were performed using the CPPTRAJ module50 of AmberTools. Hydrogen bonding
analysis was performed with the CPPTRAJ module of AmberTools, where a hydrogen bond was
defined by a cutoff distance of 3.5 Ă and a donor-hydrogen-acceptor angle of 180ÂşÂą60Âş. Free
energy calculations were carried out using MMPBSA.py.51 The electrostatic solvation energies
were calculated for every ps using three GB models: the pairwise GB models of Hawkins and coworkers52â54 (HCT) and two modified models developed by Onufriev and coworkers55 (OBC1 and
OBC2), setting the dielectric constants to 1 and 80 for the molecule and the solvent, respectively.

69

Since only relative free energy trends were of interest, solute entropy was neglected. The PISA
(Protein Interfaces, Surfaces, and Assemblies) software was used on static structures of the initial
crystal structure and the final frames of the four independent simulations to calculate properties of
the macromolecular interface.56
4.3

Results and Discussion
To assess the stability of the simulation, thermodynamic parameters are extracted from the

trajectories and plotted as a function of time (Table 4.1). Analysis of the thermodynamic
parameters shows that the total energy, temperature, and density fluctuated around a common
value, reflecting that the simulations are stable under the isobaric-isothermal ensemble and that
the properties investigated have physical significance.

Table 4.1 Average thermodynamic parameters from simulations.

WT
WT [Na+]
S549D
S549D [Na+]

Density (g/mL)
1.032 Âą 0.001
1.038 Âą 0.001
1.033 Âą 0.001
1.038 Âą 0.001

Total Energy (kcal/mol)
-264465.5 Âą 352.6
-276116.7 Âą 351.1
-267543.1 Âą 343.2
-279172.3 Âą 354.4

Temperature (K)
300.01 Âą 0.96
300.01 Âą 0.96
300.01 Âą 0.97
300.01 Âą 0.96

4.3.1 Structural Fluctuation
Local side-chain dynamics are structurally significant as they often couple with larger
conformational dynamics and are central for understanding correlations between protein-function.
The flexibility of NIK was observed by measuring the RMSD to compare its deviations from both
the starting structure and the average structure. As shown in Figure 4.1, the RMSD of WT [Na+]
generally deviated less than the WT, but the mutant S549D [Na+] deviated more than the S549D.
This indicates that the introduction of the mutation increases the sensitivity of the protein to solvent
ions. The fluctuations of the neutral WT and the S549D converged and stabilized early in the
70

simulation in which the mutant deviated less than the WT; however, the addition of the ion buffer
perturbed this relationship. By comparing the neutral systems with the [Na+] systems, it is clear
that the presence of the [Na+] buffer stabilizes the WT more than the mutant. To further
characterize these motions, the root mean-squared fluctuation (RMSF) was measured to identify
whether the deviations were consistent across the full structure. The RMSF plot indicates that the
N-terminal segment and part of the kinase domain (Val338 to Ala438) is stabilized throughout the
simulation, whereas the activation loop and the C-terminal (Ser581 to Pro675) are more dynamic
regions (Figure 4.2). Both systems containing [Na+] resulted in significantly lower deviations,
suggesting that [Na+] maintains an accessible area for catalytic activity. The results show that the
presence of the ionic environment results in a reduced flexibility of the NIK structure, which may
potentially mediate its catalytic activity. As discussed earlier, the C-terminal plays a role in
controlling the signaling function of NIK while the N-terminal maintains its catalytically
competent conformation; here the results agree.

71

Figure 4.1 Root mean square deviations (RMSDs) obtained from the 215 ns simulation.
(A) The RMSDs were calculated from the starting structure of the inactive (WT) and
phosphomimetic mutant (S549D) states in a neutral solution (color coded in blue and green,
respectively) and a 100 mM NaCl solution ([Na+]; color coded in red and purple,
respectively). For comparison, plots B and C highlight the trends between the inactive and
the active states, whereas the impact of the ionic environment is highlighted in plots D and
E. The darker colored lines represent the moving average per 100 frames.
72

40

WT

WT [Na+]

S549D

S549D [Na+]

35

RMSF (Ă)

30
25
20
15
10
5
0
332 375 418 461 504 547 590 633 333 376 419 462 505 548 591 634

Figure 4.2 Root mean square fluctuations (RMSF) of each protomer of NIK. The RMSFs over
the 215 ns were calculated per residue for each protomer (332-675) for the modeled inactive
states, WT and WT [Na+] (represented in blue and red, respectively), and active states, S549D
and S549D [Na+] (represented in green and purple, respectively).

4.3.2

Changes in Solvent Accessibility
Driving forces that promote the functional and dysfunctional association or dissociation of

proteins can be measured by physicochemical properties.57â59 In particular, the solvent accessible
surface area is useful for detecting and analyzing changes in the protein conformation induced as
a result of point-mutations, ligand binding as well the binding of other proteins. 60 The structure
of the NIK protein was crystallized with two molecules related by a 2-fold symmetry, which is a
common point group symmetry observed in biological homodimers.61,62 Together, the two
molecules had a surface area of 29656 Ă2 in which 2932 Ă2 is buried between the two units (Table
4.2) However, this was based on a static structure that was crystalized with magnesium, g-ATP,
and crystallizing agents.

73

Table 4.2 Summary of the dimer interface of NIK.
Surface Area Buried Area
ÎGint
ÎGdiss
TÎSdiss
Modelab
(Ă2)
(Ă2)
(kcal/mol) (kcal/mol) (kcal/mol)
4DN5a
29656.2
2931.8
-4.9
3.9
13.9
WT
29509.2
3692.0
-5.2
3.5
13.8
WT [Na+]
30740.6
2894.5
-8.9
2.8
14.1
S549D
33479.4
492.8
-1.6
-11.2
13.2
S549D [Na+]
31878.6
2829.6
1.8
-4.5
13.8
a
The crystal structure of the NIK dimer.
b
Predictions of the structural and chemical properties correspond to the last trajectory (215 ns).
The free energy of formation (ÎGint) and free energy of dissociation (ÎGdiss) are reported in
kcal/mol.

Using a static image from the last frame of simulation, the changes in the total and buried
surface area of the four models are summarized in Table 4.2. Looking at the changes between the
crystal structure and WT, the change in the surface area is small (-147 Ă2), however the change in
the buried surface area increases by 760 Ă2. This is interesting because the crystal structure
corresponds to a phosphomimic state of NIK, as reflected by S549D, which the surface area
increases by 3823 Ă2 and the buried surface area decreases by 2439 Ă2 (meaning that the unit is
dissociating/associating).
For the models that include an ionic strength relative to the crystal structure, the surface area
of WT [Na+] and S549D [Na+] increases by 1084 Ă2 and 2222 Ă2, respectively, and buried surface
area decreases at by 37 Ă2 and 102 Ă2, respectively, suggesting that the presence of ions may have
an explicit role in stabilizing or destabilizing interactions between the dimer interface. Comparing
the changes between systems with and without an ion buffer, the change in the surface area of WT
to WT [Na+] increases by 1231 Ă2 and the buried surface area decreases by 798 Ă2. In contrast,
for the phosphomimic mutant model, the surface area decreases by 1601 Ă2 (from S549D to S549D
[Na+]) and the buried surface area increases by 2337 Ă2, which means that the dimer interface
becomes more exposed.
74

By analyzing the dynamic changes of a reference protein structure, a mutant or modified
state can be examined in the absence of a corresponding experimentally resolved structure to
observe how the reference system responds to changes. To characterize the effect of the ionic
composition on the dimerization, the accessible surface area (ASA) was measured to determine if
either the phosphomimetic mutation or the ionic strength induced conformational changes that
expose the dimer to more solvent (Figure 4.3). The neutral WT and S549D have an average ASA
of 28359Âą450 and 29481Âą390 Ă2, respectively, and a greater average ASA of 28840Âą511 and
29912Âą591 Ă2, respectively, for the [Na+] systems. After 8 ns, the ASA for both states fluctuate
around a common value, in which the ASA of S549D is nearly 3000 Ă2 greater than the WT. In
contrast, the ASA for the [Na+] systems increases consistently, in which the S549D [Na+] is
around 2000 Ă2 greater than the WT [Na+]. Comparing each state with and without the presence
of an ion buffer, the trends in the ASA diverge with separations from 1000-3000 Ă2 for the WT
and 1000-2000 Ă2 for the S549D model after 6 ns. These trends indicate that the S549D
phosphomimetic mutation induces conformational changes that make the NIK dimer more exposed
to solvent and that the presence of the ion buffer impacts the structure more than the mutation
alone.
4.3.3

Hydrogen-Bonding Analysis/Dimer vs Monomer
Structural-based chromatographic spectroscopy and spectrometry studies were used to

determine whether the two asymmetric molecules resolved in the crystal structure of the truncated
NIK protein are representative of the homodimer as expressed in the full-length protein or a
crystallographic artifact. 26 However, the oligomeric arrangement was unable to be resolved as it
was determined that NIK in solution is both monomeric and dimeric. To quantify the gain or loss
of interactions between the dimer interface upon mutation, a hydrogen bond analysis was

75

performed. As illustrated in Table 4.3, upon the activation of NIK, hydrogen bonding interactions
between the dimer interface decrease in comparison to the WT. Interestingly, three strong
hydrogen bond donor-acceptor pairs, (Phe436-Arg368; Glu434-Ser367; Val435-Arg368), were
identified in the neutral WT that were not observed in the other models. Furthermore, the loss of
Thre559, NIKâs phosphorylation site, as an acceptor of Lys373 in both [Na+] systems shows that
the presence of sodium alters the phosphorylation activity of NIK. The results suggest that the
presence of the sodium buffer alters hydrogen bonding networks. Regarding the dimerization, the
trends indicate that the crystallized structure of NIK may resemble the homodimer.
Table 4.3 Hydrogen bonding analysis of the NIK dimer interface.a
Y
(Unit A)

X-H
(Unit B)

Ser371
Gly558
Thr561
Ser371
Glu395
Thr561
Arg432
Glu461
Trp464
Ser410
Glu395
Asp554
Gly412
Glu396
Pro370
Pro370
Asp519
Glu413
Hie402
Pro370
Glu560
Gly558
Arg408
Asn466
Leu551
Gln403

Glu560
Ser371
Lys373
Thr559
Arg408
Pro372
Phe411
Arg408
Gly412
Arg432
Arg405
Arg368
Trp464
Arg405
Pro557
Met563
Lys373
Lys430
Gln403
Val568
Pro370
Pro370
Glu461
Arg408
Arg368
Gln403

WT

81%
77%
54%
39%
38%
34%
32%
30%
29%
27%
25%
23%
19%
19%
16%
15%
15%
14%
11%
11%
10%
10%
10%
10%
2%

WT
[Na+]

80%
66%
3%
47%
8%
23%
36%
9%
43%
14%
14%
25%
22%
3%
6%
6%
12%
2%
1%
3%

S549D

75%
67%
19%
36%
30%
7%
35%
18%
14%
11%
18%
12%
24%
20%
29%
9%
7%
12%
0%
8%
7%
18%
1%
2%
76

S549D
[Na+]

97%
72%
7%
47%
14%
3%
8%
19%
18%
42%
47%
49%
3%
14%
11%
1%
5%
9%
11%
2%
51%
40%

Table 4.3 (contâd)
a

The hydrogen bond acceptor (Y) and donor (X-H) pairs observed from the simulations are
denoted by percent occupations.
4.3.4

Aggregation Propensity Sensitive to Ion Buffer
To characterize the aggregation propensity of the two dimer states, the radius of gyration

was measured to differentiate the effects caused by the ionic strength versus the mutation (Figure
4.4). Relative to the respective neutral systems, the presence of the ionic buffer in the WT and
S549D marginally enhances the rate of the dimer dissociation. For the neutral systems, the radius
of gyration plateaus after 8 ns indicating that the dissociation of the dimer is less dependent upon
the mutation and that the oligomerization is sensitive to the ionic strength. There are several
exposed cavities between the dimer interface in which solvent molecules are occupied for the
duration of the simulation (Figure 4.5). It should be noted that the presence of sodium ions within
these hydrated cavities was observed.

Figure 4.3 Solvent residing between the dimer interface.

77

Figure 4.4 Changes in the compactness of the crystal structure.

4.3.5

Comparisons Between the Dimer Binding Energies
To estimate the effects of the ionic strength in terms of the energetic property between the

inactive and active state, the relative binding free energies were evaluated (Table 4.4; Figure 4.5).
Using the pairwise screening approach (HCT), relative to the WT, the binding free energy of the
S549D decreases. The presence of ions enhances the association and dissociation of the WT and
S549D, respectively. The electrostatic component of the binding free energy is sensitive to input
parameters involved within the calculationâincluding the internal dielectric constant, the probe
radius, and the force fieldâand can lead to unreliable absolute free energy predictions. For this
only relative binding free energetics are considered. In order to evaluate the consistency of the
trends between the active and inactive states, additional free energy GB solvation models,
including the OBCn and GBn models, were used to calculate the dimer binding free energies; in
all cases the trends held.

78

Figure 4.5 Binding free energy trends with the HCT and OBC Generalized Born models.

Table 4.4 Binding free energies of the NIK dimer.a
Model
HCT
OBC1
WT
-126.8 -80.9
WT [Na+]
-118.1 -76.1
S549D
-126.1 -76.3
S549D [Na+] -134.4 -81.0
a
All units are reported in kcal/mol.

4.4

OBC2
-75.2
-71.2
-68.5
-75.0

Conclusions
The structure of a protein determines the binding ability and specificity of small molecules

and other proteins. Disruptions that cause changes in the native structure of NIK usually serve as
the basis for mutations and incorrect gene expressions that lead to diseases and cancers. Changes
in the intracellular compositions of healthy cells are a potential factor for disrupting protein
structure. The results in this study indicate that an increase in [Na+] tends to further aid in the
79

stabilization of the inactive state of NIK and the destabilization of active state. This implies that
increases in sodium concentration in cells can actually disrupt protein structure and lead to
mutations and diseases. This claim is also supported by studies that indicate that there is a
considerable increase in the sodium in many tumor cells43, which is most likely through sustained
depolarization of their cell membranes and high mitotic activity of the cancer cells. These shifts in
ionic concentrations within the cells may change the environment in which proteins such as NIK
function. These could lead to changes in the function of the protein and the onset of cancer.
This computational study was conducted to determine structure-function properties of the
NIK protein. NIK is of interest due to its important role in the NF-kB pathway, a pathway that
regulates many inflammatory and autoimmune processes. The results show a homodimeric
structure with sensitivity towards ionic presence. Furthermore, the introduction of the
phosphomimetic mutation on the 549th residue significantly impacted the behavior and
subsequently the function of NIK. The S549D mutation induces conformational changes that make
the NIK dimer more exposed to solvent. The increase in solvent-accessibility due to the mutation
also serves to make NIK even more sensitive to an ionic presence. The addition of the ions then
causes the structural flexibility of NIK to increase. These changes brought on by the ionic
sensitivity of NIK then cause structural fluctuations, specifically in the C-terminal and N-terminal
regions. These fluctuations in structure then result in changes in kinase domain, allowing for the
surrounding residues to acclimate in order to allow NIK to maintain the active conformation. This
is further supported in that the presence of the sodium buffer alters hydrogen bonding networks
and results in changes in the bonding patterns. Together, these results provide further insight on
how NIK maintains an active conformation.

80

REFERENCES

81

REFERENCES
(1)

Bonizzi, G.; Karin, M. The Two NF-ÎşB Activation Pathways and Their Role in Innate and
Adaptive Immunity. Trends Immunol. 2004, 25 (6), 280â288.

(2)

Courtois, G.; Gilmore, T. D. Mutations in the NF-kappaB Signaling Pathway:
Implications for Human Disease. Oncogene 2006, 25 (51), 6831â6843.

(3)

Hoesel, B.; Schmid, J. a. The Complexity of NF-ÎşB Signaling in Inflammation and
Cancer. Mol. Cancer 2013, 12 (1), 86.

(4)

Rayet, B.; Gelinas, C. Aberrant Rel/nfkb Genes and Activity in Human Cancer. Oncogene
1999, 18 (49), 6938â6947.

(5)

Karin, M. NF-kappaB and Cancer: Mechanisms and Targets. Mol. Carcinog. 2006, 45 (6),
355â361.

(6)

Naugler, W. E.; Karin, M. NF-kappaB and Cancer-Identifying Targets and Mechanisms.
Curr. Opin. Genet. Dev. 2008, 18 (1), 19â26.

(7)

Karin, M.; Lin, A. NF-kappaB at the Crossroads of Life and Death. Nat. Immunol. 2002, 3
(3), 221â227.

(8)

Saccani, S.; Pantano, S.; Natoli, G. Modulation of NF-ÎşB Activity by Exchange of
Dimers. Mol. Cell 2003, 11 (6), 1563â1574.

(9)

Wang, V. Y.-F.; Huang, W.; Asagiri, M.; Spann, N.; Hoffmann, A.; Glass, C.; Ghosh, G.
The Transcriptional Specificity of NF-ÎşB Dimers Is Coded within the ÎşB DNA Response
Elements. Cell Rep. 2012, 2 (4), 824â839.

(10) Chen, F. E.; Ghosh, G. Regulation of DNA Binding by Rel/NF-kappaB Transcription
Factors: Structural Views. Oncogene 1999, 18 (49), 6845â6852.
(11) Wietek, C.; OâNeill, L. A. J. Diversity and Regulation in the NF-ÎşB System. Trends
Biochem. Sci. 2007, 32 (7), 311â319.
(12) Thanos, D.; Maniatis, T.; Il-, P. M. a. NF-KB: A Lesson in Family Values Minireview.
Cell 1995, 80, 529â532.
(13) Chariot, A. The NF-ÎşB-Independent Functions of IKK Subunits in Immunity and Cancer.
Trends Cell Biol. 2009, 19 (8), 404â413.
(14) Xiao, G.; Harhaj, E. W.; Sun, S.-C. NF-ÎşB-Inducing Kinase Regulates the Processing of
NF-ÎşB2 p100. Mol. Cell 2001, 7 (2), 401â409.
(15) Sun, S.-C. Non-Canonical NF-ÎşB Signaling Pathway. Cell Res. 2011, 21 (1), 71â85.
(16) Ling, L.; Cao, Z.; Goeddel, D. V. NF-kappaB-Inducing Kinase Activates IKK-Alpha by
Phosphorylation of Ser-176. Proc. Natl. Acad. Sci. U. S. A. 1998, 95 (7), 3792â3797.
82

(17) Malinin, N. L.; Boldin, M. P.; Kovalenko, A. V; Wallach, D. MAP3K-Related Kinase
Involved in NF-kappaB Induction by TNF, CD95 and IL-1. Nature 1997, 385 (6616),
540â544.
(18) Ramakrishnan, P.; Wang, W.; Wallach, D. Receptor-Specific Signaling for Both the
Alternative and the Canonical NF-kB Activation Pathways by NF-kB-Inducing Kinase.
Immunity 2004, 21 (4), 477â489.
(19) Smith, C.; Andreakos, E.; Crawley, J. B.; Brennan, F. M.; Feldmann, M.; Foxwell, B. M.
J. NF- B-Inducing Kinase Is Dispensable for Activation of NF- B in Inflammatory
Settings but Essential for Lymphotoxin Receptor Activation of NF- B in Primary Human
Fibroblasts. J. Immunol. 2001, 167 (10), 5895â5903.
(20) Thu, Y. M.; Richmond, A. NF-ÎşB Inducing Kinase: A Key Regulator in the Immune
System and in Cancer. Cytokine Growth Factor Rev. 2010, 21 (4), 213â226.
(21) Keats, J. J.; Fonseca, R.; Chesi, M.; Schop, R.; Baker, A.; Chng, W.-J.; Van Wier, S.;
Tiedemann, R.; Shi, C.-X.; Sebag, M.; Braggio, E.; Henry, T.; Zhu, Y.-X.; Fogle, H.;
Price-Troska, T.; Ahmann, G.; Mancini, C.; Brents, L. a.; Kumar, S.; Greipp, P.;
Dispenzieri, A.; Bryant, B.; Mulligan, G.; Bruhn, L.; Barrett, M.; Valdez, R.; Trent, J.;
Stewart, a. K.; Carpten, J.; Bergsagel, P. L. Promiscuous Mutations Activate the
Noncanonical NF-ÎşB Pathway in Multiple Myeloma. Cancer Cell 2007, 12 (2), 131â144.
(22) Yin, L.; Wu, L.; Wesche, H.; Arthur, C. D.; White, J. M.; Goeddel, D. V; Schreiber, R. D.
Defective Lymphotoxin-Beta Receptor-Induced NF-kappaB Transcriptional Activity in
NIK-Deficient Mice. Science 2001, 291 (5511), 2162â2165.
(23) de Leon-Boenig, G.; Bowman, K. K.; Feng, J. a; Crawford, T.; Everett, C.; Franke, Y.;
Oh, A.; Stanley, M.; Staben, S. T.; Starovasnik, M. a; Wallweber, H. J. a; Wu, J.; Wu, L.
C.; Johnson, A. R.; Hymowitz, S. G. The Crystal Structure of the Catalytic Domain of the
NF-ÎşB Inducing Kinase Reveals a Narrow but Flexible Active Site. Structure 2012, 20
(10), 1704â1714.
(24) Xiao, G.; Sun, S. C. Negative Regulation of the Nuclear Factor Kappa B-Inducing Kinase
by a Cis-Acting Domain. J. Biol. Chem. 2000, 275 (28), 21081â21085.
(25) Tao, Z.; Ghosh, G. Understanding NIK Regulation from Its Structure. Structure 2012, 20
(10), 1615â1617.
(26) Liu, J.; Sudom, A.; Min, X.; Cao, Z.; Gao, X.; Ayres, M.; Lee, F.; Cao, P.; Johnstone, S.;
Plotnikova, O.; Walker, N.; Chen, G.; Wang, Z. Structure of the Nuclear Factor ÎşBInducing Kinase (NIK) Kinase Domain Reveals a Constitutively Active Conformation. J.
Biol. Chem. 2012, 287 (33), 27326â27334.
(27) Byron, O.; Vestergaard, B. Proteinâprotein Interactions: A Supra-Structural Phenomenon
Demanding Trans-Disciplinary Biophysical Approaches. Curr. Opin. Struct. Biol. 2015,
35, 76â86.
83

(28) Keskin, O.; Nussinov, R. Similar Binding Sites and Different Partners: Implications to
Shared Proteins in Cellular Pathways. Structure 2007, 15 (3), 341â354.
(29) Rothweiler, U.; Ăberg, E.; Johnson, K. a.; Hansen, T. E.; JĂ¸rgensen, J. B.; Engh, R. a.
P38Îą MAP Kinase Dimers with Swapped Activation Segments and a Novel Catalytic
Loop Conformation. J. Mol. Biol. 2011, 411 (2), 474â485.
(30) Li, K.; McGee, L. R.; Fisher, B.; Sudom, A.; Liu, J.; Rubenstein, S. M.; Anwer, M. K.;
Cushing, T. D.; Shin, Y.; Ayres, M.; Lee, F.; Eksterowicz, J.; Faulder, P.; Waszkowycz,
B.; Plotnikova, O.; Farrelly, E.; Xiao, S. H.; Chen, G.; Wang, Z. Inhibiting NF-ÎşBInducing Kinase (NIK): Discovery, Structure-Based Design, Synthesis, Structure-Activity
Relationship, and Co-Crystal Structures. Bioorganic Med. Chem. Lett. 2013, 23 (5), 1238â
1244.
(31) Castanedo, G. M.; Blaquiere, N.; Beresini, M.; Bravo, B.; Brightbill, H.; Chen, J.; Cui, H.F.; Eigenbrot, C.; Everett, C.; Feng, J.; Godemann, R.; Gogol, E.; Hymowitz, S.; Johnson,
A.; Kayagaki, N.; Kohli, P. B.; KnĂźppel, K.; Kraemer, J.; KrĂźger, S.; Loke, P.; McEwan,
P.; Montalbetti, C.; Roberts, D. A.; Smith, M.; Steinbacher, S.; Sujatha-Bhaskar, S.;
Takahashi, R.; Wang, X.; Wu, L. C.; Zhang, Y.; Staben, S. T. Structure-Based Design of
Tricyclic NF-ÎşB Inducing Kinase (NIK) Inhibitors That Have High Selectivity over
Phosphoinositide-3-Kinase (PI3K). J. Med. Chem. 2017, 60 (2), 627â640.
(32) Mortier, J.; Masereel, B.; Remouchamps, C.; Ganeff, C.; Piette, J.; Frederick, R. NF-ÎşB
Inducing Kinase (NIK) Inhibitors: Identification of New Scaffolds Using Virtual
Screening. Bioorg. Med. Chem. Lett. 2010, 20 (15), 4515â4520.
(33) Johnson, L. N.; Noble, M. E. M.; Owen, D. J. Active and Inactive Protein Kinases:
Structural Basis for Regulation. Cell (Cambridge, Massachusetts) 1996, 85 (2), 149â158.
(34) Jones, M. R.; Liu, C.; Wilson, A. K. Molecular Dynamics Studies of the ProteinâProtein
Interactions in Inhibitor of ÎşB Kinase-Î˛. J. Chem. Inf. Model. 2014, 54 (2), 562â572.
(35) Zhang, J. Protein-Protein Interactions in Salt Solutions. In Protein-Protein Interactions Computational and Experimental Tools; Cai, W., Ed.; InTech, 2012.
(36) Date, M. S.; Dominy, B. N. Modeling the Influence of Salt on the Hydrophobic Effect and
Protein Fold Stability. Commun. Comput. Phys. 2013, 13 (1), 90â106.
(37) MĂśller, J.; Schroer, M. A.; Erlkamp, M.; Grobelny, S.; Paulus, M.; Tiemeyer, S.; Wirkert,
F. J.; Tolan, M.; Winter, R. The Effect of Ionic Strength, Temperature, and Pressure on the
Interaction Potential of Dense Protein Solutions: From Nonlinear Pressure Response to
Protein Crystallization. Biophys. J. 2012, 102 (11), 2641â2648.
(38) KĹĂ­Ĺž, Z.; KlusĂĄk, J.; KriĹĄtofĂ­kovĂĄ, Z.; KoÄa, J. How Ionic Strength Affects the
Conformational Behavior of Human and Rat Beta Amyloids â A Computational Study.
PLoS One 2013, 8 (5), e62914.
(39) Aykut, A. O.; Atilgan, A. R.; Atilgan, C. Designing Molecular Dynamics Simulations to
84

Shift Populations of the Conformational States of Calmodulin. PLoS Comput. Biol. 2013,
9 (12), e1003366.
(40) Kony, D. B.; HĂźnenberger, P. H.; van Gunsteren, W. F. Molecular Dynamics Simulations
of the Native and Partially Folded States of Ubiquitin: Influence of Methanol Cosolvent,
pH, and Temperature on the Protein Structure and Dynamics. Protein Sci. 2007, 16 (6),
1101â1118.
(41) Cheung, C. Y.; Ko, B. C. NFAT5 in Cellular Adaptation to Hypertonic Stress Regulations and Functional Significance. J. Mol. Signal. 2013, 8 (1), 5.
(42) Fischer, O. M.; Hart, S.; Gschwind, A.; Ullrich, A.; Prenzel, N. Oxidative and Osmotic
Stress Signaling in Tumor Cells Is Mediated by ADAM Proteases and Heparin-Binding
Epidermal Growth Factor Oxidative and Osmotic Stress Signaling in Tumor Cells Is
Mediated by ADAM Proteases and Heparin-Binding Epidermal Growth Factor. Mol. Cell.
Biol. 2004, 24 (12), 5172â5183.
(43) Nagy, I. Z.; Lustyik, G.; Nagy, V. Z.; ZarĂĄndi, B.; Bertoni-Freddari, C. Intracellular
Na+:K+ Ratios in Human Cancer Cells as Revealed by Energy Dispersive X-Ray
Microanalysis. J. Cell Biol. 1981, 90 (3), 769â777.
(44) Uhlik, M. T.; Abell, A. N.; Johnson, N. L.; Sun, W.; Cuevas, B. D.; Lobel-Rice, K. E.;
Horne, E. a; DellâAcqua, M. L.; Johnson, G. L. Rac-MEKK3-MKK3 Scaffolding for p38
MAPK Activation during Hyperosmotic Shock. Nat. Cell Biol. 2003, 5 (12), 1104â1110.
(45) Chemical Computing Group Inc. Molecular Operating Environment (MOE). Scientific
Computing & Instrumentation. Chemical Computing Group Inc: 1010 Sherbooke St.
West, Suite #910, Montreal, QC, Canada, H3A 2R7 2016.
(46) Horn, H. W.; Swope, W. C.; Pitera, J. W.; Madura, J. D.; Dick, T. J.; Hura, G. L.; HeadGordon, T. Development of an Improved Four-Site Water Model for Biomolecular
Simulations: TIP4P-Ew. J. Chem. Phys. 2004, 120 (20), 9665â9678.
(47) Case, D. A.; Darden, T.; Iii, T. E. C.; Simmerling, C.; Brook, S.; Roitberg, A.; Wang, J.;
Southwestern, U. T.; Duke, R. E.; Hill, U.; Luo, R.; Irvine, U. C.; Roe, D. R.; Walker, R.
C.; Legrand, S.; Swails, J.; Cerutti, D.; Kaus, J.; Betz, R.; Wolf, R. M.; Merz, K. M.;
State, M.; Seabra, G.; Janowski, P.; Paesani, F.; Liu, J.; Wu, X.; Steinbrecher, T.; Gohlke,
H.; Homeyer, N.; Cai, Q.; Smith, W.; Mathews, D.; Salomon-ferrer, R.; Sagui, C.; State,
N. C.; Babin, V.; Luchko, T.; Gusarov, S.; Kovalenko, A.; Berryman, J.; Kollman, P. A.
AMBER 14. University of California, San Francisco.
(48) Hornak, V.; Abel, R.; Okur, A.; Strockbine, B.; Roitberg, A.; Simmerling, C. Comparison
of Multiple Amber Force Fields and Development of Improved Protein Backbone
Parameters. Proteins Struct. Funct. Bioinforma. 2006, 65 (3), 712â725.
(49) Prall, M. VMD: A Graphical Tool for the Modern Chemists. J. Comput. Chem. 2001, 22
(1), 132â134.
85

(50) Roe, D. R.; Cheatham, T. E. PTRAJ and CPPTRAJ: Software for Processing and Analysis
of Molecular Dynamics Trajectory Data. J. Chem. Theory Comput. 2013, 9 (7), 3084â
3095.
(51) Miller III, B. R.; McGee Jr., T. D.; Swails, J. M.; Homeyer, N.; Gohlke, H.; Roitberg, A.
E. MMPBSA.py: An Efficient Program for End-State Free Energy Calculations. J. Chem.
Theory Comput. 2012, 8 (9), 3314â3321.
(52) Hawkins, G. D.; Cramer, C. J.; Truhlar, D. G. Pairwise Solute Descreening of Solute
Charges from a Dielectric Medium. Chem. Phys. Lett. 1995, 246 (1,2), 122â129.
(53) Hawkins, G. D.; Cramer, C. J.; Truhlar, D. G. Parametrized Models of Aqueous Free
Energies of Solvation Based on Pairwise Descreening of Solute Atomic Charges from a
Dielectric Medium. J. Phys. Chem. 1996, 100 (51), 19824â19839.
(54) Tsui, V.; Case, D. A. Theory and Applications of the Generalized Born Solvation Model
in Macromolecular Simulations. Biopolymers 2001, 56 (4), 275â291.
(55) Onufriev, A.; Bashford, D.; Case, D. a. Exploring Protein Native States and Large-Scale
Conformational Changes with a Modified Generalized Born Model. Proteins Struct.
Funct. Bioinforma. 2004, 55 (2), 383â394.
(56) Krissinel, E.; Henrick, K. Inference of Macromolecular Assemblies from Crystalline State.
J. Mol. Biol. 2007, 372 (3), 774â797.
(57) Pechmann, S.; Levy, E. D.; Tartaglia, G. G.; Vendruscolo, M. Physicochemical Principles
That Regulate the Competition between Functional and Dysfunctional Association of
Proteins. Proc. Natl. Acad. Sci. U. S. A. 2009, 106 (25), 10159â10164.
(58) Jones, S.; Thornton, J. M. Principles of Protein-Protein Interactions. Proc. Natl. Acad. Sci.
U. S. A. 1996, 93 (1), 13â20.
(59) Marsh, J. a.; Teichmann, S. a. Relative Solvent Accessible Surface Area Predicts Protein
Conformational Changes upon Binding. Structure 2011, 19 (6), 859â867.
(60) Palma, R.; Curmi, P. M. G. Computational Studies on Mutant Protein Stability: The
Correlation between Surface Thermal Expansion and Protein Stability. Protein Sci. 2008,
8 (4), 913â920.
(61) Bahadur, R. P.; Chakrabarti, P.; Rodier, F.; Janin, J. A Dissection of Specific and NonSpecific Protein-Protein Interfaces. J. Mol. Biol. 2004, 336 (4), 943â955.
(62) Goodsell, D. S.; Olson, A. J. Structural Symmetry and Protein Function. Annu. Rev.
Biophys. Biomol. Struct. 2000, 29, 105â153.

86

CHAPTER 5 Selectivity in ROS-Induced Peptide Backbone Bond Cleavage 2

5.1

Introduction
Radical-mediated modifications to lipids, proteins, and nucleic acids have been identified

to influence the progression of diseases, including degenerative disorders, cancers, diabetes as well
as aging.1-10 Damaged nucleic acids and lipids have specific repair mechanisms; however,
damaged proteins are usually marked for degradation. The removal pathways for defective proteins
are not always efficient and can lead to accumulations that interfere with cellular functions.11
Irreversible backbone and side chain oxidations can occur as a result of exposure to different types
of radical species, including reactive oxygen species (ROS), reactive nitrogen species (RNS), and
species generated from the radiolysis of water.12,13 Production of ROS in organisms occurs
naturally resulting from cell metabolism under normal physiological conditions and can be
enriched in alternative environments induced by stress.14-19
Although many external sources of ROS exist, ROS can be produced in excess by
numerous physiological processes.2,14 Normally, the prooxidative activity is balanced with
antioxidative defense mechanisms.14,20,21 Imbalances induced from excessive ROS result in
damaged proteins and other biomolecules necessary for maintaining healthy human physiological
functions. Two hypotheses suggest why amyloid deposits come into existence after this damage
occurs: the amyloid cascade hypothesis and an alternative hypothesis.22,23 In the amyloid cascade
hypothesis, Î˛-amyloid is viewed as the cause of neurodegeneration, whereas, in the alternative
hypothesis, Î˛-amyloid is the bodyâs defense mechanism against imbalanced stress. One example

2

This chapter is presented in its entirety from: Stringfellow, H. M.; Jones, M. R.; Green, M. C.; Wilson, A. K.;
Francisco, J. S. Selectivity in ROS-Induced Peptide Backbone Bond Cleavage. J. Phys. Chem. A 2014, 118 (48),
11399â11404.

87

of excess ROS production occurs in dysfunctional or defective complex III of the mitochondrial
electron transport chain.24-29 In humans, neuronal cells secrete extracellular proteins in misfolded
agglomerations (a structure termed Î˛-amyloid) which function as an extracellular shield against
more deleterious lipid oxidation capable of damaging vital cellular membranes.18-20,30 The initial
deposits of Î˛-amyloids incite a positive feedback mechanism that increases Î˛-amyloid
production.31 Higher concentrations of Î˛-amyloid in the brain could result in the dysfunction of
neuronal cells, leading to neurodegeneration and dementia that eventually lead to an Alzheimer's
diagnosis.3,32-35 A generic Î˛-amyloid structure contains multiple Î˛-pleated sheet motifs that
contribute to its structural stability.36-39 These Î˛-pleated sheets consist of straight âÎ˛-strandâ
components connected by random coils (âÎ˛-turnâ structures) that are positioned in either parallel
or antiparallel orientations to each other.40
All protein side chains are vulnerable towards oxidation; however, each side chain
contributes to a different reactivity with ROS. For example, amino acids containing sulfurs are
defenseless towards oxidation, in which cysteines and methionines are amended to disulfides and
sulfoxide derivatives, respectively.2,3,41 Protein backbones can be oxidized directly by ROS as well
as by internal free-radical transfers42 and are known to undergo fragmentation upon oxidation.
Berlett and Stadtman suggested that proteins undergo oxidation upon attack by ROS generated by
Fenton catalysis.2,4 Superoxide radicals attack the protein deposits at the Îą-carbon sites to form
alkoxyl radicals (Figure 5.1) that further endure fragmentation via two pathways in which
complete and terminated oxidation of the protein can occur. In the diamide pathway (Pathway
[A]), the carbon-carbon bond of the peptide backbone (i) can undergo homolytic cleavage to
produce a diamide (ii) and an isocyanate radical (iii), which then undergoes termination to form
an isocyanate (iv). Alternatively in the Îą-amidation pathway (Pathway [B]), the carbon-nitrogen

88

bond encompassing the Îą-carbon site of radicalization can undergo homolytic cleavage to produce
an N-Îą-ketoacyl derivative (v) and an amide radical, which terminates via the addition of peroxyl
radical to form an amide (vii). The present study focuses on the selectivity of these two pathways
that bifurcate from the formation of the alkoxyl radical at each Îą-carbon site and also examines the
selectivity of each potential radicalization site.

Figure 5.1 Proposed reaction pathways. The diamide (pathway [A]) and Îą-amidation (pathway
[B]) pathways were modeled in this study. The different sites of radicalization C1, C2, and C3
represent CÎą1, CÎą2 and CÎą3, respectively.
Quantum mechanics and molecular mechanics have provided insight about general peptide
structures. Most investigations of small peptides focus on the folding and misfolding mechanisms
rather than on structural disruptions.43-48 A challenge for ab initio modeling of peptides is to
simulate all of the appropriate features of a generic peptide without allowing size limitations to
alter any of those intrinsic characteristics. As the molecule size directly affects the computational
expense of first principles calculations, it is important to maintain balances among the molecular
89

modeling options without affecting relevant chemical features of the peptide. Ab initio works by
Francisco and co-workers show that the Îą-carbon site is the preferred site of radical attack in a
capped tripeptide model.49,50 The 42 atom capped tripeptide model is an appropriate molecular size
for ab inito methods given the computational cost of the methods chosen. Previously, it has been
demonstrated that capping the ends of smaller peptide structures is important to avoid introducing
additional ionic influences into the energies of neutral peptide systems modeled quantum
mechanically.46,51,52 In terms of side-chains, peptide models of trialanine structures have been
shown to replicate the Î˛-turn features of the Î˛-pleated sheet;45 larger side-chain functional groups,
however, have the potential to produce steric hindrance that limits radical formation. Although
glycine residues have been observed to be less reactive than alanine, they have also been found to
undergo intramolecular Îą-hydrogen abstraction reactions, introducing additional charge onto the
peptide structure.53-55
In this study, pathways [A] and [B] were modeled to determine the effects of ROS attack
at N-terminal, internal, and C-terminal carbon sites. Two models representing a Î˛-strand and a Î˛turn (both important motifs of the Î˛-amyloid fibril) were compared for each pathway to elucidate
potential site preference for backbone bond cleavage. The results of these calculations suggest a
thermodynamic preference for pathway [A]. The impact of method and basis set choice on the
energetics of the system were also investigated.
5.2

Methodology
The model system was constructed of 42 atoms to mimic a peptide backbone composed of

three Îą-carbon sites (Figure 5.2). In each model of the parent strand, one Îą-carbon was radicalized
via the substitution of an alkoxyl radical in place of a hydrogen at each site, as per the Berlett and
Stadtman mechanism.2 The model features N-terminal, C-terminal, and internal Îą-carbon sites.

90

Methyl caps on both sides of the peptide minimize possible steric, charge, and Îą-hydrogen transfer
effects. Alanine residues eliminate the possibility of side chain interactions. Both Î˛-strand and Î˛turn structural conformations are modeled to provide insight into possible site preference for
backbone cleavage within a secondary structure. In the model system, all equilibrium geometries
were optimized using second order MĂ¸ller-Plesset perturbation correction (MP2)56-59 and density
functional theory (DFT) using the B3LYP functional60,61 with the 6-31G(d)62,63 and 6-31+G(d)6466

basis sets. To assess the effectiveness of DFT in calculating the energetics of this system, the

B3LYP functional was utilized similarly with the same basis sets. Coupled cluster with singles,
doubles, and perturbed triples (CCSD(T))67-69 with the 6-31G(d) basis set were computed from
MP2/6-31G(d) stationary points and served as the reference values for this system. Reaction
barriers (âErxn) for the N-CÎą and C-CÎą bond fission per site and conformation were determined
from optimizations of the transition states that were carried out with the respective method in
conjunction with the 6-31G(d) basis set using the Berny algorithm from the parent strand with an
elongated bond length of the cleaved bond. All reaction enthalpies included zero-point and thermal
corrections to 298 K.
5.3
5.3.1

Results and Discussion
Pathway Favorability
The CCSD(T) results are summarized in Table 5.1. The results of the method and basis set

comparisons are summarized in Table 5.2 and Table 5.3. The results generated through the use of
both B3LYP and MP2 using 6-31G(d) and 6-31+G(d) basis sets were unable to reproduce the
reference values that the CCSD(T) data generated, meaning that they were determined to be
inadequate in describing this type of radical-peptide system. Thus the CCSD(T) data will be the

91

only data referenced in this discussion hereafter. Both pathways are exothermic, as expected from
the predictions of Berlett and Stadtman.

Figure 5.2 Modeled backbone conformations. (1) Represents the Î˛-strand or âsheet-likeâ
component of a Î˛-pleated-sheet. (2) The Î˛-turn structure. In both conformations, the first Îącarbon has an alkoxyl radical in place of a hydrogen at CÎą1.
Table 5.1 Calculated reaction enthalpies for Pathway [A] and Pathway [B] at the CCSD(T) level
of theory.
Pathway [A]
Reaction
Site
Î˛-strand
CÎą1
4.8
ÎHrxn(1)
CÎą2
23.7
CÎą3
-0.4
CÎą1
-30.4
ÎHrxn(2)
CÎą2
-48.8
CÎą3
-45.6
CÎą1
-25.6
ÎHtotal
CÎą2
-25.1
CÎą3
-46.0
C-CÎą bond fission
Î˛-strand
ÎErxn
CÎą1
8.8
CÎą2
8.7
CÎą3
7.2
a
All energies are in kcal/mol.

Î˛-turn
30.1
23.6
31.3
-60.2
-44.8
-45.6
-30.2
-21.1
-14.3
Î˛-turn
5.5
6.0
5.7

92

Pathway [B]
Î˛-strand
38.8
40.5
30.1
-33.9
-35.2
-18.2
4.9
5.3
11.8
N-CÎą bond fission
Î˛-strand
22.9
23.6
24.5

Î˛-turn
43.9
33.6
36.7
-33.9
-34.8
-49.4
10.0
-1.2
-12.6
Î˛-turn
24.1
26.0
27.6

Overall, Pathway [A] is much more energetically favorable than Pathway [B] by 30.4 to
57.4 kcal/mol for the Î˛-strand conformation and by 1.7 to 40.2 kcal/mol for the Î˛-turn
conformation. Pathway [A] is to be the more energetically favorable pathway because of the nature
of the structures formed through the process of complete fragmentation. The carbon immediately
adjacent to the alkoxyl-radicalized Îą-carbon (referred to from this point on as âCÎąâ) is doublybonded to an O atom, so that the resulting carbonyl functional group withdraws electron density
from that CÎą atom, straining the CÎą-C bond to the point of cleavage. Reaction A2 and B2, the
second steps of each pathway, involve the termination of a radical on the peptide fragment structure
involved in the reaction. It is striking that in Pathway [A] a radical is propagated through the
production of â˘OOH, while in Pathway [B] both products are completely terminated.
From this study of a trialanine peptide model it has been determined that there are no
absolute trends regarding conformational preference. The Î˛-strand conformation generally favors
fragmentation via Pathway [A], while the Î˛-turn conformation generally favors fragmentation via
Pathway [B]. Intramolecular noncovalent interactions are more prevalent in the Î˛-turn structures
than in the Î˛-strand structures. In the results of our trialanine peptide study, intramolecular
noncovalent interactions stabilize the biomolecule and reduce the favorability of C-CÎą bond fission
(âErxn in Table 5.1) for the Î˛-turn conformation. These results are verified by a study conducted
by Chin et al.46 on the local secondary structure conformations of dipeptides and tripeptides, that
show that tripeptides are indeed more stable in the Î˛-turn conformation than in the Î˛-strand
conformation due to an increase in the intramolecular noncovalent interactions between the
backbone and side chains. Other studies investigating reactivity and conformational specificity in
peptide fragments also report similar advantages in side-chain stability from intramolecular
noncovalent interactions.54,55 Additionally, Îą-carbon hydrogen abstractions from the peptide

93

backbones in the Î˛-turn conformation do not induce major conformational changes in the gas phase
due to intramolecular noncovalent stabilization.53 However, in spite of this similarity in results for
the Î˛-turn conformation following C-CÎą bond fission, our results also identify a striking difference
with regard to N-CÎą bond fission. The intramolecular noncovalent stabilization that caused the Î˛turn conformation to have less energetic gain from undergoing C-CÎą bond fission does not impact
the favorability of the Î˛-turn conformation to undergo N-CÎą bond fission.
The CÎą3 site is the most favorable site of radicalization in Pathway [A] for the Î˛-strand
conformation, but the CÎą1 site is the most favorable site of radicalization for the Î˛-turn
conformation. The greatest difference between the most energetically favorable site and the next
closest value is in Pathway [A], where reacting at the CÎą3 site in the Î˛-strand structure produces
an enthalpy 20.4 kcal/mol lower in energy than the next most favorable site. In Pathway [B], the
Î˛-turn structure has the next greatest difference in enthalpy; reacting at the CÎą3 site has a 11.4
kcal/mol advantage over the next lowest value. Site selectivity is not consistent from one
conformation or reaction to the next; however, it is still an important factor in determining overall
favorability of the reactions in question.
These results are corroborated with the calculated reaction barriers for the C-CÎą and N-CÎą
bond fissions, as shown in Table 5.1. For all three sites of radicalization, the C-CÎą bond fission
has a much lower reaction barrier than that of N-CÎą bond scission, by an average difference of
15.4 kcal/mol for the Î˛-strand and by 20.2 kcal/mol for the Î˛-turn. The reaction barriers reveal that
the C-CÎą bond fission reaction barrier is consistently lower in the Î˛-turn conformation than in the
Î˛-strand conformation by an average of 2.5 kcal/mol. Conversely, the N-CÎą bond fission reaction
barrier is consistently lower in the Î˛-strand conformation than in the Î˛-turn conformation by an
average of 2.2 kcal/mol. This difference in barrier energies for the two conformations by fission

94

type indicates that intramolecular noncovalent stabilization contributes to the stability of Î˛-turn
reactants relative to products for Pathway [B] (which involves N-CÎą fission) but not for Pathway
[A]. In Pathway [A], another factor must be contributing to the relative stability of the Î˛-strand to
increase the reaction barrier of the C-CÎą bond fission.
5.3.2

Role of Structure on Bond Strength and Site Reactivity
The three radical Îą-carbon sites do not harbor identical bond strengths. This is consistent

with previous findings that not all Îą-carbon radicals are equally stable.16,70 The C-CÎą bond at site
CÎą3 within the Î˛-strand conformation is the strongest C-CÎą bond to be broken from the structures
analyzed. The differences in enthalpy between the Î˛-strand and Î˛-turn models reveal that due to
structural differences, the C-CÎą bond at CÎą3 is much stronger in the Î˛-strand conformation than
in the Î˛-turn conformation of the peptide. This is because more energy is released upon Î˛-strand
C-CÎą bond fission than through Î˛-turn C-CÎą bond fission at the same relative position. The average
energy of the C-CÎą bonds in the model is 18.9 kcal/mol. The stronger C-CÎą bond at the CÎą2 site
of the two conformations is in the Î˛-turn structure. The data further indicates that internal C-CÎą
bond fission releases more energy in the Î˛-turn conformational construct than in the Î˛-strand
counterpart. The strongest N-CÎą bond site of all the sites is located at the CÎą3 site. For this type
of bond fission, the conformation of the model does not play a significant a role in the strength of
the bond. The N-CÎą bonds have an average energy of 37.3 kcal/mol. The C-CÎą bonds have a
greater range of energies throughout the molecule than do the N-CÎą bonds. The relative location
of the backbone C-CÎą bond within the peptide structure plays a greater role in determining the
energy of that bond than does the relative location of the backbone N-CÎą bond analog.

95

5.3.3

Insights into Pathway Preferences
The importance of the second reaction (Reaction A2 or B2) in defining the energetic

favorability of the reaction pathway varies. The isocyanate radical termination step is influential
in determining the favorability of completely following reaction Pathway [A]. Similarly, in
Pathway [B], the reaction enthalpy of the pathway becomes comparably favorable because of the
amide radical termination step.
For all sites of ROS-induced bond fission (all three CÎą sites), Pathway [A] is more
energetically favorable than the Pathway [B]. In Pathway [A], the location of the site at which
bond fission occurs plays a key role in identifying the strongest bonds. In the Î˛-strand, the C-CÎą3
is the strongest bond, stronger than the other C-CÎą bonds by an average of 14.7 kcal/mol, and
stronger than any of the N-CÎą bonds by an average of 36.9 kcal/mol. In the Î˛-turn, the strengths
of all N-CÎą and C-CÎą bonds are more similar. For the Î˛-turn structure, the C-CÎą2 bond is strongest,
stronger than the other C-CÎą bonds by an average of 7.1 kcal/mol, and stronger than the N-CÎą
bonds by an average of 14.5 kcal/mol. In Pathway [B], the most important step in determining the
overall energetic favorability of the pathway is the re-association of the reactive amide radical.
The difference in reaction barriers of peptide backbone bond fission at each site of
radicalization plays an important role in overall pathway preference. The differences in the barriers
are consistent by site and conformation to determining this preference. The reaction barrier of CCÎą bond fission is notably lower than that of N-CÎą bond fission, by an average difference of 15.4
kcal/mol in the Î˛-strand conformation and by 20.2 kcal/mol in the Î˛-turn conformation. These
trends confer a definite energetic preference for Pathway [A] over Pathway [B]. Both the
individual reaction steps and the first energy barrier of Pathway [A] are lower in energy than
Pathway [B]. The present study neglects the electronic nature of neighboring amino acid side-

96

chains, which is known to influence backbone cleavage. Within the current approach, the sampled
conformations (Î˛-strand and Î˛-turn) are appropriate for describing potential configurational states
of the trialanine peptide due to the lack of flexibility and reactivity of the side-chain. The
introduction of alternative side-chains would require a more elaborate approach that assesses
conformational rearrangements and radical migrations. Assessing how the selectivity in backbone
fragmentation changes as a result of neighboring side-chains is suggested as important follow-up.
5.3.4

Evaluation of Methods
Enthalpies computed using MP2 and B3LYP (Table 5.2) were able to point to the presence

of selectivity between the Pathway [A] and Pathway [B]. However, these methods were unable
to reproduce the selectivity trends generated by the CCSD(T) calculations (summarized in Table
5.1). Although adding diffuse functions had a slight impact on the enthalpy per each individual
fragments, the additional basis functions had small significance in the prediction of the reaction
enthalpies.
The uniqueness of the CCSD(T) results in describing the trends of ROS-induced peptide
backbone fragmentation implies that the self-consistent single-, double-, and triple-electron
coupling contributes significantly. Notably, less expensive methods to observe this chemistry are
unable to replicate the reactions and thus cannot be used to correctly predict the favored sites or
conformations of ROS-induced peptide backbone bond cleavage. The results also indicate that in
spite of the attractiveness of using MP2, a more comprehensive method than B3LYP but a lessexpensive method than CCSD(T), to determine the relative site and conformational favorability,
MP2 was not observed to provide any additional insight than B3LYP towards elucidating the
trends observed with CCSD(T).

97

Table 5.2 Calculated reaction enthalpies for Pathway [A] and Pathway [B].
Pathway [A]
Site

Method/Basis
CCSD(T)/6-31G(d)
MP2/6-31G(d)
CÎą1
MP2/6-31+G(d)
B3LYP/6-31G(d)
B3LYP/6-31+G(d)
CCSD(T)/6-31G(d)
MP2/6-31G(d)
CÎą2
MP2/6-31+G(d)
B3LYP/6-31G(d)
B3LYP/6-31+G(d)
CCSD(T)/6-31G(d)
MP2/6-31G(d)
CÎą3
MP2/6-31+G(d)
B3LYP/6-31G(d)
B3LYP/6-31+G(d)
a
All energies are in kcal/mol.

Î˛-strand
ÎH[A1] ÎH[A2]
4.8 -30.4
-7.4 -42.6
-9.6 -40.0
-6.8 -53.5
-10.0 -53.3
23.7 -48.8
-5.8 -43.5
-7.5 -40.7
-5.4 -54.5
-8.7 -54.3
-0.4 -45.6
-7.7 -41.4
-8.0 -40.2
-4.1 -54.8
-7.4 -54.0

ÎHtot
-25.6
-50.1
-49.6
-60.3
-63.4
-25.1
-49.3
-48.2
-59.9
-63.0
-46.0
-49.1
-48.2
-58.8
-61.4

Î˛-turn
ÎH[A1] ÎH[A2]
30.3 -60.2
2.8 -43.1
4.5 -42.6
2.7 -56.8
-1.4 -57.2
23.6 -44.8
-5.3 -41.1
-4.0 -40.6
-6.9 -54.6
-11.4 -52.6
31.3 -45.6
1.4 -41.4
-2.8 -40.2
0.1 -54.8
-3.7 -54.0

ÎHtot
-30.2
-40.3
-38.2
-54.1
-58.6
-21.1
-46.4
-44.6
-61.5
-64.0
-14.3
-40.0
-43.0
-54.6
-57.8

Pathway [B]
Site

Method/Basis
CCSD(T)/6-31G(d)
MP2/6-31G(d)
CÎą1
MP2/6-31+G(d)
B3LYP/6-31G(d)
B3LYP/6-31+G(d)
CCSD(T)/6-31G(d)
MP2/6-31G(d)
CÎą2
MP2/6-31+G(d)
B3LYP/6-31G(d)
B3LYP/6-31+G(d)
CCSD(T)/6-31G(d)
MP2/6-31G(d)
CÎą3
MP2/6-31+G(d)
B3LYP/6-31G(d)
B3LYP/6-31+G(d)
a
All energies are in kcal/mol.

ÎH[B1]
38.8
13.5
13.8
16.2
14.2
40.5
15.3
16.5
13.7
11.3
30.1
24.7
15.9
23.4
20.7

Î˛-strand
ÎH[B2] ÎHtot
-33.9
4.9
-47.3 -33.8
-47.3 -33.6
-25.9
-9.6
-26.3 -12.1
-35.2
5.3
-48.9 -33.6
-48.8 -32.4
-23.5
-9.8
-23.3 -12.0
-18.2 11.8
-49.2 -24.5
-49.1 -33.2
-23.8
-0.4
-23.5
-2.7

98

ÎH[B1]
43.9
18.3
19.2
13.8
10.0
33.6
26.1
26.3
21.7
9.0
36.7
19.4
16.1
16.8
13.2

Î˛-turn
ÎH[B2]
-33.9
-47.3
-47.3
-21.6
-21.6
-34.8
-49.6
-48.6
-26.0
-25.5
-49.4
-46.6
-46.6
-25.4
-25.1

ÎHtot
10.0
-29.0
-28.1
-7.7
-11.6
-1.2
-23.4
-22.4
-4.3
-16.5
-12.6
-27.1
-30.5
-8.6
-12.0

Table 5.3 Calculated reaction barriers for reactions [A1] and [B1].
Î˛-strand
ÎEâĄ[A1]
8.8
11.0
3.5
8.7
10.7
3.9
7.2
9.0
2.3

Site
CÎą1

Method/Basis
CCSD(T)/6-31G(d)
MP2/6-31G(d)
B3LYP/6-31G(d)
CÎą2
CCSD(T)/6-31G(d)
MP2/6-31G(d)
B3LYP/6-31G(d)
CÎą3
CCSD(T)/6-31G(d)
MP2/6-31G(d)
B3LYP/6-31G(d)
a
All energies are in kcal/mol.
5.4

ÎEâĄ [B1]
22.9
26.9
20.9
23.6
28.0
21.7
24.5
28.5
20.2

Î˛-turn
ÎEâĄ [A1]
5.5
8.2
2.1
6.0
10.1
1.3
5.7
8.1
1.3

ÎEâĄ [B1]
24.1
29.2
21.4
26.0
35.4
19.5
27.6
35.1
19.9

Conclusion
This study utilized ab initio and density functional methods to investigate site, pathway,

and conformational trends associated with the ROS-mediated fragmentation of the trialanine
peptide. The results show that the reactions associated with the two pathways of the Berlett and
Stadtman mechanisms of ROS-induced peptide backbone bond cleavage are exothermic and
provide a quantified estimate of their relative selectivity. Pathway [A] is consistently more
energetically favorable than Pathway [B], and site preferences are not determined strictly by their
positions as N-terminal, internal, or C-terminal. Energetic analysis reveals that intramolecular
noncovalent stabilization contributes to conformational selection for fragmentation following
Pathway [A] only. Calculation of reaction barriers confirms the energetic analysis that C-CÎą bond
fission (following Pathway [A]) is much more favorable to occur than N-CÎą bond fission
(following Pathway [B]). The results identify the key backbone bonds (the C-CÎą3 bond in the Î˛strand and the C-CÎą2 bond in the Î˛-turn) that contribute to the overall stability of the Î˛-pleated
sheet motif. Furthermore, because this work identifies the most favorable types of fragmentation
within the Î˛-pleated sheet, this research can be utilized to predict how secreted amyloid precursor
99

proteins will fragment in response to ROS bombardment. The ability to predict the structural
features involved in these fragmentation mechanisms is significant towards understanding the role
of oxidative stress in neurodegenerative disease pathologies.

100

REFERENCES

101

REFERENCES
(1)

Nathan, C.; Cunningham-Bussel, A. Beyond Oxidative Stress: An Immunologist's
Guide to Reactive Oxygen Species. Nat. Rev. Immunol. 2013, 13, 349-361.

(2)

Berlett, B. S.; Stadtman, E. R. Protein Oxidation in Aging, Disease, and Oxidative
Stress. J. Biol. Chem. 1997, 272, 20313-20316.

(3)

Stadtman, E. R.; Berlett, B. S. Reactive Oxygen-Mediated Protein Oxidation in Aging
and Disease. Drug Metab. Rev. 1998, 30, 225-243.

(4)

Stadtman, E. R. Role of Oxidant Species in Aging. Curr. Med. Chem. 2004, 11, 11051112.

(5)

Cai, Z.; Yan, L. J. Protein Oxidative Modifications: Beneficial Roles in Disease and
Health. J. Biochem. Pharmacol. 2013, 1, 15-26.

(6)

Cecarini, V.; Gee, J.; Fioretti, E.; Amici, M.; Angeletti, M.; Eleuteri, A. M.; Keller, J.
N. Protein Oxidation and Cellular Homeostasis: Emphasis on Metabolism. Biochim.
Biophys. Acta 2007, 1773, 93-104.

(7)

Sung, C.; Hsu, Y.; Chen, C.; Lin, Y.; Wu, C. Oxidative Stress and Nucleic Acid
Oxidation in Patients with Chronic Kidney Disease. Oxid. Med. Cell. Longev. 2013,
2013, 1-15.

(8)

Quaye, I. K. Oxidative Stress in Human Health and Disease. In Insight and Control of
Infectious Disease in Global Scenario [Online]; Priti, R., Ed. InTech: Rijeka, Croatia,
2012; Chapter 6, pp 97-120. http://www.intechopen.com/books/insight-and-control-ofinfectious-disease-in-global-scenario/oxidative-stress-in-health-and-disease (accessed
February 18, 2014).

(9)

Aruoma, O. I. Free Radicals, Oxidative Stress, and Antioxidants in Human Health and
Disease. J. Am. Oil Chem. Soc. 1998, 75, 199-212.

(10)

Hybertson, B. M.; Gao, B.; Bose, S. K.; McCord, J. M. Oxidative Stress in Health and
Disease: The Therapeutic Potential of Nrf2 Activation. Mol. Aspects Med. 2011, 32,
234-246.

(11)

Dunlop, R. A.; Brunk, U. T.; Rodgers, K. J. Oxidized Proteins: Mechanisms of
Removal and Consequences of Accumulation. IUBMB Life 2009, 61, 522-527.

(12)

Sayre, L. M.; Moreira, P. I.; Smith, M. A.; Perry, G. Metal Ions and Oxidative Protein
Modification in Neurological Disease. Ann. Ist. Super. Sanita 2005, 41, 143-164.

(13)

Cabiscol, E.; Tamarit, J.; Ros, J. Oxidative Stress in Bacteria and Protein Damage by
Reactive Oxygen Species. Int. Microbiol. 2000, 3, 3-8.

102

(14)

Halliwell, B. Reactive Oxygen Species in Living Systems: Source, Biochemistry, and
Role in Human Disease. Am. J. Med. 1991, 91, 14S-22S.

(15)

Davies, K. J. Protein Damage and Degradation by Oxygen Radicals. I. General
Aspects. J. Biol. Chem. 1987, 262, 9895-9901.

(16)

Davies, M. J.; Dean, R. T. Radical-mediated protein oxidation: from chemistry to
medicine; Oxford science publications; Oxford University Press: Oxford; New York,
1997; pp 443.

(17)

Garrison, W. M. Reaction Mechanisms in the Radiolysis of Peptides, Polypeptides, and
Proteins. Chem. Rev. (Washington, DC, U.S.) 1987, 87, 381-398.

(18)

Imlay, J. Cellular Defenses Against Superoxide and Hydrogen Peroxide. Annu. Rev.
Biochem. 2008, 77, 755.

(19)

Davies, M. J. The Oxidative Environment and Protein Damage. Biochim. Biophys.
Acta. 2005, 1703, 93-109.

(20)

Elias, R. J.; Kellerby, S. S.; Decker, E. A. Antioxidant Activity of Proteins and
Peptides. Crit. Rev. Food Sci. Nutr. 2008, 48, 430-441.

(21)

Gracy, R. W.; Talent, J. M.; Kong, Y.; Conrad, C. C. Reactive Oxygen Species: The
Unavoidable Environmental Insult? Mutat. Res. 1999, 428, 17-22.

(22)

Jomova, K.; Vondrakova, D.; Lawson, M.; Valko, M. Metals, Oxidative Stress and
Neurodegenerative Disorders. Mol. Cell. Biochem. 2010, 345, 91-104.

(23)

Hardy, J.; Selkoe, D. J. The Amyloid Hypothesis of Alzheimer's Disease: Progress and
Problems on the Road to Therapeutics. Science (Washington, DC, U.S.) 2002, 297,
353-356.

(24)

Turrens, J. F. Mitochondrial Formation of Reactive Oxygen Species. J. Physiol.
(Oxford, U.K.) 2003, 552, 335-344.

(25)

Yamamori, T.; Yasui, H.; Yamazumi, M.; Wada, Y.; Nakamura, Y.; Nakamura, H.;
Inanami, O. Ionizing Radiation Induces Mitochondrial Reactive Oxygen Species
Production Accompanied by Upregulation of Mitochondrial Electron Transport Chain
Function and Mitochondrial Content under Control of the Cell Cycle Checkpoint. Free
Radical Biol. Med. 2012, 53, 260-270.

(26)

Chen, Q.; Moghaddas, S.; Hoppel, C. L.; Lesnefsky, E. J. Ischemic Defects in the
Electron Transport Chain Increase the Production of Reactive Oxygen Species from
Isolated Rat Heart Mitochondria. Am. J. Physiol. 2008, 294, C460-C466.

(27)

Chen, Q.; Vazquez, E. J.; Moghaddas, S.; Hoppel, C. L.; Lesnefsky, E. J. Production of
Reactive Oxygen Species by Mitochondria. J. Biol. Chem. 2003, 278, 36027-36031.

103

(28)

Sanz, A.; Stefanatos, R.; McIlroy, G. Production of Reactive Oxygen Species by the
Mitochondrial Electron Transport Chain in Drosophila Melanogaster. J. Bioenerg.
Biomembr. 2010, 42, 135-142.

(29)

Liu, Y.; Fiskum, G.; Schubert, D. Generation of Reactive Oxygen Species by the
Mitochondrial Electron Transport Chain. J. Neurochem. 2002, 80, 780-787.

(30)

Wang, L. L.; Xiong, Y. L. Inhibition of Lipid Oxidation in Cooked Beef Patties by
Hydrolyzed Potato Protein Is Related to Its Reducing and Radical Scavenging Ability.
J. Agric. Food Chem. 2005, 53, 9186-9192.

(31)

Qin, L.; Liu, Y.; Cooper, C.; Liu, B.; Wilson, B.; Hong, J. Microglia Enhance Î˛Amyloid Peptide-Induced Toxicity in Cortical and Mesencephalic Neurons By
Producing Reactive Oxygen Species. J. Neurochem. 2002, 83, 973-983.

(32)

Davies, M. J.; Fu, S.; Wang, H.; Dean, R. T. Stable Markers of Oxidant Damage to
Proteins and Their Application in the Study of Human Disease. Free Radical Biol.
Med. 1999, 27, 1151-1163.

(33)

Allan Butterfield, D. Amyloid Î˛-Peptide (1-42)-Induced Oxidative Stress and
Neurotoxicity: Implications for Neurodegeneration in Alzheimer's Disease Brain. A
Review. Free Radical Res. 2002, 36, 1307-1313.

(34)

Eisenberg, D.; Nelson, R.; Sawaya, M. R.; Balbirnie, M.; Sambashivan, S.; Ivanova, M.
I.; Madsen, A. Ă; Riekel, C. The Structural Biology of Protein Aggregation Diseases:
Fundamental Questions and Some Answers. Acc. Chem. Res. 2006, 39, 568-575.

(35)

Conrad, C. C.; Marshall, P. L.; Talent, J. M.; Malakowsky, C. A.; Choi, J.; Gracy, R.
W. Oxidized Proteins in Alzheimer's Plasma. Biochem. Biophys. Res. Commun. 2000,
275, 678-681.

(36)

Qiang, W.; Yau, W.; Luo, Y.; Mattson, M. P.; Tycko, R. Antiparallel Î˛-Sheet
Architecture in Iowa-Mutant Î˛-Amyloid Fibrils. Proc. Natl. Acad. Sci. U. S. A. 2012,
109, 4443-4448.

(37)

Paravastu, A. K.; Leapman, R. D.; Yau, W.; Tycko, R. Molecular Structural Basis for
Polymorphism in Alzheimer's Î˛-Amyloid Fibrils. Proc. Natl. Acad. Sci. U. S. A. 2008,
105, 18349-18354.

(38)

Luhrs, T.; Ritter, C.; Adrian, M.; Riek-Loher, D.; Bohrmann, B.; Dobeli, H.; Schubert,
D.; Riek, R. 3D Structure of Alzheimer's Amyloid-Î˛(1-42) Fibrils. Proc. Natl. Acad.
Sci. U. S. A. 2005, 102, 17342-17347.

(39)

Coles, M.; Bicknell, W.; Watson, A. A.; Fairlie, D. P.; Craik, D. J. Solution Structure of
Amyloid Î˛-Peptide(1-40) in a Water-Micelle Environment. Is the Membrane-Spanning
Domain Where We Think It Is? Biochemistry 1998, 37, 11064-11077.

104

(40)

Soto, C. Plaque Busters: Strategies to Inhibit Amyloid Formation in Alzheimer's
Disease. Mol. Med. Today 1999, 5, 343-350.

(41)

Levine, R. L.; Moskovitz, J.; Stadtman, E. R. Oxidation of Methionine in Proteins:
Roles in Antioxidant Defense and Cellular Regulation. IUBMB Life 2000, 50, 301-307.

(42)

Faraggi, M.; Tal, Y. The Reaction of the Hydration Electron with Amino Acid,
Peptides, and Proteins in Aqueous Solution: II. Formation of Radicals and Electron
Transfer Reactions. Radiat. Res. 1975, 62, 347.

(43)

Daura, X.; Mark, A. E.; van Gunsteren, W. F. Peptide Folding Simulations: No Solvent
Required? Comput. Phys. Commun. 1999, 123, 97-102.

(44)

Lamm, M. S.; Rajagopal, K.; Schneider, J. P.; Pochan, D. J. Laminated Morphology of
Nontwisting Î˛-Sheet Fibrils Constructed via Peptide Self-Assembly. J. Am. Chem. Soc.
2005, 127, 16692-16700.

(45)

Tashiro, S.; Kobayashi, M.; Fujita, M. Folding of an Ala-Ala-Ala Tripeptide into a Î˛Turn via Hydrophobic Encapsulation. J. Am. Chem. Soc. 2006, 128, 9280-9281.

(46)

Chin, W.; Piuzzi, F.; Dimicoli, I.; Mons, M. Probing the Competition Between
Secondary Structres and Local Prefernces in Gas Phase Isolated Peptide Backbones.
Phys. Chem. Chem. Phys. 2006, 8, 1033-1048.

(47)

Azimi, S.; Rauk, A. The Binding of Fe(II)-Heme to the Amyloid Beta Peptide of
Alzheimer's Disease: QM/MM Investigations. J. Chem. Theory Comput. 2012, 8,
5150-5158.

(48)

Bakker, J. M.; PlĂźtzer, C.; HĂźnig, I.; HĂ¤ber, T.; Compagnon, I.; von Helden, G.; Meijer,
G.; Kleinermanns, K. Folding Structures of Isolated Peptides as Revealed by Gas-Phase
Mid-Infrared Spectroscopy. ChemPhysChem 2005, 6, 120-128.

(49)

Doan, H. Q.; Davis, A. C.; Francisco, J. S. Primary Steps in the Reaction of OH
Radicals with Peptide System: Perspective from a Study of Model Amides. J. Phys.
Chem. A 2010, 114, 5342-5357.

(50)

Green, M. C.; Federov, D. G.; Kitaura, K.; Francisco, J. S.; Slipchenko, L. V. OpenShell Pair Interaction Energy Decomposition Analysis (PIEDA): Formulation and
Application to the Hydrogen Abstraction in Tripeptides. J. Chem. Phys. 2013, 138,
074111.

(51)

Green, M. C.; Stelzleni, S.; Francisco, J. S. Spectral Marker for CÎą Damage in Beta
Peptides. J. Phys. Chem. A 2013, 117, 550-565.

(52)

Turecek, F.; Syrstad, E. A.; Seymour, J. L.; Chen, X.; Yao, C. Peptide Cation-Radicals.
A Computational Study of the Competition Between Peptide N-CÎą Bond Cleavage and
Loss of the Side Chain in the [GlyPhe-NH2 + 2H]+â˘. J. Mass Spectrom. 2003, 38, 10931104.
105

(53)

Cheng, W.; Jang, S.; Wu, C.; Lin, R.; Lu, H.; Li, F. Site Specificity of the ÎąC-H Bond
Dissociation Energy for a Naturally Occurring Î˛-Hairpin Peptide - An Ab Initio Study.
J. Comput. Chem. 2009, 30, 407-414.

(54)

Bhattacharya, A.; Clawson, K. J.; Bernstein, E. R. Influence of Turn (or Fold) and Local
Charge in Fragmentation of the Peptide Analogue Molecule CH3CO-Gly-NH2
Following Single-Photon VUV (118.22 nm) Ionization. J. Phys. Chem. A 2011, 115,
10679-10688.

(55)

Bhattacharya, A.; Shin, J.; Clawson, K. J.; Bernstein, E. R. Conformation Specific and
Charge Directed Reactivity of Radical Cation Intermediates of Îą-Substituted (Amino,
Hydroxy, and Keto) Bioactive Carboxylic Acids. Phys. Chem. Chem. Phys. 2010, 12,
9700-9712.

(56)

Head-Gordon, M.; Pople, J. A.; Frisch, M. J. MP2 Energy Evaluation By Direct
Methods. Chem. Phys. Lett. 1988, 153, 503-506.

(57)

Frisch, M. J.; Head-Gordon, M.; Pople, J. A. A Direct MP2 Gradient Method. Chem.
Phys. Lett. 1990, 166, 275-280.

(58)

SaebĂ¸, S.; AlmlĂśf, J. Avoiding the Integral Storage Bottleneck in LCAO Calculations of
Electron Correlation. Chem. Phys. Lett. 1989, 154, 83-89.

(59)

Head-Gordon, M.; Head-Gordon, T. Analytic MP2 Frequencies Without Fifth-Order
Storage. Theory and Application to Bifurcated Hydrogen Bonds in the Water Hexamer.
Chem. Phys. Lett. 1994, 220, 122-128.

(60)

Becke, A. D. Density-Functional Thermochemistry. III. The Role of Exact Exchange.
J. Chem. Phys. 1993, 98, 5648.

(61)

Lee, C. T.; Yang, W. T.; Parr, R. G. Development of the Colle-Salvetti CorrelationEnergy Formula into a Functional of the Electron Density. Phys. Rev. B. 1988, 37, 785.

(62)

Hariharan, P. C.; Pople, J. A. The Influence of Polarization Functions on Molecular
Orbital Hydrogenation Energies. Theor. Chim. Acta. 1973, 28, 213-222.

(63)

Francl, M. M.; Pietro, W. J.; Hehre, W. J.; Binkley, J. S.; DeFrees, D. J.; Pople, J. A.;
Gordon, M. S. Self-Consistent Molecular Orbital Methods. XXIII. A PolarizationType Basis Set for Second-Row Elements. J. Chem. Phys. 1982, 77, 3654-3665.

(64)

Clark, T.; Chandrasekhar, J.; Spitznagel, G. W.; Schleyer, P. V. R. Efficient Diffuse
Function-Augmented Basis Sets for Anion Calculations. III. The 3-21+G Basis Set for
First-Row Elements, Li-F. J. Comput. Chem. 1983, 4, 294-301.

(65)

Krishnan, R.; Frisch, M. J.; Pople, J. A. Contribution of Triple Substitutions to the
Electron Correlation Energy in Fourth Order Perturbation Theory. J. Chem. Phys.
1980, 72, 4244.

106

(66)

Gill, P. M. W.; Johnson, B. G.; Pople, J. A.; Frisch, M. J. The Performance of the
Becke-Lee-Yang-Parr (B-LYP) Density Functional Theory with Various Basis Sets.
Chem. Phys. Lett. 1992, 197, 499-505.

(67)

CĂ­Ĺžek, J. On the Use of the Cluster Expansion and the Technique of Diagrams in
Calculations of Correlation Effects in Atoms and Molecules. In Advances in Chemical
Physics; Hariharan, P.C., Ed.; Wiley Interscience: New York, 1969; Vol. 14, p. 35.

(68)

Purvis, G. D., III; Bartlett, R. G. A Full Coupled-Cluster Singles and Doubles Model:
The Inclusion of Disconnected Triples. J. Chem. Phys. 1982, 76, 1910-1918.

(69)

Raghavachari, K.; Trucks, G. W.; Pople, J. A.; Head-Gordon, M. A Fifth-Order
Perturbation Comparison of Electron Correlation Theories. Chem. Phys. Lett. 1989, 157
(6), 479â483.

(70)

Headlam, H. A.; Mortimer, A.; Easton, C. J.; Davies, M. J. Î˛-Scission of C-3 (Î˛Carbon) Alkoxyl Radicals on Peptides and Proteins: A Novel Pathway Which Results
in the Formation of Îą-Carbon Radicals and the Loss of Amino Acid Side Chains.
Chem. Res. Toxicol. 2000, 13, 1087-1095.

107

CHAPTER 6 Partition Coefficients for the SAMPL5 Challenge using Transfer Free
Energies

6.1

Introduction
Theoretical approaches can be a useful partner for predicting physiochemical properties

important in experimental design, from drug development, to studies in toxicology and
environmental science. Such predictive approaches can provide guidance in high-throughput
screening for rational design.1,2 An important step in establishing such utility, however, is to ensure
that the approaches are suitable and well-tested.
A popular route towards rationale design is the use of mathematical models to determine
quantitative

structure-activity

relationships

(QSAR)

or

quantitative

structure-property

relationships (QSPR), based on parameters or descriptors that correlate physical and chemical
properties with experimental observations. Though useful, it is difficult for these predictive models
to determine properties coupled with changes in the electronic environment such as those that arise
from solute-solute and solute-solvent interactions.
Measurements of solvation properties of molecules in different solution phases and the
distribution equilibria between the phases are of strong interest in drug discovery as these
properties are critical for drug profiling. Promising drug candidates can be discarded as a result of
inaccurate predictions of such properties; particularly of interest are the partition (P) and
distribution (D) coefficients. The partition coefficient (P) between two phases x and y, respectively,
is the ratio of the concentration of solute C in each phase, where the subscript 0 represents the
neutral, unionized state of the solute.
[C0 ]đŚđŚ

P = [C

0 ]đĽđĽ

108

â [C ]đŚđŚ

D = âđđ[Cđđ ]
đđ

đđ đĽđĽ

In contrast, the distribution coefficient (D) is the actual partitioning or distribution of the total
analytical concentration between two solvents at a fixed pH, which includes all chemical states in
solution (i.e., unionized and ionized states). Although many QSAR and QSPR techniques exist,
because these approaches are based on defining statistical relationships via parameters trained to
fit known experimental properties, this may not be the most suitable approach for predicting
properties for molecules that undergo chemical transformations in different environments, such as
compounds with multiple protonation states or compounds that can undergo structural
rearrangements.3â5 Additionally, many QSAR and QSPR models are paramertized to predict
octanol/water partition coefficients as there is an abundance of reliable experimental data for
available to parameterize this predictive approach. For this, it is ideal to have a simple approach
that relies on less parameterization and is transferable for use with other solvents.
The partition coefficient between two immiscible solvents, such as water and cyclohexane,
is expressed as the equilibrium distribution between the concentrations of a solute in each solvent
and is related to the change in energy associated with the solute-solvent interactions, which is
expressed as the free energy difference, âG, of the solute in each solvent,
log đđ = log ďż˝

[solute]cyclohexane
log đđ
ďż˝ = ďż˝Îđşđşwater â Îđşđşcyclohexane ďż˝ 10
[solute]water
đđđđ

where e is Eulerâs number, k is Boltzmannâs constant, and T is temperature. Predicting the free
energy of solvation for a molecule using chemometric techniques can be challenging due to the
difficulties in parameterizing the solute-solvent interactions, with the many intermolecular forces
that contribute to solvation.
Another route that can be used to predict the partition coefficient is to use quantum
mechanical (QM) approaches, accounting for solvation via either an explicit or implicit route. For
explicit solvation, individual solvent molecules are included in a calculation.
109

While these

approaches can account for non-covalent solute-solvent interactions, they can become
computationally intensive due to the number of solvent molecules that may be needed within the
solvation shell as well as the amount of sampling that would be required. For implicit solvation, a
representation of the solvent is utilized which neglects explicit contributions of the solute-solvent
interactions, resulting in a lower computational cost approach. Implicit solvation approaches,
continuum solvent models, are commonly used and include the Conductor-like Polarizable
Continuum Model (CPCM)6, the COnductor-like Screening MOdel (COSMO)7, and the Solvation
Model based on Density (SMD)8, for predicting the free energies of hydration. Of these solvation
models, SMD is a more portable model for the prediction of solvation free energies, as it uses the
electron density of the solute in contrast to partial charges as used in CPCM and COSMO. In
previous studies predicting the free energies of solvation of small molecules, it has been shown
that the increasing quality of a basis set can improve the prediction, including ab initio approaches9
as well as in hybrid QM and molecular mechanics (MM) approaches such as the QM/MM-nonBoltzmann-Bennett method.10,11
For this investigation, emphasis is on the 13 molecule subset (Batch 0) of the SAMPL5
distribution coefficient molecule set. Using this subset, a variety of hybrid DFT functionals with
different basis sets were used to predict the cyclohexane-water partition coefficient. A vertical
solvation approach, using gas-phase optimized geometries to predict the free energy in solution,
was used for predicting the cyclohexane-water transfer free energy needed for the calculation of
the partition coefficient. The approach that was chosen was applied to the full molecule set of 53
compounds, which corresponds to Submission #40 in the SAMPL5 Distribution Coefficient
Challenge.

110

6.2

Methods
The initial structures issued with the SAMPL5 challenge data set were used as the reference

state for all calculations. Gas phase geometry optimizations and frequency calculations were
performed on the 53 molecules of the SAMPL5 set (shown in Error! Reference source not
found.) using B3LYP12â14 in conjunction with cc-pVTZ basis sets.15â18 B3LYP/cc-pVTZ was
selected due to its well-established success in the prediction of ground state gas-phase structures
for molecules such as those in the SAMPL5 set. Frequencies were examined to ensure that
equilibrium stationary points were reached.

For second-row species such as sulfur, the

recommended form of the correlation consistent basis sets, the augmented tight-d basis sets, ccpV(T+d)Z,19 was used to avoid the deficiencies noted in the original form of the correlation
consistent basis sets for second-row atoms. The correlation consistent basis sets were selected due
to their demonstrated behavior with a broad range of functionals20â29, known to converge with
respect to increasing basis set size (i.e., cc-pVnZ, where n=D, T, Q) to the Complete Basis Set
(CBS), or Kohn-Sham limit, for numerous properties such as thermochemical properties. Though
DFT structures are generally known to reach convergence using a triple-zeta basis set, for
molecules that include sulfur or transition metal species or molecules of increasing size, energetic
properties determined using DFT may not reach convergence unless a basis set of at least
quadruple-zeta quality basis sets are used. Thus, basis sets through quadruple-zeta quality were
considered in this work.

111

Figure 6.1 Structure of the 53 compounds investigated in the SAMPL5 challenge. The 13
molecules represent (left) correspond to the Batch 0 subset.

Many prior studies seeking to predict the solvation free energies of small organic molecules
often chose hybrid DFT methods, however, an optimal functional or optimal functional class has
not yet been agreed upon.8,30â40 In order to identify which functional would serve best for the blind
prediction, single point calculations were carried out on the optimized geometries using several
different hybrid DFT functionals, (B97-141, B9842, B3PW9112,43, M0544, M05-2X45, M0646, M06112

2X, M06-HF47, ĎB9748, and ĎB97X-D49) in combination with the cc-pVnZ basis sets (where n=D,
T, and Q) for the subset of 13 molecules (noted Batch 0 in Error! Reference source not found.).
These functionals can be classified as three different types of hybrid functionals: global hybrid
generalized gradient approximation (GH-GGA) includes B97-1, B98, and B3PW91; global hybrid
meta-GGA (GH-mGGA) includes M05, M05-2X, M06, M06-2X, and M06-HF; and range
separated hybrid (RSH)-GGA includes ĎB97 and ĎB97X-D. These functionals were chosen due
to their popularity and utility for organic species, as well as to provide a variety of classes of hybrid
functionals. All calculations were performed using the âUltrafineâ integration grid as it is known
that HM-GGA functionals are sensitive to the integration grid size50 and finer grids can improve
numerical accuracy. These single point calculations were carried out in implicit solvent using the
SMD solvation model for both water and cyclohexane. The partition coefficients were estimated
from the difference in the transfer free energy in water and in cyclohexane.

6.3

Results and Discussion
For the Batch 0 subset of molecules, several hybrid DFT functionals were tested with

double, triple, and quadruple-Îś level basis sets. In these calculations, only a single conformation,
protonation state, and tautomer were considered thus there is not a statistical uncertainty associated
with the calculations. Rather, there is model uncertainty that arises from the assumption of a single
geometry, the DFT method, the basis set, and the solvent model. From the results shown in Table
6.1, overall, each approach predicts the partition coefficient within a mean absolute deviation
(MAD) in the range of 1.5 - 2.0 logP units in reference to experiment, which corresponds to a 2.0
- 2.7 kcal mol-1 variation in the transfer free energy. While QSPR methods are able to predict

113

within 0.3 - 1.0 log units from experiment, the physical significance of the predictions are
questionable as these models have adroit tactics at modeling noise.51
Table 6.1 Comparison of predicted logP with experimental logD.
SAMPL5
ID
003
015
017
020
037
045
055
058
059
061
068
070
080

logD
Exp.
1.9 Âą 0.1
-2.2 Âą 0.3
2.5 Âą 0.3
1.6 Âą 0.3
-1.5 Âą 0.1
-2.1 Âą 0.2
-1.5 Âą 0.1
0.8 Âą 0.1
-1.3 Âą 0.3
-1.5 Âą 0.1
1.4 Âą 0.3
1.6 Âą 0.3
-2.2 Âą 0.2

MSDb
MADc

SAMPL5
ID
003
015
017
020
037
045
055
058
059
061
068
070
080
MSDb
MADc

logD
Exp.
1.9 Âą 0.1
-2.2 Âą 0.3
2.5 Âą 0.3
1.6 Âą 0.3
-1.5 Âą 0.1
-2.1 Âą 0.2
-1.5 Âą 0.1
0.8 Âą 0.1
-1.3 Âą 0.3
-1.5 Âą 0.1
1.4 Âą 0.3
1.6 Âą 0.3
-2.2 Âą 0.2

DZ
3.1
-3.1
1.3
0.9
-3.9
-1.1
-2.6
2.7
-0.4
-0.8
1.5
5.5
1.1

B97-1
TZ
2.7
-3.6
1.0
0.3
-4.2
-1.5
-3.1
2.3
-0.5
-1.2
0.9
5.1
0.5

QZ
2.6
-3.6
1.0
0.1
-4.3
-1.6
-3.2
2.2
-0.5
-1.3
0.8
5.0
0.4

0.5
1.5

0.1
1.5

0.0
1.5

a

DZ
3.0
-3.1
1.3
0.9
-4.0
-1.0
-2.7
2.8
-0.3
-0.8
1.7
5.6
1.1

M06
TZ
2.7
-3.4
1.2
0.4
-4.2
-1.4
-3.1
2.4
-0.4
-1.3
1.1
5.4
0.5

QZ
2.7
-3.4
1.2
0.3
-4.2
-1.4
-3.1
2.4
-0.4
-1.4
1.1
5.3
0.5

0.5
1.5

0.2
1.5

0.2
1.5

DZ
3.0
-3.1
1.2
0.9
-4.0
-1.1
-2.7
2.7
-0.4
-0.8
1.5
5.5
1.0

B98
TZ
2.6
-3.6
1.0
0.2
-4.3
-1.6
-3.2
2.2
-0.5
-1.3
0.8
5.1
0.4

QZ
2.5
-3.7
1.0
0.1
-4.4
-1.7
-3.3
2.2
-0.5
-1.4
0.7
5.0
0.3

0.5
1.5

0.0
1.5

-0.1
1.5

DZ
2.8
-3.5
0.6
0.5
-4.4
-1.4
-2.9
2.3
-0.6
-1.0
1.2
5.0
0.8

M06-2X
TZ QZ
2.3 2.3
-4.0 -3.9
0.3 0.5
-0.2 -0.2
-4.8 -4.7
-1.9 -1.9
-3.4 -3.4
1.8 1.8
-0.8 -0.7
-1.6 -1.7
0.5 0.5
4.6 4.6
0.1 0.1

0.1 -0.4 -0.3
1.5 1.5 1.5

114

B3PW91
DZ TZ QZ
3.0 2.7 2.6
-3.2 -3.6 -3.7
1.0 0.9 0.9
0.8 0.2 0.1
-4.0 -4.3 -4.3
-1.2 -1.6 -1.7
-2.7 -3.2 -3.2
2.6 2.2 2.2
-0.5 -0.6 -0.6
-0.9 -1.3 -1.4
1.4 0.8 0.7
5.3 4.9 4.9
1.1 0.5 0.4
0.4
1.5

0.0
1.5

-0.1
1.5

M06-HF
TZ QZ
1.4 1.3
-5.3 -5.1
-1.4 -1.3
-1.5 -1.6
-6.2 -6.0
-3.1 -3.0
-4.2 -4.1
0.5 0.4
-1.5 -1.5
-2.4 -2.5
-0.8 -1.0
2.8 2.6
-0.8 -0.7

DZ
1.8
-4.6
-1.0
-0.6
-5.4
-2.4
-3.6
1.0
-1.4
-1.7
-0.3
3.3
0.1

-0.9 -1.5 -1.6
1.6 1.9 1.9

Table 6.1 (contâd)
SAMPL5
ID
003
015
017
020
037
045
055
058
059
061
068
070
080

logD
Exp.
1.9 Âą 0.1
-2.2 Âą 0.3
2.5 Âą 0.3
1.6 Âą 0.3
-1.5 Âą 0.1
-2.1 Âą 0.2
-1.5 Âą 0.1
0.8 Âą 0.1
-1.3 Âą 0.3
-1.5 Âą 0.1
1.4 Âą 0.3
1.6 Âą 0.3
-2.2 Âą 0.2

MSDb
MADc
SAMPL5
ID
003
015
017
020
037
045
055
058
059
061
068
070
080

logD
Exp.
1.9 Âą 0.1
-2.2 Âą 0.3
2.5 Âą 0.3
1.6 Âą 0.3
-1.5 Âą 0.1
-2.1 Âą 0.2
-1.5 Âą 0.1
0.8 Âą 0.1
-1.3 Âą 0.3
-1.5 Âą 0.1
1.4 Âą 0.3
1.6 Âą 0.3
-2.2 Âą 0.2

DZ
3.0
-3.1
1.3
0.9
-3.7
-1.0
-2.6
2.8
-0.3
-0.7
1.6
5.5
1.2

DZ
2.9
-3.3
0.9
0.7
-4.1
-1.3
-2.7
2.5
-0.5
-1.0
1.3
5.2
1.0

ĎB97
TZ
2.5
-3.9
0.6
-0.1
-4.5
-1.8
-3.2
2.0
-0.7
-1.6
0.5
4.8
0.3

QZ
2.4
-4.0
0.6
-0.2
-4.5
-1.9
-3.3
2.0
-0.7
-1.7
0.4
4.7
0.2

0.3
1.5

-0.2
1.5

-0.3
1.5

M05
TZ
2.7
-3.5
1.2
0.4
-4.0
-1.4
-3.0
2.4
-0.4
-1.2
1.0
5.3
0.7

ĎB97X-D
DZ
TZ
2.8
2.5
-3.5 -3.9
0.9
0.7
0.6 -0.1
-4.2 -4.5
-1.4 -1.8
-2.8 -3.3
2.4
2.0
-0.6 -0.7
-1.0 -1.5
1.2
0.6
5.2
4.8
0.8
0.2
0.2
1.5

-0.2
1.5

QZ
2.4
-3.9
0.6
-0.2
-4.5
-1.9
-3.4
2.0
-0.7
-1.6
0.5
4.7
0.2
-0.3
1.5

M05-2X
DZ TZ QZ
2.8 1.9 2.0
-3.5 -4.6 -4.4
0.6 -0.1 0.0
0.5 -0.8 -0.8
-4.4 -5.1 -5.0
-1.4 -2.4 -2.4
-2.9 -3.8 -3.8
2.3 1.3 1.3
-0.6 -1.0 -1.0
-1.0 -1.9 -1.9
1.2 0.0 0.0
5.0 4.1 4.0
0.8 -0.3 -0.3

QZ
2.6
-3.7
1.1
0.2
-4.1
-1.5
-3.1
2.3
-0.4
-1.3
0.9
5.0
0.6

MSDb
0.6 0.2 0.1
0.1 -0.8 -0.8
MADc
1.5 1.5 1.5
1.5 1.6 1.6
a
The columns labeled DZ, TZ, and QZ correspond to the cc-pVDZ, cc-pVTZ, and cc-pVQZ
basis sets.
b
The mean signed deviation of the predicted logP from the experimental logD.
c
The mean absolute deviation of the predicted logP from the experimental logD.

115

Figure 6.2 Subset of molecules (Batch0) LogP: effect of increasing basis sets.The predicted
logP with respect to increasing basis set size (DZ, TZ, QZ) are represented by the red box,
green triangle, and purple âXâ, respectively. 116

For the Batch 0 subset of molecules and the functionals considered, there is not a systematic
underestimation or overestimation in the prediction of the logP. The predicted logP for the GHGGA and RSH-GGA functionals are similar, with a MAD of 1.5 logP units. Each of the tested
approaches are able to predict the correct sign of logP in water and cyclohexane
(hydrophobicity/hydrophilicity) for each molecule, with the exception of molecules 017, 020, 058,
068, and 080. Predicting the correct sign of logP is important as this reflects the preference of the
molecule to reside in either the organic or aqueous phase.
The lowest MAD observed in this study is 1.5 logP units, which is given by eight out of
the ten functionals tested. As shown in Table 6.1, most of the functionals result in a low mean
signed deviation (MSD) compared with the magnitude of the MAD. This indicates that there is not
a consistent deviation from the experimental values. Rather, the small magnitude of the MSD is
the result of deviations above and below experiment such that they largely cancel for the Batch 0
subset of molecules. The largest MSDs observed are for the M06-HF functional with the
magnitude of the MSDs within 50% or more of the MADs. Thus, the M06-HF functional has a
larger systematic error resulting in underestimation of the logP value, i.e. negative MSDs in
comparison to the other functionals. The calculated values for logP are plotted with respect to the
molecule number in Figure 6.2 while the numerical results are shown in Table 6.1. As shown, the
quality of the basis set used does not impact the predictions of hydrophobicity or hydrophilicity
since the sign of the deviations is mostly unchanged when changing basis. For M06-HF, although
it appears that the calculated logP is consistently underestimated with the TZ and QZ basis sets,
the functional tends to predict molecules to be more hydrophilic than those predicted with the other
functionals. This bias towards hydrophilicity would become a problem for high-throughput drug

117

screening as ideal compounds are neither too hydrophobic nor too hydrophilic. Predictions that are
too hydrophilic could result in discarding potential compounds.
The correlation consistent basis sets used for this work were constructed in a systematic
fashion to recover correlation energy for ab initio methods and overall convergence is commonly
demonstrated for properties determined with increasing size of basis sets. As illustrated in Figure
6.2, the increasing quality of the basis sets lowers the logP values. This behavior is clearly
convergent, yet convergence to the Kohn-Sham limit does not necessarily result in improved
results with respect to experiment.
In examining the Batch 0 subset, trends in the predictions of logP for each molecule are
consistent for each functional with only a few exceptions. Molecules 020, 068, and 070 have
deviations that differ in sign as a function of basis set and functional. This is simply the result of
the predictions being so close to experiment that even a small variation in the predicted logP values
can change the sign of the deviation. Increasing the basis set size typically improves the
predictions of logP for the HG-GGA functionals and the RSH-GGA functionals, while the
magnitude of the signed deviation increases slightly for the GH-mGGA functionals. Overall, the
small variation between the logP values obtained with the TZ and QZ basis sets indicates that the
TZ basis set is already near the Kohn-Sham limit. A greater percentage of exact exchange from
the functionals may help in the prediction of logP, for example, from M05 to M05-2X, a better
prediction of logP is obtained for molecules 003, 045, 058, 059, 070, and 080. However, for some
molecules too much exact exchange overestimates the solvation in water and underestimates the
logP by at least 1.0 to 2.0 logP units. For example, each functional underestimates the logP for
molecules 015, 017, 020, 037, and 055, as each overestimates the solvation in water relative to
cyclohexane. Additionally for these molecules, increasing the size of the basis set predicts the

118

molecules to be more hydrophilic. In these cases, the functionals perform best using a DZ basis
set. In the case for molecule 068, each functional predicts within experimental error using a DZ
basis set, with the exclusion of M06-HF.
Overall, similar predictions of logP are obtained with the GH-GGA functionals and the
RSH-GGA functionals. For the GH-mGGA functionals, predictions of logP obtained using M05
and M06 are similar. Using M05-2X and M06-2X yield similar predictions of logP as well. M06HF stands out the most in contrast to the other functionals as it consistently overestimates the
solvation of the molecule in water. From the results, it is evident that increasing the amount of
exact exchange, as well the quality of the basis set, overestimates the solvation energy in water
relative to the solvation energy in cyclohexane. It is believed that this bias in overestimation in
water is due to the parameter fitting for the SMD solvation model. The SMD solvation model,
which was reported to achieve an accuracy for predicting the solvation free energies (mean
unsigned error of 0.6 to 1.0 kcal/mol for neutral solutes), was more heavily parameterized against
molecules solvated in water in contrast to cyclohexane.8 This situation can be advantageous for
molecules that are slightly hydrophobic. However, this is not optimal since there is a consistent
overestimation of the energy from being solvated in water as this results in misleading predictions
of the equilibrium distribution of a molecule in different solvents. Although there are cases in
which the GH-mGGA functionals perform best, the GGA functionals were constructed with less
parameter fitting and may serve as a stronger class of functionals for initial starting guesses when
predicting the solvation properties of molecules in which experiment is unavailable.
B3PW91 was chosen as the method for predicting the transfer free energy of the remaining
SAMPL5 subsets because of its overall consistent behavior across the period table, and for a
number of energetic properties. Some of the molecules within the SAMPL5 set contain sulfur.

119

Previous studies have shown that for molecules containing sulfur, the BP3W91 functional yields
more accurate energetics than the B3LYP functional when using correlation consistent basis
sets.21,52 The results submitted for the SAMPL5 challenge are shown in Table 6.2. Using B3PW91
and a quadruple-Îś basis level basis set has an MAD of 1.9 and was able to estimate the logP within
2.0 logP units from the experimental logD for about 60% of the molecule set. Although this
approach lacks contributions from other protonation states, tautomers, and additional protomers,
it provided a good starting estimate of logP. Regarding the outliers that this approach predicted
over 2.0 logP units in reference to the experimental logD, it is believed that the prediction can be
improved by including the chemical contributions from tautomerization, protonation, as well as
with additional conformational sampling as many of the outliers that are greater than 3.0 logP units
from the experimental logD are less rigid than other molecules within the dataset. For several of
the molecules, the source of the larger deviations are known whereas other relationships between
the prediction and the structure will require further investigations. The largest outlier is molecule
083, with a deviation of 9.0 logP units, is a result of modeling of the incorrect tautomer. The
tautomeric state issued in the SAMPL5 data set was not the preferred tautomer. The results
obtained from using the triple-Îś and the quadruple-Îś level basis sets are very similar. Rather than
using a quadruple-Îś level basis set, it is recommended to use a triple-Îś level size and the correlation
consistent basis sets that include tight d functions for molecules containing sulfur.29

120

Table 6.2 Overview of the results submitted to the SAMPL5 challenge.
ID
002
003
004
005
006
007
010
011
013
015
017
019
020
021
024
026
027
033
037
042
044
045
046
047
048
049
050

LogDExp.
1.4 Âą 0.3
1.9 Âą 0.1
2.2 Âą 0.3
-0.9 Âą 0.1
-1.0 Âą 0.1
1.4 Âą 0.3
-1.7 Âą 0.4
-3.0 Âą 0.1
-1.5 Âą 0.4
-2.2 Âą 0.3
2.5 Âą 0.3
1.2 Âą 0.4
1.6 Âą 0.3
1.2 Âą 0.3
1.0 Âą 0.4
-2.6 Âą 0.1
-1.9 Âą 0.1
1.8 Âą 0.2
-1.5 Âą 0.1
-1.1 Âą 0.3
1.0 Âą 0.4
-2.1 Âą 0.2
0.2 Âą 0.3
-0.4 Âą 0.3
0.9 Âą 0.4
1.3 Âą 0.1
-3.2 Âą 0.6

LogPCalc.
-1.0
2.6
4.1
1.1
-0.9
3.5
-2.4
3.2
3.1
-3.7
0.9
5.9
0.1
0.7
1.9
-2.8
2.7
3.4
-4.3
0.7
0.9
-1.7
-1.0
0.2
0.8
1.9
-5.4

ID
055
056
058
059
060
061
063
065
067
068
069
070
071
072
074
075
080
081
082
083
084
085
086
088
090
092

LogDExp.
-1.5 Âą 0.1
-2.5 Âą 0.1
0.8 Âą 0.1
-1.3 Âą 0.3
-3.9 Âą 0.2
-1.5 Âą 0.1
-3.0 Âą 0.4
0.7 Âą 0.2
-1.3 Âą 0.3
1.4 Âą 0.3
-1.3 Âą 0.3
1.6 Âą 0.3
-0.1 Âą 0.5
0.6 Âą 0.3
-1.9 Âą 0.3
-2.8 Âą 0.3
-2.2 Âą 0.2
-2.2 Âą 0.3
2.5 Âą 0.4
-1.9 Âą 0.4
0.0 Âą 0.2
-2.2 Âą 0.4
0.7 Âą 0.2
-1.9 Âą 0.3
0.8 Âą 0.2
-0.4 Âą 0.3

LogPCalc.
-3.2
-1.6
2.2
-0.6
-1.7
-1.4
-6.0
-1.4
0.4
0.7
-0.1
4.9
-0.9
3.3
-6.3
-1.4
0.4
-5.3
6.4
-10.9
2.6
0.7
1.1
-1.0
1.0
-1.1

MSDa
0.5
MADb
1.9
a
The mean signed deviation of the predicted logP from the experimental logD.
b
The mean absolute deviation of the predicted logP from the experimental logD.

6.4

Conclusion
In this study, several DFT functionals were used to predict the partition coefficients in

cyclohexane and water of molecules in the SAMPL5 molecule set, a diverse set of molecules,
representing a variety of protonation states and tautomerization states. Using the SMD implicit
solvation model and a vertical solvation approach, the free energy of transfer was predicted,
121

resulting in a mean absolute deviation of 1.9 logP units from the experimental logD. The results
highlight that the performance of density functionals does not consistently overestimate or
underestimate the logP for some molecules in the SAMPL5 set. Functionals in the GH-GGA and
the RSH-GGA class perform similarly, whereas the performance of GH-mGGA does not provide
similar results as GH-GGA and RSH-GGA functionals. The results show that functionals that
include a larger percentages of exact exchange tend to predict logP values that overestimate the
hydrophilicity. For molecules that are similar to those studied in this work, using a GH-GGA
functional, such as B3PW91, in conjunction with cc-pVTZ can be used for predicting transfer free
energy of small organic molecules.
Moving forward, alternative strategies should be considered to try to improve the
predictions. The acid dissociation constants for many of the ionizable compounds are not known.
It is possible that these theoretical predictions could be improved by accounting for the multiple
ionization states as this would more accurately estimate the distribution.53 Although assumptions
were made regarding the protonation, tautomeric, and conformational states, the results highlight
the ability of hybrid DFT approaches and the SMD implicit solvent model for estimating logP. As
these approaches are able to predict close to experimental logD, the predicted logP underestimates
the energetic contributions related to the structural and environmental heterogeneity in solution
that is reflective of experiment. To attempt to further reduce the logP deviations from the
experimental logD, the performance of ab initio electronic correlation methods could be
considered for predicting the logP. While the use of these electronic structure methods would
increase the computational cost, these predictions may provide a stronger approach that allows for
systematic improvement.

122

REFERENCES

123

REFERENCES
(1)

Nieto-Draghi, C.; Fayet, G.; Creton, B.; Rozanska, X.; Rotureau, P.; de Hemptinne, J.-C.;
Ungerer, P.; Rousseau, B.; Adamo, C. A General Guidebook for the Theoretical
Prediction of Physicochemical Properties of Chemicals for Regulatory Purposes. Chem.
Rev. 2015, 115 (24), 13093â13164.

(2)

Le, T.; Epa, V. C.; Burden, F. R.; Winkler, D. A. Quantitative StructureâProperty
Relationship Modeling of Diverse Materials Properties. Chem. Rev. 2012, 112 (5), 2889â
2919.

(3)

Martin, Y. C. Letâs Not Forget Tautomers. J. Comput. Aided. Mol. Des. 2009, 23 (10),
693â704.

(4)

Cherkasov, A.; Muratov, E. N.; Fourches, D.; Varnek, A.; Baskin, I. I.; Cronin, M.;
Dearden, J.; Gramatica, P.; Martin, Y. C.; Todeschini, R.; et al. QSAR Modeling: Where
Have You Been? Where Are You Going To? J. Med. Chem. 2014, 57 (12), 4977â5010.

(5)

Gramatica, P.; Sangion, A. A Historical Excursus on the Statistical Validation Parameters
for QSAR Models: A Clarification Concerning Metrics and Terminology. J. Chem. Inf.
Model. 2016.

(6)

Barone, V.; Cossi, M. Quantum Calculation of Molecular Energies and Energy Gradients
in Solution by a Conductor Solvent Model. J. Phys. Chem. A 1998, 102 (11), 1995â2001.

(7)

Klamt, A.; SchĂźĂźrmann, G. COSMO: A New Approach to Dielectric Screening in
Solvents with Explicit Expressions for the Screening Energy and Its Gradient. J. Chem.
Soc., Perkin Trans. 2 1993, No. 5, 799â805.

(8)

Marenich, A. V.; Cramer, C. J.; Truhlar, D. G. Universal Solvation Model Based on
Solute Electron Density and on a Continuum Model of the Solvent Defined by the Bulk
Dielectric Constant and Atomic Surface Tensions. J. Phys. Chem. B 2009, 113 (18), 6378â
6396.

(9)

Riojas, A. G.; Wilson, A. K. Solv-ccCA: Implicit Solvation and the Correlation Consistent
Composite Approach for the Determination of pKa. J. Chem. Theory Comput. 2014, 10
(4), 1500â1510.

(10) KĂśnig, G.; Pickard IV, F. C.; Mei, Y.; Brooks, B. R. Predicting Hydration Free Energies
with a Hybrid QM/MM Approach: An Evaluation of Implicit and Explicit Solvation
Models in SAMPL4. J. Comput. Aided. Mol. Des. 2014, 28 (3), 245â257.
(11) KĂśnig, G.; Hudson, P. S.; Boresch, S.; Woodcock, H. L. Multiscale Free Energy
Simulations: An Efficient Method for Connecting Classical MD Simulations to QM or
QM/MM Free Energies Using Non-Boltzmann Bennett Reweighting Schemes. J. Chem.
Theory Comput. 2014, 10 (4), 1406â1419.
(12) Becke, A. D. Density-Functional Exchange-Energy Approximation with Correct
124

Asymptotic Behavior. Phys. Rev. A 1988, 38 (6), 3098â3100.
(13) Lee, C.; Yang, W.; Parr, R. G. Development of the Colle-Salvetti Correlation-Energy
Formula into a Functional of the Electron Density. Phys. Rev. B. 1998, 37, 785â789.
(14) Stephens, P. J.; Devlin, F. J.; Chabalowski, C. F.; Frisch, M. J. Ab Initio Calculation of
Vibrational Absorption and Circular Dichroism Spectra Using Density Functional Force
Fields. J. Phys. Chem. 1994, 98 (45), 11623â11627.
(15) Jr, T. H. D. Gaussian Basis Sets for Use in Correlated Molecular Calculations. I. The
Atoms Boron through Neon and Hydrogen. J. Chem. Phys. 1989, 90, 1007â1023.
(16) Woon, D. E.; Dunning, T. H. Gaussian Basis Sets for Use in Correlated Molecular
Calculations. IV. Calculation of Static Electrical Response Properties. J. Chem. Phys.
1994, 100 (4), 2975.
(17) Woon, D. E.; Dunning, T. H. Gaussian Basis Sets for Use in Correlated Molecular
Calculations. III. The Atoms Aluminum through Argon. J. Chem. Phys. 1993, 98 (2),
1358.
(18) Woon, D. E.; Dunning, T. H. Gaussian Basis Sets for Use in Correlated Molecular
Calculations. V. Core-Valence Basis Sets for Boron through Neon. J. Chem. Phys. 1995,
103 (11), 4572.
(19) Dunning, T. H. J.; Peterson, K. A.; Wilson, A. K.; Jr, T. H. D.; Jr., T. H. D. Gaussian
Basis Sets for Use in Correlated Molecular Calculations. X. The Atoms Aluminum
through Argon Revisted. J. Chem. Phys. 2001, 114 (21), 9244â9253.
(20) Wang, N.; Wilson, A. Effects of Basis Set Choice upon the Atomization Energy of the
Second-Row Compounds SO2, CCl, and ClO2 for B3LYP and B3PW91. J. Phys. Chem.
A 2003, 107, 6720â6724.
(21) Wang, N. X.; Wilson, A. K. Density Functional Theory and the Correlation Consistent
Basis Sets: The Tight D Effect on HSO and HOS. J. Phys. Chem. A 2005, 109 (32), 7187â
7196.
(22) Prascher, B. P.; Wilson, A. K. The Behaviour of Density Functionals with Respect to
Basis Set. V. Recontraction of Correlation Consistent Basis Sets. Mol. Phys. 2007, 105
(19â22), 2899â2917.
(23) Wang, N. X.; Wilson, A. K. The Behavior of Density Functionals with Respect to Basis
Set. I. The Correlation Consistent Basis Sets. J. Chem. Phys. 2004, 121 (16), 7632â7646.
(24) Laury, M. L.; Boesch, S. E.; Haken, I.; Sinha, P.; Wheeler, R. A.; Wilson, A. K. Harmonic
Vibrational Frequencies: Scale Factors for Pure, Hybrid, Hybrid Meta, and Double-Hybrid
Functionals in Conjunction with Correlation Consistent Basis Sets. J. Comput. Chem.
2011, 32 (11), 2339â2347.

125

(25) Prascher, B. P.; Wilson, B. R.; Wilson, A. K. Behavior of Density Functionals with
Respect to Basis Set. VI. Truncation of the Correlation Consistent Basis Sets. J. Chem.
Phys. 2007, 127 (12), 124110.
(26) Wang, N. X.; Venkatesh, K.; Wilson, A. K. Behavior of Density Functionals with Respect
to Basis Set. 3. Basis Set Superposition Error â . J. Phys. Chem. A 2006, 110 (2), 779â784.
(27) Wang, N. X.; Wilson, A. K. Behaviour of Density Functionals with Respect to Basis Set:
II. Polarization Consistent Basis Sets. Mol. Phys. 2005, 103 (2â3), 345â358.
(28) Jiang, W.; Laury, M. L.; Powell, M.; Wilson, A. K. Comparative Study of Single and
Double Hybrid Density Functionals for the Prediction of 3d Transition Metal
Thermochemistry. J. Chem. Theory Comput. 2012, 8 (11), 4102â4111.
(29) Dunning, T. H.; Peterson, K. A.; Wilson, A. K. Gaussian Basis Sets for Use in Correlated
Molecular Calculations. X. The Atoms Aluminum through Argon Revisited. J. Chem.
Phys. 2001, 114 (21), 9244.
(30) Bryantsev, V. S.; Diallo, M. S.; van Duin, A. C. T.; Goddard, W. A. Evaluation of
B3LYP, X3LYP, and M06-Class Density Functionals for Predicting the Binding Energies
of Neutral, Protonated, and Deprotonated Water Clusters. J. Chem. Theory Comput. 2009,
5 (4), 1016â1026.
(31) Rayne, S.; Rayne, S.; Forest, K. Accuracy of Computational Solvation Free Energies for
Neutral and Ionic Compounds: Dependence on Level of Theory and Solvent Model. Nat.
Preced. 2010, 1â22.
(32) Kelly, C. P.; Cramer, C. J.; Truhlar, D. G. SM6: A Density Functional Theory Continuum
Solvation Model for Calculating Aqueous Solvation Free Energies of Neutrals, Ions, and
SoluteâWater Clusters. J. Chem. Theory Comput. 2005, 1 (6), 1133â1152.
(33) Kelly, C. P.; Cramer, C. J.; Truhlar, D. G. Aqueous Solvation Free Energies of Ions and
IonâWater Clusters Based on an Accurate Value for the Absolute Aqueous Solvation Free
Energy of the Proton. J. Phys. Chem. B 2006, 110 (32), 16066â16081.
(34) Takano, Y.; Houk, K. N. Benchmarking the Conductor-like Polarizable Continuum Model
(CPCM) for Aqueous Solvation Free Energies of Neutral and Ionic Organic Molecules. J.
Chem. Theory Comput. 2005, 1 (1), 70â77.
(35) Guthrie, J. P.; Povar, I. A Test of Various Computational Solvation Models on a Set of
âdifficultâ Organic Compounds. Can. J. Chem. 2009, 87 (8), 1154â1162.
(36) Tekarli, S. M.; Drummond, M. L.; Williams, T. G.; Cundari, T. R.; Wilson, A. K.
Performance of Density Functional Theory for 3d Transition Metal-Containing
Complexes: Utilization of the Correlation Consistent Basis Sets. J. Phys. Chem. A 2009,
113 (30), 8607â8614.
(37) Riley, K. E.; Opât Holt, B. T.; Merz, K. M. Critical Assessment of the Performance of
126

Density Functional Methods for Several Atomic and Molecular Properties. J. Chem.
Theory Comput. 2007, 3 (2), 407â433.
(38) Martell, J. M.; Goddard, J. D.; Eriksson, L. A. Assessment of Basis Set and Functional
Dependencies in Density Functional Theory: Studies of Atomization and Reaction
Energies. J. Phys. Chem. A 1997, 101 (10), 1927â1934.
(39) Cohen, A. J.; Mori-Sanchez, P.; Yang, W.; Mori-SĂĄnchez, P. Challenges for Density
Functional Theory. Chem. Rev. 2012, 112 (1), 289â320.
(40) Sousa, S. F.; Fernandes, P. A.; Ramos, M. J. General Performance of Density Functionals.
J. Phys. Chem. A 2007, 111 (42), 10439â10452.
(41) Hamprecht, F. A.; Cohen, A. J.; Tozer, D. J.; Handy, N. C. Development and Assessment
of New Exchange-Correlation Functionals. J. Chem. Phys. 1998, 109 (15), 6264â6271.
(42) Schmider, H. L.; Becke, A. D. Optimized Density Functionals from the Extended G2 Test
Set. J. Chem. Phys. 1998, 108 (23), 9624â9631.
(43) Becke, A. D. Density-Functional Thermochemistry. III. The Role of Exact Exchange. J.
Chem. Phys. 1993, 98, 5648â5652.
(44) Zhao, Y.; Schultz, N. E.; Truhlar, D. G. Exchange-Correlation Functional with Broad
Accuracy for Metallic and Nonmetallic Compounds, Kinetics, and Noncovalent
Interactions. J. Chem. Phys. 2005, 123 (16), 161103.
(45) Zhao, Y.; Schultz, N. E.; Truhlar, D. G. Design of Density Functionals by Combining the
Method of Constraint Satisfaction with Parametrization for Thermochemistry,
Thermochemical Kinetics, and Noncovalent Interactions. J. Chem. Theory Comput. 2006,
2 (2), 364â382.
(46) Zhao, Y.; Truhlar, D. G. The M06 Suite of Density Functionals for Main Group
Thermochemistry, Thermochemical Kinetics, Noncovalent Interactions, Excited States,
and Transition Elements: Two New Functionals and Systematic Testing of Four M06Class Functionals and 12 Other Function. Theor. Chem. Acc. 2008, 120 (1â3), 215â241.
(47) Zhao, Y.; Truhlar, D. G. Density Functional for Spectroscopy: No Long-Range SelfInteraction Error, Good Performance for Rydberg and Charge-Transfer States, and Better
Performance on Average than B3LYP for Ground States. J. Phys. Chem. A 2006, 110
(49), 13126â13130.
(48) Chai, J. Da; Head-Gordon, M. Systematic Optimization of Long-Range Corrected Hybrid
Density Functionals. J. Chem. Phys. 2008, 128 (8), 1â15.
(49) Chai, J.-D.; Head-Gordon, M. Long-Range Corrected Hybrid Density Functionals with
Damped Atomâatom Dispersion Corrections. Phys. Chem. Chem. Phys. 2008, 10 (44),
6615.

127

(50) Wheeler, S. E.; Houk, K. N. Integration Grid Errors for Meta-Gga-Predicted Reaction
Energies: Origin of Grid Errors for the M06 Suite of Functionals. J. Chem. Theory
Comput. 2010, 6 (2), 395â404.
(51) Palmer, D. S.; Mitchell, J. B. O. Is Experimental Data Quality the Limiting Factor in
Predicting the Aqueous Solubility of Druglike Molecules? Mol. Pharm. 2014, 11 (8),
2962â2972.
(52) Denis, P. A.; Ventura, O. N. Density Functional Investigation of Atmospheric Sulfur
Chemistry. I. Enthalpy of Formation of HSO and Related Molecules. Int. J. Quantum
Chem. 2000, 80 (3), 439â453.
(53) Kah, M.; Brown, C. D. LogD: Lipophilicity for Ionisable Compounds. Chemosphere
2008, 72 (10), 1401â1408.

128

CHAPTER 7 CONCLUDING REMARKS

Proteins are dynamic in nature and can have many conformational-dependent functional
states, including active, inactive, and intermediate states. These conformational states can
represent an array of structural features that distinguish the ability of the protein to bind other
molecules. Elucidating the transitions between the conformational states of protein complexes is
critical for effective drug discovery, design, and delivery methods, as it would allow direct
correlation between conformational-activity and structure-activity relationships. In this
dissertation, theoretical investigations were employed to examine structure-function relationships
of proteins and peptides. More specifically, investigations were carried out on oncogenic proteinkinases involved in the signaling of the NF-ÎşB pathway.
NF-ÎşB transcription factors are sequestered in the inactive state by IÎşB proteins and are
released upon activation of the canonical or the non-canonical pathway. The canonical pathway is
dependent upon the activation of IKKÎ˛ and the non-canonical pathway is dependent upon the
activation of NIK. In Chapter 3, molecular dynamics simulations of IKKÎ˛ were used to investigate
how conformational changes and protein-protein interactions within multimeric assemblies are
influenced by changes in the interacting subunits, as dimerization of IKKÎ˛ is critical for its
activation, although it is not required for its reactivity.1 An approach was designed to characterize
the changes in the inactive and active states by contrasting and quantifying the changes in proteinprotein association and dissociation upon post-translational modification.
Thermodynamic properties were monitored for simulation stability. Hydrogen-bonding
interactions, salt-bridges, and conformational changes were compared among the monomer, the
two different dimers and the tetrameric arrangement of IKKÎ˛ and suggest that the tetrameric
structure mediates a global stability for the enzyme. Additionally, the results highlight that
129

modeling a single monomer of the protein kinase is insufficient for capturing the associated
structure-activity relationships as the oligomeric assembly is significant for function.
In Chapter 4, molecular dynamics simulations and modeling techniques were employed to
differentiate the structural dynamics of the NF-kB Inducing Kinase (NIK) in its native and mutant
form, and in the absence and presence of salt concentration in efforts to probe whether changes in
the ionic environment stabilize or destabilize the NIK dimer. Analyses of structure-activity and
conformational-activity relationships indicate that the protein-protein interactions are sensitive to
changes in the ionic strength. Ligand binding pockets either compress or expand, affecting both
local and distal intermolecular interactions, yielding further insight into how changes within the
intracellular microenvironment affect molecular mechanisms and how small molecules may bind.
Ligand-binding sites in rational drug design are often proposed from static conformational
states from X-ray crystallography that neglect intrinsic dynamics induced by natural molecular
and physiological environments. As the environment of a molecular system directly impacts the
structure, it is reasonable that this impacts the activity. Although changes in the ionic concentration
are known to impact protein-ligand binding, the specific role and mechanism that assist and hinder
ligand binding remain elusive for many protein kinases. Additionally, cancer cells are unique
because they are highly expressed with features that aid the cell in maintaining its oncogenic
homeostatic conditions, including the expression of heat shock proteins, ion transporters, and other
complexes that regulate the intracellular environment (eg. pH and ionic strength). Using the
modeling schemes proposed in Chapters 3 and 4, future investigations should include investigating
related Serine/Threonine and Tyrosine protein kinases to better understand structure-activity and
conformational-activity relationships mediated by changes in the intracellular environment for

130

these protein kinases and how these changes may induce alternative mechanisms leading to
oncogenic activity.
Another aspect of this work includes the assessment of different theoretical methods for
the prediction of physical properties of peptides and drug molecules. In Chapter 5, ab initio and
density functional methods were used to characterize the selectivity of radical-mediated
fragmentation by evaluating site, conformation, and pathway trends of small trialanine peptides
resembling a Î˛-strand and a Î˛-turn.2 Comparisons of reaction enthalpies show that the diamide
pathway is more energetically favorable than the Îą-amidation pathway and that both pathways are
site and conformationally selective. An evaluation of electronic structure methods illustrates that
more approximate methods can provide insight on which fragmentation pathway is preferred but
are unable to discern site or conformational selectivity and that sophisticated methods that account
for electron correlation are needed for reactions of radicals to proteins. All results and conclusions
in Chapter 5 were based on gas-phase computations, whereas the target is at understanding larger
biomolecules. While these findings offer a fundamental insight into how proteins can respond to
reactive oxygen species, producing fragments with predictable terminal functional groups, this
study neglects solvent-induced conformational rearrangement and the impact of varying side
chains on the reaction pathways. These are of interest for further investigations because they will
allow for an enhanced understanding of protein fragmentation as well as the ability to predict the
structural features involved in such mechanisms, which is significant towards understanding
oxidative stress. As indicated in the results, additional methods should be considered in future
work including both quantum mechanics and molecular mechanics methods to account for
conformational rearrangements.

131

In Chapter 6, several density functional methods were employed to estimate the partition
coefficients of drug-like molecules from the transfer free energies in cyclohexane and water.3
Three classes of hybrid density functionals were assessed and the results highlight that functionals
with less parameterization outperformed those deemed as more sophisticated functions. The results
also highlight some of the inadequacies of the modeling schemes. Moving forward with first
principle predictions of physiochemical properties, partition coefficients are a measurement of
only the neutral forms of the solute, while distribution coefficients include all charged and neutral
tautomers of the solute distributing between two immiscible solvents. Future investigations should
include efforts at predicting the distribution coefficients by considering reactions of the of the
solute ion solution and by further assessing the performance of other ab initio and density
functional methods as well as solvation models.

132

REFERENCES

133

REFERENCES
(1)

Jones, M. R.; Liu, C.; Wilson, A. K. Molecular Dynamics Studies of the ProteinâProtein
Interactions in Inhibitor of ÎşB Kinase-Î˛. J. Chem. Inf. Model. 2014, 54 (2), 562â572.

(2)

Stringfellow, H. M.; Jones, M. R.; Green, M. C.; Wilson, A. K.; Francisco, J. S.
Selectivity in ROS-Induced Peptide Backbone Bond Cleavage. J. Phys. Chem. A 2014,
118 (48), 11399â11404.

(3)

Jones, M. R.; Brooks, B. R.; Wilson, A. K. Partition Coefficients for the SAMPL5
Challenge Using Transfer Free Energies. J. Comput. Aided. Mol. Des. 2016, 30 (11),
1129â1138.

134