water and biomolecules physical chemistry of life phenomena (biological and medical physics,...

325

Upload: arthur-coleman

Post on 01-Sep-2014

107 views

Category:

Science


1 download

DESCRIPTION

Biomolecules and water molecules simply represent “chemical substances” when each of them exists alone. However, we find various biological processes expressed when these substances function together. This book “Water and Biomolecules – Physical Chemistry of Life Phenomena” covers the physical chemistry of such biological processes, and deals with “folding”, “dynamics”, and “function” of biomolecules as they are expressed in close relation to water molecules. Protein misfolding and amyloidogenesis are also included, because these are closely related to protein folding and functional expression, and hence responsible for a number of human diseases.

TRANSCRIPT

Page 1: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)
Page 2: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

biological and medical physics,biomedical engineering

Page 3: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

biological and medical physics,biomedical engineeringThe fields of biological and medical physics and biomedical engineering are broad, multidisciplinary anddynamic. They lie at the crossroads of frontier research in physics, biology, chemistry, and medicine. TheBiological and Medical Physics, Biomedical Engineering Series is intended to be comprehensive, covering abroad range of topics important to the study of the physical, chemical and biological sciences. Its goal is toprovide scientists and engineers with textbooks, monographs, and reference works to address the growingneed for information.

Books in the series emphasize established and emergent areas of science including molecular, membrane,and mathematical biophysics; photosynthetic energy harvesting and conversion; information processing;physical principles of genetics; sensory communications; automata networks, neural networks, and cellu-lar automata. Equally important will be coverage of applied aspects of biological and medical physics andbiomedical engineering such as molecular electronic components and devices, biosensors, medicine, imag-ing, physical principles of renewable energy production, advanced prostheses, and environmental control andengineering.

Editor-in-Chief:Elias Greenbaum, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA

Editorial Board:Masuo Aizawa, Department of Bioengineering,Tokyo Institute of Technology, Yokohama, Japan

Olaf S. Andersen, Department of Physiology,Biophysics & Molecular Medicine,Cornell University, New York, USA

Robert H. Austin, Department of Physics,Princeton University, Princeton, New Jersey, USA

James Barber, Department of Biochemistry,Imperial College of Science, Technologyand Medicine, London, England

Howard C. Berg, Department of Molecularand Cellular Biology, Harvard University,Cambridge, Massachusetts, USA

Victor Bloomfield, Department of Biochemistry,University of Minnesota, St. Paul, Minnesota, USA

Robert Callender, Department of Biochemistry,Albert Einstein College of Medicine,Bronx, New York, USA

Britton Chance, Department of Biochemistry/Biophysics, University of Pennsylvania,Philadelphia, Pennsylvania, USA

Steven Chu, Lawrence Berkeley NationalLaboratory, Berkeley, California, USA

Louis J. DeFelice, Department of Pharmacology,Vanderbilt University, Nashville, Tennessee, USA

Johann Deisenhofer, Howard Hughes MedicalInstitute, The University of Texas, Dallas,Texas, USA

George Feher, Department of Physics,University of California, San Diego, La Jolla,California, USA

Hans Frauenfelder,Los Alamos National Laboratory,Los Alamos, New Mexico, USA

Ivar Giaever, Rensselaer Polytechnic Institute,Troy, New York, USA

Sol M. Gruner, Cornell University,Ithaca, New York, USA

Judith Herzfeld, Department of Chemistry,Brandeis University, Waltham, Massachusetts, USA

Mark S. Humayun, Doheny Eye Institute,Los Angeles, California, USA

Pierre Joliot, Institute de BiologiePhysico-Chimique, Fondation Edmondde Rothschild, Paris, France

Lajos Keszthelyi, Institute of Biophysics, HungarianAcademy of Sciences, Szeged, Hungary

Robert S. Knox, Department of Physicsand Astronomy, University of Rochester, Rochester,New York, USA

Aaron Lewis, Department of Applied Physics,Hebrew University, Jerusalem, Israel

Stuart M. Lindsay, Department of Physicsand Astronomy, Arizona State University,Tempe, Arizona, USA

David Mauzerall, Rockefeller University,New York, New York, USA

Eugenie V. Mielczarek, Department of Physicsand Astronomy, George Mason University, Fairfax,Virginia, USA

Markolf Niemz, Medical Faculty Mannheim,University of Heidelberg, Mannheim, Germany

V. Adrian Parsegian, Physical Science Laboratory,National Institutes of Health, Bethesda,Maryland, USA

Linda S. Powers, University of Arizona,Tucson, Arizona, USA

Earl W. Prohofsky, Department of Physics,Purdue University, West Lafayette, Indiana, USA

Andrew Rubin, Department of Biophysics, MoscowState University, Moscow, Russia

Michael Seibert, National Renewable EnergyLaboratory, Golden, Colorado, USA

David Thomas, Department of Biochemistry,University of Minnesota Medical School,Minneapolis, Minnesota, USA

Page 4: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

Kunihiro KuwajimaYuji GotoFumio Hirata

Masahide TerazimaMikio Kataoka

(Editors)

Water and BiomoleculesPhysical Chemistryof Life Phenomena

With Figures

ABC

125

Page 5: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

Professor Kunihiro KuwajimaNational Institutes of Natural Sciences, Okazaki Institute for Integrative Bioscience5-1 Higashiyama, Myodaiji, Okazaki 444-8787, JapanE-mail: [email protected]

Professor Yuji Goto

3-2 Yamadaoka, Suita, Osaka 565-0871, JapanE-mail: [email protected]

Professor Fumio HirataNational Institutes of Natural Sciences, Institute for Molecular ScienceDepartment for Theoretical and Computational Molecular Science38 Nishigo-Naka, Myodaiji, Okazaki 444-8585, JapanE-mail: [email protected]

Professor Masahide TerazimaKyoto University, Graduate School of Science, Department of ChemistryOiwakecho, Kitashirakawa, Kyoto 606-8502, JapanE-mail: [email protected]

Professor Mikio KataokaNara Institute of Science and Technology, Graduate School of Materials Science8916-6 Takayama, Ikoma, Nara 630-0192, JapanE-mail: [email protected]

Biological and Medical Physics, Biomedical Engineering ISSN 1618-7210

ISBN 978-3-540-88786-7 e-ISBN 978-3-540-88787-4

Library of Congress Control Number:

© Springer-Verlag Berlin Heidelberg 2009

This work is subject to copyright. All rights are reserved, whether the whole or part of the material isconcerned, specif ically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting,reproduction on microf ilm or in any other way, and storage in data banks. Duplication of this publication orparts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, inits current version, and permission for use must always be obtained from Springer. Violations are liable toprosecution under the German Copyright Law.

The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply,even in the absence of a specif ic statement, that such names are exempt from the relevant protective laws andregulations and therefore free for general use.

Typesetting: Camera-ready by SPI Publisher Services, PondicherryCover design: eStudio Calamar Steinen

SPIN 12251513 57/3180/SPIPrinted on acid-free paper

9 8 7 6 5 4 3 2 1

springer.com

2008944102

Osaka University, Institute for Protein Research

Page 6: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

Preface

“Biomolecules”, including proteins, nucleic acids and saccharides, perform var-ious biological activities in “water”. Biomolecules and water molecules simplyrepresent “chemical substances” when each of them exists alone. However,we find various biological processes expressed when these substances func-tion together. This book “Water and Biomolecules – Physical Chemistry ofLife Phenomena” covers the physical chemistry of such biological processes,and deals with “folding”, “dynamics”, and “function” of biomolecules as theyare expressed in close relation to water molecules. Protein misfolding andamyloidogenesis are also included, because these are closely related to pro-tein folding and functional expression, and hence responsible for a number ofhuman diseases.

This book is also related to our recent Research Project “Water and Bio-molecules”, which was supported for five years by a Grant-in-Aid for theScientific Research in Priority Areas from the Ministry of Education, Science,Culture, Sports and Technology (MEXT) of Japan, and concluded at the endof March of 2008. During the project period, we held an open workshop annu-ally, at which we had several invited talks by expert researchers in our field,several oral activity reports from our project members, and poster presenta-tions representing the activities of all the members of the project team. Thelast workshop was organized by Mikio Kataoka (Nara Institute of Scienceand Technology), and held in Nara, the oldest capital of Japan, on Janu-ary 24 and 25, 2008. This book thus consists of 15 chapters, including sevenchapters contributed by seven invited speakers (C.M. Dobson, H.J. Dyson,R.M. Levy, J.A. McCammon, C.A. Royer, C.M. Rao, and P.E.Wright) inthe last workshop and eight chapters contributed by eight members (Y. Goto,F. Hirata, M. Kataoka, K. Kuwajima, Y. Okamoto, M. Sakurai, M. Terazima,and K. Yoshikawa) who were involved in our project.

The chapters are arranged thematically: Chaps. 1–5 describe experimentaland simulation studies on the folding of biomolecules, Chaps. 6–12 are relatedto the dynamics and function of biomolecules, and Chaps. 13–15 deal with theamyloidogenesis of proteins.

Page 7: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

VI Preface

In Chap. 1, Peter E. Wright and his colleagues describe recent advancesin mapping transient long range interactions, which are directly implicatedin kinetic folding pathways of apomyoglobin. They use NMR relaxation tech-niques to map out the apomyoglobin folding landscape. Chapter 3 by TakahiroSakaue and Kenichi Yoshikawa gives an overview and recent developments inthe higher-order structure transition between dispersed coil and condensedcompact states in giant DNA molecules. The rich transition behaviors found inexperiments are analyzed based on the statistical mechanical concept and arediscussed in relation to biological significance. Chapters 4 and 5 deal with theo-retical and computational studies of protein folding. Yuko Okamoto in Chap. 4gives an excellent overview of generalized-ensemble algorithms for molecularsimulations of protein folding, and Ron M. Levy and his colleagues in Chap. 5describe studies using replica-exchange simulations to explore the complexbinding and folding landscapes of proteins, particularly focusing on their re-cent work using simplified continuous and discrete representations of theselandscapes. Kunihiro Kuwajima and colleagues in Chap. 2 also describe ex-perimental and simulation studies of folding/unfolding of goat α-lactalbumin,and demonstrate the power of combination of experiments and simulationsfor studying the problems of protein folding.

In Chap. 6, H. Jane Dyson and her colleagues describe the structural prop-erties and dynamics of sizable disordered proteins in solution characterized byspectroscopic methods such as NMR. The chapter thus deals with intrinsicallydisordered proteins, whose functional role in crucial areas such as transcrip-tional regulation, translation and cellular signal transduction has only recentlybeen recognized. Chapter 7, by Mikio Kataoka and Hironari Kamikobo, de-scribes studies on protein dynamics and the effect of hydration water on thedynamics using photoactive yellow protein as a model protein. Chapter 8 byMasahide Terazima describes studies on the biological reactions in severalnew techniques developed by his group. The techniques can monitor spec-trally silent dynamics in time-domain, using the pulsed laser induced tran-sient grating and transient lens methods. Catherine A. Royer and RolandWinter in Chap. 9 describe the pressure perturbation calorimetry, along withresults from many previous densitmetric and high pressure studies to calcu-late quantitatively the specific volumes of a model protein, staphylococcalnuclease in both the folded and unfolded states as a function of temperature.Minoru Sakurai in Chap. 12 describes studies on the biological functions ofa non-reducing disaccharide, α,α-trehalose as a substitute for water, and ontheir underlying mechanisms from viewpoints of thermodynamic, hydrationand structural characteristics of this sugar. Chapters 10 and 11 deal with the-oretical and computational studies of protein dynamics and functions. FumioHirata and his colleagues in Chap. 10 describe the application of the 3D-RISMtheory, a statistical mechanics theory of molecular liquid, to characterizationof proteins in aqueous solutions, particularly focusing on detection of watermolecules and ions trapped in pores of proteins. J. Andrew McCammon in

Page 8: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

Preface VII

Chap. 11 gives an excellent overview of how computer simulations can be usedquantitatively to interpret the behavior of proteins, including their binding ofligands.

In Chap. 13, Chris M. Dobson gives an overview and the conceptual basisof the problems of protein folding and misfolding. The misfolding can oftengive rise to serious cellular malfunctions that frequently lead to disease. Healso describes the results of experiments designed to link the principles of mis-folding and aggregation to the effects of such processes in model organismssuch as Drosophila. Chapter 14 by Abhay Kumar Thakur and Ch. Mohan Raodescribes the recent studies of their group on the possibility of UV exposureas a structural perturbant using mouse prion protein and other amyloidogenicproteins as model systems. Finally, Chap. 15 by Yuji Goto and his colleaguesdescribes the results of recent studies of their group on the direct observationof nucleation and growth of amyloid fibrils using total internal reflection flu-orescence microscopy combined with thioflavin and atomic force microscopy.

We thank all the contributors to this book for their time and effort inpreparing the manuscripts, and particularly Chris M. Dobson (Cambridge)and Ron M. Levy (Rutgers), who were international advisors to our project,for their interest in the project and a number of very useful suggestions re-garding the project. Thanks are also due to Claus E. Ascheron, BalamuruganElumalai and Adelheid Duhm of Springer Science for their help in publishingthis book.

Okazaki Kunihiro KuwajimaJanuary 2009 Yuji Goto

Fumio HirataMikio Kataoka

Masahide Terazima

Page 9: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

“This page left intentionally blank.”

Page 10: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

Contents

1 Mapping Protein Folding Landscapes by NMR RelaxationP.E. Wright, D.J. Felitsky, K. Sugase, and H.J. Dyson . . . . . . . . . . . . . . . 11.1 NMR Techniques for Studying Protein Folding . . . . . . . . . . . . . . . . . 11.2 The Apomyoglobin Folding Landscape . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Structure of the Kinetic Molten Globule State . . . . . . . . . . . . . . . . . 21.4 The Upper Reaches of the Folding Landscape . . . . . . . . . . . . . . . . . . 21.5 Paramagnetic Relaxation Probes: Spin Labeling of Apomyoglobin 41.6 Model for Transient Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.7 Information from Relaxation Dispersion Measurements . . . . . . . . . . 81.8 Folding of an Intrinsically Disordered Protein Upon Binding

to a Target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 Experimental and Simulation Studiesof the Folding/Unfolding of Goat α-LactalbuminK. Kuwajima, T. Oroguchi, T. Nakamura, M. Ikeguchi, and A. Kidera . . 132.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.2 Goat α-Lactalbumin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3 Differences Between the Unfolding Behaviors of Authentic

and Recombinant Goat α-Lactalbumin . . . . . . . . . . . . . . . . . . . . . . . . 152.3.1 Experimental Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.3.2 Simulation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.3.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.4 Folding/Unfolding Pathways of Goat α-Lactalbumin . . . . . . . . . . . . 232.4.1 Experimental Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.4.2 Simulation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.4.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.5 Summary and Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Page 11: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

X Contents

3 Transition in the Higher-order Structure of DNAin Aqueous SolutionT. Sakaue and K. Yoshikawa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373.2 Long DNA Molecules in Aqueous Solution . . . . . . . . . . . . . . . . . . . . . 38

3.2.1 Primary, Secondary, and Higher-order Structures . . . . . . . . 383.2.2 DNA Condensation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403.2.3 Looking at Single DNA Molecules . . . . . . . . . . . . . . . . . . . . . 40

3.3 Statistical Physics of Folding of a Long Polymer . . . . . . . . . . . . . . . . 423.3.1 Some Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423.3.2 Continuous Transition in Flexible Polymers:

Coil-Globule Transition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.3.3 Discontinuous Transition in Semiflexible Polymers . . . . . . . 453.3.4 Instability Due to the Remanent Charge . . . . . . . . . . . . . . . 51

3.4 Summary and Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553.4.1 Higher-order Structure and Genetic Activity . . . . . . . . . . . . 563.4.2 Toward Chromatin Structure . . . . . . . . . . . . . . . . . . . . . . . . . 56

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4 Generalized-Ensemble Algorithms for StudyingProtein FoldingY. Okamoto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614.2 Generalized-Ensemble Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.2.1 Multicanonical Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634.3 Multidimensional Extensions of Multicanonical Algorithm . . . . . . . 67

4.3.1 Replica-Exchange Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 694.3.2 Multidimensional Extensions of Replica-Exchange

Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 734.4 Examples of Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 754.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

5 Protein Folding and Binding: Effective Potentials, ReplicaExchange Simulations, and Network ModelsA.K. Felts, M. Andrec, E. Gallicchio, and R.M. Levy . . . . . . . . . . . . . . . . . 975.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 975.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

5.2.1 The OPLS-AA/AGBNP Effective Potential . . . . . . . . . . . . . 1005.2.2 Replica Exchange Molecular Dynamics . . . . . . . . . . . . . . . . . 1025.2.3 The Network Model of Protein Folding . . . . . . . . . . . . . . . . . 1035.2.4 Loop Prediction with Torsion Angle Sampling . . . . . . . . . . 103

5.3 Folding of Peptides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1045.3.1 G-Peptide Folding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1045.3.2 Folding of Other Small Peptides . . . . . . . . . . . . . . . . . . . . . . . 1055.3.3 Loop Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

Page 12: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

Contents XI

5.4 Kinetic Model of the G-Peptide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1085.4.1 The G-Peptide has Apparent Two-State Kinetics

After a Small Temperature Jump Perturbation . . . . . . . . . . 1085.4.2 The G-Peptide has an α-Helical Intermediate

5.4.3 A Molecular View of Kinetic Pathways . . . . . . . . . . . . . . . . . 1095.5 Ligand Conformational Equilibrium in a Cytochrome

P450 Complex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1105.5.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1115.5.2 The Population of the Proximal State as a Function

of Temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1125.6 Simple Continuous and Discrete Models for Simulating

Replica Exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1125.6.1 Discrete Network Replica Exchange (NRE) . . . . . . . . . . . . . 1145.6.2 RE Simulations using MC on a Continuous Potential . . . . 114

5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

6 Functional Unfolded Proteins: How, When,Where, and Why?H.J. Dyson, S.-C. Sue, and P.E. Wright . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1236.1 What is a Functional Unfolded Protein? . . . . . . . . . . . . . . . . . . . . . . . 1236.2 Where do Functional Unfolded Proteins Occur? . . . . . . . . . . . . . . . . 1246.3 How Are Functional Unfolded Proteins Studied? . . . . . . . . . . . . . . . . 1246.4 NMR Spectra: Practical Considerations . . . . . . . . . . . . . . . . . . . . . . . 1256.5 Dynamic Complexes in CBP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1266.6 Role of Flexibility in the Function of IκBα . . . . . . . . . . . . . . . . . . . . . 128References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

7 Structure of the Photointermediate of PhotoactiveYellow Protein and the Propagation Mechanismof Structural ChangeM. Kataoka and H. Kamikubo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1377.1 Solution X-ray Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1377.2 Photoactive Yellow Protein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1387.3 Solution Structure Analysis of Photointermediate

of PYP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1397.3.1 High-Angle X-ray Scattering of PYP in the Dark

and in the Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1397.3.2 Analysis of High Angle Scattering . . . . . . . . . . . . . . . . . . . . . 142

7.4 Propagation Mechanism of the Structural Change . . . . . . . . . . . . . . 1447.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

During Folding from Coil Conformations . . . . . . . . . . . . . . . . 108

Page 13: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

XII Contents

8 Time-Resolved Detection of Intermolecular Interactionof Photosensor ProteinsM. Terazima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1498.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1498.2 Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1518.3 Diffusion Coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1548.4 Time-Resolved Detection of Interprotein Interactions . . . . . . . . . . . . 154

8.4.1 Protein–Protein Interaction of the PhotoexcitedPhotoactive Yellow Protein . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

8.4.2 Photoinduced Dimerization of AppA . . . . . . . . . . . . . . . . . . . 1578.4.3 Photoinduced Dimerization and Dissociation

of Phototropins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1638.4.4 Diffusion Detection of Interprotein Interaction . . . . . . . . . . 168

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

9 Volumetric Properties of Proteins and the Role of Solventin Conformational DynamicsC.A. Royer and R. Winter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1739.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1739.2 Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1749.3 Thermal Expansivity and ΔV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1799.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

10 A Statistical Mechanics Theory of Molecular RecognitionT. Imai, N. Yoshida, A. Kovalenko, and F. Hirata . . . . . . . . . . . . . . . . . . . 18710.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18710.2 Outline of the RISM and 3D-RISM Theories . . . . . . . . . . . . . . . . . . . 19010.3 Recognition of Water Molecules by Protein . . . . . . . . . . . . . . . . . . . . 19610.4 Noble Gas Binding to Protein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19910.5 Selective Ion-Binding by Protein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20110.6 Pressure-Induced Structural Transition of Protein

and Molecular Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20410.7 Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208

11 Computational Studies of Protein DynamicsJ.A. McCammon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21111.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21111.2 Brief Survey of Protein Motions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21111.3 Binding and Selectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21311.4 Concerted Binding and Release . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21611.5 Molecular Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

Page 14: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

Contents XIII

12 Biological Functions of Trehalose as a Substitute for WaterM. Sakurai . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21912.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21912.2 Hydration Property of Trehalose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

12.2.1 Property of the Aqueous Solution of Trehalose . . . . . . . . . . 22112.2.2 Atomic-Level Picture of Hydration of Trehalose . . . . . . . . . 223

12.3 Solid-State Property of Trehalose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22512.3.1 Polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22512.3.2 Glassy State of Trehalose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227

12.4 Biological Roles of Trehalose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22912.4.1 Possible Mechanisms of Anhydrobiosis . . . . . . . . . . . . . . . . . 22912.4.2 Strategy for Desiccation Tolerance

in the Sleeping Chironomid . . . . . . . . . . . . . . . . . . . . . . . . . . . 23012.4.3 Other Biological Roles of Trehalose . . . . . . . . . . . . . . . . . . . . 234

12.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

13 Protein Misfolding Diseases and the Key Role Playedby the Interactions of Polypeptides with WaterC.M. Dobson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24113.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24113.2 The Importance of Normal and Aberrant Protein Folding

in Biology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24213.3 Protein Aggregation and Amyloid Formation . . . . . . . . . . . . . . . . . . . 24713.4 Molecular Evolution and the Control of Protein Misfolding . . . . . . 25313.5 Impaired Misfolding Control and the Onset of Disease . . . . . . . . . . . 25513.6 Probing Misfolding and Aggregation in Living Organisms . . . . . . . . 25713.7 The Recent Proliferation of Misfolding Diseases and Prospects

for Effective Therapies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26013.8 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263

14 Effect of UV Light on Amyloidogenic Proteins: Nucleationand Fibril ExtensionA.K. Thakur and Ch. Mohan Rao . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26714.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26714.2 Amyloid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268

14.2.1 Structural Perturbation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26814.2.2 Nucleation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27214.2.3 Fibril Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272

14.3 UV Light as a Potent Structural Perturbant . . . . . . . . . . . . . . . . . . . 27214.3.1 UV-Induced Aggregation of Prion Protein . . . . . . . . . . . . . . 27314.3.2 Prevention of UV-Induced Aggregation of Prion Protein . . 27414.3.3 UV Exposure Alters Conformation of Prion Protein . . . . . 27414.3.4 UV-Exposed Proteins Failed to Form Amyloid De Novo . . 277

Page 15: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

XIV Contents

14.3.5 Is Subcritical Concentration of UV-Exposed ProteinResponsible for Failure to Form Amyloid Fibrils? . . . . . . . . 279

14.3.6 UV-Exposed Amyloidogenic ProteinsForm Amyloid Upon Seeding . . . . . . . . . . . . . . . . . . . . . . . . . 280

14.3.7 UV-Exposed Prion Protein Fibrils Show AlteredFibril Morphology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282

14.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286

15 Real-Time Observation of Amyloid Fibril Growth by TotalInternal Reflection Fluorescence MicroscopyH. Yagi, T. Ban, and Y. Goto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28915.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28915.2 Total Internal Reflection Fluorescence Microscopy . . . . . . . . . . . . . . 29015.3 Real-Time Observation of β2-m and Aβ Fibrils . . . . . . . . . . . . . . . . . 29115.4 Effects of Various Surfaces on the Growth of Aβ Fibrils . . . . . . . . . 29215.5 Spontaneous Formation of Aβ(1–40) Fibrils and Classification

of Morphologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29515.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301

Page 16: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

List of Contributors

Michael AndrecDepartment of Chemistryand Chemical Biologyand BioMaPS Institutefor Quantitative BiologyRutgers University, PiscatawayNJ 08854, USA

Tadato BanNational Advanced Instituteof Advanced Science and TechnologyMidorigaoka 1-8-31, IkedaOsaka 563-8577, Japan

Christopher M. DobsonDepartment of Chemistry, Universityof Cambridge, Lensfield RoadCambridge CB2 1EW, [email protected]

H. Jane DysonDepartment of MolecularBiology MB2, The Scripps ResearchInstitute, 10550 North Torrey PinesRoad, La Jolla, CA 92037, [email protected]

Daniel J. FelitskyDepartment of MolecularBiology MB2, The Scripps ResearchInstitute, 10550 North Torrey PinesRoad, La Jolla, CA 92037, USA

Anthony K. FeltsDepartment of Chemistryand Chemical Biology and BioMaPSInstitute for Quantitative BiologyRutgers University, PiscatawayNJ 08854, USA

Emilio GallicchioDepartment of Chemistryand Chemical Biology and BioMaPSInstitute for Quantitative BiologyRutgers University, PiscatawayNJ 08854, USA

Yuji GotoInstitute for Protein Research, OsakaUniversity, 3-2 Yamadaoka, SuitaOsaka 565-0871, [email protected]

Fumio HirataDepartment of Theoretical andComputational Molecular ScienceInstitute for Molecular ScienceNational Institutes of NaturalSciences, Okazaki, Aichi [email protected] of Functional MolecularScience, School of Physical Sciences

Page 17: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

XVI List of Contributors

Graduate University for AdvancedStudies (SOKENDAI)5-1 Higashiyama, MyodaijiOkazaki, Aichi 444-8585, Japan

Mitsunori IkeguchiInternational Graduate Schoolof Arts and ScienceYokohama City UniversityTsurumi, Yokohama 230-0045Japan

Takashi ImaiComputational Science ResearchProgram, RIKEN, WakoSaitama 351-0198, [email protected]

Hironari KamikuboGraduate School of MaterialsScience, Nara Institute of Scienceand Technology, IkomaNara 630-0192, [email protected]

Mikio KataokaGraduate School of MaterialsScience, Nara Institute of Scienceand Technology, IkomaNara 630-0192, [email protected]

Akinori KideraInternational Graduate Schoolof Arts and ScienceYokohama City UniversityTsurumi, Yokohama 230-0045, Japan

Andriy KovalenkoNational Institutefor Nanotechnology, and Departmentof Mechanical EngineeringUniversity of Alberta, EdmontonAlberta T6G 2M9, [email protected]

Kunihiro KuwajimaOkazaki Institute for IntegrativeBioscience, National Institutesof Natural Sciences, 5-1 HigashiyamaMyodaiji, Okazaki, Aichi [email protected]

and

Department of Functional MolecularScience, School of Physical SciencesGraduate University for AdvancedStudies (SOKENDAI)5-1 Higashiyama, MyodaijiOkazaki, Aichi 444-8787, Japan

Ronald M. LevyDepartment of Chemistryand Chemical Biology and BioMaPSInstitute for Quantitative BiologyRutgers University, PiscatawayNJ 08854, [email protected]

J. Andrew McCammonDepartment of Chemistryand Biochemistry, Departmentof Pharmacology, Centerfor Theoretical Biological Physics,and Howard Hughes MedicalInstitute, University of Californiaat San Diego, La JollaCA 92093-0365, [email protected]

Takashi NakamuraOkazaki Institute for IntegrativeBioscience, National Institutesof Natural Sciences, 5-1 HigashiyamaMyodaiji, Okazaki, Aichi 444-8787Japan

Yuko OkamotoDepartment of PhysicsNagoya University, NagoyaAichi 464-8602, [email protected]

Page 18: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

List of Contributors XVII

Tomotaka OroguchiInternational Graduate Schoolof Arts and ScienceYokohama City University, TsurumiYokohama 230-0045, Japan

Ch. Mohan RaoCentre for Cellular and MolecularBiology, Council of Scientific andIndustrial ResearchHyderabad 500 007, [email protected]/staff/mohan

Catherine A. RoyerINSERM, U554, CNRS UMR504829 rue de Navacelles34090 Montpellier Cedex, [email protected]

Takahiro SakaueFukui Institute for FundamentalChemistry, Kyoto UniversityKyoto 606-8103, [email protected]

Minoru SakuraiCenter for Biological Resourcesand Informatics, Tokyo Instituteof Technology, B-62 Nagatsuta-choMidori-ku, Yokohama [email protected]

Shih-Che SueDepartment of MolecularBiology MB2, The Scripps ResearchInstitute, 10550 North Torrey PinesRoad, La Jolla, CA 92037, USA

Kenji SugaseDepartment of Molecular BiologyMB2, The Scripps ResearchInstitute, 10550 North Torrey PinesRoad, La Jolla, CA 92037, USA

Masahide TerazimaDepartment of Chemistry, GraduateSchool of Science, Kyoto UniversityKyoto 606-8502, [email protected]

Abhay Kumar ThakurCentre for Cellular and MolecularBiology, Council of Scientificand Industrial ResearchHyderabad 500 007, India

Roland WinterDepartment of Chemistry, PhysicalChemistry I – Biophysical ChemistryDortmund University of TechnologyOtto-Hahn Str. 6, D-44227Dortmund, [email protected]

Peter E. WrightDepartment of Molecular BiologyMB2, The Scripps Research Institute10550 North Torrey Pines RoadLa Jolla, CA 92037, [email protected]

Hisashi YagiInstitute for Protein Research, OsakaUniversity, 3-2 Yamadaoka, SuitaOsaka 565-0871, Japan

Norio YoshidaDepartment of Theoreticaland Computational MolecularScience, Institute for MolecularScience, Okazaki, Aichi [email protected]

Kenichi YoshikawaDepartment of Physics, GraduateSchool of Science, Kyoto UniversityKyoto 606-8502, [email protected]

Page 19: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

1

Mapping Protein Folding Landscapesby NMR Relaxation

P.E. Wright, D.J. Felitsky, K. Sugase, and H.J. Dyson

Abstract. The process of protein folding provides an excellent example of the in-teractions of water with biomolecules. The changes in the water–protein interactionsalong the protein folding pathway provide an important impetus for the formationof the final natively folded structure of the protein. NMR spectroscopy providesunique insights into the dynamic protein folding process, and during the past 20years we have seen the development of a wide range of NMR techniques to probethe kinetic and thermodynamic aspects of protein folding. In particular, with the ad-vent of high-field spectrometers and stable isotope labeling techniques, the structureand dynamics of a wide range of disordered and partly ordered proteins at equilib-rium have been characterized by NMR. Efforts in our laboratory over a number ofyears have allowed the sequence-specific identification of sites of local hydrophobiccollapse, as well as secondary structure formation and transient long-range inter-actions in several protein systems, most notably for apomyoglobin, which will behighlighted in this article.

1.1 NMR Techniques for Studying Protein Folding

Kinetic folding pathways for proteins that fold on a millisecond timescale canbe probed using hydrogen exchange pulse labeling [1,2], where differential pro-tection of amide protons at various points during folding is detected by NMR.More recently, with the advent of high-field spectrometers and 13C, 15N, and2H labeling techniques, the structure and dynamics of disordered and partlyordered proteins at equilibrium have been characterized by NMR. The upperreaches of the protein folding landscape can be mapped using chemical shift,nuclear Overhauser effect (NOE), spin labeling, relaxation data, and residualdipolar coupling measurements (reviewed in [3]). Efforts in our laboratory overa number of years have allowed the sequence-specific identification of sites oflocal hydrophobic collapse, secondary structure formation, and transient long-range interactions in several protein systems, most notably in apomyoglobin.

Page 20: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

2 P.E. Wright et al.

1.2 The Apomyoglobin Folding Landscape

Apomyoglobin, the heme-free version of the muscle protein myoglobin, con-tains eight helices folded into the canonical globin fold. The kinetic foldingpathway, elucidated by hydrogen exchange pulse labeling [4,5], shows the rapidformation of an intermediate species containing the A, B, G, and H helices,which is followed by the slower (∼ms) folding of the remainder of the protein.The equilibrium folding landscape for apomyoglobin is typical for a single-domain protein. In the presence of high concentrations of urea, the protein iscompletely unfolded [6], and populates an ensemble of structures with littledetectable propensity for structure formation. In the acid-unfolded state at pH2, the protein is largely unfolded, but samples transient secondary structureand hydrophobic clusters in certain parts of the protein but not in others [7].Equilibrium intermediates corresponding to the ABGH kinetic intermediateare formed at intermediate pHs in the absence of urea. These species, termedmolten globules, contain relatively stable helical secondary structure, but fluidtertiary structure. Resonances of the F helix of folded apomyoglobin at pH6 are invisible because of an exchange on an intermediate timescale betweentwo or more structures with different chemical shifts [8].

1.3 Structure of the Kinetic Molten Globule State

All globins so far studied pass through a kinetic molten globule intermedi-ate that contains some but not all of the helices. The particular helices thatare present in the kinetic intermediate vary according to the amino acid se-quence; for example, the intermediate in the folding of the monomeric planthemoglobin apoleghemoglobin contains the E, G, and H helices instead of theA, B, G, and H helices of apomyoglobin. An extensive series of kinetic andequilibrium folding studies on mutants of apomyoglobin [9–11] have identifieda non-native structure that slows down folding and allows the intermediateto be detected. This is illustrated in Fig. 1.1, which shows the proton occu-pancies in the molten globule intermediate of apomyoglobin mapped onto thestructure of the native, fully folded protein. The most highly protected areasin the intermediate, which likely correspond to the coalesced portion of thepolypeptide, do not correspond to contiguous regions in the fully folded pro-tein. Instead, the H helix appears to be translocated in the intermediate byabout one helical turn. We conclude that the transition state for folding thusinvolves resolution of this small area of non-native structure before the finalnative contacts can be made.

1.4 The Upper Reaches of the Folding Landscape

One of the strengths of NMR is that it can give per-residue structural infor-mation on ensembles of molecules that may contain different local structures.An example of this is the acid-unfolded state of apomyoglobin. Chemical shift

Page 21: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

1 Mapping Protein Folding Landscapes by NMR Relaxation 3

Fig. 1.1. Model of the apomyoglobin kinetic folding intermediate based on hydrogenexchange pulse labeling and mutagenesis data. The proton occupancies are mappedonto the structure of the holomyoglobin [12]. The degree of amide proton exchangeprotection is indicated by the intensity of the gray shading and the thickness of thebackbone. The most protected regions are indicated by the darkest shade and thethickest backbone. The figure was prepared using MolMol [13]

data show that there is a detectable propensity for helical backbone dihedralangles in the regions of the protein that correspond to the H and A helices inthe native folded state. Relaxation data [7] and spin-labeling studies [14] showthe presence of transient native-like long-range interactions between the A andG helix regions in acid-unfolded apoMb. That these transient interactions arenative-like and nonrandom must be a consequence of the amino acid sequencealone, and a series of mutant studies of apomyoglobin [9–11] showed thatthe propensity for local and transient long-range ordering in acid-unfoldedapomyoglobin could be correlated with the property “average area buriedupon folding” (AABUF) [15] or the modified hydrophobic effect [16]. In ad-dition, the proton occupancy in the kinetic intermediate also correlates withthe AABUF, and changing the local AABUF by designed point mutationsalso changes the pattern of proton occupancy in the kinetic intermediate [10](Fig. 1.2). These experiments showed conclusively that the local regions withhigh AABUF adopt stable structure early in the protein folding process. Wenext turned to the question of the means whereby the hydrophobic clusters,sometimes separated by long intermediate stretches of the unfolded polypep-tide, can interact, and to the hierarchy of folding events. These questions areaddressed by using paramagnetic relaxation enhancement (PRE) (spin labels)and 15N R2 relaxation dispersion.

Page 22: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

4 P.E. Wright et al.

Fig. 1.2. Correlation between proton occupancies in the kinetic burst phase in-termediate (black circles) and average area buried upon folding (AABUF, graylines) for wild-type apomyoglobin and for a quadruple mutant (Leu11Gly, Trp14Gly,Ala71Leu, Gly73Trp – termed the GGLW mutant). Reproduced with permissionfrom [10]

1.5 Paramagnetic Relaxation Probes: Spin Labelingof Apomyoglobin

The incorporation of a paramagnetic spin label results in broadening of theNMR resonances of nuclei within 15–20 A from the site of spin labeling. Thismakes spin labels powerful probes of conformational ensembles. A preliminaryspin-label study of apomyoglobin [14] showed that the transient contacts thatoccur at equilibrium in acid-unfolded apoMb are sequence specific and regionspecific. Resonances are broadened in the immediate vicinity of the spin label,but for some spin label sites, such as E18 (Fig. 1.3), broadening is observedat long range in the G and H helix regions, while for a spin-label site inthe E helix, no such long-range broadening is observed. We have recentlyundertaken a comprehensive spin-label study of apomyoglobin using the datato derive a model that gives rise to a quantitative evaluation of the populationof various transient collapsed states [17].

1.6 Model for Transient Interactions

For unfolded and partly folded states, the spin label reports on parts of thepolypeptide chain that are in transient contact with the segment bearing thespin label. The extent of relaxation enhancement (line broadening) dependson both the distance to the paramagnetic spin label and the lifetime of theinteraction. When the chain conformers rapidly interconvert, as is the case

Page 23: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

1 Mapping Protein Folding Landscapes by NMR Relaxation 5

Fig. 1.3. Paramagnetic relaxation enhancement profiles for apomyoglobin unfoldedat pH 2.3 in the presence (left panels) and absence (right panels) of 8 M urea. Datafor spin labels attached at residues 18 and 77 is shown. The plots show the ratio ofHSQC cross-peak intensity with the spin label oxidized (paramagnetic) and reduced(diamagnetic) as a function of residue number. The solid lines in the left panelsrepresent the broadening profile expected for a random coil polypeptide. The figureis adapted from data reported in [14]. The positions of the helices in holomyoglobinare shown by the bars at the top of the figure

in unfolded apomyoglobin, the relaxation enhancement becomes a weightedaverage over all members of the ensemble:

R2P = Σi Kipi/r6i ,

where pi is the fractional population of state i, ri is the distance betweenthe backbone amide proton which gives rise to the NMR cross-peak and thespin label, and Ki is a proportionality constant which depends on both thegyromagnetic ratio of the nucleus under investigation and the correlation timefor the electron–nuclear dipole–dipole interaction. The magnitude of Ki issuch that even very small populations (<∼1%) can contribute measurably tothe overall relaxation rate.

Detailed PRE measurements on a total of 14 spin-label sites distrib-uted throughout the molecule confirm that transient long-range interactionsin the acid-unfolded apomyoglobin chain are restricted to the five regionscorresponding to the AABUF maxima in the primary sequence; these aredesignated as regions A, B, C, G, and H (corresponding to the high-AABUFsequences in the A and B helices, the CD loop, and the G and H helices, respec-tively). The localization of interaction sites to distinct segments of the chainsuggests that the paramagnetic relaxation may be modeled via a chemical

Page 24: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

6 P.E. Wright et al.

kinetics approach whereby these five sites transiently associate in various com-binations. The unfolded ensemble can thereby be divided into 52 macrostates,which correspond to the 51 different possible combinatorial arrangements ofthe five interaction sites and the completely dissociated substrate. For exam-ple, A can combine with G and H in one cluster, while B and C are interactingin a second cluster to form the macrostate AGH–BC.

The long-range intramolecular contacts in a given macrostate act as aset of topological restraints which reduce the chain’s configurational entropy.For nonspecific, transient interactions, the relative entropy loss for the for-mation of different contacts (loops) may well be expected to dominate thethermodynamics and thus determine the relative populations of the differentmacrostates. This entropy loss depends directly on the length of the interven-ing chain segment(s) and relates to the distance distribution function P (r)between two noninteracting sites separated by the same linker length. Morespecifically, the entropy loss is defined by the fraction of the distribution inwhich the interaction site centers are close enough for the two regions tocoalesce; this distance is typically estimated from the sum of the radii oftwo spheres with volumes equivalent to the total van der Waals volumes ofall residues within each interaction site. The required distance distributionscan be extracted from the paramagnetic relaxation enhancement of spin la-bels attached centrally in the chain (such as at positions 57 or 77) where nolong-range interactions occur, utilizing radius of gyration information to helpdetermine the long-range tails of the distributions.

A model in which the relative stabilities of various clusters (andmacrostates) are determined solely by the entropic barriers to loop closurediscussed above can be fitted to the experimental paramagnetic relaxationdata, with a single parameter reflecting the mean favorable free energy ofinteraction required to overcome the entropic penalty for contact formation.The model fits the experimental data surprisingly well (pale gray lines inFig. 1.4), but not perfectly, as it cannot explain the experimentally observedpreference for the C-terminus (G/H regions) to interact with A/B over C.(This latter region should interact more strongly solely on the basis of loopentropy considerations.) When this fact is accounted for (by modeling inweaker pairwise interaction free energies of the C region with other interac-tion sites), the loop entropy model gives an excellent fit to the PRE data forthe 12 spin-label sites that show long-range interactions, as shown via thefit in Fig. 1.4 (dark gray lines). Spin-label data obtained in the presence of8 M urea at pH 2.3 could also be well fitted by this model; the only contactspersisting under these more destabilizing conditions are relatively short range(AB and GH).

The most highly populated macrostates in the pH 2.3 ensemble in the ab-sence of urea include, as expected, the species with no association (A–B–C–G–H). What is perhaps more surprising is that this completely unfolded sub-state is a minority (30%); most ensemble members have one or more long-rangeinteraction. The most populated (17–40% population) interactions involve

Page 25: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

1 Mapping Protein Folding Landscapes by NMR Relaxation 7

Fig. 1.4. Paramagnetic relaxation enhancement profiles for apomyoglobin unfoldedat pH 2.3 with spin labels at the positions indicated by the arrows. The fitted curvesshow the initial (pale gray) fits and the fits obtained after correction for differences inpairwise cluster interaction free energies (dark gray lines).The location of hydropho-bic clusters defined from regions of high AABUF are indicated by bars at the top ofthe figure. Reproduced with permission from [17]

some of the smallest loop closure events (e.g., AB, GH, and BC) and are in-dependent of whether the interaction is native (GH, BC) or non-native (AB).The former of these observations rationalizes the relatively modest reduction(∼15%) in radius of gyration relative to more completely (denaturant) un-folded ensembles despite the presence of long-range interactions in a majorityof the ensembles.

Transient interactions also occur between the N- and C-termini of theprotein. The populations of these contacts are quite small (less than 4%)consistent with the greater reduction in entropy required to close loops in-volving the extended intervening linkers. The model predicts multiple speciesinvolving different combinations of A, B, G, and H, all of similar stability.Because of its stochastic basis, however, the model is likely to underestimatecooperativity; the results are not incompatible with a single ABGH cluster.Strikingly, spin labels that probe this interaction induce quite different ex-tents of nonlocal line broadening. A spin label attached at position R139 (inthe H region) induces much more N-terminal (A/B) line broadening than oneattached at position K140. Similarly, spin labels at positions 11, 15, and 18 inthe A region all induce different extents of line broadening in G and H. This

Page 26: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

8 P.E. Wright et al.

differential line broadening must reflect the relative orientations/positions ofthe side chains within the cluster, indicating a significant degree of speci-ficity of interaction. In contrast to this observation, spin labels that probeinteractions restricted to a single chain terminus all enhance relaxation tosimilar degrees, suggesting that more localized interactions are significantlymore heterogeneous and less specific.

1.7 Information from Relaxation DispersionMeasurements

Relaxation dispersion measurements are applied to systems that are under-going an exchange process on the microsecond to millisecond timescale. Mea-surement of the R2 relaxation rate in a series of experiments where the pulsingfrequency is varied results in additional intensity for resonances of nuclei in-volved in the exchange process [18,19]. The resulting dispersion curve, showingReff

2 as a function of 1/τCP can be fitted to functions such as [19,20]

Reff2 = R0

2 +12

{kAB + kBA − 1

τCPcosh−1 [D+ cosh (η+) − D− cos (η−)]

}

D± =12

[±1 +

ψ + 2Δω2√ψ2 + ξ2

]

η± = τCP

√12

(±ψ +

√ψ2 + ξ2

)ψ = (kAB + kBA)2 − Δω2

ξ = 2Δω (kAB − kBA)

The parameters derived from these fits give information on the relativepopulation of the two states pA and pB, on the rate of exchange kex(= kAB +kBA) between the two states, and on the structure of the excited state, whichis given by the chemical shift difference Δω.

1.8 Folding of an Intrinsically Disordered ProteinUpon Binding to a Target

Coupled folding and binding is a frequent theme in the field of intrinsicallydisordered proteins (see Chap. 6). One of the earliest examples of this phe-nomenon was the interaction of the phosphorylated kinase-inducible domain(pKID) of the transcription factor CREB with the KIX domain of the tran-scriptional coactivator CBP. Free pKID is unfolded in solution [21], but foldsinto an orthogonal pair of helices, αA and αB, upon binding to the foldedKIX domain (Fig. 1.5) [23]. We have recently posed the question, what is the

Page 27: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

1 Mapping Protein Folding Landscapes by NMR Relaxation 9

Fig. 1.5. Coupled folding and binding during the interaction of the phosphorylatedkinase inducible activation domain of the transcription factor CREB (termed pKID)with the globular KIX domain of the CREB binding protein (CBP). The free pKIDdomain is intrinsically disordered (represented as the unfolded chain on the left), andin the absence of the binding partner it populates an ensemble of conformations.Upon binding to the globular KIX domain (shown as gray surface), it folds into apair of orthogonal helices (dark gray backbone trace). Reproduced with permissionfrom [22]

Fig. 1.6. 15N R2 relaxation dispersion profile for Arg124 of pKID recorded at 800(filled circles) and 500 MHz (open circles). Dispersion curves for 1mM [15N]-pKIDin the presence of 0.95, 1.00, 1.05, and 1.10 mM KIX are shown

mechanism by which the folding of the disordered pKID is coupled with bind-ing to the KIX domain? Recent NMR studies including relaxation dispersionhave provided intriguing insights into this question.

Coupled folding and binding of pKID to KIX was studied by recording aseries of HSQC titrations and by 15N R2 dispersion measurements performedusing 15N-labeled pKID at two magnetic fields and over a range of pKID:KIXconcentration ratios [24] (Fig. 1.6). The HSQC titrations show that at leasttwo processes occur as KIX is titrated into pKID. Both fast (low-affinity)and slow (high-affinity) exchange processes are observed. The NMR data can

Page 28: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

10 P.E. Wright et al.

be fitted to a pseudo four-site exchange model, which gives important newinsights into the mechanism of coupled folding and binding in this system:

pKIDFree

+KIX ↔ pKIDEncountercomplex

. . . KIX ↔ pKID : KIX∗

Intermediate↔ pKID : KIX

Bound.

The encounter complex represents an ensemble in which nonspecific hy-drophobic interactions occur at a number of sites. The primary interactionsin the encounter complex involve a hydrophobic cluster (Y134, I137, L138,and L141) in the unfolded αB region of pKID contacting hydrophobic patcheson KIX. The encounter complex was invoked to reconcile the behavior of thecross-peaks in the HSQC titrations with the Δω values obtained from the re-laxation dispersion measurements: a better correlation is observed between theΔω values and equilibrium chemical shift differences Δδ which utilize the en-counter complex (Fig. 1.7). The structure of the binding intermediate can alsobe inferred from the chemical shift and relaxation data. The αA helix is nearlyfully folded in the intermediate, whereas the αB helix is only partially folded.

In summary, our NMR measurements show that the coupled folding andbinding landscape of pKID is complex: Disordered pKID first makes transienthydrophobic contacts with KIX, forming an ensemble of encounter complexesthat evolve to folded states without dissociation of pKID from the KIX sur-face. As for the folding of apomyoglobin, the most important interaction inthe initiation of coupled folding and binding of pKID is the formation of hy-drophobic interactions, which can then play a key role in directing the foldingprocess towards the final folded state.

Fig. 1.7. Correlation of 15N chemical shift differences (Δω∗) determined from theR2 dispersion measurements with equilibrium shift differences. Chemical shift dif-ferences between free pKID and the fully bound state (ΔδFB) are shown as blacksquares, and between the encounter complex and fully bound state (ΔδEB) are shownas gray circles, with matching shades for the lines of best fit. Reproduced with per-mission from [24]

Page 29: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

1 Mapping Protein Folding Landscapes by NMR Relaxation 11

References

1. J.B. Udgaonkar, R.L. Baldwin, Nature 335, 694–699 (1988)2. H. Roder, G.A. Elove, S.W. Englander, Nature 335, 700–704 (1988)3. H.J. Dyson, P.E. Wright, Chem. Rev. 104, 3607–3622 (2004)4. P.A. Jennings, P.E. Wright, Science 262, 892–896 (1993)5. C. Nishimura, H.J. Dyson, P.E. Wright, J. Mol. Biol. 322, 483–489 (2002)6. S. Schwarzinger, P.E. Wright, H.J. Dyson, Biochemistry 41, 12681–12686 (2002)7. J. Yao, J. Chung, D. Eliezer, P.E. Wright, H.J. Dyson, Biochemistry 40,

3561–3571 (2001)8. D. Eliezer, P.E. Wright, J. Mol. Biol. 263, 531–538 (1996)9. C. Nishimura, P.E. Wright, H.J. Dyson, J. Mol. Biol. 334, 293–307 (2003)

10. C. Nishimura, M.A. Lietzow, H.J Dyson, P.E. Wright, J. Mol. Biol. 351, 383–392(2005)

11. C. Nishimura, H.J. Dyson, P.E. Wright, J. Mol. Biol. 355, 139–156 (2006)12. J. Kuriyan, S. Wilz, M. Karplus, G.A. Petsko, J. Mol. Biol. 192, 133–154 (1986)13. R. Koradi, M. Billeter, K. Wuthrich, J. Mol. Graphics 14, 51–55 (1996)14. M.A. Lietzow, M. Jamin, H.J. Dyson, P.E. Wright, J. Mol. Biol. 322, 655–662

(2002)15. G.D. Rose, A.R. Geselowitz, G.J. Lesser, R.H. Lee, M.H. Zehfus, Science 229,

834–838 (1985)16. H.J. Dyson, P.E. Wright, H.A. Scheraga, Proc. Natl. Acad. Sci. U. S. A. 103,

13057–13061 (2006)17. D.J. Felitsky, M.A. Lietzow, H.J. Dyson, P.E. Wright, Proc. Natl. Acad. Sci.

U. S. A. 105, 6278–6283 (2008)18. J.P. Loria, M. Rance, A.G. Palmer, J. Am. Chem. Soc. 121, 2331–2332 (1999)19. M. Tollinger, N.R. Skrynnikov, F.A. Mulder, J.D. Forman-Kay, L.E. Kay, J. Am.

Chem. Soc. 123, 11341–11352 (2001)20. D.G. Davis, M.E. Perlman, R.E. London, J. Magn Reson. B 104, 266–275 (1994)21. I. Radhakrishnan, G.C. Perez-Alvarado, H.J. Dyson, P.E. Wright, FEBS Lett.

430, 317–322 (1998)22. H.J. Dyson, P.E. Wright, Nat. Rev. Mol. Cell Biol. 6, 197–208 (2005)23. I. Radhakrishnan, G.C. Perez-Alvarado, D. Parker, H.J. Dyson, M.R. Montminy,

P.E. Wright, Cell 91, 741–752 (1997)24. K. Sugase, H.J. Dyson, P.E. Wright, Nature 447, 1021–1025 (2007)

Page 30: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

“This page left intentionally blank.”

Page 31: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

2

Experimental and Simulation Studiesof the Folding/Unfolding of Goatα-Lactalbumin

K. Kuwajima, T. Oroguchi, T. Nakamura, M. Ikeguchi, and A. Kidera

Abstract. We studied (1) the unfolding behavior of the authentic and recom-binant forms of goat α-lactalbumin and (2) the structure of the transition stateof folding/unfolding of the protein, both experimentally and by simulation of themolecular dynamics. Experimentally, the recombinant protein exhibited remarkabledestabilization and unfolding-rate acceleration as compared to those of the authen-tic protein; these differences were caused by the presence of an extra N-terminalmethionine residue in the recombinant form. We also characterized the transition-state structure by mutational Φ-value analysis, based on which the structure waslocalized in a region containing the C-helix and the Ca2+-binding site of the pro-tein. Simulation of the molecular dynamics of unfolding at high temperatures (398and 498K) yielded good reproduction of the experimental observations and gaveatomically detailed descriptions of the unfolding behavior and the transition-statestructure of folding/unfolding. The present series thus demonstrated the power ofcombination of experiments and simulations for studying the problems of proteinfolding.

2.1 Introduction

One of the major objectives of the physical chemistry studies in waterand biomolecules is to fully reproduce the experimentally observed folding/unfolding behavior of a typical model protein in water by means of molecularsimulation. However, the all-atom molecular dynamics (MD) simulation of thefolding of a protein from the fully unfolded state to the native structure re-mains computationally intractable when the size of the target protein is largerthan 100 residues and when simulation is carried out with explicit water mole-cules (i.e., when complete, contextualized simulation is attempted) [1–3].

Nevertheless, explicit-water all-atom MD simulations of unfolding of pro-teins at high temperatures are expected to yield important insights into themolecular mechanisms of protein folding [4–13]. It can frequently be assumedthat even at high temperatures unfolding may basically represent a reversal ofthe folding transition [14–16]. Thus, unfolding MD simulations will typically

Page 32: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

14 K. Kuwajima et al.

yield an atomically detailed picture of both the unfolding and folding be-haviors of a protein. Recent experimental advances, including a hydrogen-exchange NMR technique and site-directed mutagenesis (Φ-value analysis),have enabled us to obtain the structures of intermediates and the transitionstate of folding with the resolution of amino acid residues [17–20]; hence, suchexperimental data serve as diagnostic criteria for the validity of simulationresults. Therefore, the combined use of experiments and simulations is con-sidered extremely advantageous for gaining a comprehensive understanding ofthe molecular mechanisms of protein folding [13,21].

This chapter describes experimental and simulation studies of the folding/unfolding of goat α-lactalbumin. Experimentally, we have accomplished twogoals. First, we have shown that recombinant α-lactalbumin, which has an ad-ditional methionine residue at the N-terminus, is remarkably less stable andunfolds faster than the authentic protein prepared from goat milk [22]. Sec-ond, we have characterized the molten globule intermediate and the transitionstate of folding using a hydrogen-exchange 2D NMR technique and mutationalΦ-value analysis, respectively [23]. Here, we carried out unfolding MD simula-tions of recombinant and authentic α-lactalbumin at high temperatures (398and 498 K), and the simulation results were then compared with the exper-imental observations mentioned above [24, 25]. We have thus demonstratedthat MD simulations reliably reproduce the experimentally observed fasterunfolding of the recombinant protein and the transition-state structure of thefolding/unfolding reactions; furthermore, MD simulations provide very useful,atomically detailed descriptions for elucidating protein-folding mechanisms.

2.2 Goat α-Lactalbumin

Goat α-lactalbumin is a globular milk protein of 123 amino acid residues witha molecular weight of 14,200 [26]. High-resolution X-ray crystallographic struc-tures are available for both the authentic protein prepared from goat milk andthe recombinant protein expressed in Escherichia coli (E. coli) [22, 26]. Thestructures of the authentic and recombinant proteins are superimposable ontoeach other, although the recombinant protein has an extra methionine residue(Met0) at the N-terminus (Fig. 2.1) [22]. The structure of goat α-lactalbuminis composed of two subdomains, an α-domain formed by four α-helices (A-,B-, C-, and D-helices from the N-terminus) and the C-terminal 310-helix,and a β-domain formed by a three-stranded β-sheet and a 310-helix [22, 26].α-Lactalbumin is a Ca2+-binding protein, and the Ca2+-binding site is locatedat the interface between the α- and β-domains, namely, at a loop from theC-terminal side of the 310-helix involved in the β-domain to the N-terminalside of the C-helix [26–28]. Ca2+ binding to α-lactalbumin remarkably stabi-lizes the native structure of the protein [29–31].

α-Lactalbumin is a useful model protein for protein-folding studies. Itshows the molten globule intermediate under mildly denaturing conditions

Page 33: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

2 Experimental and Simulation Studies of the Folding/Unfolding of Goat 15

Fig. 2.1. The backbone structures of authentic and recombinant goat α-lactalbuminin the crystal form. The backbone of Mol A of the authentic protein, represented bya wire model, was superimposed on the backbone of the recombinant protein. Grayand black wires represent the authentic and recombinant proteins, respectively. TheCα-atom RMSD value between the two proteins was 0.54 A. The PDB codes for theauthentic and recombinant proteins are 1HFY and 1HMK, respectively

at equilibrium, and the identity between the molten globule intermediate anda kinetic folding intermediate has been well established [32,33]. In the kineticrefolding of the protein from the fully unfolded state, the molten globule inter-mediate forms first within the dead time of a stopped-flow experiment (a fewmilliseconds), and then the intermediate folds into the native state [34, 35].A transition state located around the state of maximum free energy existsbetween the molten globule intermediate and the native state.

Recently, the molten globule state of α-lactalbumin has been shown to pos-sess antitumor activity when complexed with a fatty acid [36, 37], and hencethe protein may possess secondary biological activity in addition to the pri-mary activity of native α-lactalbumin, i.e., substrate specificity modifier activ-ity in a lactose synthase system [38,39]. The molten globule of α-lactalbuminthus provides an example of the folding intermediate of a protein exhibitinga secondary biological activity.

2.3 Differences Between the Unfolding Behaviorsof Authentic and Recombinant Goat α-Lactalbumin

2.3.1 Experimental Studies

On the basis of the equilibrium unfolding curves of authentic and recombi-nant α-lactalbumin, the recombinant protein is remarkably less stable thanthe authentic one (Fig. 2.2) [22]. The transition midpoints were 3.2 and 2.7 Mguanidine hydrochloride (GdnHCl) for the authentic and recombinant pro-teins, respectively, and the difference in the stabilization free energy, ΔΔGNU,

Page 34: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

16 K. Kuwajima et al.

Fig. 2.2. GdnHCl-induced unfolding transition curves for authentic and recombi-nant goat α-lactalbumin [22]. The filled diamonds indicate the unfolding transition ofthe methionine-free recombinant protein produced by CNBr cleavage. The unfoldingwas carried out at 25◦C in the presence of 1 mM CaCl2, 50 mM NaCl, and 50mMsodium cacodylate (pH 7.0). The transitions were monitored by CD measurementsat 222 nm (circles and diamonds) and at 270 nm (triangles), and the transition curveswere normalized between the native and fully unfolded baselines. The black line withsymbols represents the authentic form, and the gray line with symbols represents therecombinant form. Reproduced with permission from [22]

between the two proteins was as much as 1.1 kcal mol−1 at 3.2 M GdnHCl and25◦C [22]. This difference in stability between the two proteins was solely dueto the presence of an extra methionine residue at the N-terminus of the re-combinant protein, because the methionine-free recombinant protein producedby cyanogen bromide (CNBr) cleavage recovered its stability and producedan unfolding transition curve coincident with that of the authentic protein(Fig. 2.2) [22]; there is no methionine residue in the mature sequence of goatα-lactalbumin, and therefore, only the N-terminal methionine was removedby CNBr. Berliner and coworkers reported that the recombinant ΔE1 mu-tant, in which the Glu1 residue of the authentic sequence was geneticallyremoved leaving an N-terminal methionine in its place, showed higher sta-bility than the authentic protein [40, 41]. Therefore, the destabilization ofrecombinant α-lactalbumin was not due to the presence of the methionine atthe N-terminus, but rather due to the extension of the N-terminus by theextra residue.

Considering that the X-ray crystallographic structures of authentic andrecombinant goat α-lactalbumin were essentially identical to each other, withthe exception of the N-terminal region and a loop region between residues105 and 110 (Fig. 2.1), the difference in stability between the two proteinswas remarkable.

We therefore studied the unfolding and refolding kinetics of authentic andrecombinant goat α-lactalbumin, induced by GdnHCl concentration jumps,

Page 35: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

2 Experimental and Simulation Studies of the Folding/Unfolding of Goat 17

using stopped-flow circular dichroism (CD) and fluorescence spectroscopy(Fig. 2.3) [22]. Although the refolding kinetics of the two proteins coincidedwith each other, the unfolding kinetics was ninefold faster in the case of the

Fig. 2.3. GdnHCl-induced (a) unfolding and (b) refolding kinetic progress curvesof authentic and recombinant goat α-lactalbumin [22]. Unfolding was initiated by aconcentration jump from 1.0 to 5.4 M, and the refolding process was initiated by aconcentration jump from 5.5 to 0.5 M at 25◦C in the presence of 1mM CaCl2, 50mMNaCl, and 50 mM sodium cacodylate (pH 7.0); the refolding and unfolding kineticswere monitored by the measurement of CD ellipticity at 225 nm using stopped-flowCD. The continuous line denotes the authentic protein, and the filled squares denoterecombinant protein. (b) The inset shows the refolding progress curve within 2 s,and the same notations are used for the reaction curves. Theoretical kinetic progresscurves are also shown in (a) and (b). Reproduced with permission from [22]

Page 36: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

18 K. Kuwajima et al.

recombinant protein than the authentic protein at 5.4 M GdnHCl and 25◦C.This indicates that the destabilization of the recombinant protein is primarilyassociated with the acceleration of the unfolding rate.

The molecular mechanisms by which the extension of the N-terminus bythe extra methionine residue destabilized recombinant α-lactalbumin remainunclear. Additional conformational entropy of the extra methionine residuein the unfolded state could account for the destabilization and unfolding-rateacceleration of the recombinant protein [22]. Ishikawa and coworkers reportedthe destabilization of recombinant bovine α-lactalbumin, similarly inducedby the extra N-terminal methionine residue, and showed that the enthalpychange of thermal unfolding was the same for the authentic and recombinantproteins, indicating that the destabilization was caused by an entropic ef-fect [42]. However, the destabilization by the extra methionine residue in thelysozyme homologous to α-lactalbumin was rather enthalpic and accompaniedby a disruption of hydrogen-bond networks in the N-terminal region [43,44].

2.3.2 Simulation Studies

We carried out all-atom MD unfolding simulations for the authentic and re-combinant forms of goat α-lactalbumin at 398 and 498 K using the programpackage of MARBLE [24,25,45]. The simulations produced 10 trajectories of5 ns for each form, resulting in 100 ns total at each temperature. Our simula-tions included explicit water molecules of the TIP3P model [46], and we usedthe CHARMM22 force field to calculate the potential energy [47]. We applieda periodic boundary condition to the system, in which a protein moleculewas immersed in water within a rectangular box that contained the proteinmolecule and 8,571 water molecules for the authentic protein or 8,787 watermolecules for the recombinant protein [24]. We observed the initial stages ofunfolding in the 398-K simulations and global unfolding in the 498-K simula-tions, both at atomic-level resolution. In this section, we primarily focus onthe results of the 398-K simulations to explore the early stages of the unfoldingtransition.

Unfolding Dynamics Observed by MD Simulations

We monitored the unfolding trajectories obtained by the simulations in termsof three structural parameters: (1) the Cα-atom root-mean-square deviation(RMSD) from the native structure, (2) the fractional number of the nativecontacts (Q), and (3) the radius of gyration (Rg) of the molecule, and theresults thus obtained qualitatively agreed with the experimental observationsshown above [24]. Although the kinetic unfolding curves monitored by theabove parameters (RMSD, Q, and Rg) fluctuated greatly with the simula-tion time and trajectory, the averaged kinetic unfolding curves, obtained byaveraging the parameters for the 10 trajectories, indicated that recombinantα-lactalbumin unfolded faster than the authentic protein [24].

Page 37: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

2 Experimental and Simulation Studies of the Folding/Unfolding of Goat 19

To examine the structural changes along the unfolding trajectories in moredetail, we investigated the Q values for the five core regions (CoreN-term,CoreABC, CoreAB, CoreC-term, and Coreβ) (Fig. 2.4(a)), which were uniquelydefined from the native contact map of goat α-lactalbumin, as a function ofsimulation time [24]. As shown in Fig. 2.4(b), the CoreN-term that was com-posed of contacts between residues from 1 (the N-terminus) to 3 and thosefrom 36 to 38 was disrupted within the first few hundreds of picoseconds inthe recombinant protein at 398 K, i.e., much faster than disruption of theauthentic protein. On the other hand, simulations of the other core regions,particularly CoreC-term and Coreβ, did not exhibit any clear differences be-tween the two proteins during the first 5 ns (Fig. 2.4(c)–(f)). Simulations athigher temperatures also revealed that the disruption of CoreN-term allowedwater molecules to penetrate into the hydrophobic core (CoreABC), and thispenetration by water triggered global unfolding of the protein molecule (seebelow) [25,48].

Hydrogen-Bond Network

To further elucidate the faster disruption of CoreN-term in the recombinantprotein, we examined the hydrogen-bond network around the N-terminus ofthe proteins, and analyzed the disruption of these hydrogen bonds during theearly stage of unfolding [24]. In authentic goat α-lactalbumin, the N-terminalammonium group of Glu1 formed three hydrogen bonds with the side-chainoxygen atoms of Asp37, Thr38, and Gln39 (Fig. 2.5(a)). The presence of theextra Met0 in the recombinant protein removed the two hydrogen bonds inGlu1 with Thr38 and Gln39, but the Gln39 side chain formed an alternativehydrogen bond with the new N-terminus of Met0 (Fig. 2.5(b)). As a result, thenet difference in the number of the N-terminal hydrogen bonds was –1 betweenthe recombinant and authentic α-lactalbumin, and the distance between theOγ atom of Thr38 and the backbone amide N atom of Glu1 was 5 A in thenative recombinant structure.

The faster disruption of CoreN-term in the recombinant protein was thuscaused by the loss of the hydrogen bond formed by Thr38 with Glu1 and theweakening of the hydrogen bonds formed by Asp37 and Gln39 with the back-bone N atom(s). In particular, the time course of the increase in the atomicdistance between the Oγ of Thr38 and the backbone N of Glu1 coincidedwith the time course of the CoreN-term disruption, indicating that both eventsoccurred in concert (Fig. 2.5) [24].

The importance of the hydrogen bond between the Oγ of Thr38 and thebackbone N of Glu1 in the thermodynamic stability of goat α-lactalbuminwas confirmed experimentally using the T38A mutant, in which Thr38 wasreplaced by an Ala residue [24]. As expected, the T38A mutation had greaterimpact on the stability of the authentic form than on the stability of therecombinant form.

Page 38: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

20 K. Kuwajima et al.

Fig. 2.4. (a) The five core regions, CoreN-term, CoreABC, CoreAB, CoreC-term, andCoreβ, defined in terms of native contacts. The core regions are shown on the crystalstructure of recombinant goat α-lactalbumin (PDB code: 1HMK) [24]. The extra me-thionine residue, denoted as M0 in the figure, at the N-terminus in the recombinantprotein is shown in the CPK model. (b)–(f) shows the Q value calculated for eachcore region: (b) CoreN-term, (c) CoreABC, (d) CoreAB, (e) CoreC-term, and (f) Coreβ.The light-gray and dark-gray curves represent the averaged data for the authenticand recombinant proteins, respectively. Reproduced with permission from [24]

Page 39: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

2 Experimental and Simulation Studies of the Folding/Unfolding of Goat 21

Fig. 2.5. Close views of interacting atoms in CoreN-term for (a) the authentic and(b) the recombinant proteins. The distance between N of the N-terminus and (c)Oδ of Asp37, (d) Oγ of Thr38, and (e) Nε of Gln39, for the authentic protein, andthe distance between N of Glu1 and (c) Oδ of Asp37, and (d) Oγ of Thr38, and(e) the distance between N of Met0 and Nε of Gln39, for the recombinant protein,during simulation at 398K, averaged for ten trajectories [24]. The light-gray anddark-gray curves represent those for authentic and recombinant goat α-lactalbumin,respectively. Reproduced with permission from [24]

Hinge-Bending Motions of Dynamic Domains

To investigate the relationship between the unfolding behavior and the struc-tural dynamics of goat α-lactalbumin, we carried out a principal componentanalysis of the MD trajectories for the equilibrium dynamics at 298 K andunfolding dynamics at 398 K, and characterized the principal modes as screwmotions of dynamic domains according to the method of Hayward et al. [24,49]For both the authentic and recombinant proteins, we identified the same twodynamic domains, i.e., domain 1 formed by residues 1–35 and 101–120, anddomain 2 formed by residues 36–100. The C-helix (residues 86–98) thus be-longed to domain 2, and moved together with the Ca2+-binding site (residues79–88) and the β-domain (residues 36–85). We found that the hinge-bendingmotions between the two dynamic domains were characteristic of the dynamicsof α-lactalbumin; more importantly, the screw axis of the interdomain motionpasses through the protein interior from the C-terminal end of the C-helix tothe N-terminus of the protein (Fig. 2.6) [24].

Page 40: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

22 K. Kuwajima et al.

Fig. 2.6. The dynamic domains of goat α-lactalbumin, domain 1 (dark gray) anddomain 2 (light gray), and the screw axis of the interdomain motion [24]. The C-helixis involved in domain 2 and moves together with the Ca2+-binding site and the β-domain. Reproduced with permission from [24]

The location of the end of the interdomain screw axis at the N-terminusexerts a dramatic impact on the unfolding behavior of α-lactalbumin becausethe screw axis corresponds to the hinge axis of the hinge-bending motion [49].Hinge-bending motion imposes the strongest of all such forces on the hingeaxis, and this may lead to the faster disruption of CoreN-term in the re-combinant protein, which has a weaker hydrogen-bond network around theN-terminus than the authentic protein.

Finally, the conformational entropy effect of the extra methionine residueof the recombinant protein should not be excluded from the present analysis[22]. Such an effect must be present as long as the methionine assumes a fixedstructure in the native state, but this effect is difficult to measure by MDsimulation [50].

2.3.3 Conclusions

(1) The unfolding behaviors of the authentic and recombinant forms of goatα-lactalbumin are remarkably different, although both forms have an iden-tical three-dimensional structure. The recombinant form was found tobe 1.1 kcal mol−1 less stable than the authentic form, and the recombi-nant form unfolded at a ninefold faster rate than the authentic form. Thedestabilization and unfolding-rate acceleration were due to the presence ofan extra methionine residue at the N-terminus in the recombinant protein.

(2) We carried out two series of unfolding MD simulations, one for the au-thentic form and the other for the recombinant form of goat α-lactalbuminat 398 K. The unfolding simulations reasonably reproduced the experi-mentally observed difference between the proteins, i.e., the faster rate ofunfolding of the recombinant protein.

(3) The principal component analysis of the dynamics revealed the hinge-bending motions of the protein. One end of the screw axis of the motions

Page 41: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

2 Experimental and Simulation Studies of the Folding/Unfolding of Goat 23

was located at the N-terminus, and this location of the screw axis and theweakening of the hydrogen-bond network in the N-terminal region wereresponsible for the faster unfolding of the recombinant protein.

2.4 Folding/Unfolding Pathways of Goat α-Lactalbumin

2.4.1 Experimental Studies

The folding of α-lactalbumin is well represented by a framework model inwhich secondary structure units form first, prior to the organization of specifictertiary interactions of amino acid side chains [32–35]. An early kinetic foldingintermediate of the protein has characteristics of the molten globule state,which assumes the native-like secondary structure and the compact shape ofthe molecule without specific side-chain packing. The rate-limiting step offolding is the process from the molten globule intermediate to the fully nativestate of α-lactalbumin, and hence the transition state of folding is locatedbetween the molten globule and native states.

Molten Globule Intermediate

We studied the hydrogen exchange kinetics of the authentic and recombi-nant forms of goat α-lactalbumin in the molten globule state at pHobs 1.7and 25◦C using the 1H–15N HSQC spectra (Nakamura et al., unpublished);pHobs is the pH meter reading of a D2O solution. At pH 2, the molten globulestate is stable, and its identity with the kinetic folding intermediate has beenwell established [32–35]. The proteins were 15N-labeled, and the 15N-labeledauthentic form was obtained by the CNBr cleavage of the 15N-labled recombi-nant protein. We carried out hydrogen-exchange reactions of the two proteinsin the molten globule state for a variety of exchange times in 90% D2O, andthen quenched the exchange by rapid refolding of the protein at pHobs 5.9 and25◦C in the presence of 1.0 mM CaCl2. The cross-peaks of the HSQC spectraof the hydrogen-exchange-labeled and refolded proteins provided the exchangekinetics of the individual peptide amide protons, which were assigned in theHSQC spectra in the native state (Fig. 2.7).

The protection factor, pi, for the amide proton of residue i is given by theratio of the rate constants kint,i and kobs,i for the intrinsic chemical exchangeand the respective observed exchange reactions for residue i as follows:

pi =kint,i

kobs,i. (2.1)

kint,i was determined by the amino acid sequence of the protein as reportedby Bai et al. [51], and kobs,i was obtained from the hydrogen-exchangeexperiment.

Page 42: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

24 K. Kuwajima et al.

Fig. 2.7. [15N, 1H]-HSQC spectrum of 15N-labeled recombinant goat α-lactalbuminat pH 6.3 and 25◦C in 95% H2O/5% D2O (Nakamura et al., unpublished). Peaksare labeled with their residue-specific assignments

The hydrogen-exchange protection profile, given by pi values as a functionof i, for the molten globule state of goat α-lactalbumin is shown in Fig. 2.8.The profile was identical, within experimental error, between the authenticand recombinant forms, and only the C-helix was weakly protected with aprotection factor of 10–20. This result is thus in contrast with those previouslyreported for intermediates of guinea pig and human α-lactalbumin [52,53] andseveral species of lysozyme, in which the A- and B-helices are more stronglyprotected (protection factor range of 5–500) [54–58]. Indeed, the present find-ings are more similar to those obtained for bovine α-lactalbumin, except forthe smaller protection factor (pi = 5–7) of the C-helix in the molten globuleform of the bovine than the goat protein [59].

On the basis of the findings of kinetic refolding studies of goatα-lactalbumin obtained by stopped-flow CD measurements, the molten glob-ule folding intermediate at neutral pH was found to possess the capacityto weakly bind to Ca2+ with a binding constant on the order of 103 M−1

(unpublished results), i.e., a value that is 103–104-fold smaller than the bind-ing constant of the native protein [30, 60–62]. Therefore, the C-helix and theCa2+-binding site are weakly but significantly organized in the molten globulestate of goat α-lactalbumin.

Transition State

In mutational Φ-value analyses of the transition state of protein folding, site-directed mutagenesis is used to introduce nondisruptive mutations at various

Page 43: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

2 Experimental and Simulation Studies of the Folding/Unfolding of Goat 25

Fig. 2.8. Histograms showing the distribution of protection factors from amidehydrogen exchange for (a) the authentic form and (b) the recombinant form ofgoat α-lactalbumin in the MG-state at pHobs 1.7 and 25◦C (Nakamura et al.,unpublished)

amino acid residue sites of a protein studied in order to experimentally investi-gate the thermodynamic stability and kinetic folding and unfolding reactionsof the wild-type and mutant proteins [19, 20, 63]. Thermodynamic analysesof the equilibrium unfolding transition curves of proteins provide informationabout changes in the stabilization Gibbs energy ΔΔG induced by mutation.Kinetic folding and unfolding experiments provide the folding and unfoldingrate constants for wild-type and mutant proteins. On the basis of the obtaineddata, the Φ-value of the transition state is given by the following equation:

Φ = 1 − RT ln(kWTunf /kmutant

unf )ΔΔG

, (2.2)

where k WTunf and k mutant

unf are unfolding rate constants for the wild-typeand mutant proteins, respectively; Φ = 1 when the transition-state structureis fully native at the mutation site, whereas Φ = 0 when the transition-statestructure is still fully unfolded. Because the unfolding kinetics is often simplerthan the refolding kinetics, which may be affected by the heterogeneity of theunfolded state or by the presence of folding intermediates, we typically use theunfolding rate constants to determine the Φ value. To avoid errors caused bylong extrapolation along increasing or decreasing denaturant concentrations,we usually employ the ΔΔG, k WT

unf , and k mutantunf values at the midpoint

of the unfolding transition for the wild-type protein.

Page 44: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

26 K. Kuwajima et al.

We introduced 17 mutations (V8A, L12A, V17A, T29I, L52A, I55V, W60A,D87N, I89V, V90A, K93A, I95V, L96A, Y103F, L105A, L110A, and W118F)into recombinant goat α-lactalbumin, and carried out mutational Φ-valueanalyses of the transition state of folding [23]. As shown by Vanhooren andcoworkers, the W60A mutation was disruptive because of a big volume change;thus, for this particular mutation site we employed the Φ value of the W60Fmutant reported by Vanhooren et al. in the following series [64, 65]. The re-sults of the Φ-value analysis revealed that the mutants with mutations lo-cated in the A-helix (V8A, L12A), the B-helix (V27A, T29I), the C-helix(K93A, L96A), the C–D loop (Y103F), the D-helix (L105A, L110A), andthe C-terminal 310-helix (W118F) have low Φ values of less than 0.2. On theother hand, D87N, which is located at the Ca2+-binding site, and W60F,which is involved in the β-domain, have relatively high Φ values of larger than0.9, indicating that the tight packing of the side chains around these residues(Agp87 and Trp60) occurs in the transition state [23]. Another β-domain mu-tant (I55V) and three C-helix mutants (I89V, V90A, and I95V) were shownto have intermediate Φ values ranging from 0.4 to 0.7. The folding nucleus inthe transition state of goat α-lactalbumin is thus not extensively distributedacross the molecule, but instead is very localized within a region containingthe Ca2+-binding site as well as the interface between the C-helix and theβ-domain (Fig. 2.9) [23].

The above findings, when taken together with the preceding results re-garding the hydrogen-exchange protection profile of the molten globule in-termediate, demonstrated that the folding reaction of goat α-lactalbumin isa hierarchical and sequential process, in which folding is initiated in the re-gion of the C-helix and the Ca2+-binding site, and proceeds from there byorganization of the structure around this region (the folding nucleus).

2.4.2 Simulation Studies

The MD unfolding simulations at 498 K led to the global unfolding ofboth authentic and recombinant goat α-lactalbumin, but comparison of the

Fig. 2.9. The Φ-values ((a) experimental Φ-values, and (b) ΦMD obtained from MDtrajectories) mapped onto the three-dimensional structure of goat α-lactalbumin[25]. Reproduced with permission from [25]

Page 45: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

2 Experimental and Simulation Studies of the Folding/Unfolding of Goat 27

trajectories with those at 398 K revealed that the early stages of unfolding at498 K were very similar to the unfolding trajectories at 398 K [25].

We therefore investigated the unfolding trajectories at 498 K, and the MDsimulation results were compared with the experimentally observed kineticsof folding and unfolding at room temperature [25]. In particular, we identifiedthe transition state of unfolding on the basis of the MD unfolding trajecto-ries, and then compared that structure with the transition-state structure offolding/unfolding experimentally observed by the mutational Φ-value analy-sis [23]. To identify the transition state of unfolding based on the MD tra-jectories, however, we have to represent the protein structures, which appearalong the unfolding trajectories, in a proper but coarse-grained manner (seebelow).

We therefore developed a coarse-grained coordinate system (segmentalQ-coordinates) in which the secondary structure segments can be regardedas folding units [25]. According to the framework model of protein folding,secondary-structure segments (α-helices and β-hairpins) act as the foldingunits that initially form at an early stage of kinetic folding, and the consoli-dation and assembly of the folding units are rate-limiting steps of the foldingreaction [35, 66–68]. The experimental results for the kinetic folding of goatα-lactalbumin shown above supported the framework model of folding.

Segmental Q-Coordinates

To construct the segmental Q-coordinates, we divided the structure of thegoat α-lactalbumin molecule into eight segments, segment 1 (the N-terminalregion and the A-helix), segment 2 (the C-terminal side of the A-helix and theB-helix), segment 3 (a β-hairpin formed by the first and second β-strands in theβ-domain), segment 4 (the third β-strand and the subsequent loop), segment5 (the Ca2+-binding site that includes a 310-helix and the N-terminal side ofthe C-helix), segment 6 (the C- and D-helices), segment 7 (the C-terminal 310-helix), and segment 8 (the C-terminal region) (Fig. 2.10) [25]. These segmentswere local contiguous segments along the primary sequence. Division intothe eight segments was based on the contact map of the three-dimensionalstructure of goat α-lactalbumin, and most of the segments corresponded tosecondary-structure segments, i.e., α-helices, or β-hairpins [25].

Each axis of the segmental Q-coordinates is given by the fractional nativecontact QS(i, j) between a pair of different segments i and j when j > i, orwithin segment i when i = j:

QS(i, j) =

The number of native contacts between i and j(within i when i = j) in X

The total number of native contacts between i and j(within i when i = j) in N

, (2.3)

where X and N denote a protein structure produced by the MD simulationand the native structure, respectively [25]. Although the theoretically possible

Page 46: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

28 K. Kuwajima et al.

Fig. 2.10. (a) Contact map of the native structure observed in the equilibrium MDsimulation at 298 K [25]. The color scheme corresponds to the eight local segments:1 (red, residues 0–11), 2 (blue, 12–36), 3 (green, 37–54), 4 (yellow, 55–74), 5 (cyan,75–88), 6 (orange, 89–106), 7 (gray, 107–119), and 8 (black, 120–123). Seventeensegment pairs (gray) were considered in the segmental Q-coordinates. (b) The crys-tallographic structure of recombinant goat α-lactalbumin shown in the respectivecolors representing the eight local segments [25]. (c)–(f) The structural character-istics of Clusters 1, 4, 5, and 9 represented in the respective colors used to depictthe eight local segments [25]. The panels on the left show the superimposition ofthe structures randomly selected from each cluster. The panels on the right show atwo-dimensional lattice representation of the 17 segmental Q-coordinates averagedwithin each cluster. Reproduced with permission from [25]

maximum number of dimensions of conformational hyperspace of the segmen-tal Q-coordinates was 36 (= 8×9/2), there were only 17 coordinates, as trivialcoordinates containing less than three native contacts were neglected (blankareas in Fig. 2.10(a)).

The advantage of the use of the segmental Q-coordinate system becomesapparent when we compare the unfolding trajectories represented in the con-formational hyperspace of the segmental Q-coordinates and those in thehyperspace of the Cartesian coordinates (Fig. 2.11) [25]. Because unfoldedconformations are very widely distributed in the hyperspace of the Cartesian

Page 47: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

2 Experimental and Simulation Studies of the Folding/Unfolding of Goat 29

Fig. 2.11. Two-dimensional representations of the structural ensemble observed inthe unfolding trajectories, which were mapped onto the two largest principal com-ponents in the 17-dimensional segmental Q-coordinates (a) and in the hyperspaceof the Cartesian coordinates (b) [25]. Reproduced with permission from [25]

coordinates, the unfolding trajectories depict a funnel-like shape. On the otherhand, the unfolded conformations are all close to each other in the hyperspaceof the segmental Q-coordinates, so that pathway-like unfolding trajectoriesare observed (Fig. 2.11(a)), from which the pathway, intermediates, and tran-sition state of unfolding can be explored. It is of note that whether the proteinfolding/unfolding is described by a folding pathway or funnel may depend onthe coordinate system used to represent the protein structure.

Cluster Analysis, Unfolding Pathway, and Transition State

By k-means cluster analysis with Euclidean distance in the segmentalQ-coordinates, we divided the structure ensemble of the MD unfoldingtrajectories into nine clusters [25]. The clustering was performed using alldata obtained for the authentic and recombinant proteins, and the clus-ters were numbered in the order of the distance from the native structure.Figure 2.10(c)–(f) shows protein structures in four representative clusters(Clusters 1, 4, 5, and 9), in which Cluster 1 is almost identical to the nativestructure with all of the 17 Q-coordinates close to unity, whereas Cluster 9,which lost 84% of its native contacts, represents the unfolded state.

Twenty MD unfolding trajectories were obtained at 498 K, i.e., 10 for theauthentic protein and the remaining 10 for the recombinant protein [25]. Eachtrajectory was characterized by flows between different clusters of the MDstructure ensemble. Such trajectory flows may thus represent the unfoldingpathway.

To investigate similarities and differences between the individual unfoldingtrajectories in terms of trajectory flow (i.e., the unfolding pathway), we carriedout multiple trajectory alignments analogous to multiple sequence alignmentsof biological sequences [25, 69]. As a result, we found that the 20 unfolding

Page 48: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

30 K. Kuwajima et al.

Fig. 2.12. The trajectory flows of the clades: (a) Clade 1, (b) Clade 2, and (c) Clade5. Circles represent the nine clusters, Clusters 1–9 [25]. Each arrow represents thenet frequency of the transition. A thicker arrow indicates a larger flow. Reproducedwith permission from [25]

trajectories could be classified into five groups; three of these (Clades 1, 2, and5 ) each included at least five trajectories, and these major groups are shown inFig. 2.12. Each of the five groups are referred to here as a “Clade” based on theanalogy of a clade in a phylogenetic tree (cladogram) constructed by multiplesequence alignment. Clade 1 consists only of the trajectories of the authenticprotein, and indicates a cooperative unfolding from Cluster 1 to Cluster 6 viaCluster 5. On the other hand, Clade 5 consists only of the trajectories of therecombinant protein, and indicates a noncooperative unfolding that reachesCluster 5 via Clusters 2–4 and ultimately Cluster 6 or higher clusters. Clade 2represents a mixture of trajectories of the authentic and recombinant proteins,and shows intermediate features between Clades 1 and 5.

In all of the five clades, the unfolding pathway necessarily passes throughCluster 5, and hence Cluster 5 is the bottleneck of the unfolding transition(Fig. 2.12) [25]. This indicates that Cluster 5 may correspond to the transitionstate of unfolding. To validate the identity of Cluster 5 as the transition state,we estimated theoretical Φ values (ΦMD) that were calculated from the MDtrajectories. The ΦMD value was based on the fractional native contact ofamino acid residues in the structures produced by MD simulations [5, 25].The correlation coefficient between ΦMD and the experimental Φ values givenby equation (2.2) was highest around the center of Cluster 5, demonstratingthat Cluster 5 represents the transition state of unfolding (Fig. 2.9).

Hydration of Protein Interior During Unfolding

To further characterize the structural changes of goat α-lactalbumin duringunfolding, we examined the probability distributions of the following fourstructural parameters in each of the nine clusters of the structural ensem-ble of MD trajectories: (1) the fractional native contact (Q) of the entiremolecule, (2) the RMSD of Cα atoms between a pair of structures that be-long to the same cluster, (3) the solvent-accessible surface area (SASA) ofhydrophobic side chains, and (4) the SASA of hydrophilic side chains [25].

Page 49: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

2 Experimental and Simulation Studies of the Folding/Unfolding of Goat 31

Fig. 2.13. The probability distributions of four structural parameters calculated forthe structures of each cluster [25]. (a) The fraction of the native tertiary contacts Q.(b) The RMSD value of Cα atoms between a pair of structures that belong tothe same cluster. The SASA for (c) hydrophobic and (d) hydrophilic side chains.Reproduced with permission from [25]

As shown in Fig. 2.13, both the Q and RMSD values of the transition state(Cluster 5) are located between those of the native state (Cluster 1) andthe unfolded state (Clusters 6–9). However, the RMSD distribution of thetransition state remains native-like, and a sudden broadening occurs betweenClusters 5 and 6 or after passing through the transition state. The SASAdistribution of hydrophobic side chains shows a more characteristic behavior,and a large increase in the hydrophobic SASA occurs only after the proteinpasses through the transition state, while no significant changes are observedin the hydrophilic SASA in any of the clusters.

The above results thus suggest that extensive hydration of the hydropho-bic interior of the protein occurs only after the protein passes through thetransition state, and this hydration of the protein interior leads to the ex-tensive unfolding (i.e., the increase in RMSD) of the protein molecule [25].Provided that folding is the reverse of unfolding, an important rate-limitingstep of protein folding may be the dehydration of hydrated hydrophobicgroups to form a hydrophobic interior. The formation of partial native con-tacts (Q ≈ 0.5) accompanies this rate-limiting step of folding (Fig. 2.13(a)),and this partial structural organization occurs around the folding nucleusformed by the C-helix and Ca2+-binding site in goat α-lactalbumin. Molecularsimulations of other proteins or even small peptides are known to exhibit simi-lar extensive dehydration of hydrated hydrophobic groups at the rate-limitingstep of folding [4, 70–73], and hence this is probably a general mechanism ofprotein folding.

Page 50: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

32 K. Kuwajima et al.

2.4.3 Conclusions

(1) We experimentally characterized the molten globule state and thefolding/unfolding transition state of goat α-lactalbumin using a hydrogen-exchange 2D NMR technique and mutational Φ value analysis. The foldingreaction occurs in a hierarchical manner, with the C-helix and Ca2+-binding site being weakly organized in the molten globule intermediateand the structure around the same region becoming further organized inthe transition state.

(2) We carried out unfolding MD simulations of goat α-lactalbumin at 498 K.The protein structure was represented in the segmental Q-coordinate, andcluster analyses and multiple-trajectory alignments were carried out toobtain the transition-state structure solely from the MD simulation. Thestructure obtained by this approach was very close to that obtained ex-perimentally, and hence the results of the kinetic unfolding experimentswere well reproduced by the simulations.

(3) The analysis of the probability distributions of different structural para-meters in each cluster of the MD structural ensemble revealed that thehydration of most of the hydrophobic surface of the protein occurs afterpassage through the transition state of unfolding, and this hydration of theprotein interior leads to the extensive unfolding of the protein molecule.Thus, the dehydration of hydrated hydrophobic groups, which enables theformation of a hydrophobic interior, may be an important rate-limitingstep of protein folding.

2.5 Summary and Perspectives

We studied the unfolding behavior and the folding/unfolding transition stateof goat α-lactalbumin both experimentally and by MD simulation. The MDsimulation results yielded good reproduction of experimentally observed dif-ferences in the unfolding behaviors of the authentic and recombinant pro-teins and also reliably reproduced the experimentally observed transition-state structure, together with atomically detailed descriptions of the unfoldingprocess [24, 25]. The present study thus demonstrates the power of the com-bined use of experimentation and simulation for investigating protein folding.

In future studies, it will be necessary not only to combine experimentaland simulation results but also to address more critical questions regardingthe underlying mechanisms of protein folding. For goat α-lactalbumin, ad-ditional questions will need to be answered, e.g., why the region containingthe C-helix and the Ca2+-binding site acts as the folding nucleus and whatdetermines the folding nucleus. To address issues of this sort, the combinedresults of experimental and simulation studies of the folding/unfolding of dif-ferent proteins will be needed. Particularly intriguing in this regard wouldbe a comparative study of goat α-lactalbumin and canine milk lysozyme. The

Page 51: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

2 Experimental and Simulation Studies of the Folding/Unfolding of Goat 33

latter protein is homologous to α-lactalbumin and has the same Ca2+-bindingsite at the interface of the α- and β-domains [74]. Nevertheless, the foldingnucleus of canine milk lysozyme differs greatly from that of α-lactalbuminand is probably located at A- and B-helices distant from the Ca2+-bindingsite [75].

Acknowledgments

We would like to thank our former colleagues in the Department of Physicsin the School of Science, University of Tokyo, including Tapan K. Chaudhuri,Kimiko Saeki, Munehito Arai, and Takao Yoda, all of whom assumed im-portant roles in the experimental portion of this study. We are also gratefulto Professor Motonori Ota (Nagoya University), who introduced the multi-ple trajectory alignment method in this study. This study was supported bya Grant-in-Aid for Scientific Research on Priority Areas (project numbers15076201, 15076209, and 15076101).

References

1. Y. Duan, P.A. Kollman, Science 282, 740 (1998)2. R. Day, V. Daggett, Adv. Protein Chem. 66, 373 (2003)3. H.A. Scheraga, M. Khalili, A. Liwo, Annu. Rev. Phys. Chem. 58, 57 (2007)4. A. Caflisch, M. Karplus, J. Mol. Biol. 252, 672 (1995)5. V. Daggett, A.J. Li, L.S. Itzhaki, D.E. Otzen, A.R. Fersht, J. Mol. Biol. 257,

430 (1996)6. J. Tsai, M. Levitt, D. Baker, J. Mol. Biol. 291(1), 215 (1999)7. U. Mayor, C.M. Johnson, V. Daggett, A.R. Fersht, Proc. Natl. Acad. Sci.

U. S. A. 97(25), 13518 (2000)8. L.J. Smith, R.M. Jones, W.F. van Gunsteren, Proteins 58(2), 439 (2005)9. F. Ding, W. Guo, N.V. Dokholyan, E.I. Shakhnovich, J.E. Shea, J. Mol. Biol.

350(5), 1035 (2005)10. H. Lei, S.G. Dastidar, Y. Duan, J. Phys. Chem. B 110(43), 22001 (2006)11. N. Smolin, R. Winter, Biochim. Biophys. Acta 1764(3), 522 (2006)12. A. Das, C. Mukhopadhyay, J. Chem. Phys. 127(16), 165103 (2007)13. R.D. Schaeffer, A. Fersht, V. Daggett, Curr. Opin. Struct. Biol. 18(1), 4 (2008)14. V. Daggett, Chem. Rev. 106(5), 1898 (2006)15. R. Day, V. Daggett, J. Mol. Biol. 366(2), 677 (2007)16. M.E. McCully, D.A. Beck, V. Daggett, Biochemistry 47(27), 7079 (2008)17. H.J. Dyson, P.E. Wright, Annu. Rev. Phys. Chem. 47, 369 (1996)18. M.M. Krishna, L. Hoang, Y. Lin, S.W. Englander, Methods 34(1), 51 (2004)19. A. Matouschek, J.T. Kellis, L. Serrano, A.R. Fersht, Nature 340(6229), 122

(1989)20. A. Fersht, Structure and Mechanism in Protein Science: A Guide to Enzyme

Catalysis and Protein Folding (W.H. Freeman, New York, 1998)21. C.D. Snow, E.J. Sorin, Y.M. Rhee, V.S. Pande, Annu. Rev. Biophys. Biomol.

Struct. 34, 43 (2005)

Page 52: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

34 K. Kuwajima et al.

22. T.K. Chaudhuri, K. Horii, T. Yoda, M. Arai, S. Nagata, T.P. Terada,H. Uchiyama, T. Ikura, K. Tsumoto, H. Kataoka, M. Matsushima, K. Kuwa-jima, I. Kumagai, J. Mol. Biol. 285, 1179 (1999) (Erratum in: J Mol Biol. 336(3),825 (2004))

23. K. Saeki, M. Arai, T. Yoda, M. Nakao, K. Kuwajima, J. Mol. Biol. 341(2), 589(2004)

24. T. Oroguchi, M. Ikeguchi, K. Saeki, K. Kamagata, Y. Sawano, M. Tanokura,A. Kidera, K. Kuwajima, J. Mol. Biol. 354(1), 164 (2005)

25. T. Oroguchi, M. Ikeguchi, M. Ota, K. Kuwajima, A. Kidera, J. Mol. Biol.371(5), 1354 (2007)

26. A.C.W. Pike, K. Brew, K.R. Acharya, Structure 4, 691 (1996)27. Y. Hiraoka, T. Segawa, K. Kuwajima, S. Sugai, N. Murai, Biochem. Biophys.

Res. Commun. 95(3), 1098 (1980)28. D.I. Stuart, K.R. Acharya, N.P. Walker, S.G. Smith, M. Lewis, D.C. Phillips,

Nature 324(6092), 84 (1986)29. M. Ikeguchi, K. Kuwajima, S. Sugai, J. Biochem. (Tokyo) 99(4), 1191 (1986)30. T. Hendrix, Y.V. Griko, P.L. Privalov, Biophys. Chem. 84(1), 27 (2000)31. A. Chedad, H. Van Dael, Proteins 57(2), 345 (2004)32. K. Kuwajima, Proteins 6, 87 (1989)33. K. Kuwajima, FASEB J. 10, 102 (1996)34. M. Arai, K. Kuwajima, Fold. Des. 1(4), 275 (1996)35. M. Arai, K. Kuwajima, Adv. Protein Chem. 53, 209 (2000)36. M. Svensson, A. Hakansson, A.K. Mossberg, S. Linse, C. Svanborg, Proc. Natl.

Acad. Sci. U. S. A. 97(8), 4221 (2000)37. K.H. Mok, J. Pettersson, S. Orrenius, C. Svanborg, Biochem. Biophys. Res.

Commun. 354(1), 1 (2007)38. U. Brodbeck, W.L. Denton, N. Tanahashi, K.E. Ebner, J. Biol. Chem. 242(7),

1391 (1967)39. B. Ramakrishnan, P.K. Qasba, J. Mol. Biol. 310(1), 205 (2001)40. D.B. Veprintsev, M. Narayan, S.E. Permyakov, V.N. Uversky, C.L. Brooks, A.M.

Cherskaya, E.A. Permyakov, L.J. Berliner, Proteins 37(1), 65 (1999)41. S.E. Permyakov, G.I. Makhatadze, R. Owenius, V.N. Uversky, C.L. Brooks,

E.A. Permyakov, L.J. Berliner, Protein Eng. Des. Sel. 18(9), 425 (2005)42. N. Ishikawa, T. Chiba, L.T. Chen, A. Shimizu, M. Ikeguchi, S. Sugai, Protein

Eng. 11, 333 (1998)43. K. Takano, K. Tsuchimori, Y. Yamagata, K. Yutani, Eur. J. Biochem. 266(2),

675 (1999)44. S. Goda, K. Takano, Y. Yamagata, Y. Katakura, K. Yutani, Protein Eng. 13(4),

299 (2000)45. M. Ikeguchi, J. Comput. Chem. 25(4), 529 (2004)46. W.L. Jorgensen, J. Chandrasekhar, J.D. Madura, R.W. Impey, M.L. Klein, J.

Chem. Phys. 79(2), 926 (1983)47. J. MacKerell AD, D. Bashford, M. Bellott, J. Dunbrack RL, J.D. Evanseck, M.J.

Field, S. Fischer, J. Gao, H. Guo, S. Ha, D. Joseph-McCarthy, L. Kuchnir, K.Kuczera, F.T.K. Lau, C. Mattos, S. Michnick, T. Ngo, D.T. Nguyen, B. Prod-hom, I.I.I. Reiher WE, B. Roux, M. Schlenkrich, Sm, J. Phys. Chem. B 102,3586 (1998)

48. T. Yoda, M. Saito, M. Arai, K. Horii, K. Tsumoto, M. Matsushima, I. Kumagai,K. Kuwajima, Proteins 42, 49 (2001)

Page 53: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

2 Experimental and Simulation Studies of the Folding/Unfolding of Goat 35

49. S. Hayward, A. Kitao, H.J. Berendsen, Proteins 27(3), 425 (1997)50. H. Meirovitch, Curr. Opin. Struct. Biol. 17(2), 181 (2007)51. Y. Bai, J.S. Milne, L. Mayne, S.W. Englander, Proteins 17, 75 (1993)52. C.L. Chyan, C. Wormald, C.M. Dobson, P.A. Evans, J. Baum, Biochemistry

32, 5681 (1993)53. B.A. Schulman, C. Redfield, Z.Y. Peng, C.M. Dobson, P.S. Kim, J. Mol. Biol.

253, 651 (1995)54. S.E. Radford, C.M. Dobson, P.A. Evans, Nature 358, 302 (1992)55. S.D. Hooke, S.E. Radford, C.M. Dobson, Biochemistry 33, 5867 (1994)56. L.A. Morozova-Roche, C.C. Arico-Muendel, D.T. Haynie, V.I. Emelyanenko,

H. Van Dael, C.M. Dobson, J. Mol. Biol. 268, 903 (1997)57. L.A. Morozova-Roche, J.A. Jones, W. Noppe, C.M. Dobson, J. Mol. Biol. 289,

1055 (1999)58. Y. Kobashigawa, M. Demura, T. Koshiba, Y. Kumaki, K. Kuwajima, K. Nitta,

Proteins 40, 579 (2000)59. V. Forge, R.T. Wijesinha, J. Balbach, K. Brew, C.V. Robinson, C. Redfield,

C.M. Dobson, J. Mol. Biol. 288(4), 673 (1999)60. K. Kuwajima, M. Mitani, S. Sugai, J. Mol. Biol. 206(3), 547 (1989)61. K. Nitta, Methods Mol Biol 172, 211 (2002)62. A. Vanhooren, K. Vanhee, K. Noyelle, Z. Majer, M. Joniau, I. Hanssens, Biophys.

J. 82, 407 (2002)63. A.R. Fersht, A. Matouschek, L. Serrano, J. Mol. Biol. 224(3), 771 (1992)64. A. Vanhooren, A. Chedad, V. Farkas, Z. Majer, M. Joniau, H. Van Dael,

I. Hanssens, Proteins 60(1), 118 (2005)65. A. Chedad, H. Van Dael, A. Vanhooren, I. Hanssens, Biochemistry 44(46), 15129

(2005)66. R.L. Baldwin, G.D. Rose, Trends Biochem. Sci. 24, 77 (1999)67. B. Nolting, K. Andert, Proteins 41(3), 288 (2000)68. S. Nishiguchi, Y. Goto, S. Takahashi, J. Mol. Biol. 373(2), 491 (2007)69. M. Ota, M. Ikeguchi, A. Kidera, Proc. Natl. Acad. Sci. U. S. A. 101(51), 17658

(2004)70. M.S. Cheung, A.E. Garcia, J.N. Onuchic, Proc. Natl. Acad. Sci. U. S. A. 99(2),

685 (2002)71. W. Guo, S. Lampoudi, J.E. Shea, Biophys. J. 85(1), 61 (2003)72. Y.M. Rhee, E.J. Sorin, G. Jayachandran, E. Lindahl, V.S. Pande, Proc. Natl.

Acad. Sci. U. S. A. 101(17), 6456 (2004)73. J. Juraszek, P.G. Bolhuis, Proc. Natl. Acad. Sci. U. S. A. 103(43), 15859 (2006)74. T. Koshiba, M. Yao, Y. Kobashigawa, M. Demura, A. Nakagawa, I. Tanaka,

K. Kuwajima, K. Nitta, Biochemistry 39(12), 3248 (2000)75. H. Nakatani, K. Maki, K. Saeki, T. Aizawa, M. Demura, K. Kawano, S. Tomoda,

K. Kuwajima, Biochemistry 46(17), 5238 (2007)

Page 54: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

“This page left intentionally blank.”

Page 55: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

3

Transition in the Higher-order Structureof DNA in Aqueous Solution

T. Sakaue and K. Yoshikawa

Abstract. Recent progress in single-chain observation techniques is revealing thefascinating world of individual long DNA molecules in higher-order structures. Ex-amples include a large discontinuous folding transition between disordered coil andordered compact states, the phenomenon of intrachain segregation, in which foldedand unfolded parts coexist along the chain, and the multi-stability between different“phases,” which implies the importance of dynamic degrees of freedom in the system.Although these behaviors are apparently much more complex than naively expectedfrom conventional knowledge, the essential physics can be depicted from a simplepolymer model with appropriate degrees of coarse-graining. The semiflexibility, thatis, the local stiffness, of the chain and the electrostatic properties together with theeffects associated with finite-chain length are shown to be crucial, which dictates thelarge-scale behaviors of long DNA chains in the higher-order level. In the regulationof the genetic activity, living cells may utilize physico-chemical properties inherentin genomic DNA molecules, which are highly charged, locally stiff, and very long.

3.1 Introduction

Thanks to the remarkable progress in the molecular biology during the pastquarter century, we have accumulated a great deal of knowledge on the mole-cular processes taking place in living cells [1]. Here the underlying techniqueis the transgenic. For instance, one can knock out a particular gene, and theconsequences of this action can be investigated by comparison with the wildtype. Based on such experimental methodology, the correlation between a cer-tain function and a specific protein is revealed. Indeed, a large number of suchspecific proteins have been identified, reflecting the complexity of functionsin cells.

The question then arises as to how the cell organizes these specific eventsto create a spatio-temporal order to maintain its life. Cells have hierarchicaldynamic structure from the nanometer to the micrometer scale. Therefore,to answer this question, it is necessary to explore the mesoscopic level ofdescription, given the molecular knowledge in the nanometer scale.

Page 56: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

38 T. Sakaue and K. Yoshikawa

An important example is seen in the mechanism of the gene expression reg-ulation, which is one of the fundamental problems in biology. Despite the factthat all the cells are, in principal, equipped with the same DNA molecules asthe genetic code, they yet exhibit different phenotypes robustly, depending onthe cell type. These so-called epigenetic phenomena are sustained through celldivisions. How does each cell differentiate spontaneously? In addition, how arethe levels of expression in specific cells self-regulated? Various specific proteinshave been identified as the regulatory factors for particular characteristics, andmathematical models of the molecular network with multiple stable attractorsand feedback loops have been actively investigated to clarify the underlyingmechanism. Despite extensive efforts in this direction, however, a comprehen-sive view remains yet elusive. Recent elaborate experiments appear to posesevere questions on the current framework by revealing its weakness to largefluctuations inherent in the cell scale [2–4]. Here, we would like to gaze intothe problem from a different viewpoint. Genetic information is stored in thechain-like molecules known as DNA. One of the striking properties of genomicDNA is its extremely long length L compared with the molecular thicknessa = 2 nm. To better grasp this property, let us assume that the DNA is a ropeof radius ∼1 cm. The total length of the human DNA inside a cell measuresL ∼ 1 m, which means that the length of the rope would be ∼107 m, compa-rable with the diameter of the earth. Why is DNA so long? It is tempting toexamine the intrinsic properties of such long polymers and ask the possibleimplications as a genetic material.

In the present chapter, we review recent progress in the study of the higher-order structures of long DNA molecules. In Sect. 3.2, the physico-chemicalproperties of the DNA folding transition in aqueous solution are surveyed. Inparticular, the flexibility and rich potentiality in the higher-order structureof long DNA chains are investigated by single-chain observation. In Sect. 3.3,we analyze these observed phenomena from the viewpoint of the statisticalphysics of long, semiflexible polymers. Here, coarse-grained phenomenologicalarguments and computer simulations with simple modelings are demonstratedto be powerful tools for revealing the fundamental features of DNA. We alsodiscuss the recent attempt to reconstitute the chromatin-like structure andsummarize the possible biological importance and perspectives.

3.2 Long DNA Molecules in Aqueous Solution

3.2.1 Primary, Secondary, and Higher-order Structures

The genetic information of living organisms is coded in DNA in the form ofbase pair sequences. There are four types of nucleotides, which are linked toa polynucleotide with a sugar-phosphate backbone. The arrangement of nu-cleotides along the one-dimensional chain is called the primary structure ofDNA, which directly encodes the primary structure of proteins by means of

Page 57: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

3 Transition in the Higher-order Structure of DNA 39

phosphate group

base(A, G, C, T)

2 nm

3.4 nm

0.1μmsugar

Fig. 3.1. Hierarchy in DNA molecules

the genetic code. Usually, the complementary pairs of nucleotides are con-nected via hydrogen bonds and two polynucleotide chains are wound aroundeach other to form a double helix. This double helical structure is called thesecondary structure of DNA and is regarded as a fundamental unit for thespatial organization of the long DNA chains in larger length scales, that is,higher-order structures (Fig. 3.1).

Because of the rigid double helix structure (and the electrostatic repulsionbetween phosphate groups), the DNA chain is locally stiff, with a conformationthat is almost a straight line with small thermal fluctuations. Quantitatively,this leads to a large characteristic decay length of the orientation correlationknown as the persistent length lp � 50 nm in usual aqueous conditions, whichis much larger than the molecular thickness of the DNA chain a = 2 nm.

It should be stressed that while the secondary structure is determined bythe local interactions, that is, affected by segments located in the proximityalong the chain, the higher-order structure is governed by the global influ-ence created by the entire chain. Therefore, the structural transition in thehigher-order level is essentially different from the helix-coil transition thatoccurs on the secondary structure level.1 DNA molecules of biological ori-gin are extremely long, having contour lengths of L. It is expected that thelarge-scale behaviors of long DNA molecules do not depend strongly on themolecular details such as the base pair sequence. The following subsectionsdescribe the phenomenology of the higher-order structural transitions in long

1 The coupling of these two transitions on different scales is possible, which maymerit future investigations. Note that helices are often adopted motifs in thesecondary structure level in biopolymers, and investigating its impact on thehigher-order structures is an important theme [5].

Page 58: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

40 T. Sakaue and K. Yoshikawa

DNA molecules. Then, in Sect. 3.3, we demonstrate that the essential featuresare indeed described by a small number of material properties, such as lp, L,and the environmental parameters.

3.2.2 DNA Condensation

When dissolved in water, a long DNA molecule takes a disperse random coilconformation. However, the DNA molecules found in living organisms lookvery different. They are, in general, tightly packed inside a limited space. Forinstance, T4 phage DNA with a contour length of 57 μm (166 kbp) is packedinside a virus capsid of linear dimension ∼100 nm. Full length of the genomicDNA of Escherichia coli is as long as ∼1.4 mm, yet packed in a nucleus regionin the order of ∼μm. Moreover, the random coil and the compactly packedstates should be regarded as different “phases.”

When we add a sufficient quantity of polyamines to the dilute DNA solu-tion, the DNA molecules aggregate and may even precipitate from the solu-tion. As observed using electron microscope, the DNA aggregates often takean ordered toroidal morphology reminiscent of interphage DNA. This phenom-enon is called DNA condensation [6–8]. Not only polyamines but also othermultivalent cations, cationic surfactants, water soluble polymer, alcohol, etc.are capable of inducing condensation. These agents are collectively referredto as condensing agents.

These observations have given rise to a number of interesting questions. Inparticular, the following two questions have attracted considerable attention.(1) The DNA condensation phenomenon is governed primarily by electrostaticinteractions. Then, what is the origin of the attractive force between highlycharged DNA segments [9]? (2) From a very dilute solution of long DNAmolecules, the collapsing on the single chain level would occur. Then, givensome effective attraction, how can we describe the phenomenon of the foldingof a long DNA into the compact ordered state?

We shall be mainly concerned with the second question, but it should bekept in mind that these two are not completely separable and the nature ofthe effective interactions may affect the transition manner in some cases.

Note that it is more common to observe multiple molecular condensatesin conventional techniques such as total intensity and dynamic laser lightscattering, which are not well suited to such a dilute solution. The term “con-densation” was intended to make the distinction with the usual aggregation orprecipitation, which indicate the situation in which the aggregate is of finitesize and orderly morphology [6, 7].

3.2.3 Looking at Single DNA Molecules

As stated previously, genomic DNA molecules are generally very long andexhibit a large flexibility in the micrometer scale. The behaviors of singleDNA are, thus, described statistically, the understanding of which is highly

Page 59: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

3 Transition in the Higher-order Structure of DNA 41

Fig. 3.2. Different scenarios in the folding transition of long polymers. (Top) Grad-ual shrink, that is, continuous transition, (middle) all-or-none discontinuous tran-sition in the level of single chains, and (bottom) multiple-step transition throughintrachain segregations. Note that due to the coexistence region characteristic tothe finite-size system, all the cases look similar to the continuous transition in themacroscopic measurement

required for various purposes in biological and material sciences. As notedearlier, conventional experiments measure ensemble averaged quantities, sothat the ambiguity associated with multimolecular events in the condensationis unavoidable. However, it is of critical importance to recognize the hierarchyinvolved in the system under consideration (Fig. 3.2). In the dilute solution,there are a large number of long DNA chains, each of which should be re-garded as a statistical subsystem. Reflecting the finiteness of the subsystem,the unique characteristics at the single chain level may be smoothed-out byensemble averaging.

A clear picture in the single DNA level has become attainable through theuse of the fluorescence microscopy [10–12]. The direct observation of singleDNA molecules has revealed basic characteristics inherent in the folding oflong DNA molecules. Among others, the folding accompanies a marked dis-creteness, that is, the first-order transition from the swollen coil to the com-pactly folded state (cf., Fig. 3.2(middle)). In Fig. 3.3, the dependence of thelong-axis length of T4 DNA on the concentration of trivalent cation spermidineis plotted. Here, individual DNA molecules are folded in an all-or-none fashionand there is a certain range of coexistence, in which both the swollen coil andthe compactly folded states are observed. The same trend has been reportedfor various cases with different condensing agents.

Page 60: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

42 T. Sakaue and K. Yoshikawa

0

1

2

3

4

5

100 101 102 103 104

Long

axi

s len

gth(mm

)

CSPR(mM)

5mm

(a)

(b)

Fig. 3.3. Folding transition of T4 phage DNA induced by the addition of the triva-lent cation spermidine. The abscissa and ordinate axes are the spermidine concentra-tion and long axis length of DNA measured by fluorescence microscopic observation(see [11] for more details)

Recent progress in experiments has also revealed fascinating phenomenaand rich scenarios of the folding transition. Most noteworthy, the phenomenonof intrachain segregation has been shown to be possible in long DNA mole-cules [13–19] (cf. Fig. 3.2(bottom)). Careful observation of individual DNAmolecules around the region of the folding transition has revealed that suchpartially folded states appear in long DNA molecules under various situations.Long DNA chains can take not only coil and completely folded states, but alsointrachain segregated states with various morphologies as higher-order struc-tures, which can be controlled by suitable environmental conditions. Whatis the underlying mechanism behind such rich behaviors? We shall proceedto the theoretical description from the viewpoint of the statistical physics ofmacromolecules. We start from the classical theory of the coil-globule transi-tion, and then the recent developments and attempts inspired by the singlechain observation are also discussed.

3.3 Statistical Physics of Folding of a Long Polymer

3.3.1 Some Basis

In this section, we review the statistical mechanical approach to the prob-lem of the folding of long polymer chains. From the standpoint of physics,considerable efforts have been made to extract simple and universal laws ofbiopolymers’ behavior regardless of their complexity and diversity. This leads

Page 61: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

3 Transition in the Higher-order Structure of DNA 43

to the development of the theory of coil-globule transition [20, 21]. The coil-globule transition, if necessary, with appropriate modifications can be used tounderstand many features of real biopolymers. However, it is also obvious thatit is insufficient, and there still remain some gaps between our understandingand the transition behavior of real biopolymers. As possible picks for thesegaps, we may quote the heterogeneity of the monomer sequence, the effectof chain stiffness, and electrostatics, all of which are not considered in theideal version of coil-globule transition. Since DNA molecules are locally stiff,strongly (negatively) charged, and approximately treated as a homopolymerwith the appropriate coarse-graining, it is expected that many of the confor-mational behaviors of DNA is described by the relatively simple homopolymermodel with the effect of chain stiffness and electrostatics. In particular, it isexpected to be a reasonable model to study the conformational transition oflong DNA molecules at higher level. Further, we shall see the importance ofthe proper degree of coarse-graining to capture the diversity and universalitybehind the phenomena.

Let us start with some basis and definitions. The basic feature of the poly-mer molecules is connectivity. A linear polymer chain, that is, no branching,with the contour length L = Nl can be described as a sequence of N segmentsof size l. The number N is proportional to the molecular weight and the lengthl is called the Kuhn segment length.2 For various phenomena, including thefolding transition, it is important to distinguish l with the monomer size a,which corresponds to the thickness of the chain. The ratio l/a is a measure ofthe local chain stiffness. The smallness of the value on the order l/a � 1 meansthat the directional memory along the chain is lost at the monomer scale, andsuch polymers are referred to as flexible polymers. On the other hand, a largevalue of l/a � 1 indicates that the chain is rigid and resists bending at thescale of the Kuhn length, while manifesting flexibility at larger scales due tothe entropic elasticity. Polymers with such a hierarchical property are referredto as semiflexible polymers. One may also define stiff polymers, in which theKuhn length is comparable to or exceeds the chain length l ≥ L. Examples ofstiff polymers include actin filaments in cells (l � 35 μm and L � 0.5–1 μm)and fragment DNA molecules with ∼100 bp. On the other hand, long DNAmolecules are typical examples of semiflexible polymers.

3.3.2 Continuous Transition in Flexible Polymers:Coil-Globule Transition

A basic characteristic of a single polymer is its spatial dimensions, such asthe radius of gyration. The average size of the ideal chain is identical to themean square displacement of the random walker

√〈R2

id〉 � lN1/2 (the bracket

2 Note that the Kuhn length is comparable to the persistence length lp, which is analternative measure of the chain stiffness (see Sect. 3.2). For DNA (more generally,chains with worm-like elasticity), l = 2lp.

Page 62: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

44 T. Sakaue and K. Yoshikawa

indicates ensemble averaging). The conformation of the polymer correspondsto the trajectory of the random walker, called a random coil, which is theorigin of the entropic elasticity of the polymeric materials. In reality, thisconformation would be modified depending on the compatibility with thesolvent. When the compatibility is high, the solvent is called a good solvent,and the long polymer chain is more swollen due to the repulsive interactionbetween segments (excluded volume effect). In the opposite case, called thepoor solvent regime, however, the polymer is collapsed into a compact globulestate to minimize the contact with the solvent.

At the simplest level, this transformation, that is, coil-globule transitiondriven by the change in the solvent quality can be analyzed by the followingfree energy equation [20,21]:

F

T∼ α2 + α−2 + xα−3 + yα−6, (3.1)

where the swelling ratio α is the ratio of the polymer size R to the ideal chainsize α2 ≡ 〈R2〉/〈R2

id〉. The parameters x = BN1/2/l3 and y = C/l6 dependon the second (B) and third (C) virial coefficients, respectively, and T is thebath temperature (the Boltzmann constant is implicit throughout this chap-ter). The first two terms arise from the effect of the conformational entropyand the remaining two terms represent the interactions between segments,where the segment density is assumed not to be very high, so that the virialexpansion (up to triple interactions) would be valid. In usual systems (suchas flexible chains), changes in the solvent quality are reflected in the secondvirial coefficient, where B > 0 (B < 0) corresponds to a good (poor) solventand the condition with B = 0 is called the θ point. If the solvent quality iscontrolled by the temperature,3 then one can write B � al2τ around the θtemperature, with the reduced temperature τ = (θ − T )/θ. The equilibriumswelling ratio is obtained via the minimization of (3.1):

α5 − α = x + yα−3. (3.2)

This framework is most appropriate for the coil-globule transition in flexi-ble polymers with y � 1.4 In Fig. 3.4, we show how the coil-globule transitionproceeds with the temperature change for a flexible polymer with variouslengths. With decreasing temperature, the chain size shrinks gradually, andat some point, becomes equal to the ideal chain size (α = 1) due to the can-cellation of the attractive binary interactions and repulsive higher-order (inthis case, represented by C) interactions, which leads to the definition of theapparent transition temperature Ttr. It is seen that Ttr lies slightly below the θ

3 This simple case suffices to demonstrate the basic feature in the more generalsituation, in which the solvent quality can also be controlled by changing thesolution composition.

4 The calculation of C for the anisotropic molecule leads to C ∼ a3l3, thusy ∼ (a/l)3.

Page 63: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

3 Transition in the Higher-order Structure of DNA 45

τ

α

0.0-1.0 1.00.0

1.0

2.0

Fig. 3.4. Coil-globule transition in a flexible polymer y = 1 with various lengths.The solid, long-dashed, short-dashed, and dotted curves correspond to the chainlength N = 104, 103, 102, and 10, respectively. The horizontal and vertical dottedlines represent α = 1 and τ = 0, respectively

temperature. By substituting α = 1 in (3.2), the width of the transition regionis obtained as (θ − Ttr)/θ = N−1/2, that is, the sharpness of the transitionincreases with the chain length. A sophisticated mean-field theory predictsthat this transition becomes a second-order transition in the limit of the in-finite chain length [20]. In addition, an analogy with the critical phenomenasuggests that the coil-globule transition point corresponds to the tri-criticalpoint [21]. These results claim that the global feature does not depend on themolecular details and highlights the universality in the coil-globule transition.

The above analysis implies that the coil-globule transition is essentially agas–liquid transition within a single chain. Unlike usual molecular gases, thetranslational entropy is absent due to the chain connectivity, and instead, theconformational entropy shows up. The collapsed state is a spherical droplet,that is, globule, to minimize the surface area, the size of which is self-adjustedto satisfy the mechanical balance between the inside and the outside of theglobule.

3.3.3 Discontinuous Transition in Semiflexible Polymers

It is known that the coil-globule transition in flexible polymers is well ex-plained by the theory of the type discussed [22]. Note that the chain lengthand the solvent quality come into the theory in the following combined formx = BN1/2/l3, which is the only dimensionless parameter governing the tran-sition. The presence of the master curve (see Fig. 3.5 below) implies that thephase behavior of the thermodynamic limit with N → ∞ is readily discussedfrom the measurement of shorter chains via finite-size scaling.

What about semiflexible polymers? It is, in principle, possible to includethe effect of the chain stiffness through the parameter y in (3.1). As shown

Page 64: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

46 T. Sakaue and K. Yoshikawa

x

a

0.0-1.00.0

0.5

1.0

Fig. 3.5. Plots of α as a function of x for various values of y from (3.2). The solid,long-dashed, short-dashed, and dotted curves correspond to the parameters y = 1,0.1, 1/60, and 0.005, respectively

in Fig. 3.5, the dependence of the swelling ratio α on x becomes sharper forlarger values of y (stiffer chains) and develops a metastable loop beyond thecritical value of ycri = 1/60, which is reminiscent of the van der Waals the-ory for the gas–liquid transition. Although this feature seems to have an in-teresting connection with the large discontinuous transition observed in thefolding of long DNA molecules, it might be applicable only to an ideal sit-uation with asymptotically long chains. In most practical cases, nontrivialfeatures associated with the finite chain length effect show up. Moreover, theanisotropic segments have a capability to exhibit the orientational ordering indense states [20,23], which implies that the description based on a sole orderparameter α, that is, the segment density, becomes inadequate. The fact thatthe DNA chain with a rather wide range of length forms a compact toroid,the size of which is comparable to the Kuhn length in many situations [6, 7],indicates that the coarse-graining over the Kuhn length scale may be insuf-ficient. These features make the folding transition in semiflexible polymersmuch more exotic compared with a simple coil-globule transition in flexiblepolymers.

Equilibrium Aspects

Computer simulation is a powerful method for studying the folding transi-tion of semiflexible polymers, in which both intersegment and larger scaledegrees of freedom can be treated reliably [24–27]. A suitable model is asequence of spherical beads connected by bonds, in which the stiffness is con-trolled by the bending potential as a function of the angle between adjacentbonds. The solvent quality is tuned by the strength ε of the short-rangedattractive interaction between beads. An example of the result from Monte

Page 65: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

3 Transition in the Higher-order Structure of DNA 47

Fig. 3.6. Dependences of the chain size (gyration radius) on the inverse temperaturecalculated through Monte Carlo simulations. (Top) A semiflexible chain with contourlength L/a = 512 and Kuhn length l/a � 20, and (bottom) a flexible chain (l/a � 2)with the same contour length. The error bars represent the standard deviations. Theinsets show snapshots of (a) coil states and (b) folded states

Carlo simulation is shown in Fig. 3.6(top), in which the gyration radius of thechain with L/a = 500 and l/a � 20 is plotted as a function of the inversetemperature ε/T .

The chain size is almost unaffected by the solvent quality until the thresh-old point, at which the chain is discontinuously folded into the compact state.There is a narrow but finite region of coexistence, in which both the coil andthe compact states are observed. The compact state is no longer a spheri-cal globule, but has a toroidal morphology reminiscent of the typical foldedproduct of DNA chains. Neighboring segments inside the toroid exhibit a highorientational ordering, which indicates the folding of semiflexible polymers asa disorder–order transition. All these features resemble a typical trend in the

Page 66: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

48 T. Sakaue and K. Yoshikawa

folding of long DNA molecules revealed by the single chain observation, andthis strongly indicates that the semiflexibility is one of the crucial factors. Forcomparison, we also show the result from the same Monte Carlo calculationfor a flexible chain (L/a = 500 and l/a � 2) in Fig. 3.6(bottom). With thedecrease in the solvent quality, the chain gradually shrinks into the globulestate through the θ point, in accordance with the classical scenario of thecoil-globule transition (Sect. 3.3.2).

There are two factors identified for controlling the torus morphology andits size in the poor solvent condition. One is the surface energy, which tendsto reduce the surface area, and the other is the bending energy, which prefersstraighter conformations. In a compact state, these two factors compete, lead-ing to the torus as the optimum compromise [23, 28–32]. Let us discuss theoptimum size of the torus. The torus is characterized by two radii of curvature:the average radius R and the thickness r of the torus (Fig. 3.7). The relevantenergy consists of the surface and bending energies:

U � γS + κL

R2, (3.3)

where γ(� ε/a2) is the surface tension, S = 4π2rR is the surface area of thetorus, and κ = T l/2 is the bending modulus.5 To discuss the optimum shapeof the torus, let us assume that the torus is made up of the dense packingof the segments with parallel alignment. Then, the volume 2π2r2R = πa2L/4does not depend on the torus shape and one of the variables (r or R) is deleted.By minimizing (3.3) with respect to the remaining variable, the optimum sizeof the torus is deduced as (Fig. 3.7)

102

101

100

103 104

R, r/

a

L/a

R (charged)

R (neutral)

r (neutral)

r (charged)

R

r

Fig. 3.7. (Left) Schematic image of a torus. (Right) Double-logarithmic plot of thetorus size, average radius R, and thickness r vs. chain length L (3.4) with parametersγa2/T = 4 and l/a = 15. Also shown are the results for a charged semiflexible chain(cf. Sect. 3.3.4 and [32] for more details)

5 While it may appear that the high curvature near the center hole of the toruswould lead to a much higher bending energy, the trace of the chain segmentis not necessarily a circle. Rather, the chain can reduce the bending energy bydistributing the curvature more uniformly [33].

Page 67: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

3 Transition in the Higher-order Structure of DNA 49

Fig. 3.8. Typical snapshots (top and side views) of folded semiflexible polymerswith l/a � 20 from Monte Carlo simulations. The chain lengths are (a) L/a = 500,(b) L/a = 1,000, and (c) L/a = 2,000. The dependence of the radius of gyration onthe chain length obeys the scaling law Rg ∼ Lν with the exponent ν = 0.197±0.019(see [32] for more details)

r ∼(

γa6L2

κ

)1/5

,

R ∼(

κ2L

γ2a2

)1/5

. (3.4)

The mean radius of the torus is rather insensitive to the chain length. Con-sequently, as the chain length increases, the thickness of torus r increases morerapidly than the mean radius R. Beyond the critical length L∗ (obtained asR(L∗) = r(L∗)), a hole is not formed, thus a fat disk would be formed. Thesepredictions are in reasonable agreement with the results obtained from MonteCarlo simulations (Fig. 3.8).

Folding Kinetics

It is interesting to ask the kinetic aspect of the folding. How a long fluctuatingcoil folds into an ordered torus structure upon the decrease of the solvent qual-ity? The discontinuous nature of the transition implies that the process of thefolding would be similar to the crystallization from a supersaturated solution,in which the nucleation and growth are typical kinetic processes. Figure 3.9shows a typical example of the folding process obtained by Brownian dynamicsimulations. A semiflexible chain is initially in a good solvent condition (left-most snapshot). After the quench, the chain keeps a coil state for a while.During this metastable period, pairs of monomeric units stick to each otherfor a short time owing to the effective attractive interactions in the courseof thermal fluctuation. However, such pairs soon break and separate. When alarge enough doughnut-shaped nucleus (critical nucleus) is formed at a certain

Page 68: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

50 T. Sakaue and K. Yoshikawa

time

6 sec 1.5 sec 1.5 sec10µm

Fig. 3.9. Dynamical process of the folding of a semiflexible polymer with contourlength L/a = 512 and Kuhn length l/a � 20. (Top) Snapshots obtained throughBrownian dynamics simulations, and (bottom) the fluorescence intensity profile of theT4 DNA during the folding and corresponding schematic pictures (see [26] and [34]for more details)

occasion, the remaining coil part is pulled into the nucleus in order, and finallythe torus structure is formed. The critical nucleus is created at the chain end,with the highest probability reflecting a large motional freedom. The typicalcharacteristic of torus formation is the almost constant speed of the growthprocess, reflecting the quasi-one-dimensional nature of the polymer chain.6

Note that not only the torus but also rod-shaped products are frequentlyformed, although these rod structures have slightly higher energies than thetorus, and so are metastable at the condition investigated herein. Close inspec-tion of the folding process indicates that the final structure is almost controlledat the stage of the nucleation, that is, a rod-shaped nucleus would be moreeasily formed than the doughnut-shaped nucleus, resulting in a rather highprobability for the metastable rod formation. These results demonstrate thecrucial importance of the pathway in the free energy landscape in semiflexiblechain folding.

6 For a more rigorous argument, the finite-size effect in the torus state (surface,bending energies, etc.) and the dissipation involved in the process should be cor-rectly characterized. It is worthwhile to point out the similarity between thegrowth process (sucking the coil part into the nucleus) and the dynamics of poly-mer translocation (sucking the coil part into the localized hole). For the latter, alucid theoretical description has recently been proposed [35].

Page 69: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

3 Transition in the Higher-order Structure of DNA 51

Fig. 3.10. Typical snapshot of the core-shell structure formed from a long semiflex-ible chain with L/a = 2400 and l/a � 20 obtained through Monte Carlo simulations

Core-Shell Structure in Long Chains

So far, we have observed unique characteristics in the folding of semiflexiblepolymers, which are mostly associated with torus formation in the compactstate. Although the length of the chain studied was long enough to reveal thesemiflexibility, the number N = L/l of statistically independent segments wasvery small (on the order of �10). The torus is indeed the product of a chainof finite length, as discussed earlier, and there would be several distinctivefeatures expected for the folding of longer semiflexible chains.

A recent study has demonstrated that a long semiflexible polymer mayassume a partially folded state, in which a dense core is surrounded by a dis-perse fringe, at the moderately poor solvent condition [36] (Fig. 3.10). Insidethe core, the segment density is rather high, and there is a weak orientationalordering. Upon further quenching, this core-shell structure will be transformedinto a more ordered, completely folded state, such as a torus or a disk struc-ture. Therefore, the long semiflexible polymers exhibit multiple-step foldingtransitions.

3.3.4 Instability Due to the Remanent Charge

In Sect. 3.3.3, we have examined the impact of the chain stiffness on the foldingtransition. Several aspects of the DNA higher-order transition can be discussedin terms of the semiflexible chain model. However, experiments also providedifferent situations, which seem not to be explained by the stiffness effectalone. In this section, we shall deal with another important effect arisingfrom the polyelectrolyte nature of DNA molecules. One of the central issueshere is the origin of the attractive force between like-charged segments as thedriving force of the folding. However, our stance here is to investigate thelarge scale conformation of the polymer, given some effective interaction, asmentioned in Sect. 3.2.2. The primal difference with the neutral chain case liesin the electrostatic self-energy of the structure due to the possible incomplete

Page 70: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

52 T. Sakaue and K. Yoshikawa

charge compensation. This may have a crucial effect on the folding mannerboth in flexible and semiflexible polymers.

Rayleigh Instability

Given a constant volume ∼R3, a shape with minimum surface area is a sphereof radius ∼R. Therefore, a liquid drop usually takes a spherical shape to min-imize the surface energy to ∼γR2. Now imagine that electric charge Q is ac-cumulated in the droplet, which creates electrostatic self-energy ∼(lBQ2/R).7

When the charge exceeds the critical value Qcr = e(γR3/(T lB)), the sphericaldrop becomes locally unstable and will spontaneously deform. This is calledRayleigh instability and the equilibrium state is a set of smaller droplets withcharge on each of them lower than the critical value, which are infinitely sep-arated from each other [37]. The same instability happens for the chargedglobule made from flexible polyelectrolytes, but the final equilibrium stateis now smaller globules connected by narrow strings due to the connectiv-ity of the chain. This pearl-necklace globule was first predicted based on thescaling argument [38] and was validated by subsequent extensive computersimulations [39,40].

Rings-on-a-String Conformation in Semiflexible Polyelectrolytes

Let us start with a recent experimental observation [18, 19] summarized inFig. 3.11. Here, T4 DNA molecules are folded by a gemini (dimeric) surfactantas a condensing agent. Fluorescence microscope (FM) observations show theappearance of the partially folded structure as a stable state in a certainrange of surfactant concentration. This is an example of the stepwise foldingtransition through intrachain segregation, cf. Fig. 3.2(bottom). Atomic forcemicroscopy (AFM) has clearly revealed the fine structure in which severaltori are interconnected by strings, that is, a single DNA molecule takes arings-on-a-string structure.

How can we explain this phenomenon? The preceding sections have iden-tified several mechanisms to control the size and the morphology of the foldedpolymers. Surface tension is always important and is responsible for the spher-ical morphology of the flexible polymer globule. The size of the globule isdetermined by the condition of the mechanical equilibrium between the in-side globule and the outer solution. For semiflexible polymers of moderatelength, the bending stress prefers the torus morphology, the size of which isdetermined by the balance between the surface and the bending energies. Ifthe flexible polymers are charged, the globule may split due to the Rayleighinstability, and the pearl-necklace conformation appears as a result of the

7 The length lB = e2/εT is called the Bjerrum length, which corresponds to thedistance at which the electrostatic energy between two unit charges in the mediumof the (effective) dielectric constant ε becomes equivalent to the thermal energy.

Page 71: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

3 Transition in the Higher-order Structure of DNA 53

Fig. 3.11. Folding of T4 DNA by the addition of the gemini surfactant. Distri-butions of the long-axis length of T4 DNA at different concentrations [cs] of thesurfactant. Coil, partially folded, and completely folded states are distinguished bythe different colorings. Also shown are FM and AFM images with the correspondingschematic representation of the partially folded state ([cs] = 0.2 μM) and completelyfolded state ([cs] = 1.0 μM). The FM and AFM observations are of the same DNAmolecules attached to a mica surface. A rings-on-a-string structure is clearly seenfor the partially folded DNA, while the completely folded DNA assumes a networkstructure composed of many fused rings (see [19] for more details)

competition between the surface and the electrostatic energies. The naturalquestion, then, is what is expected for the folding transition of the semiflexiblepolyelectrolytes?

The rings-on-a-string structure is characterized by the coexistence of or-dered domains (torus) and disordered domains (coil), and is thus regarded asmicrophase segregation within a single chain. Since the generation of an or-dered folded structure from a semiflexible chain can be considered to be a kindof crystal growth (Sect. 3.3.3), the appearance of such intra-chain segregatedstructures is somewhat counterintuitive. In the simulation of the folding of asingle semiflexible chain, in which the process of torus nucleation and growthis clearly observed, a partially folded structure with a growing torus is onlytransient and is never stable [26].

One may naively suppose that this phenomenon is caused by Rayleigh in-stability, that is, a single torus may split upon the accumulation of the charge.

Page 72: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

54 T. Sakaue and K. Yoshikawa

However, it is not immediately obvious that this mechanism is responsible forthe rings-on-a-string structures observed for DNA in solution with a moder-ate concentration of monovalent salt. In fact, a simple energetic considerationsuggests the following unique characteristic of the charged torus [32]. At agiven segment density, a torus is characterized by two characteristic radii ofcurvature, that is, ring radius R and ring thickness r and therefore possessesa greater degree of freedom than a spherical globule, which is solely charac-terized by the radius, or equivalently, by the number of segments inside theglobule. This additional freedom provides an escape pathway, which allowsthe torus to grow without accumulating the electrostatic self-energy, that is,unlike a spherical globule, a torus does not necessarily split upon charging.In other words, the electrostatic-self energy limits the ring thickness, but notthe ring radius. Thus, the grand state of the charged torus is characterizedby a thin ring, the radius of which rapidly increases with the chain length L(Fig. 3.7).

Let us briefly discuss a possible alternative scenario, which has been pro-posed based on the consideration of the unique characteristics of the chargedtorus and the crucial role of the combinational entropy of the segment statedistribution along the chain [41]. The free energy F (N) of a folded polymerwith N segments is generally written in the following form:

F (N) = Fb(N) + ΔF (N), (3.5)

where the first term Fb(N) ∼ N is the bulk term and the second term rep-resents the nonextensive part. For a globule of neutral flexible polymers, thiscomes from the surface energy ΔF (N) ∼ N2/3, and for a neutral torus formedby semiflexible polymers, the minimization of (3.3) leads to ΔF (N) ∼ N3/5.Therefore, splitting into two parts is forbidden by the high energetic penalty:F (N) < F (N1) + F (N2) (with N = N1 + N2).8 On the other hand, if theresidual charge inside the torus limits its thickness, splitting does not alterthe total volume and the surface area of the object. Thus, the only contri-bution to the nonextensive part of the free energy arises from the bendingenergy, which can be evaluated as ΔF (N) ∼ N−1 from (3.3). The energeticcost for the splitting is then very low, in particular for a long chain; there-fore, a multiple-tori structure (Fig. 3.11 (right)) may appear as an entropicallystabilized state, reflecting the increase in the possible number of states.9

A simple model calculation of the folding transition in line with the aboveanalysis has demonstrated that the degree of the remanent charge inside thefolded part is a crucial factor for the transition manner (Fig. 3.12). If the foldedpart is completely neutralized by oppositely charged low molecular solvents,then the scenario developed for neutral semiflexible polymers can be applied.8 There is an additional penalty associated with the “boundary” between two parts,

which may be regarded as a defect.9 In “low temperature” states like this, a kinetic effect would also be important for

the generation of multiple tori structures. (See [32] for more details.)

Page 73: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

3 Transition in the Higher-order Structure of DNA 55

0 400 800 1200

1.0

2.0

3.0

N

c g (x1

0-2 )

c g (x1

0-3 )

0 400 800 1200

6.0

7.0

8.0

N

coil

fully foldedfully folded

coil

rings-on-a string

Fig. 3.12. Diagrams of the folding transition of semiflexible polyelectrolytes (l/a =20) by the addition of the condensing agent in a plane of concentration of the con-densing agent cg and the segment number N = L/l. (Left) An all-or-none transitionfrom coil to fully folded torus is observed for the case of almost complete chargeneutralization (degree of the remanent charge α = 10−4). (Right) Rings-on-a-stringstructures emerge for the folding of long chains due to the presence of the remanentcharge (α = 0.2) (see [41] for more details)

On the other hand, the presence of the remanent charge may have a qualita-tive effect. At the onset of the folding, the chain may be discontinuously foldedinto the rings-on-a-string structure. As can be easily guessed from the earlierdiscussion, this structure is stabilized by a large number of possible ways ofrealization on how tori and coils can be arranged along the chain. Reflectingthe finiteness of the system freedom, the structures of different numbers ofrings coexist in the intrachain segregated state. As the solvent quality de-creases, the probability distribution changes and finally the completely foldedstate composed of many fused mini rings is reached.

The essential requisite for the present scenario is the unique property ofthe charged torus, that is, its instability to thicken beyond a certain size.Therefore, its applicability is not limited to the case, in which the torus thick-ness is limited by the electrostatic mechanism. For example, we expect thatsurfactant molecules, which are sometimes used as condensing agents, mayaffect such structural property through the packing inside folded structures.Note also that the presence of the finite-sized bundles is rather ubiquitous inother semiflexible polyelectrolyte systems and biopolymer solutions. Seekingfor its consequences would be a yet uncultivated problem.

3.4 Summary and Perspectives

Controlling the higher-order structures of DNA in a reliable way is highlyrequired for various problems, ranging from biological/nanosciences to med-ical applications. We have seen that, in the mesoscopic length scale, a DNA

Page 74: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

56 T. Sakaue and K. Yoshikawa

molecule can be reasonably well described by a simple polymer model withuniform physical property. It has become evident that this simple polymeris capable of exhibiting much richer conformational transitions than naivelyexpected. One striking example is the phenomenon of intrachain segregation.We have discussed one of the possible scenarios in Sect. 3.3.4. However, otherscenarios are also conceivable under different experimental conditions. A re-lated topic is the appearance of the core-shell structure in long semiflexiblepolymers discussed in Sect. 3.3.3, which requires further investigations.

There would be many open questions and various future directions eitherfundamental- or application-oriented. We close this chapter by adding twocomments, which are supposed to be fundamental from the biological pointof view.

3.4.1 Higher-order Structure and Genetic Activity

One of the most interesting questions is the relationship with the geneticactivity. Although it is known that the part of chromatin in the geneticallyactive state is somewhat relaxed, this phenomenon has not been discussedfrom the viewpoint of the material properties inherent in long DNA chains.A recent in vitro study reported that the transcriptional activity of long DNAmolecules (40 kbp containing one gene) can be abruptly switched off at thecritical concentration of the added condensing agents [42]. Importantly, thisinhibition is shown to be directly correlated with the all-or-none discontinuousfolding transition [43]. On the other hand, under the same conditions, a systemcomposed of short fragments of DNA on the order of the persistence lengthdoes not show such an on/off switching of the transcriptional activity. Here, itshould be noted that fragment DNA molecules are used in usual biochemicaland molecular biology experiments due to the difficulty in operating long DNAmolecules, where the correlation between the higher-order structures of DNAand its function is missing.

What can be expected for longer DNA molecules with multiple genes?The typical domain size involved in the intrachain segregation is on the orderof several dozen kilo base pairs. This implies that several dozen genes canbe simultaneously switched off by the formation of one segregated domainthrough partial folding. Such a higher-order transition can be induced by aslight change in the environmental condition and may provide global controlof the accessibility of regulatory proteins [44].

3.4.2 Toward Chromatin Structure

In eucaryotic cells, high compaction of genomic DNA is achieved by the com-plexation with cationic structural proteins called histones [45]. DNA firstwraps around the histone to form a basic unit known as nucleosome, which is

Page 75: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

3 Transition in the Higher-order Structure of DNA 57

organized into the hierarchical chromatin structure. The structure and func-tion of the chromatin has been actively studied on a molecular level. For exam-ple, chemical modifications such as acetylation and methylation of the histonetail are known to greatly affect the chromatin activity. However, the higher-order structure has not yet been clarified. Although it is widely recognizedthat chromatins function by utilizing various specific mechanisms, here again,one may also approve of the imperative impact of general (most importantlyelectrostatic) interactions. In this direction, let us introduce a recent study,which focuses on a simple model system composed of T4 DNA and cationicnanosized particles instead of histones [46] (Fig. 3.13). The constructed systemof the chromatin analogue is not featured by specific DNA–histone interac-tions, and therefore is governed by general interactions only, the propertiesof which are controllable with comparative ease. The result shows that theglobal structure is indeed controlled by apparent physical parameters, such asthe size and/or charge, the concentration of nanoparticles, and the ambientsalt concentration, etc. In particular, under suitable conditions with the reg-ular wrapping mode, structures reminiscent of real chromatins are obtained.The correlation between the higher-order structures and the transcriptional

Fig. 3.13. (Top) An electron micrograph of an artificial chromatin model com-posed of T4 DNA and cationic nanoparticles of diameter �15nm. (Bottom) Typicalsnapshots of a model DNA (semiflexible polyelectrolyte) complexed with cationicnanoparticles. At low salt concentration (Debye screening length rD/a = 1), a beads-on-a-string nucleosome-like structure is observed (left), while locally segregated clus-ters are formed at higher salt concentrations (rD/a = 0.3) (right) (See [46] for moredetails)

Page 76: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

58 T. Sakaue and K. Yoshikawa

activity has also been examined, providing a useful insight for the gene deliv-ery application as well as the function of real chromatin [47].

It is likely that cells utilize the physico-chemical properties inherent ingenomic DNA molecules, which are highly charged, locally stiff, and very longin the course of their functioning. Unveiling the higher-order structure and itsrelation with the function of DNA and chromatin is awaited.

References

1. B. Albert et al., Molecular Biology of the Cell, 3rd edn. (Gerland, New York,1994)

2. A. Arkin, J. Ross, H.H. McAdams, Genetics 149, 1633 (1998)3. C.V. Rao, D.M. Wolf, A.P. Arkin, Nature (London) 420, 231 (2002); 421, 190E

(2003)4. J.M. Raser, E.K. O’Shea, Science 309, 2010 (2005)5. A.A. Kornyshev, D.J. Lee, S. Leikin, A. Wynveen, Rev. Mod. Phys. 79, 943

(2007)6. V.A. Bloomfield, Biopolymers 31, 1471 (1991)7. V.A. Bloomfield, Curr. Opin. Struct. Biol. 6, 334 (1996)8. J. Widom, R.L. Baldwin, Biopolymers 22, 1595 (1983)9. W.M. Gelbart, R.F. Bruinsma, P.A. Pincus, V.A. Parsegian, Phys. Today 53,

38 (2000)10. K. Yoshikawa, M. Takahashi, V.V. Vasilevskaya, A.R. Khokhlov, Phys. Rev.

Lett. 76, 73029 (1996)11. M. Takahashi, K. Yoshikawa, V.V. Vasilevskaya, A.R. Khokhlov, J. Phys.

Chem. B 101, 9396 (1997)12. K. Yoshikawa, Y. Yoshikawa, in Pharmaceutical Perspectives of Nucleic Acid-

Based Therapeutics, ed. by R.I. Mahato, S.W. Kim (Taylor & Francis, London,2002)

13. S.G. Starodubsev, K. Yoshikawa, J. Phys. Chem. 100, 19702 (1996)14. M. Ueda, K. Yoshikawa, Phys. Rev. Lett. 77, 2133 (1996)15. K. Yoshikawa, Y. Yoshikawa, Y. Koyama, T. Kanbe, J. Am. Chem. Soc. 119,

6473 (1997)16. S. Takagi, K. Tsumoto, K. Yoshikawa, J. Chem. Phys. 114, 6942 (2001)17. Y. Yoshikawa, Yu.S. Velichko, Y. Ichiba, K. Yoshikawa, Eur. J. Biochem. 268,

2593 (2001)18. A.A. Zinchenko, V.G. Sergeyev, S. Murata, K. Yoshikawa, J. Am. Chem. Soc.

125, 4414 (2003)19. N. Miyazawa, T. Sakaue, K. Yoshikawa, R. Zana, J. Chem. Phys. 112, 044902

(2005)20. A.Yu. Grosberg, A.R. Khokhlov, Statistical Physics of Macromolecules,

(American Institute of Physics, New York, 1994)21. P.-G. de Gennes, Scaling Concepts in Polymer Physics, (Cornell University

Press, Ithaca, 1979)22. G. Swislow, S. Sun, I. Nishio, T. Tanaka, Phys. Rev. Lett. 44, 796 (1980)23. A.Yu. Grosberg, A.R. Khokhlov, Adv. Polym. Sci. 41, 53 (1981)24. H. Noguchi, K. Yoshikawa, J. Chem. Phys. 109, 5070 (1998)

Page 77: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

3 Transition in the Higher-order Structure of DNA 59

25. V.A. Ivanov, W. Paul, K. Binder, J. Chem. Phys. 109, 5659 (1998)26. T. Sakaue, K. Yoshikawa, J. Chem. Phys. 117, 6323 (2002)27. M.R. Stukan, V.A. Ivanov, A.Yu. Grosberg, W. Paul, K. Binder, J. Chem.

Phys. 118, 3392 (2003)28. A.Yu. Grosberg, Biofizika(USSR) 24, 32 (1979)29. A.Yu. Grosberg, A.V. Zhestkov, J. Biomol. Struct. Dyn. 3, 859 (1986)30. J. Ubbink, T. Odijk, Europhys. Lett. 33, 353 (1996)31. V.V. Vasilevskaya, A.R. Khokhlov, S. Kidoaki, K. Yoshikawa, Biopolymers 41,

51 (1997)32. T. Sakaue, J. Chem. Phys. 120, 6299 (2004)33. N.V. Hud, K.H. Downing, R. Balhorn, Proc. Natl. Acad. Sci. USA 92, 3581

(1995)34. Y. Matsuzawa, Y. Yonezawa, K. Yoshikawa, Biochem. Biophys. Commu. 225,

796 (1996)35. T. Sakaue, Phys. Rev. E 76, 021803 (2007)36. Y. Higuchi, T. Sakaue, K. Yoshikawa, Chem. Phys. Lett. 461, 42 (2008)37. L. Rayleigh, Philos. Mag. 14, 184 (1882)38. A.V. Dobrynin, M. Rubinstein, S.P. Obukhov, Macromolecules 29, 2974 (1996)39. A.V. Lyulin, B. Dunweg, O.V. Borisov, A.A. Darinskii, Macromolecules 32,

3264 (1999)40. H.J. Limback, C. Holm, K. Kremer, Europhys. Lett. 49, 189 (2000)41. T. Sakaue, K. Yoshikawa, J. Chem. Phys. 125, 074904 (2006)42. K. Tsumoto, L. Francois, K. Yoshikawa, Biophys. Chem. 106, 23 (2003)43. A. Yamada, K. Kubo, T. Nakai, K. Tsumoto, K. Yoshikawa, Appl. Phys. Lett.

86, 223901 (2005)44. K. Yoshikawa, J. Biol. Phys. 28, 701 (2002)45. A.P. Wolffe, Chromatin: Structure and Function, (Academic Press, New York,

1998)46. A.A. Zinchenko, T. Sakaue, S. Araki, K. Yoshikawa, D. Baigl, J. Phys. Chem.

B 111, 3019 (2007)47. A.A. Zinchenko, L. Francois, K. Yoshikawa, Biophys. J. 92, 1318 (2007)

Page 78: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

“This page left intentionally blank.”

Page 79: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

4

Generalized-Ensemble Algorithms for StudyingProtein Folding

Y. Okamoto

Abstract. Conventional simulations of biomolecular systems will get trapped instates of local-minimum energy. A simulation in generalized ensemble overcomesthis difficulty by performing a random walk in potential energy space and otherparameter space. From only one simulation run, one can obtain accurate canonical-ensemble averages of physical quantities as functions of temperature and other para-meters of the sytem by the single-histogram and/or multiple-histogram reweightingtechniques. In this article, we review the generalized-ensemble algorithms. Two well-known methods, namely, multicanonical algorithm and replica-exchange method, aredescribed first. Both Monte Carlo and molecular dynamics versions of the algorithmsare given. We then present further extensions of the above two methods.

4.1 Introduction

Canonical fixed-temperature simulations of complex systems such as biomole-cules are greatly hampered by the multiple-minima problem. Because simula-tions at low temperatures tend to get trapped in a few of the huge number oflocal-minimum-energy states, which are separated by high energy barriers, itis very difficult to obtain accurate canonical distributions at low temperaturesby conventional Monte Carlo (MC) and molecular dynamics (MD) methods.One way to overcome this multiple-minima problem is to perform a simula-tion in a generalized ensemble where each state is weighted by an artificial,non-Boltzmann probability weight factor so that a random walk in potentialenergy space may be realized. This class of simulation methods are referredto as the generalized-ensemble algorithms (for reviews see, e.g., [1–7]). Therandom walk allows the simulation to escape from any energy barrier and tosample much wider conformational space than by conventional methods. Bymonitoring the energy in a single simulation run, one can obtain not only theglobal-minimum-energy state but also canonical-ensemble averages as func-tions of temperature by the single-histogram [8] or multiple-histogram [9, 10]reweighting techniques (an extension of the multiple-histogram method is alsoreferred to as the weighted histogram analysis method (WHAM) [10]).

Page 80: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

62 Y. Okamoto

One of the most well-known generalized-ensemble methods is perhaps mul-ticanonical algorithm (MUCA) [11, 12] (for reviews see, e.g., [13, 14]). (Themethod is also referred to as entropic sampling [15] and adaptive umbrellasampling [16] of the potential energy [17]. MUCA can also be considered as asophisticated, ideal realization of a class of algorithms called umbrella sam-pling [18]. Also closely related methods are transition matrix methods reviewedin [19] and Wang-Landau method [20, 21], which is also referred to as densityof states Monte Carlo [22]. See also [23].) MUCA and its generalizations havebeen applied to spin systems (see, e.g., [24–29]). MUCA was also introducedto the molecular simulation field [30]. Since then MUCA and its generaliza-tions have been extensively used in many applications in protein and relatedsystems [31–65]. Molecular dynamics version of MUCA has also been devel-oped [17,38,42] (see also [38,66] for Langevin dynamics version). MUCA hasbeen extended so that flat distributions in other parameters instead of poten-tial energy may be obtained (see, e.g., [25, 26, 37, 43, 45, 60, 64]). This can beconsidered as a special case of the multidimensional (or, multivariable) exten-sions of MUCA, where a multidimensional random walk in potential energyspace and in other parameter space is realized (see, e.g., [37, 43, 44, 62, 65]).In this article, we just present one of such methods, namely, the multibaric-multithermal algorithm, where a two-dimensional random walk in both poten-tial energy space and volume space is realized [62,63].

The multicanonical algorithms are powerful, but the probability weightfactors are not a priori known and have to be determined by iterations ofshort trial simulations. This process can be nontrivial and very tedius forcomplex systems with many degreees of freedom.

In the replica-exchange method (REM) [67–69], the difficulty of weightfactor determination is greatly alleviated. (A closely related method was in-dependently developed in [70]. Similar methods in which the same equationsare used but emphasis is laid on optimizations have been developed [71, 72].REM is also referred to as multiple Markov chain method [73] and paralleltempering [74]. Details of literature about REM and related algorithms canbe found in recent reviews [2, 75].) In this method, a number of noninteract-ing copies (or replicas) of the original system at different temperatures aresimulated independently and simultaneously by the conventional MC or MDmethod. For every few steps, pairs of replicas are exchanged with a specifiedtransition probability. The weight factor is just the product of Boltzmannfactors, and so it is essentially known.

REM has already been used in many applications in protein systems[76–91]. Other molecular simulation fields have also been studied by thismethod in various ensembles [92–96]. Moreover, REM was applied to clus-ter studies in quantum chemistry field [97]. The details of molecular dynamicsalgorithm have been worked out for REM in [77]. This led to a wide applica-tion of REM in the protein folding and related problems (see, e.g., [98–115]).

However, REM also has a computational difficulty: As the number of de-grees of freedom of the system increases, the required number of replicas also

Page 81: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

4 Generalized-Ensemble Algorithms for Studying Protein Folding 63

greatly increases, whereas only a single replica is simulated in MUCA. Thisdemands a lot of computer power for complex systems. Our solution to thisproblem is to use REM for the determinations of weight factor of MUCA,which is much simpler than previous iterative methods of weight determina-tions, and then perform a long MUCA production run. The method is referredto as the replica-exchange multicanonical algorithm (REMUCA) [82, 87, 88].In REMUCA, a short replica-exchange simulation is performed, and the mul-ticanonical weight factor is determined by the multiple-histogram reweightingtechniques [9, 10].

Finally, one is naturally led to a multidimensional (or, multivariable)extension of REM, which we refer to as multidimensional replica-exhcangemethod (MREM) [80]. (The method is also referred to as generalized parallelsampling [116], Hamiltonian replica-exchange method [86], and Model Hop-ping [117].) A special realization of MREM is replica-exchange umbrella sam-pling (REUS) [80] and it is particularly useful in free energy calculations (seealso [81] for a similar idea). In this article, we just present one of such meth-ods, namely, the replica-exchange method in the isobaric-isothermal ensemble,where not only temperature values but also pressure values are exchanged inthe replica-exchange processes [3, 94, 96, 104, 105]. (The results of the firstsuch application of the two-dimensional replica-exchange simulations in theisobaric-isothermal ensemble were presented in [3].) This approach is comple-mentary to the multibaric-multithermal algorithm above.

In this article, we describe the generalized-ensemble algorithms mentionedearlier. Namely, we first review the two familiar methods: MUCA and REM.We then describe multidimensional extensions of these methods. Examples ofthe results by some of these algorithms are then presented.

4.2 Generalized-Ensemble Algorithms

4.2.1 Multicanonical Algorithm

Let us consider a system of N atoms of mass mk (k = 1, . . . , N) with theircoordinate vectors and momentum vectors denoted by q ≡ {q1, . . . , qN} andp ≡ {p1, . . . ,pN}, respectively. The Hamiltonian H(q, p) of the system is thesum of the kinetic energy K(p) and the potential energy E(q):

H(q, p) = K(p) + E(q), (4.1)

where

K(p) =N∑

k=1

pk2

2mk. (4.2)

In the canonical ensemble at temperature T , each state x ≡ (q, p) with theHamiltonian H(q, p) is weighted by the Boltzmann factor

Page 82: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

64 Y. Okamoto

WB(x;T ) = exp (−βH(q, p)) , (4.3)

where the inverse temperature β is defined by β = 1/kBT (kB is the Boltzmannconstant). The average kinetic energy at temperature T is then given by

〈K(p)〉T =

⟨N∑

k=1

pk2

2mk

⟩T

=32NkBT. (4.4)

Because the coordinates q and momenta p are decoupled in (4.1), we cansuppress the kinetic energy part and can write the Boltzmann factor as

WB(x;T ) = WB(E;T ) = exp(−βE). (4.5)

The canonical probability distribution of potential energy PNVT(E;T ) is thengiven by the product of the density of states n(E) and the Boltzmann weightfactor WB(E;T ):

PNVT(E;T ) ∝ n(E)WB(E;T ). (4.6)

Because n(E) is a rapidly increasing function and the Boltzmann factor de-creases exponentially, the canonical ensemble yields a bell-shaped distribution,which has a maximum around the average energy at temperature T . The con-ventional MC or MD simulations at constant temperature are expected toyield PNVT(E;T ). A MC simulation based on the Metropolis algorithm [118]is performed with the following transition probability from a state x of po-tential energy E to a state x′ of potential energy E′:

w(x → x′) = min(

1,WB(E′;T )WB(E;T )

)= min (1, exp (−βΔE)) . (4.7)

whereΔE = E′ − E. (4.8)

A MD simulation, on the other hand, is based on the following Newton equa-tions of motion:

qk =pk

mk, (4.9)

pk = − ∂E

∂qk

= fk, (4.10)

where fk is the force acting on the kth atom (k = 1, . . . , N). This set ofequations actually yield the microcanonical ensemble, and we have to add athermostat to obtain the canonical ensemble at temperature T . Here, we justfollow Nose’s prescription [119,120], and we have

qk =pk

mk, (4.11)

Page 83: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

4 Generalized-Ensemble Algorithms for Studying Protein Folding 65

pk = − ∂E

∂qk

− s

spk = fk − s

spk, (4.12)

s = sPs

Q, (4.13)

Ps =N∑

k=1

pk2

mk− 3NkBT = 3NkB (T (t) − T ) , (4.14)

where s is Nose’s scaling parameter, Ps is its conjugate momentum, Q is itsmass, and the “instantaneous temperature” T (t) is defined by

T (t) =1

3NkB

N∑k=1

pk(t)2

mk. (4.15)

However, in practice, it is very difficult to obtain accurate canonical distri-butions of complex systems at low temperatures by conventional MC or MDsimulation methods. This is because simulations at low temperatures tend toget trapped in one or a few of local-minimum-energy states.

In the multicanonical ensemble [11, 12], on the other hand, each state isweighted by a non-Boltzmann weight factor Wmu(E) (which we refer to as themulticanonical weight factor), so that a uniform potential energy distributionPmu(E) is obtained:

Pmu(E) ∝ n(E)Wmu(E) ≡ constant. (4.16)

The flat distribution implies that a free one-dimensional random walk in thepotential energy space is realized in this ensemble. This allows the simulationto escape from any local minimum-energy states and to sample the configu-rational space much more widely than the conventional canonical MC or MDmethods.

The definition in (4.16) implies that the multicanonical weight factor isinversely proportional to the density of states, and we can write it as follows:

Wmu(E) ≡ exp [−β0Emu(E;T0)] =1

n(E), (4.17)

where we have chosen an arbitrary reference temperature, T0 = 1/kBβ0, andthe “multicanonical potential energy” is defined by

Emu(E;T0) ≡ kBT0 ln n(E) = T0S(E). (4.18)

Here, S(E) is the entropy in the microcanonical ensemble. Since the density ofstates of the system is usually unknown, the multicanonical weight factor hasto be determined numerically by iterations of short preliminary runs [11,12].

A multicanonical MC simulation is performed, for instance, with the usualMetropolis criterion [118]: The transition probability of state x with potentialenergy E to state x′ with potential energy E′ is given by

Page 84: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

66 Y. Okamoto

w(x → x′)=min(1,

Wmu(E′)Wmu(E)

)=min

(1,

n(E)n(E′)

)=min (1, exp (−β0ΔEmu)) ,

(4.19)where

ΔEmu = Emu(E′;T0) − Emu(E;T0). (4.20)

The MD algorithm in the multicanonical ensemble also naturally follows from(4.17), in which the regular constant temperature MD simulation (with T =T0) is performed by replacing E by Emu in (4.12) [38,42]:

pk = − ∂Emu(E;T0)∂qk

− s

spk =

∂Emu(E;T0)∂E

fk − s

spk. (4.21)

If the exact multicanonical weight factor Wmu(E) is known, one can cal-culate the ensemble averages of any physical quantity A at any temperatureT (=1/kBβ) as follows:

< A >T =

∑E

A(E)PNVT(E;T )

∑E

PNVT(E;T )=

∑E

A(E)n(E) exp(−βE)

∑E

n(E) exp(−βE), (4.22)

where the density of states is given by (see (4.17))

n(E) =1

Wmu(E). (4.23)

The summation instead of integration is used in (4.22), because we oftendiscretize the potential energy E with step size ε (E = Ei; i = 1, 2, . . .). Here,the explicit form of the physical quantity A should be known as a functionof potential energy E. For instance, A(E) = E gives the average potentialenergy < E >T as a function of temperature, and A(E) = β2(E− < E >T )2

gives specific heat.In general, the multicanonical weight factor Wmu(E), or the density of

states n(E), is not a priori known, and one needs its estimator for a numer-ical simulation. This estimator is usually obtained from iterations of shorttrial multicanonical simulations. The details of this process are described, forinstance, in [24,33]. However, the iterative process can be nontrivial and verytedius for complex systems.

In practice, it is impossible to obtain the ideal multicanonical weight factorwith completely uniform potential energy distribution. The question is whento stop the iteration for the determination of weight factor. Our criterion fora satisfactory weight factor is that as long as we do get a random walk inpotential energy space, the probability distribution Pmu(E) does not have tobe completely flat with a tolerance of, say, an order of magnitude deviation.In such a case, we usually perform with this weight factor a multicanonicalsimulation with high statistics (production run) to get even better estimate

Page 85: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

4 Generalized-Ensemble Algorithms for Studying Protein Folding 67

of the density of states. Let Nmu(E) be the histogram of potential energydistribution Pmu(E) obtained by this production run. The best estimate ofthe density of states can then be given by the single-histogram reweightingtechniques [8] as follows (see the proportionality relation in (4.16)):

n(E) =Nmu(E)Wmu(E)

. (4.24)

By substituting this quantity in (4.22), one can calculate ensemble averagesof physical quantity A(E) as a function of temperature. Moreover, ensembleaverages of any physical quantity A (including those that cannot be expressedas functions of potential energy) at any temperature T (= 1/kBβ) can nowbe obtained as long as one stores the “trajectory” of configurations (and A)from the production run. Namely, we have

< A >T =

n0∑k=1

A(x(k))W−1mu (E(x(k))) exp [−βE(x(k))]

n0∑k=1

W−1mu (E(x(k))) exp [−βE(x(k))]

, (4.25)

where x(k) is the configuration at the kth MC (or MD) step and n0 is thetotal number of configurations stored. Note that when A is a function of E,(4.25) reduces to (4.22), where the density of states is given by (4.24).

4.3 Multidimensional Extensionsof Multicanonical Algorithm

In the multicanonical ensemble, a one-dimensional random walk is realizedin the potential energy space. This algorithm can be generalized to multidi-mensions, where a random walk in other quantities besides potential energyis performed. There are many possibilities for this generalization. Here, wegive an example of two-dimensional extensions of multicanonical algorithm,multibaric–multithermal algorithm [62,63].

In the isobaric-isothermal ensemble [119–122], the probability distributionPNPT(E,V;T,P) for potential energy E and volume V at temperature T andpressure P is given by

PNPT(E,V;T,P) ∝ n(E,V)WNPT(E,V;T,P) = n(E,V) e−βH. (4.26)

Here, the density of states n(E,V) is given as a function of both E and V,and H is the “enthalpy” (without the kinetic energy contributions):

H = E + PV. (4.27)

Page 86: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

68 Y. Okamoto

This weight factor produces an isobaric-isothermal ensemble at constant tem-perature (T ) and constant pressure (P), and this ensemble yields bell-shapeddistributions in both E and V.

To perform the isobaric-isothermal MC simulation [122], we performMetropolis sampling on the scaled coordinates ri = L−1qi (qi are the realcoordinates) and the volume V (here, the particles are placed in a cubic boxof size L ≡ 3

√V). The trial moves from state x with the scaled coordinates r

with volume V to state x′ with the scaled coordinate r′ and volume V ′ aregenerated by uniform random numbers. The enthalpy is accordingly changedfrom H(E(r,V),V) to H′(E(r′,V ′),V ′) by these trial moves. The trial moveswill be accepted with the probability

w(x → x′) = min (1, exp[−β{H′ −H− NkBT ln(V ′/V)}]) , (4.28)

where N is the total number of atoms in the system.As for the MD method in this ensemble, we just present the Nose-Andersen

algorithm [119–121]. The equations of motion in (4.11)–(4.14) are now gener-alized as follows:

qk =pk

mk+

V3V qk, (4.29)

pk = − ∂H∂qk

−(

s

s+

V3V

)pk = fk −

(s

s+

V3V

)pk, (4.30)

s = sPs

Q, (4.31)

Ps =N∑

i=1

p2i

mi− 3NkBT = 3NkB (T (t) − T ) , (4.32)

V = sPV

M, (4.33)

PV =1

3V

(N∑

i=1

p2i

mi−

N∑i=1

qi ·∂H∂qi

)− ∂H

∂V = P(t) − P, (4.34)

where M is the artificial mass associated with the volume, PV is the conjugatemomentum for the volume, and the “instantaneous pressure” P(t) is definedby

P(t) =1

3V

(N∑

i=1

pi(t)2

mi−

N∑i=1

qi(t) ·∂H∂qi

(t)

)=

1

3V

(N∑

i=1

pi(t)2

mi+

N∑i=1

qi(t) · f i(t)

).

(4.35)

We now introduce the idea of the multicanonical technique into theisobaric-isothermal ensemble method and refer to this generalized-ensemblealgorithmasthemultibaric-multithermalalgorithm(MUBATH)[62,63,123–125].

Page 87: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

4 Generalized-Ensemble Algorithms for Studying Protein Folding 69

The molecular simulations in this generalized ensemble perform random walksboth in the potential energy space and in the volume space.

In the multibaric-multithermal ensemble, each state is sampled by themultibaric-multithermal weight factor Wmbt(E,V) ≡ exp{−βHmbt(E,V)}(Hmbt is referred to as the multibaric-multithermal enthalpy), so that a uni-form distribution in both potential energy E and volume V is obtained [62]:

Pmbt(E,V) ∝ n(E,V)Wmbt(E,V)=n(E,V) exp{−β0Hmbt(E,V)}≡constant,(4.36)

where we have chosen an arbitrary reference temperature, T0 = 1/kBβ0.The multibaric-multithermal MC simulation can be performed by replac-

ing H by Hmbt in (4.28):

w(x → x′) = min (1, exp[−β0{H′mbt −Hmbt − NkBT0 ln(V ′/V)}]) , (4.37)

To perform the multibaric-multithermal MD simulation, we just solve theabove equations of motion (4.29)–(4.34) for the regular isobaric-isothermalensemble (with arbitrary reference temperature T = T0 and reference pressureP = P0), where the enthalpy H is replaced by the multibaric-multithermalenthalpy Hmbt in (4.30) and (4.34) [63].

After an optimal weight factor Wmbt(E, V ) is obtained, a long produc-tion simulation is performed for data collection. We employ the reweightingtechniques [8] for the results of the production run to calculate the isobaric-isothermal-ensemble averages. The probability distribution PNPT(E,V;T,P)of potential energy and volume in the isobaric-isothermal ensemble at thedesired temperature T and pressure P is given by

PNPT(E,V;T,P) =Nmbt(E,V) Wmbt(E,V)−1 e−β(E+PV)∑

E,V

Nmbt(E,V) Wmbt(E,V)−1 e−β(E+PV), (4.38)

where Nmbt(E,V) is the histogram of the probability distribution Pmbt(E,V)of potential energy and volume that was obtained by the multibaric-multi-thermal production run. The expectation value of a physical quantity A at Tand P is then obtained from

〈A〉T,P =∑E,V

A(E,V) PNPT(E,V;T,P). (4.39)

4.3.1 Replica-Exchange Method

The system for the replica-exchange method (REM) consists of M noninter-acting copies (or, replicas) of the original system in the canonical ensembleat M different temperatures Tm (m = 1, . . . ,M). We arrange the replicas sothat there is always exactly one replica at each temperature. Then there existsa one-to-one correspondence between replicas and temperatures; the label i

Page 88: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

70 Y. Okamoto

(i = 1, . . . ,M) for replicas is a permutation of the label m (m = 1, . . . , M)for temperatures, and vice versa:{

i = i(m) ≡ f(m),m = m(i) ≡ f−1(i), (4.40)

where f(m) is a permutation function of m and f−1(i) is its inverse.Let X =

{x

[i(1)]1 , . . . , x

[i(M)]M

}=

{x

[1]m(1), . . . , x

[M ]m(M)

}stand for a “state” in

this generalized ensemble. Each “substate” x[i]m is specified by the coordinates

q[i] and momenta p[i] of N atoms in replica i at temperature Tm:

x[i]m ≡

(q[i], p[i]

)m

. (4.41)

Because the replicas are noninteracting, the weight factor for the state Xin this generalized ensemble is given by the product of Boltzmann factors foreach replica (or at each temperature):

WREM(X) =M∏

i=1

exp{−βm(i)H

(q[i], p[i]

)}=

M∏m=1

exp{−βmH

(q[i(m)], p[i(m)]

)},

= exp

{−

M∑i=1

βm(i)H(q[i], p[i]

)}= exp

{−

M∑m=1

βmH(q[i(m)], p[i(m)]

)},

(4.42)

where i(m) and m(i) are the permutation functions in (4.40).We now consider exchanging a pair of replicas in this ensemble. Suppose we

exchange replicas i and j, which are at temperatures Tm and Tn, respectively,

X ={

. . . , x[i]m, . . . , x[j]

n , . . .}−→ X ′ =

{. . . , x[j]′

m , . . . , x[i]′n , . . .

}. (4.43)

Here, i, j, m, and n are related by the permutation functions in (4.40), andthe exchange of replicas introduces a new permutation function f ′:{

i = f(m) −→ j = f ′(m),j = f(n) −→ i = f ′(n). (4.44)

The exchange of replicas can be written in more detail as{x

[i]m ≡

(q[i], p[i]

)m

−→ x[j]′m ≡

(q[j], p[j]′)

m,

x[j]n ≡

(q[j], p[j]

)n−→ x

[i]′n ≡

(q[i], p[i]′)

n,

(4.45)

where the definitions for p[i]′ and p[j]′ will be given below. We remark thatthis process is equivalent to exchanging a pair of temperatures Tm and Tn forthe corresponding replicas i and j as follows:{

x[i]m ≡

(q[i], p[i]

)m

−→ x[i]′n ≡

(q[i], p[i]′)

n,

x[j]n ≡

(q[j], p[j]

)n−→ x

[j]′m ≡

(q[j], p[j]′)

m.

(4.46)

Page 89: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

4 Generalized-Ensemble Algorithms for Studying Protein Folding 71

In the original implementation of the replica-exchange method (REM)[67–69], Monte Carlo algorithm was used, and only the coordinates q (andthe potential energy function E(q)) had to be taken into account. In mole-cular dynamics algorithm, on the other hand, we also have to deal with themomenta p. We proposed the following momentum assignment in (4.45) (andin (4.46)) [77]: ⎧⎪⎪⎪⎨

⎪⎪⎪⎩p[i]′ ≡

√Tn

Tmp[i],

p[j]′ ≡√

Tm

Tnp[j],

(4.47)

which we believe is the simplest and the most natural. This assignment meansthat we just rescale uniformly the velocities of all the atoms in the replicas bythe square root of the ratio of the two temperatures so that the temperaturecondition in (4.4) may be satisfied.

The transition probability of this replica-exchange process is given by theusual Metropolis criterion:

w(X → X ′) ≡ w(x[i]

m

∣∣∣ x[j]n

)= min

(1,

WREM(X ′)WREM(X)

)= min (1, exp (−Δ)) ,

(4.48)where in the second expression (i.e., w(x[i]

m|x[j]n )) we explicitly wrote the pair

of replicas (and temperatures) to be exchanged. From (4.1), (4.2), (4.42), and(4.47), we have

WREM(X ′)

WREM(X)= exp

{−βm

[K

(p[j]′

)+ E

(q[j]

)]− βn

[K

(p[i]′

)+ E

(q[i]

)]+βm

[K

(p[i]

)+ E

(q[i]

)]+ βn

[K

(p[j]

)+ E

(q[j]

)]},

= exp

{−βm

Tm

TnK

(p[j]

)− βn

Tn

TmK

(p[i]

)+ βmK

(p[i]

)+ βnK

(p[j]

)−βm

[E(q[j]

)− E

(q[i]

)]− βn

[E(q[i]

)− E

(q[j]

)]}.

(4.49)

As the kinetic energy terms in this equation all cancel out, Δ in (4.48) becomes

Δ = βm

(E(q[j]

)− E

(q[i]

))− βn

(E(q[j]

)− E

(q[i]

)), (4.50)

= (βm − βn)(E(q[j]

)− E

(q[i]

)). (4.51)

Here, i, j, m, and n are related by the permutation functions in (4.40) beforethe replica exchange: {

i = f(m),j = f(n). (4.52)

Without loss of generality, we can assume T1 < T2 < · · · < TM . A simu-lation of the replica-exchange method (REM) is then realized by alternately

Page 90: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

72 Y. Okamoto

performing the following two steps:

1. Each replica in canonical ensemble of the fixed temperature is simulatedsimultaneously and independently for a certain MC or MD steps.

2. A pair of replicas at neighboring temperatures, say x[i]m and x

[j]m+1, are

exchanged with the probability w(x

[i]m

∣∣∣x[j]m+1

)in (4.48).

Note that in Step 2 we exchange only pairs of replicas corresponding to neigh-boring temperatures, because the acceptance ratio of the exchange processdecreases exponentially with the difference in the two β’s (see (4.51) and(4.48)). Note also that whenever a replica exchange is accepted in Step 2, thepermutation functions in (4.40) are updated. A random walk in “temperaturespace” is realized for each replica, which in turn induces a random walk in po-tential energy space. This alleviates the problem of getting trapped in statesof energy local minima.

The REM simulation is particularly suitable for parallel computers. Be-cause one can minimize the amount of information exchanged among nodes,it is best to assign each replica to each node (exchanging pairs of tem-perature values among nodes is much faster than exchanging coordinatesand momenta). This means that we keep track of the permutation functionm(i; t) = f−1(i; t) in (4.40) as a function of MC or MD step t during thesimulation. After parallel canonical MC or MD simulations for a certain steps(Step 1), M/2 pairs of replicas corresponding to neighboring temperatures aresimulateneously exchanged (Step 2), and the pairing is alternated between thetwo possible choices, i.e., (T1, T2), (T3, T4), . . . and (T2, T3), (T4, T5), . . . .

After a long production run of a replica-exchange simulation, the canonicalexpectation value of a physical quantity A at temperature Tm (m = 1, . . . , M)can be calculated by the usual arithmetic mean as follows:

< A >Tm=

1nm

nm∑k=1

A (xm(k)) , (4.53)

where xm(k) (k = 1, · · · , nm) are the configurations obtained at tempera-ture Tm, and nm is the total number of measurements made at T = Tm.The expectation value at any intermediate temperature can also be obtainedfrom (4.22), where the density of states is given by the multiple-histogramreweighting techniques [9,10] as follows. Let Nm(E) and nm be, respectively,the potential-energy histogram and the total number of samples obtained attemperature Tm = 1/kBβm (m = 1, . . . ,M). The best estimate of the densityof states is then given by [9,10]

n(E) =

M∑m=1

g−1m Nm(E)

M∑m=1

g−1m nm exp(fm − βmE)

, (4.54)

Page 91: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

4 Generalized-Ensemble Algorithms for Studying Protein Folding 73

where we have for each m (= 1, · · · ,M)

exp(−fm) =∑E

n(E) exp(−βmE). (4.55)

Here, gm = 1 + 2τm, and τm is the integrated autocorrelation time at tem-perature Tm. For many systems, the quantity gm can safely be set to be aconstant in the reweighting formulae [10], and hereafter we set gm = 1.

Note that (4.54) and (4.55) are solved self-consistently by iteration [9,10] toobtain the density of states n(E) and the dimensionless Helmholtz free energyfm. Namely, we can set all the fm (m = 1, . . . ,M) to, e.g., zero initially. Wethen use (4.54) to obtain n(E), which is substituted into (4.55) to obtain nextvalues of fm, and so on.

Moreover, ensemble averages of any physical quantity A (including thosethat cannot be expressed as functions of potential energy) at any temperatureT (= 1/kBβ) can now be obtained from the “trajectory” of configurations ofthe production run. Namely, we first obtain fm (m = 1, . . . , M) by solving(4.54) and (4.55) self-consistently, and then we have [87]

< A >T =

M∑m=1

nm∑k=1

A(xm(k))1

M∑�=1

n� exp [f� − β�E(xm(k))]

exp [−βE(xm(k))]

M∑m=1

nm∑k=1

1M∑

�=1

n� exp [f� − β�E(xm(k))]

exp [−βE(xm(k))]

,

(4.56)

where xm(k) (k = 1, · · · , nm) are the configurations obtained at tempera-ture Tm.

The major advantage of REM over other generalized-ensemble methodssuch as multicanonical algorithm [11, 12] lies in the fact that the weight fac-tor is a priori known (see (4.42)), while in the multicanonical algorithm thedetermination of the weight factors can be very tedius and time-consuming.In REM, however, the number of required replicas increases greatly as thesystem size N increases, while only one replica is used in the multicanonicalalgorithm. This demands a lot of computer power for complex systems. More-over, so long as optimal weight factors can be obtained, the multicanonicalalgorithm is more efficient in sampling than the replica-exchange method [88].

4.3.2 Multidimensional Extensions of Replica-Exchange Method

We now present our multidimensional extension of REM, which we refer to asmultidimensional replica-exchange method (MREM) [80]. The crucial obser-vation that led to the new algorithm is As long as we have M noninteracting

Page 92: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

74 Y. Okamoto

replicas of the original system, the Hamiltonian H(q, p) of the system doesnot have to be identical among the replicas and it can depend on a parameterwith different parameter values for different replicas. Namely, we can writethe Hamiltonian for the ith replica at temperature Tm as

Hm(q[i], p[i]) = K(p[i]) + Eλm(q[i]), (4.57)

where the potential energy Eλmdepends on a parameter λm and can be

written, for instance, as

Eλm(q[i]) = E0(q[i]) + λmV (q[i]). (4.58)

This expression for the potential energy is often used in simulations. For in-stance, in umbrella sampling [18], E0(q) and V (q) can be, respectively, takenas the original potential energy and the “biasing” potential energy with thecoupling parameter λm. In simulations of spin systems, on the other hand,E0(q) and V (q) (here, q stands for spins) can be, respectively, consideredas the zero-field term and the magnetization term coupled with the externalfield λm.

While replica i and temperature Tm are in one-to-one correspondence inthe original REM, replica i and “parameter set” Λm ≡ (Tm, λm) are in one-to-one correspondence in the new algorithm. Hence, the present algorithm canbe considered as a multidimensional extension of the original replica-exchangemethod, where the “parameter space” is one-dimensional (i.e., Λm = Tm).Because the replicas are noninteracting, the weight factor for the state X inthis new generalized ensemble is again given by the product of Boltzmannfactors for each replica (see (4.42)):

WMREM(X) = exp

{−

M∑i=1

βm(i)Hm(i)

(q[i], p[i]

)},

= exp

{−

M∑m=1

βmHm

(q[i(m)], p[i(m)]

)},

(4.59)

where i(m) and m(i) are the permutation functions in (4.40). Then the samederivation that led to the original replica-exchange criterion follows, and thetransition probability of replica exchange is given by (4.48), where we nowhave (see (4.50)) [80]

Δ = βm

(Eλm

(q[j]

)− Eλm

(q[i]

))− βn

(Eλn

(q[j]

)− Eλn

(q[i]

)). (4.60)

Here, Eλmand Eλn

are the total potential energy (see (4.57)). Note that weneed to newly evaluate the potential energy for exchanged coordinates,Eλm

(q[j]) and Eλn(q[i]), because Eλm

and Eλnare in general different

functions.For obtaining the canonical distributions, the multiple-histogram reweight-

ing techniques [9,10] are particularly suitable. Suppose we have made a single

Page 93: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

4 Generalized-Ensemble Algorithms for Studying Protein Folding 75

run of the present replica-exchange simulation with M replicas that corre-spond to M different parameter sets Λm ≡ (Tm, λm) (m = 1, . . . ,M). LetNm(E0, V ) and nm be, respectively, the potential-energy histogram and thetotal number of samples obtained for the mth parameter set Λm. The WHAMequations that yield the canonical probability distribution PT,λ(E0, V ) =n(E0, V ) exp(−βEλ) with any potential-energy parameter value λ at any tem-perature T = 1/kBβ are then given by [80]

n(E0, V ) =

M∑m=1

Nm(E0, V )

M∑m=1

nm exp (fm − βmEλm)

, (4.61)

and for each m (= 1, · · · ,M)

exp(−fm) =∑E0,V

n(E0, V ) exp (−βmEλm) . (4.62)

Here, n(E0, V ) is the generalized density of states. Note that n(E0, V ) isindependent of the parameter sets Λm ≡ (Tm, λm) (m = 1, . . . ,M). Thedensity of states n(E0, V ) and the “dimensionless” Helmholtz free energy fm

in (4.61) and (4.62) are solved self-consistently by iteration.We now present an example of MREM. We consider an isobaric-isothermal

ensemble and exchange not only the temperature but also the pressure valuesof pairs of replicas during a MC or MD simulation [94]. Namely, supposewe have M replicas with M different values of temperature and pressure(Tm,Pm). We are setting E0 = E, V = V, and λm = Pm in (4.58). Weexchange replicas i and j which are at (Tm,Pm) and (Tn,Pn), respectively.The transition probability of this replica-exchange process is then given by(4.48), where (4.60) now reads [3, 80,96]

Δ = (βm − βn)(E(q[j]

)− E

(q[i]

))+ (βmPm − βnPn)

(V [j] − V [i]

).

(4.63)We can alternately exchange pairs of neighboring temperature values andpairs of neighboring pressure values during the replica-exchange simulation.Moreover, if we fix the temperature, we can have only the pressure-exchangeprocess as a special case, which yields a one-dimensional random walk in thevolume space.

4.4 Examples of Simulation Results

We now present some of the simulation results by the generalized-ensemblealgorithms that were described in the previous section.

Page 94: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

76 Y. Okamoto

The first example is the results of the calculation of the residual entropyof the ordinary ice [126,127]. This calculation shows how accurate the densityof states can be obtained by multicanonical simulations from the reweightingformula of (4.24).

In the crystal structure of ordinary ice, each oxygen atom is located atthe center of a tetrahedron and straight lines (bonds) through the sites ofthe tetrahedron point towards four nearest-neighbor oxygen atoms. Hydrogenatoms are distributed according to the ice rules [128]:

A. There is one hydrogen atom on each bond (then called hydrogen bond).B. There are two hydrogen atoms near each oxygen atom (these three atoms

constitute a water molecule).

Extrapolating low temperature calorimetric experimental data (then avail-able down to about 10 K) towards zero absolute temperature, it was found thatice has a residual entropy [129]:

S0 = kB ln(Ω) > 0, (4.64)

where Ω is the number of states for N molecules. Subsequently, Linus Pauling[128] derived estimates of Ω = (Ω1)N by approximate methods, obtaining

ΩPauling1 = 3/2. (4.65)

Thus, Ω = (3/2)N is the number of Pauling configurations. Assuming thatthe H2O molecules are essentially intact in ice, one of his arguments is that agiven molecule can orient itself in six ways satisfying ice rule B. Choosing theorientations of all molecules at random, the chance that the adjacent moleculespermit a given orientation is 1/4. The total number of configurations is thusΩ = (6/4)N .

Equation (4.65) converts to the residual entropy

SPauling0 = 0.80574 . . . cal deg−1 mol−1, (4.66)

where we have used R = 8.314472 (15) [J deg−1 mol−1] for the gas constant[130]. This is in good agreement with the experimental estimate

Sexperimental0 = 0.82 (5) cal deg−1 mol−1, (4.67)

which was subsequently obtained by Giauque and Stout [131] using refinedcalorimetry (we give error bars with respect to the last digit(s) in parentheses).

Pauling’s arguments omit correlations induced by closed loops when onerequires fulfillment of the ice rules for all atoms, and it was shown by Onsagerand Dupuis [132] that Ω1 = 1.5 is in fact a lower bound. Onsager’s studentNagle used a series expansion method to derive the estimate [133]

ΩNagle1 = 1.50685 (15), (4.68)

Page 95: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

4 Generalized-Ensemble Algorithms for Studying Protein Folding 77

orSNagle

0 = 0.81480 (20) cal deg−1 mol−1. (4.69)Here, the error bar is not statistical but reflects higher order corrections ofthe expansion, which are not entirely under control.

Despite Nagle’s high precision estimate, there has apparently been almostno improvement on the accuracy of the experimental value (4.67). Some ofthe difficulties are addressed in a careful study by Haida et al. [134]. But theirfinal estimate remains (4.67), with no reduction of the error bar. We notedthat by treating the contributions in their table 3 as statistically independentquantities and using Gaussian error propagation (instead of adding up theindividual error bars), the final error bar becomes reduced by almost a fac-tor of two and their value would then read S0 = 0.815 (26) cal deg−1 mol−1.Still Pauling’s value is safely within one standard deviation. Modern elec-tronic equipment should allow for a much better precision. We think that anexperimental verification of the difference to Pauling’s estimate would be anoutstanding confirmation of structures imposed by the ice rules.

Our calculations are based on two simple statistical models, which reflectPauling’s arguments. In the first model, called six-state H2O molecule model,we allow for six distinct orientations of each H2O molecule and define itsenergy by

E = −∑

b

h(b, s1b , s

2b). (4.70)

Here, the sum is over all bonds b of the lattice and (s1b and s2

b indicate thedependence on the states of the two H2O molecules, which are connected bythe bond)

h(b, s1b , s

2b) =

{1 for a hydrogen bond,

0 otherwise.(4.71)

In the second model, called two-state H-bond model, we do not considerdistinct orientations of the molecule, but allow two positions for each hydrogennucleus on the bonds. The energy is defined by

E = −∑

s

f(s, b1s, b

2s, b

3s, b

4s), (4.72)

where the sum is over all sites (oxygen atoms) of the lattice. The function fis given by

f(s, b1s, b

2s, b

3s, b

4s) =

⎧⎪⎨⎪⎩

2 for two hydrogen nuclei close to s,

1 for one or three hydrogen nuclei close to s,

0 for zero or four hydrogen nuclei close to s.

(4.73)

The groundstates of each model fulfill the ice rules. The results of a mul-ticanonical simulation will give an accurate estimate of the density of statesn(E) from (4.24), and we can write

Ω(E) = Cn(E). (4.74)

Page 96: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

78 Y. Okamoto

At β = 0, the number of states is 6N for the six-state model and 22N for thetwo-state model. Once these normalizations at β = 0 are given, the propor-tionality constant C can be determined from the results of the multicanonicalsimulations [24]. Hence, one can obtain an accurate estimate of the number ofthe lowest-energy state, Ω(E0), where E0 is the energy of the lowest-energystate.

Using periodic boundary conditions (BCs), our simulations are based ona lattice construction set up earlier by Berg [135]. We have performed mul-ticanonical MC simulations for the two models with the lattice sizes thatcorrespond to the number of water molecules N = 128, 360, 576, 896, and1,600. Combining the two fit results in the thermodynamic limit (N → ∞)leads to our final estimate

ΩMUCA1 = 1.50738 (16). (4.75)

This converts into

SMUCA0 = 0.81550 (21) cal deg−1 mol−1 (4.76)

for the residual entropy [126]. This is at present the most accurate value forthe residual entropy of the ordinary ice.

The next example is the multicanonical MD simulations of the C-peptideof ribonuclease A in explicit water [136]. In the model of simulations, theN-terminus and the C-terminus of the C-peptide analogue were blocked withthe acetyl group and the N -methyl group, respectively. The number of aminoacids is 13 and the amino-acid sequence is Ace-Ala-Glu−-Thr-Ala-Ala-Ala-Lys+-Phe-Leu-Arg+-Ala-His+-Ala-Nme [137,138]. The initial configuration ofour simulation was first generated by a high temperature molecular dynamicssimulation (at T = 1,000 K) in gas phase, starting from a fully extended con-formation. We randomly selected one of the structures that do not have anysecondary structures such as α-helix and β-sheet. The peptide was then sol-vated in a sphere of radius 22 A, in which 1,387 water molecules were included(see Fig. 4.1). Harmonic restraint was applied to prevent the water moleculesfrom going out of the sphere. The total number of atoms is 4,365. The dielec-tric constant was set equal to 1.0. The force-field parameters for protein weretaken from the all-atom version of AMBER parm99 [141], which was foundto be suitable for studying helical peptides [142], and TIP3P model [143] wasused for water molecules. The unit time step, Δt, was set to 0.5 fs.

As a production run, we carried out a 15 ns multicanonical MD simulationand the results of this production run were analyzed in detail.

In Fig. 4.2a we show the time series of potential energy from this pro-duction run. We indeed observe a random walk covering as much as5,000 kcal mol−1 of energy range (note that 23 kcal mol−1 ≈1 eV). We showin Fig. 4.2b the average potential energy as a function of temperature, whichwas obtained from the trajectory of the production run by the reweightingtechniques in (4.22) and (4.24). The average potential energy monotonicallyincreases as the temperature increases.

Page 97: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

4 Generalized-Ensemble Algorithms for Studying Protein Folding 79

Fig. 4.1. The initial configuration of C-peptide in explicit water. The filled circlesstand for the oxygen atoms of water molecules. The number of water molecules is1,387, and they are placed in a sphere of radius 22 A. As for the peptide, besidesthe backbone structure (in dark gray), side chains of only Glu−-2, Phe-8, Arg+-10,and His+-12 are shown (in light gray). The figure was created with Molscript [139]and Raster3D [140]

−14000

−13000

−12000

−11000

−10000

−9000

−8000

0 2 4 6 8 10 12 14

Time [nsec]

E [k

cal/m

ol]

300 350 400 450 500 550 600 650 700

T [K]

a b

Fig. 4.2. Time series of potential energy of the C-peptide system from the multi-canonical MD production run (a) and the average potential energy as a function oftemperature (b). The latter was obtained from the trajectory of the multicanonicalMD production run by the single-histogram reweighting techniques

By analyzing the free energy landscape, we identified three distinct localminima in free energy. We show representative conformations at these minimain Fig. 4.3. The structure of the global-minimum free-energy state (GM) has apartially distorted α-helix with the salt bridge between Glu−-2 and Arg+-10.The structure is in good agreement with the experimental structure obtainedby both NMR and X-ray experiments. In this structure, there also exists a

Page 98: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

80 Y. Okamoto

Fig. 4.3. Representative structures at the global-minimum free-energy state ((a)GM) and the two local-minimum states ((b) LM1 and (c) LM2). As for the peptidestructures, besides the backbone structure, side chains of only Glu−-2, Phe-8, Arg+-10, and His+-12 are shown in ball-and-stick model

contact between Phe-8 and His+-12. This contact is again observed in thecorresponding residues of the X-ray structure. At LM1, the structure has acontact between Phe-8 and His+-12, but the salt bridge between Glu−-2 andArg+-10 is not formed. On the other hand, the structure at LM2 has this saltbridge, but it does not have a contact between Phe-8 and His+-12. Thus, onlythe structures at GM satisfy all the interactions that have been observed bythe X-ray and other experimental studies.

The next example is the results of the multibaric-multithermal MD sim-ulation [144, 145]. This simulation was performed for a system consisting ofone alanine dipeptide molecule ((S)-2-(acetylamino)-N -methylpropanamide)and 63 water molecules. We used the AMBER parm96 force field [146] forthe alanine dipeptide molecule and the TIP3P [143] rigid-body model for thewater molecules. The initial values of the alanine-dipeptide dihedral angleswere set to be φ = ψ = 180◦. We employed a cubic unit cell with periodicboundary conditions. The electrostatic potential was calculated by the Ewaldmethod. We calculated the van der Waals interaction, which is given by theLennard–Jones 12-6 term, of all pairs of the atoms within the minimum im-age convention instead of introducing the spherical potential cutoff. Here, weused the symplectic time-development formalism [147], which is based on theNose-Poincare thermostat [148, 149], the Andersen barostat [121], and thesymplectic quaternion scheme [150]. The time step was taken as Δt = 0.5 fs.

Figure 4.4a–c shows the time series of potential energy E in the isobaric-isothermal MD simulation at (T0, P0) = (240 K, 0.1 MPa), (298 K, 0.1 MPa),and (298 K, 300 MPa), respectively. The potential energy fluctuates in narrowranges. On the other hand, Fig. 4.4d shows that the MUBATH MD simulationrealizes a random walk in the potential-energy space and covers a wide energyrange.

Figures 4.5a–c show the time series of volume V obtained by the conven-tional isobaric-isothermal MD simulations. The volume fluctuates in narrowranges. The MUBATH MD simulation, on the other hand, performs a randomwalk that covers a range of V = 1.8 ∼ 3.5 nm3, as shown in Fig. 4.5d, whichis 3–5 times wider than that by the isobaric-isothermal MD simulations.

Page 99: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

4 Generalized-Ensemble Algorithms for Studying Protein Folding 81

−8

−7

−6

−5

−4

−3

−2

−8

−7

−6

−5

−4

−3

−2

−8

−7

−6

−5

−4

−3

−2

−8

−7

−6

−5

−4

−3

−2

0.0 0.2 0.4 0.6 0.8 1.0

E/(

100

kcal

/mol

)E

/(10

0 kc

al/m

ol)

E/(

100

kcal

/mol

)E

/(10

0 kc

al/m

ol)

t /ns0.0 0.2 0.4 0.6 0.8 1.0

t /ns

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0t /ns t /ns

a b

c d

Fig. 4.4. Time series of potential energy E from (a) the conventional isobaric–isothermal MD simulation at T0 = 240 K and P0 = 0.1 MPa; (b) the conventionalisobaric–isothermal MD simulation at T0 = 298 K and P0 = 0.1 MPa; (c) the con-ventional isobaric–isothermal MD simulation at T0 = 298 K and P0 = 300 MPa; and(d) the multibaric–multithermal MD simulation

The probability distributions P (φ, ψ) of φ and ψ at wide ranges of temper-ature and pressure have been calculated by the reweighting techniques. TheMUBATH MD simulation sampled not only the states of PII and C5 but alsothe states of αR, αP, and αL.

The volume under the surface P (φ, ψ) around each peak corresponds tothe population W of each state. To calculate W , the whole (φ, ψ) plane wasdivided into six states as listed in Table 4.1. For example, the population WPII

of the PII state is calculated by the integral of P (φ, ψ) in the area in which φand ψ take the PII configuration:

WPII =∫

(φ,ψ)∈PII

dφdψP (φ, ψ) , (4.77)

where the integration range of (φ, ψ) stands for the range for the correspondingstate in Table 4.1. The population of each state at T = 298 K and P = 0.1 MPais also shown in Table 4.1.

Estimation of the partial molar enthalpy and partial molar volume is im-portant in solution chemistry, because these values control the population of

Page 100: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

82 Y. Okamoto

c

1.5

2.0

2.5

3.0

3.5

4.0

0.0 0.2 0.4 0.6 0.8 1.0

d

b

V/n

m3

t /ns

1.5

2.0

2.5

3.0

3.5

4.0

0.0 0.2 0.4 0.6 0.8 1.0

V/n

m3

t /ns

1.5

2.0

2.5

3.0

3.5

4.0

0.0 0.2 0.4 0.6 0.8 1.0

V/n

m3

t /ns

1.5

2.0

2.5

3.0

3.5

4.0

0.0 0.2 0.4 0.6 0.8 1.0

V/n

m3

t /ns

a

Fig. 4.5. Time series of volume V from (a) the conventional isobaric–isothermalMD simulation at T0 = 240 K and P0 = 0.1 MPa; (b) the conventional isobaric–isothermal MD simulation at T0 = 298 K and P0 = 0.1 MPa; (c) the conventionalisobaric–isothermal MD simulation at T0 = 298 K and P0 = 300 MPa; and (d) themultibaric–multithermal MD simulation

Table 4.1. The dihedral-angle ranges of (φ, ψ) for six states and their populationat T = 298K and P = 0.1 MPa, which were obtained by the reweighting techniquesfrom the MUBATH MD simulation

State φ ψ Population

PII (−100◦, 0◦) (30◦,−120◦) 0.412(18)C5 (120◦,−100◦) (30◦,−120◦) 0.496(20)αR (−100◦, 0◦) (−120◦, 30◦) 0.041(6)αP (120◦,−100◦) (−120◦, 30◦) 0.046(10)αL (0◦, 120◦) (0◦, 120◦) 0.004(4)Cax

7 (0◦, 120◦) (120◦, 0◦) 0.0008(7)

The numbers in parentheses for the population are theestimated uncertainties

each state when temperature and pressure are changed. It is the MUBATHalgorithm that enables us to calculate the partial molar enthalpy and partialmolar volume accurately.

Page 101: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

4 Generalized-Ensemble Algorithms for Studying Protein Folding 83

Figure 4.6 shows the population ratios of WC5/WPII , WαR/WPII , WαP

/WPII , and WαL/WPII as functions of the inverse of temperature 1/T at theconstant pressure of P = 0.1 MPa. The error bars were estimated by the jack-knife method [151]. As temperature increases, WC5/WPII , WαR/WPII , andWαP/WPII increase, although the error bars of WαL/WPII are too large todiscuss its temperature dependence. Thermodynamics tells that the increasein temperature at constant pressure causes the increase in enthalpy. The in-creases in the population ratios W/WPII against the PII state by the temper-ature increase indicate that enthalpy for the C5, αR, and αP states is higherthan that of the PII state. The difference of partial molar enthalpy ΔH ofthe C5 state from that of the PII state is, for example, calculated from thederivative of WC5/WPII with respect to 1/T :

ΔH = −R

[∂ log(WC5/WPII)

∂(1/T )

]P

, (4.78)

where R is the gas constant. The derivative of WC5/WPII was calculated hereby the least-squares fitting. The error bars were estimated again by the jack-knife method [151]. These enthalpy differences are listed in Table 4.2.

−4

−3

−2

−1

0

1

2.5 3.0 3.5 4.0

1/T/(10−3/K)

a b

C5/PII

αR/PII

αP/PII

αL/PII

log

(W/W

PII)

−4

−3

−2

−1

0

1

2.5 3.0 3.5 4.0

1/T/(10−3/K)

log

(W/W

PII)

c

−4

−3

−2

−1

0

1

2.5 3.0 3.5 4.0

1/T/(10−3/K)

log

(W/W

PII)

Fig. 4.6. The population ratios as functions of the inverse of temperature 1/T atconstant pressure of P = 0.1 MPa, which was obtained by the reweighting tech-niques from the results of the multibaric–multithermal MD simulation: (a) those ofWC5/WPII and WαR/WPII , (b) that of WαP/WPII , and (c) that of WαL/WPIIt

Table 4.2. Differences of partial molar enthalpy ΔH (kJ mol−1) and partial molarvolume ΔV (cm3 mol−1) of the C5, αR, αP, and αL states from that of the PII state

ΔH (kJ mol−1) ΔV (cm3 mol−1)

State MUBATH MD Raman MUBATH MD Raman

C5 1.1 ± 0.9 2.5 0.7 ± 0.9 0.1αR 10.8 ± 2.8 4.4 −1.2 ± 5.4 1.1αP 7.2 ± 4.3 − 2.8 ± 2.6 −αL −3 ± 56 − −8.1 ± 11.9 −The Raman spectroscopy data [152] are also given

Page 102: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

84 Y. Okamoto

−4

−3

−2

−1

0

1

0 100 200 300−12

−8

−4

0

log

(W/W

PII)

P /PMa

−4

−3

−2

−1

0

1

0 100 200 300

log

(W/W

PII)

P /PMa0 100 200 300

log

(W/W

PII)

P /PMa

C5/PII

aR/PII

aP/PII

aL/PII

a b c

Fig. 4.7. The population ratios as functions of pressure P at constant temperatureof T = 298 K, which was obtained by the reweighting techniques from the results ofthe multibaric–multithermal MD simulation: (a) those of WC5/WPII and WαR/WPII ,(b) that of WαP/WPII , and (c) that of WαL/WPII

Table 4.2 also lists the experimental data by Raman spectroscopy for theC5 and αR states [152]. Considering the errors, the differences of the partialmolar enthalpy ΔH by the MUBATH MD simulation agree well with thoseby the Raman spectroscopy.

Figure 4.7 shows the population ratios of WC5/WPII , WαR/WPII , WαP/WPII , and WαL/WPII as functions of pressure P at the constant tempera-ture of T = 298 K. As pressure increases, both WC5/WPII and WαP/WPII

decrease, although the WαR/WPII and WαL/WPII data have too large errorbars to discuss their pressure dependence. The increase in pressure at con-stant temperature generally causes the decrease in volume. The decreases inWC5/WPII and WαP/WPII means that the volumes of the C5 and αP statesare larger than that of the PII. The difference of partial molar volume ΔV ofthe C5 state from that of the PII state is, for instance, calculated from thederivative of WC5/WPII with respect to pressure P by

ΔV = −RT

[∂ log(WC5/WPII)

∂P

]T

. (4.79)

The difference between the partial molar volume of the αR, αP, and αL statesand that of the PII state was also obtained in the same way. These volumedifferences are shown in Table 4.2. The partial molar volume difference ΔVbetween C5 and PII and that between αR and PII obtained by the MUBATHMD simulation agree well with those by the Raman spectroscopy.

The MUBATH method has the merits of both multicanonical algorithmand isobaric-isothermal method. It can escape from local-minimum free-energystates and specific temperature and pressure. From a single MUBATH simula-tion run, we could obtain thermodynamic quantites at pressure ranging from1 MPa to several hundred MPa. Hence, this generalized-ensemble algorithm isparticularly suitable for studying pressure-induced denaturation of proteins.

The next example is the results of the applications of REM MC simulationsto the prediction of membrane protein structures [153–156].

Page 103: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

4 Generalized-Ensemble Algorithms for Studying Protein Folding 85

It is estimated that 20–30% of all genes in most genomes encode membraneproteins [157]. However, only a small number of detailed structures have beenobtained for membrane proteins because of technical difficulties in experimentssuch as high quality crystal growth. Therefore, it is desirable to develop amethod for predicting membrane protein structures by computer simulations.

Our method consists of two parts. In the first part, amino-acid sequencesof the transmembrane helix regions of the target protein are identified. Itis already established that the transmembrane helical segments can be pre-dicted by analyzing mainly the hydrophobicity of amino-acid sequences, with-out having any information about the higher order structures. There existmany WWW servers such as TMHMM [157], MEMSAT [158], SOSUI [159],and HMMTOP [160], in which given the amino-acid sequence of a protein theyjudge whether the protein is a membrane protein or not and (if yes) predictthe regions in the amino-acid sequence that correspond to the transmembranehelices.

In the second part, we perform a REM simulation of these transmembranehelices that were identified in the first part. Given the amino-acid sequences oftransmembrane helices, we first construct α-helices of these sequences. For oursimulations, we introduce the following rather drastic approximations. (1) Wetreat the backbone of the α-helices as rigid body and only side-chain structuresare made flexible. (2) We neglect the rest of the amino acids of the membraneprotein (such as loop regions). (3) We neglect surrounding molecules suchas lipids. In principle, we can also use molecular dynamics method, but weemploy Monte Carlo algorithm here. We update configurations with rigidtranslations and rigid rotations of each α-helix and torsion rotations of sidechains. We use a standard force field such as CHARMM [161, 162] for thepotential energy of the system. We also add the following simple harmonicconstraints to the original force-field energy:

Econstr =NH−1∑i=1

k1 θ (ri,i+1 − di,i+1) [ri,i+1 − di,i+1]2

+NH∑i=1

{k2 θ

(∣∣zLi − zL

0

∣∣− dLi

) [∣∣zLi − zL

0

∣∣− dLi

]2+ k2 θ

(∣∣zUi − zU

0

∣∣− dUi

) [∣∣zUi − zU

0

∣∣− dUi

]2}+∑Cα

k3 θ (rCα− dCα

) [rCα− dCα

]2 , (4.80)

where NH is the total number of transmembrane helices in the protein andθ(x) is the step function:

θ(x) ={

1 , for x ≥ 0,0 , otherwise, (4.81)

Page 104: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

86 Y. Okamoto

and k1, k2, and k3 are the force constants of the harmonic constraints; ri,i+1 isthe distance between the C atom of the C-terminus of the ith helix and the Natom of the N-terminus of the (i + 1)th helix; zL

i and zUi are the z-coordinate

values of the Cα (or C) atom of the N-terminus (or C-terminus) of the ithhelix near the fixed lower boundary value zL

0 and the upper boundary valuezU0 of the membrane, respectively; rCα

are the distance of Cα atoms fromthe origin; and di,i+1, dL

i , dUi , and dCα

are the corresponding central valueconstants of the harmonic constraints. The first term in (4.80) is the energythat constrains pairs of adjacent helices along the amino-acid chain not to beapart from each other too much (loop constraints). This term has a nonzerovalue only when the distance ri,i+1 becomes longer than di,i+1.

The second term in (4.80) is the energy that constrains helix N-teminusand C-terminus to be located near membrane boundary planes. This termhas a nonzero value only when the C atom of each helix C-terminus andCα atom of each helix N-terminus are apart more than dL

i (or dUi ). Based

on the knowledge that most membrane proteins are placed in parallel, thisconstraint energy is included so that helices are not too much apart from theperpendicular orientation with respect to the membrane boundary planes.

The third term in (4.80) is the energy that constrains all Cα atoms withinthe sphere (centered at the origin) of radius dCα

. This term has a nonzerovalue only when Cα atoms go out of this sphere. The term is introduced sothat the center of mass of the molecule stays near the origin. The radius of thesphere is set to a large value to guarantee that a wide conformational spaceis sampled.

In the first part of the present method, we obtain amino-acid sequences ofthe transmembrane helix regions from existing WWW servers such as thosein [157–160]. However, the precision of these programs in the WWW serversis about 85% and needs improvement. We thus focus our attention on theeffectiveness of the second part of our method, leaving this improvement to thedevelopers of the WWW servers. Namely, we use the experimentally knownamino-acid sequence of helices (without relying on the WWW servers) andtry to predict their conformations, following the prescription of the secondpart of our method described earlier.

The results that we present here are those of bacteriorhodopsin [156]. Wethus have NH = 7. Other parameter values that we used in (4.80) are k1 =1.0 (kcal mol−1) A−2, di,i+1 = 20.0 A, k2 = 1.0 (kcal mol−1) A−2, zL

0 = 0.0 A,zU0 = 31.5 A, dU = dL = 2.0 A, k3 = 0.05 (kcal mol−1) A−2, and dCα

= 100 A.We performed a REM MC simulation of 168,000,000 MC steps. We used thefollowing 32 temperatures: 200, 218, 238, 260, 284, 310, 338, 369, 410, 455,505, 561, 623, 691, 768, 853, 947, 1,052, 1,125, 1,202, 1,285, 1,374, 1,469, 1,642,1,835, 2,051, 2,293, 2,679, 3,132, 3,660, 4,278, and 5,000 K. This temperaturedistribution was chosen so that all the acceptance ratios of replica exchangeare almost uniform and sufficiently large (>10%) for computational efficiency.The highest temperature was chosen sufficiently high so that no trapping inlocal-minimum-energy states occurs. Replica exchange was attempted once atevery 50 MC steps.

Page 105: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

4 Generalized-Ensemble Algorithms for Studying Protein Folding 87

Fig. 4.8. Typical snapshots from the REM simulation for Replica 14. The con-figurations were taken at the 43,146,000-th MC step (a), at the 47,664,000-th MCstep (b), at the 48,155,000-th MC step (c), at the 48,822,000-th MC step (d), atthe 49,500,000-th MC step (e), and at the 58,398,000-th MC step (f). The RMSDfrom the native configuration is 7.78 A (a), 10.84 A (b), 15.18 A (c), 14.76 A (d),11.71 A (e), and 5.72 A (f) with respect to all Cα atoms. The corresponding tem-peratures are 3,132 K (a), 2,679 K (b), 3,132 K (c), 3,132 K (d), 2,051 K (e), and561 K (f). The color of the helices from the N terminus is as follows: Helix A (blue),Helix B (aqua), Helix C (green), Helix D (yellow-green), Helix E (yellow), Helix F(orange), and Helix G (red). The figures were created with RasMol [163]

In Fig. 4.8, typical snapshots of one of the 32 replicas, Replica 14, from theREM simulation are shown. In Fig. 4.8a, the helix configuration is differentfrom the native one (see Fig. 4.9a below). In particular, Helix G is trappedin the center. As the simulation proceeds, the temperature becomes high andthen drops to low values by the replica-exchange process, and the same helixconfiguration (“topology”) as the native one is finally obtained in Fig. 4.8f.These figures confirm that our simulations indeed sampled a wide configura-tional space. We see that the REM simulation performs random walks notonly in energy space but also in conformational space and that they do notget trapped in one of a huge number of local-minimum-energy states.

In Fig. 4.9, the PDB structure and the smallest RMSD structure obtainedby the REM simulation are compared. The retinal molecule is included inthe native PDB structure (Fig. 4.9a), but it was not used in our simulation.Nevertheless, the structure obtained by Replica 14 (Fig. 4.9b) has the same

Page 106: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

88 Y. Okamoto

Fig. 4.9. (a) The PDB structure of bacteriorhodopsin (PDB code: 1C3W) withretinal. (b) The smallest RMSD configuration that was obtained by the REM sim-ulation. (a1), (a2) and (b1), (b2) are the same structures viewed from differentangles (from top and from side), respectively. Dark-color atoms in the center in (a)represent the retinal (a) was drawn by eliminating the loop regions and lipids fromthe PDB file. The RMSD of the structure in (b) from the native structure of (a) is4.42 A with respect to all Cα atoms. The figures were created with RasMol [163]

helix topology (relative helix configuration) as the native structure. Theirstructures are indeed quite similar to each other. We remark that the initialconformation of Replica 14 is very different from the native one (RMSD =16.39 A). It is indeed remarkable that we could obtain a native-like structurefrom a random initial conformation, even though we neglected loop regions,retinal, lipids, surrounding water molecules in our simulation. This suggeststhat the helix–helix interactions are the main driving force in the final stageof the structure formation of membrane proteins.

The final example is the results of the applications of REMD simulations tothe folding of a small protein, namely, the B1 domain of streptococcal proteinG [164]. The simulations were performed on the Earth Simulator. Protein Gconsists of 56 amino acids, and the total number of atoms in the protein is 855.For the force fields, we used OPLS-AA/L [165] for the protein molecule andTIP3P [143] for water molecules. We first performed a REMD simulation ofprotein G in vacuum with 96 replicas. The initial conformation of the REMDsimulation was a fully extended one. We then solvated one of the obtained

Page 107: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

4 Generalized-Ensemble Algorithms for Studying Protein Folding 89

Fig. 4.10. The canonical probability distributions of the total potential energy ofprotein G obtained from the REMD simulation with 224 temperatures. They are allbell-shaped with sufficient overlaps with the neighboring ones

Fig. 4.11. Snapshots from the REMD simulation of protein G in explicit solvent

compact conformation in a sphere of water of radius 50 A. The total numberof water molecules was 17,187 (the total number of atoms was then 52,416including the protein atoms).

Using 112 nodes of the Earth Simulator, we performed a REMD simulationof this system with 224 replicas. The REMD simulation was successful inthe sense that we observed a random walk in potential energy space, whichsuggests that a wide conformational space was sampled. In Fig. 4.10 we showthe canonical probability distributions of the total potential energy at thecorresponding 224 temperatures ranging from 250 to 700 K.

As is clear from the Figure, all the adjacent distributions have sufficientoverlaps with the neighboring ones, suggesting that this REMD simulationwas successful. We indeed observed a random walk in the potential energyspace. This random walk in potential energy space induced a random walkin the conformational space, and we indeed observed many occasions of theformation of native-like secondary structures (α-helix and β-strands) duringthe REMD simulation.

In Fig. 4.11 we show some of the snapshots from this REMD simulation.Although we did observe lots of native-like secondary-structure formations,the simulation has not reached the native structure yet. We have to improveforce-field parameters and need more computation time.

Page 108: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

90 Y. Okamoto

4.5 Conclusions

In this article, we have reviewed some of powerful generalized-ensemble algo-rithms for both Monte Carlo simulations and molecular dynamics simulations.A simulation in generalized ensemble realizes a random walk in potential en-ergy space, alleviating the multiple-minima problem that is a common diffi-culty in simulations of complex systems with many degrees of freedom.

Detailed formulations of the two well-known generalized-ensemblealgorithms, namely, multicanonical algorithm (MUCA) and replica-exchangemethod (REM), were given. We then introduced further extensions of theabove two methods. We have shown the effectiveness of these algorithms byapplying them to various biomolecular systems.

Acknowledgements

The author thanks his co-workers for useful discussions. In particular, he isgrateful to Drs. B.A. Berg, M. Kawata, A. Kitao, H. Kokubo, M. Mikami,A. Mitsutake, C. Muguruma, T. Nishikawa, T. Okabe, H. Okumura, Y. Sugita,and T. Yoda for collaborations that led to the results presented in the presentarticle. The computations were performed on the Earth Simulator, computersat the Computer Center in the Institute for Molecular Science, and those atthe Nagoya University Computer Center. This work was supported, in part, byGrants-in-Aid for Scientific Research in Priority Areas (“Water and Biomole-cules”), for the Next Generation Super Computing Project, Nanoscience Pro-gram from the Ministry of Education, Culture, Sports, Science and Technology(MEXT), Japan, and for JST-BIRD Project.

References

1. U.H.E. Hansmann, Y. Okamoto, in Annual Reviews of Computational PhysicsVI, ed. by D. Stauffer (World Scientific, Singapore, 1999), pp. 129–157

2. A. Mitsutake, Y. Sugita, Y. Okamoto, Biopolymers (Peptide Science) 60,96–123 (2001)

3. Y. Sugita, Y. Okamoto, in Lecture Notes in Computational Science andEngineering, ed. by T. Schlick, H.H. Gan (Springer-Verlag, Berlin, 2002), pp.304–332; e-print: cond-mat/0102296

4. Y. Okamoto, J. Mol. Graphics Mod. 22, 425–439 (2004); e-print: cond-mat/0308360

5. H. Kokubo, Y. Okamoto, Mol. Sim. 32, 791–801 (2006)6. S.G. Itoh, H. Okumura, Y. Okamoto, Mol. Sim. 33, 47–56 (2007)7. Y. Sugita, A. Mitsutake, Y. Okamoto, in Lecture Notes in Physics,

ed. by W. Janke (Springer-Verlag, Berlin, 2008), pp. 369–407; e-print:arXiv:0707.3382v1 [cond-mat.stat-mech]

8. A.M. Ferrenberg, R.H. Swendsen, Phys. Rev. Lett. 61, 2635–2638 (1988); 63,1658 (1989)

Page 109: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

4 Generalized-Ensemble Algorithms for Studying Protein Folding 91

9. A.M. Ferrenberg, R.H. Swendsen, Phys. Rev. Lett. 63, 1195–1198 (1989)10. S. Kumar, D. Bouzida, R.H. Swendsen, P.A. Kollman, J.M. Rosenberg,

J. Comput. Chem. 13, 1011–1021 (1992)11. B.A. Berg, T. Neuhaus, Phys. Lett. B 267, 249–253 (1991)12. B.A. Berg, T. Neuhaus, Phys. Rev. Lett. 68, 9–12 (1992)13. B.A. Berg, Fields Institute Communications 26, 1–24 (2000); also see e-print:

cond-mat/990923614. W. Janke, Physica A 254, 164–178 (1998)15. J. Lee, Phys. Rev. Lett. 71, 211–214 (1993); 71, 2353 (1993)16. M. Mezei, J. Comput. Phys. 68, 237–248 (1987)17. C. Bartels, M. Karplus, J. Phys. Chem. B 102, 865–880 (1998)18. G.M. Torrie, J.P. Valleau, J. Comput. Phys. 23, 187–199 (1977)19. J.S. Wang, R.H. Swendsen, J. Stat. Phys. 106, 245–285 (2002)20. F. Wang, D.P. Landau, Phys. Rev. Lett. 86, 2050–2053 (2001)21. F. Wang, D.P. Landau, Phys. Rev. E 64, 056101 (2001)22. Q. Yan, R. Faller, J.J. de Pablo, J. Chem. Phys. 116, 8745–8749 (2002)23. S. Trebst, D.A. Huse, M. Troyer, Phys. Rev. E 70 046701 (2004)24. B.A. Berg, T. Celik, Phys. Rev. Lett. 69, 2292–2295 (1992)25. B.A. Berg, U.H.E. Hansmann, T. Neuhaus, Phys. Rev. B 47, 497–500 (1993)26. W. Janke, S. Kappler, Phys. Rev. Lett. 74, 212–215 (1995)27. B.A. Berg, W. Janke, Phys. Rev. Lett. 80, 4771–4774 (1998)28. N. Hatano, J.E. Gubernatis, Prog. Theor. Phys. (Suppl.) 138, 442–447 (2000)29. B.A. Berg, A. Billoire, W. Janke, Phys. Rev. B 61, 12143–12150 (2000)30. U.H.E. Hansmann, Y. Okamoto, J. Comput. Chem. 14, 1333–1338 (1993)31. U.H.E. Hansmann, Y. Okamoto, Physica A 212, 415–437 (1994)32. M.H. Hao, H.A. Scheraga, J. Phys. Chem. 98, 4940–4948 (1994)33. Y. Okamoto, U.H.E. Hansmann, J. Phys. Chem. 99, 11276–11287 (1995)34. N.B. Wilding, Phys. Rev. E 52, 602–611 (1995)35. A. Kolinski, W. Galazka, J. Skolnick, Proteins 26, 271–287 (1996)36. N. Urakami, M. Takasu, J. Phys. Soc. Jpn. 65, 2694–2699 (1996)37. S. Kumar, P. Payne, M. Vasquez, J. Comput. Chem. 17, 1269–1275 (1996)38. U.H.E. Hansmann, Y. Okamoto, F. Eisenmenger, Chem. Phys. Lett. 259,

321–330 (1996)39. U.H.E. Hansmann, Y. Okamoto, Phys. Rev. E 54, 5863–5865 (1996)40. U.H.E. Hansmann, Y. Okamoto, J. Comput. Chem. 18, 920–933 (1997)41. H. Noguchi, K. Yoshikawa, Chem. Phys. Lett. 278, 184–188 (1997)42. N. Nakajima, H. Nakamura, A. Kidera, J. Phys. Chem. B 101, 817–824 (1997)43. C. Bartels, M. Karplus, J. Comput. Chem. 18, 1450–1462 (1997)44. J. Higo, N. Nakajima, H. Shirai, A. Kidera, H. Nakamura, J. Comput. Chem.

18, 2086–2092 (1997)45. Y. Iba, G. Chikenji, M. Kikuchi, J. Phys. Soc. Jpn. 67, 3327–3330 (1998)46. A. Mitsutake, U.H.E. Hansmann, Y. Okamoto, J. Mol. Graphics Mod. 16,

226–238; 262–263 (1998)47. U.H.E. Hansmann, Y. Okamoto, J. Phys. Chem. B 103, 1595–1604 (1999)48. H. Shimizu, K. Uehara, K. Yamamoto, Y. Hiwatari, Mol. Sim. 22, 285–301

(1999)49. S. Ono, N. Nakajima, J. Higo, H. Nakamura, Chem. Phys. Lett. 312, 247–254

(1999)50. A. Mitsutake, Y. Okamoto, J. Chem. Phys. 112, 10638–10647 (2000)

Page 110: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

92 Y. Okamoto

51. K. Sayano, H. Kono, M.M. Gromiha, and A. Sarai, J. Comput. Chem. 21,954–962 (2000)

52. F. Yasar, T. Celik, B.A. Berg, H. Meirovitch, J. Comput. Chem. 21, 1251–1261(2000)

53. A. Mitsutake, M. Kinoshita, Y. Okamoto, F. Hirata, Chem. Phys. Lett. 329,295–303 (2000)

54. M.S. Cheung, A.E. Garcia, J.N. Onuchic, Proc. Natl. Acad. Sci. U.S.A. 99,685–690 (2002)

55. N. Kamiya, J. Higo, H. Nakamura, Protein Sci. 11, 2297–2307 (2002)56. S.W. Jang, Y. Pak, S.M. Shin, J. Chem. Phys. 116, 4782–4786 (2002)57. J.G. Kim, Y. Fukunishi, H. Nakamura, Phys. Rev. E 67, 011105 (2003)58. N. Rathore, T.A. Knotts IV, J.J. de Pablo, J. Chem. Phys. 118, 4285–4290

(2003)59. T. Terada, Y. Matsuo, A. Kidera, J. Chem. Phys. 118, 4306–4311 (2003)60. B.A. Berg, H. Noguchi, Y. Okamoto, Phys. Rev. E 68, 036126 (2003)61. M. Bachmann, W. Janke, Phys. Rev. Lett. 91, 208105 (2003)62. H. Okumura, Y. Okamoto, Chem. Phys. Lett. 383, 391–396 (2004)63. H. Okumura, Y. Okamoto, Chem. Phys. Lett. 391, 248–253 (2004)64. S.G. Itoh, Y. Okamoto, Chem. Phys. Lett. 400, 308–313 (2004)65. S.G. Itoh, Y. Okamoto, Phys. Rev. E 76, 026705 (2007)66. T. Munakata, S. Oyama, Phys. Rev. E 54, 4394–4398 (1996)67. K. Hukushima, K. Nemoto, J. Phys. Soc. Jpn. 65, 1604–1608 (1996)68. K. Hukushima, H. Takayama, K. Nemoto, Int. J. Mod. Phys. C 7, 337–344

(1996)69. C.J. Geyer, in Computing Science and Statistics: Proc. 23rd Symp. on the

Interface, ed. by E.M. Keramidas (Interface Foundation, Fairfax Station, 1991),pp. 156–163

70. R.H. Swendsen, J.-S. Wang, Phys. Rev. Lett. 57, 2607–2609 (1986)71. K. Kimura, K. Taki, in Proc. 13th IMACS World Cong. on Computation and

Appl. Math. (IMACS ’91), ed. by R. Vichnevetsky, J.J.H. Miller, vol. 2, pp.827–828

72. D.D. Frantz, D.L. Freeman, J.D. Doll, J. Chem. Phys. 93, 2769–2784 (1990)73. M.C. Tesi, E.J.J. van Rensburg, E. Orlandini, S.G. Whittington, J. Stat. Phys.

82, 155–181 (1996)74. E. Marinari, G. Parisi, J.J. Ruiz-Lorenzo, in Spin Glasses and Random Fields,

ed. by A.P. Young (World Scientific, Singapore, 1998), pp. 59–9875. Y. Iba, Int. J. Mod. Phys. C 12, 623–656 (2001)76. U.H.E. Hansmann, Chem. Phys. Lett. 281, 140–150 (1997)77. Y. Sugita, Y. Okamoto, Chem. Phys. Lett. 314, 141–151 (1999)78. A. Irback, E. Sandelin, J. Chem. Phys. 110, 12256–12262 (1999)79. M.G. Wu, M.W. Deem, Mol. Phys. 97, 559–580 (1999)80. Y. Sugita, A. Kitao, Y. Okamoto, J. Chem. Phys. 113, 6042–6051 (2000)81. C.J. Woods, J.W. Essex, M.A. King, J. Phys. Chem. B 107, 13703–13710

(2003)82. Y. Sugita, Y. Okamoto, Chem. Phys. Lett. 329, 261–270 (2000)83. A. Mitsutake, Y. Okamoto, Chem. Phys. Lett. 332, 131–138 (2000)84. D. Gront, A. Kolinski, J. Skolnick, J. Chem. Phys. 113, 5065–5071 (2000)85. G.M. Verkhivker, P.A. Rejto, D. Bouzida, S. Arthurs, A.B. Colson, S.T. Freer,

D.K. Gehlhaar, V. Larson, B.A. Luty, T. Marrone, P.W. Rose, Chem. Phys.Lett. 337, 181–189 (2001)

Page 111: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

4 Generalized-Ensemble Algorithms for Studying Protein Folding 93

86. H. Fukunishi, O. Watanabe, S. Takada, J. Chem. Phys. 116, 9058–9067 (2002)87. A. Mitsutake, Y. Sugita, Y. Okamoto, J. Chem. Phys. 118, 6664–6675 (2003)88. A. Mitsutake, Y. Sugita, Y. Okamoto, J. Chem. Phys. 118, 6676–6688 (2003)89. A. Sikorski, P. Romiszowski, Biopolymers 69, 391–398 (2003)90. C.Y. Lin, C.K. Hu, U.H.E. Hansmann, Proteins 52, 436–445 (2003)91. G. La Penna, A. Mitsutake, M. Masuya, Y. Okamoto, Chem. Phys. Lett. 380,

609–619 (2003)92. M. Falcioni, M.W. Deem, J. Chem. Phys. 110, 1754–1766 (1999)93. Q. Yan, J.J. de Pablo, J. Chem. Phys. 111, 9509–9516 (1999)94. T. Nishikawa, H. Ohtsuka, Y. Sugita, M. Mikami, Y. Okamoto, Prog. Theor.

Phys. (Suppl.) 138, 270–271 (2000)95. D.A. Kofke, J. Chem. Phys. 117, 6911–6914 (2002)96. T. Okabe, M. Kawata, Y. Okamoto, M. Mikami, Chem. Phys. Lett. 335,

435–439 (2001)97. Y. Ishikawa, Y. Sugita, T. Nishikawa, Y. Okamoto, Chem. Phys. Lett. 333,

199–206 (2001)98. A.E. Garcia, K.Y. Sanbonmatsu, Proteins 42, 345–354 (2001)99. R.H. Zhou, B.J. Berne, R. Germain, Proc. Natl. Acad. Sci. U.S.A. 98,

14931–14936 (2001)100. A.E. Garcia, K.Y. Sanbonmatsu, Proc. Natl. Acad. Sci. U.S.A. 99, 2782–2787

(2002)101. R.H. Zhou, B.J. Berne, Proc. Natl. Acad. Sci. U.S.A. 99, 12777–12782 (2002)102. M. Feig, A.D. MacKerell, C.L. Brooks III, J. Phys. Chem. B 107, 2831–2836

(2003)103. Y.M. Rhee, V.S. Pande, Biophys. J. 84, 775–786 (2003)104. D. Paschek, A.E. Garcia, Phys. Rev. Lett. 93, 238105 (2004)105. D. Paschek, S. Gnanakaran, A.E. Garcia, Proc. Natl. Acad. Sci. USA 102,

6765–6770 (2005)106. J.W. Pitera, W. Swope, Proc. Natl. Acad. Sci. U.S.A. 100, 7587–7592 (2003)107. M.K. Fenwick, F.A. Escobedo, Biopolymers 68, 160–177 (2003)108. A. Mitsutake, Y. Okamoto, J. Chem. Phys. 121, 2491–2504 (2004)109. M.K. Fenwick, F.A. Escobedo, J. Chem. Phys. 119, 11998–12010 (2003)110. K. Murata, Y. Sugita, Y. Okamoto, Chem. Phys. Lett. 385, 1–7 (2004)111. A.K. Felts, Y. Harano, E. Gallicchio, R.M. Levy, Proteins 56, 310 (2004)112. A. Mitsutake, M. Kinoshita, Y. Okamoto, F. Hirata, J. Phys. Chem. B 108,

19002–19012 (2004)113. A. Baumketner, J.E. Shea, Biophys. J. 89, 1493 (2005)114. T. Yoda, Y. Sugita, Y. Okamoto, Proteins 66, 846–859 (2007)115. A.E. Roitberg, A. Okur, C. Simmerling, J. Phys. Chem. B 111, 2415–2418

(2007)116. T.W. Whitfield, L. Bu, J.E. Straub, Physica A 305, 157–171 (2002)117. W. Kwak, U.H.E. Hansmann, Phys. Rev. Lett. 95, 138102 (2005)118. N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, E. Teller,

J. Chem. Phys. 21, 1087–1092 (1953)119. S. Nose, Mol. Phys. 52, 255–268 (1984)120. S. Nose, J. Chem. Phys. 81, 511–519 (1984)121. H.C. Andersen, J. Chem. Phys. 72, 2384 (1980)122. I.R. McDonald, Mol. Phys. 23, 41 (1972)123. H. Okumura, Y. Okamoto, Phys. Rev. E 70, 026702 (2004)

Page 112: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

94 Y. Okamoto

124. H. Okumura, Y. Okamoto, J. Phys. Soc. Jpn. 73, 3304–3311 (2004)125. H. Okumura, Y. Okamoto, J. Comput. Chem. 27, 379–395 (2006)126. B.A. Berg, C. Muguruma, Y. Okamoto, Phys. Rev. B 75, 092202 (2007)127. C. Muguruma, Y. Okamoto, B.A. Berg, Phys. Rev. E 78, 041113 (2008)128. L. Pauling, J. Am. Chem. Soc. 57, 2680 (1935)129. W.F. Giauque, M. Ashley, Phys. Rev. 43, 81 (1933)130. National Institute of Standards and Technology (NIST) at http://physics.nist.

gov/cuu/131. W.F. Giauque, J.W. Stout, J. Am. Chem. Soc. 58, 1144 (1936)132. L. Onsager, M. Dupuis, Re. Scu. Int. Fis. ‘Enrico Fermi’ 10, 294 (1960)133. J.F. Nagle, J. Math. Phys. 7, 1484 (1966)134. O. Haida, T. Matsuo, H. Suga, and S. Seki, J. Chem. Thermodynamics 6, 815

(1974)135. B.A. Berg, 2005 (unpublished).136. Y. Sugita, Y. Okamoto, Biophys. J. 88, 3180–3190 (2005)137. K.R. Shoemaker, P.S. Kim, E.J. York, J.M. Stewart, R.L. Baldwin, Nature

326, 563–567 (1987)138. K.R. Shoemaker, R. Fairman, D.A. Schultz, A.D. Robertson, E.J. York, J.M.

Stewart, R.L. Baldwin, Biopolymers 29, 1–11 (1990)139. P.J. Kraulis, J. Appl. Crystallogr. 24, 946–950 (1991)140. E.A. Merritt, D.J. Bacon, Methods Enzymol. 277, 505–524 (1997)141. J. Wang, P. Cieplak, P.A. Kollman, J. Comput. Chem. 21, 1049-1074 (2000)142. T. Yoda, Y. Sugita, Y. Okamoto, Chem. Phys. Lett. 386, 460–467 (2004)143. W.L. Jorgensen, J. Chandrasekhar, J.D. Madura, R.W. Impey, M.L. Klein,

J. Chem. Phys. 79, 926–935 (1983)144. H. Okumura, Y. Okamoto, Bull. Chem. Soc. Jpn. 80, 1114–1123 (2007)145. H. Okumura, Y. Okamoto, J. Phys. Chem. B 112, 12038–12049 (2008)146. P.A. Kollman, R. Dixon, W. Cornell, T. Fox, C. Chipot, A. Pohorille, in

Computer Simulation of Biomolecular Systems, Vol. 3, ed. by A. Wilkinson,P. Weiner, W.F. van Gunsteren (Kluwer, Dordrecht, 1997), pp. 83–96

147. H. Okumura, S.G. Itoh, Y. Okamoto, J. Chem. Phys. 126, 084103 (2007)148. S.D. Bond, B.J. Leimkuhler, B.B. Laird, J. Comput. Phys. 151, 114 (1999)149. S. Nose, J. Phys. Soc. Jpn. 70, 75 (2001)150. T.F. Miller, M. Eleftheriou, P. Pattnaik, A. Ndirango, D. Newns, G.J. Martyna,

J. Chem. Phys. 116, 8649 (2002)151. B.A. Berg, Introduction to Monte Carlo Simulations and Their Statistical

Analysis, (World Scientific, Singapore, 2004)152. T. Takekiyo, T. Imai, M. Kato, Y. Taniguchi, Biopolymers 73, 283 (2004)153. H. Kokubo, Y. Okamoto, Chem. Phys. Lett. 383, 397–402 (2004)154. H. Kokubo, Y. Okamoto, J. Chem. Phys. 120, 10837–10847 (2004)155. H. Kokubo, Y. Okamoto, J. Phys. Soc. Jpn. 73, 2571–2585 (2004)156. H. Kokubo, Y. Okamoto, Chem. Phys. Lett. 392, 168–175 (2004)157. A. Krogh, B. Larsson, G.v. Heijne, E.L.L. Sonnhammer, J. Mol. Biol. 305, 567

(2001)158. D.T. Jones, W.R. Taylor, J.M. Thornton, Biochemistry 33, 3038 (1994)159. T. Hirokawa, S. Boon-Chieng, S. Mitaku, Bioinformatics 14, 378 (1998)160. G.E. Tusnady, I. Simon, J. Mol. Biol. 283, 489 (1998)161. W.E. Reiher III, Theoretical Studies of Hydrogen Bonding, Ph.D. Thesis,

Department of Chemistry, Harvard University, Cambridge, MA, USA, 1985

Page 113: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

4 Generalized-Ensemble Algorithms for Studying Protein Folding 95

162. E. Neria, S. Fischer, M. Karplus, J. Chem. Phys. 105, 1902 (1996)163. R.A. Sayle, E.J. Milner-White, Trends. Biochem. Sci. 20, 374 (1995)164. A. Mitsutake, Y. Sugita, T. Yoda, T. Nishikawa, Y. Okamoto, in preparation.165. G.A. Kaminski, R.A. Friesner, J. Tirado-Rives, W.L. Jorgensen, J. Phys.

Chem. B 105, 474 (2001)

Page 114: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

“This page left intentionally blank.”

Page 115: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

5

Protein Folding and Binding: EffectivePotentials, Replica Exchange Simulations,and Network Models

A.K. Felts, M. Andrec, E. Gallicchio, and R.M. Levy

Abstract. Advances in computational biophysics depend on the development ofaccurate effective potentials and powerful sampling methods to traverse rugged en-ergy landscapes. We have developed an approach that makes use of the combinedpower of replica exchange simulations and a network model for kinetics. We carryout replica exchange simulations to generate a very large set of states using an all-atom effective potential function and construct a kinetic model for the folding, usingan ansatz that allows kinetic transitions between states based on structural similar-ity. We are also using replica exchange simulations to study the binding of ligandsto proteins such as cytochrome P450. A better understanding of the relationshipbetween the physical kinetics of the systems being studied to their “kinetics” inthe replica exchange ensemble is needed to use this new technology to maximumadvantage. To illustrate some of the challenges, we will discuss the results using anetwork model to “simulate” replica exchange simulations of protein folding.

5.1 Introduction

Molecular simulations of protein structural changes and ligand binding arebuilt upon two foundations: (1) the design of effective potentials, which arematched with the requirements of accuracy and speed appropriate to par-ticular modeling problems, and (2) the design of algorithms to sample theeffective potentials in highly efficient ways so as to facilitate the convergenceof the simulations in a thermodynamic sense. Developing algorithms to satisfythe competing goals of accuracy and speed is at the heart of the problem.

The protein folding problem is of fundamental importance in modern struc-tural biology. Recent advances in experimental techniques have helped to elu-cidate thermodynamic and kinetic mechanisms that underlie different stages ofthe folding process [1–6]. Computer simulations performed at various levels ofmolecular detail have played a central role in the interpretation of experimen-tal studies. Molecular simulations using models based on fully atomic repre-sentations are becoming more accurate and more practical and are increasingly

Page 116: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

98 A.K. Felts et al.

employed to simulate protein folding and predict protein structures [7–15]. Be-cause of the large number of degrees of freedom, however, these simulationsrequire extensive computer resources to obtain meaningful results, especiallywith explicit solvent models [16]. Because of this, many recent computationalstudies have been carried out with implicit solvent models [15, 17–20]. Thequestion of how well implicit solvent effective potentials when combined withdetailed atomic protein models can predict thermodynamic as well as kineticaspects of protein folding is under active investigation [9,10,12,13,15,19–27].

Numerous stringent requirements make the development of practically use-ful solvation-free energy models for biological applications very challenging.To be applicable to ligand binding affinity prediction, the model should be ac-curate over a wide range of molecular sizes and over a wide range of functionalgroups. To study protein folding, allosteric reactions, and flexible receptor andligand docking, the model must be able to describe hydration free energy dif-ferences between different molecules as well as different conformations of thesame molecule. Finally, the model needs to be computationally efficient andshould be expressed in analytical form with analytical gradients for seam-less incorporation in a molecular mechanics code to perform conformationalsampling and energy optimization calculations. Although models with someof these characteristics exist [9, 14, 22, 28–33], only few meet all the aboverequirements.

In modern implicit solvent models [31], the solvation free energy is typ-ically decomposed into a nonpolar component and an electrostatic compo-nent. Dielectric continuum methods account for the electrostatic componentby treating the water solvent as a uniform high-dielectric continuum [34].Methods based on the numerical solution of the Poisson–Boltzmann (PB)equation [35,36] provide a virtually exact representation of the response of thesolvent within the dielectric continuum approximation. Their computationalcomplexity is, however, still comparable to explicit solvent models and theyare not easily integrated in molecular dynamics simulation programs. Recentadvances extending dielectric continuum approaches have focused on the de-velopment of Generalized Born (GB) models [22,37], which have been shownto reproduce with good accuracy PB [33,38,39] and explicit solvent [40] resultsat a fraction of the computational expense. The development of computation-ally efficient analytical and differentiable GB methods with gradients based onpairwise descreening schemes [41,42] has made possible the integration of GBmodels in molecular dynamics packages for biological simulations [29,43–45].

Despite the fact that nonpolar hydration forces dominate whenever hy-drophobic interactions [46] are important, the general availability of accuratemodels for the nonpolar component of the hydration-free energy is lacking.The structure and properties of proteins in water is highly influenced by hy-drophobic interactions [47–50]. Hydrophobic interactions also play a key rolein the mechanism of ligand binding to proteins [30, 51–53]. Empirical sur-face area models [54] for the nonpolar component of the solvation free en-ergy are widely used [28, 37, 55–62]. Surface area models are useful as a first

Page 117: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

5 Protein Folding and Binding 99

approximation; however, deficiencies are observed [57, 63, 64] that are par-ticularly severe in the context of high resolution modeling and force fieldtransferability [65].

We developed the Analytical Generalized Born plus Non-Polar (AGBNP)model, an implicit solvent model based on the Generalized Born model[37–40, 44, 66] for the electrostatic component and on the decomposition ofthe nonpolar hydration-free energy into a cavity component based on thesolute surface area and a solute–solvent van der Waals interaction free en-ergy component modeled using an estimator based on the Born radius of eachatom.

Recent advances in parallel sampling techniques [67–69] and the wide-spread availability of large numbers of processors have now made possiblethe calculation of the full potential of mean force of small-to-medium sizedpeptides in solution [15, 19, 69–71]. One class of methods for studying equi-librium properties of quasi-ergodic systems that has received a great deal ofrecent attention is based on the Replica Exchange (RE) [72, 73] algorithm(also known as parallel tempering). RE methods, particularly Replica Ex-change Molecular Dynamics (REMD) [67], have become very popular for thestudy of protein biophysics, including peptide and protein folding [15,74,75],aggregation [76–78], and protein–ligand interactions [79, 80]. Previous stud-ies of protein folding appear to show a significant increase in the number ofreversible folding events in REMD simulations vs. conventional MD [81,82].

The effectiveness of RE methods is determined by the number of tempera-tures (replicas) that are simulated, their range and spacing, the rate at whichexchanges are attempted, and the kinetics of the system at each tempera-ture. While the determination of “optimal” Metropolis acceptance rates andtemperature spacings has been the subject of various studies [73, 83–88], therole played by the intrinsic temperature-dependent conformational kineticswhich is central to understanding RE has not received much attention. Re-cent work [88–91] recognizes the importance of exploration of conformationalspace and the crossing of barriers between conformational states as the keylimiting factor for the RE algorithm. Molecular kinetics can have a strongeffect on RE beyond the entropic effects that have been discussed [89, 91],particularly if the kinetics does not have simple temperature dependence. Itis known from experimental and computational studies that the folding ratesof proteins and peptides can exhibit anti-Arrhenius behavior, where the fold-ing rate decreases with increase in temperature [92–97]. Different models havebeen proposed to explain the physical origin of this effect [98,99].

We have investigated various systems to illustrate the principles of havinga sound effective potential and a powerful sampling technique. Predicting theconformations of peptides which form secondary structure in solution pro-vides the test of the effectiveness of OPLS-AA/AGBNP and REMD [75].We demonstrate how we can determine the kinetics of folding of one ofthose peptides, the G-peptide, based on the conformations generated during areplica exchange simulation using network models [100]. We also successfully

Page 118: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

100 A.K. Felts et al.

predict with OPLS-AA/AGBNP protein loop conformations that are them-selves “peptides,” which are tethered to a protein frame [101]. And we demon-strate our ability to explore the thermodynamics of binding using REMD withthe OPLS-AA/AGBNP potential for the system of N -palmitoylglycine com-plexed to cytochrome P450 BM-3 [80]. Finally, the behavior of RE methods aredemonstrated with simple models that capture the kinetics of RE [102,103].

5.2 Methods

5.2.1 The OPLS-AA/AGBNP Effective Potential

The total free energy of folding for a protein in solution can be representedapproximately as the sum of two terms:

ΔGtot � ΔGint + ΔGsolv, (5.1)

where ΔGint is the internal free energy of folding corresponding to the in-tramolecular degrees of freedom of the protein and ΔGsolv is the differenceof solvation free energy between the folded and unfolded states. The internalentropy change can be estimated from MD simulations; however, calculatingthe internal entropy is quite expensive [55]. Nevertheless, it has been foundthat the internal entropy changes between conformations are all roughly thesame [55, 104]. Since in this work different conformations of a given moleculeare compared, it is not necessary to include ΔSint in the total free energychange [12]; therefore, an effective free energy function

ΔGeff = ΔUint + ΔGsolv (5.2)

can be used in the lieu of ΔGtot. The OPLS all-atom (OPLS-AA) forcefield [105, 106] is used to model ΔUint, the internal energy for all atomicinteractions and intramolecular degrees of freedom. The solvation free energy,ΔGsolv, of each structure is estimated using the analytical generalized Bornmodel [27] with nonpolar free energy estimator (AGBNP, as described later)as implemented in the IMPACT modeling program [107].

In the original development of the OPLS-AA force field, the partial chargesand van der Waals parameters were adjusted to reproduce experimentalheats of vaporization and densities for a series of pure liquids [105, 108–112].These parameters were further tested by comparison with experimental solva-tion energies, using explicit-solvent simulations. Additional comparisons weremade in some cases to hydrogen-bond dimer-interaction energies obtainedfrom quantum-chemical calculations. These comparisons were used to detectlarge discrepancies that, when present, called for a reinvestigation of the non-bonded parameters. The OPLS-AA torsional parameters were fit to reproducegas-phase conformational energies obtained from quantum-chemical calcula-tions [106], and stretching and bending parameters were adapted from theCHARMM22 or AMBER force fields.

Page 119: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

5 Protein Folding and Binding 101

The generalized Born model is given by the following equation [37],

ΔGGB = −12

(1εin

− 1εw

)∑ij

qiqj

fij(rij), (5.3)

where qi is the charge of atom i and rij is the distance between atoms i and j,and gives the electrostatic component of the free energy of transfer of a mole-cule with interior dielectric εin from vacuum to a continuum medium of dielec-tric constant εw, by interpolating between the two extreme cases that can besolved analytically: the one in which the atoms are infinitely separated andthe other in which the atoms are completely overlapped. The interpolationfunction fij in (5.3) is defined as

fij =[r2ij + BiBj exp(−r2

ij/4BiBj)] 1

2 , (5.4)

where Bi is the Born radius of atom i defined as the effective radius thatreproduces through the Born equation

ΔGisingle = −1

2

(1εin

− 1εw

)q2i

Bi, (5.5)

the electrostatic free energy of the molecule when only the charge of atom iis present in the molecular cavity.

The analytical generalized Born (AGB) implicit solvent model is based on anovel pairwise descreening implementation [27] of the generalized Born model[29]. The combination of AGB with a recently proposed nonpolar hydration-free energy estimator described later is referred to as AGBNP [27]. AGBemploys a parameter-free and conformation-dependent analytical scheme toobtain the pairwise descreening scaling coefficients used in the computationof the Born radii used in the generalized Born equation (5.3). The agreementbetween the AGB Born radii and exact numerical calculations was found tobe excellent [27]. The AGBNP nonpolar model consists of an estimator for thesolute–solvent van der Waals interaction energy in addition to an analyticalsurface area component corresponding to the work of cavity formation [27].Because AGBNP is fully analytical with first derivatives, it is well suited forenergy minimization as well as for MD sampling. A detailed description of theAGBNP model and its implementation is provided in 27.

The nonpolar solvation free energy is given by the sum of two terms: thefree energy to form the cavity in solvent filled by the solute and the dispersionattraction between solute and solvent [65, 113]. The nonpolar free energy iswritten as [27]

ΔGnp =∑

i

(γiAi + ΔG

(i)vdW

), (5.6)

where the first term is the cavity term, γi is the surface tension proportionalityconstant for atom i, and Ai is the solvent exposed surface area of atom i. Thesecond term is the dispersion interaction term, which is given by [27]

Page 120: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

102 A.K. Felts et al.

ΔG(i)vdW = αi

−16πρwεi,wσ6i,w

3(Bi + Rw)3, (5.7)

where αi is an adjustable solute–solvent van der Waals dispersion parameterfor atom i. The parameter ρw is the number density of water at standardconditions (0.033428/A3). εi,w and σi,w are the pairwise Lennard–Jones (LJ)well-depth and diameter parameters for atom i and the TIP4P water oxygen asgiven by the OPLS-AA force field [105,106]. (εi,w =

√εiεw, where εi is the LJ

well-depth for atom i and εw is similarly for the TIP4P water oxygen. The ε forwater hydrogens is set to zero. σi,w is defined in a similar manner.) Rw is theradius of a water molecule (1.4 A). By not incorporating the Lennard-Jonesparameters into the dispersion parameter, αi, atoms with different thoughsimilar εi’s and σi’s are assigned the same α so as to minimize the numberof adjustable parameters. Bi is the Born radius of atom i. The form of 5.7for the solute–solvent van der Waals interaction energy component has beenderived on the basis of simple physical arguments [27].

We use two sets of parameterizations of α and γ to test the full nonpolarfunction described earlier relative to a simpler nonpolar function. In past im-plementations [14], the total nonpolar solvation free energy is given by a termproportional to the solvent-accessible surface area, or in terms of 5.7, settingall values of αi to zero, and setting γi for all atoms to 0.015 kcal mol−1 A−2.This implicit solvent model with the less-detailed nonpolar function is re-ferred to as “AGB-γ.” When we use the full nonpolar function including thedispersion term using the parameters set forth in the work of Gallicchio andLevy [27], the implicit solvent model is referred to as “AGBNP.”

A third parameterization aimed at implementing a correction for saltbridge interactions (which are generally overestimated by generalized Bornsolvent models) [75, 114] is also investigated. To correct for the overstabiliza-tion of salt bridges by the generalized Born model, we used modified radiiand γi for carboxylate oxygens [101]. The implicit solvent model that hasadditional descreening of ion pairing is referred to as “AGBNP+.”

5.2.2 Replica Exchange Molecular Dynamics

The MD replica exchange canonical sampling method (REMD) has been im-plemented in the molecular simulation package IMPACT [107] following theapproach proposed by Sugita and Okamoto [67]. In this method, a series ofstructures (the replicas) are simulated in parallel using MD at different tem-peratures. The temperatures, Tm and Tn, of two replicas, i and j, respectively,are exchanged with the following Metropolis transition probability [67]:

W ({Tm, Tn} → {Tn, Tm}) ={

1 for Δ ≤ 0,exp(−Δ) for Δ > 0,

(5.8)

whereΔ ≡ (βn − βm)(Ei − Ej), (5.9)

Page 121: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

5 Protein Folding and Binding 103

βm is 1/kTm and Ei is the current potential energy of replica i (and similarlyfor βn and Ej). After the exchange, the velocities of replicas i and j arerescaled at the new given temperatures. In our simulations, several replicasare run in parallel over a particular temperature range.

5.2.3 The Network Model of Protein Folding

During a REMD simulation of, for instance, the G-peptide (the C-terminalβ-hairpin of the B1 domain of protein G), a series of conformations (“states”)are generated at each temperature, of which there are 20. The REMD simula-tion of the G-peptide resulted in 40,000 conformational snapshots (“states”)at each of the 20 temperatures, for a total of 800,000 states. In our kinetic net-work model, these REMD snapshots can be visualized as nodes in a network.The edges connecting these nodes represent allowed conformational transi-tions, and the allowed conformational transitions are determined by the struc-tural similarity of the two states involved [115–117]. This network structurecan be viewed as an approximate representation of effects caused by frictionalinteractions with the environment [115]. For each state, 42 Cα-Cα distanceswere calculated, and structural similarity was defined as the Euclidean dis-tance between points in this distance space. The structural similarities forall sequential pairs of MD snapshots along a given REMD walker having thesame temperature were tabulated. Any two states with the same REMD tem-perature were joined by an edge if their structural similarity was less than orequal to a cutoff value. No connections were allowed between conformationsnot belonging to adjacent REMD temperatures. The resulting kinetic networkhas 800,000 nodes and 7.374 × 109 edges.

As in previous works [115,117–121], we simulate the kinetics on our graphas a jump Markov process with discrete states using the Gillespie Algorithm[122], where each (directed) edge is assigned a microscopic rate constant. Suchsimulations allow us to more directly characterize the sequence of events offolding. We make the equilibrium probability of being in any given state equalto that of being in any other at the same replica exchange temperature. Suchan equilibrium can be arranged by making the microscopic rate constant foreach transition to be equal to the rate constant for the reverse process. Wechose the relative equilibrium populations for states from different temper-atures such that the probability of being in states extracted from differentreplica exchange temperatures is peaked near a “reference” or a “simulation”temperature, which is a parameter of the kinetic model. This model allowsa given path to sample states having instantaneous temperatures above orbelow the reference temperature T0 in a physically realistic manner [100].

5.2.4 Loop Prediction with Torsion Angle Sampling

The loop prediction algorithm implemented in the Protein Local Optimiza-tion Program (PLOP) is described in detail in [123]. During loop build-up,

Page 122: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

104 A.K. Felts et al.

a series of filters of increasing complexity is applied to eliminate unreason-able conformations as early as possible and clustering is performed to removeredundant conformations. For long loops (≥9 residues), we have adopted pre-diction schemes based on multiple executions of PLOP with different parame-ters [123, 124]. The initial predictions with the most favorable energy scoresare subjected to a series of constrained refinement calculations with PLOPin which selected loop backbone atoms are not allowed to move or move onlywithin a given range [123]. Further enhancements, such as allowing for moreatomic overlaps and increasing the number of clusters in the K-means algo-rithm [125], have been incorporated into the loop sampling algorithms [101].

We have tested the loop prediction algorithms on two sets of protein loopsof known structure (see [101]). The first set is composed of the 57 nine-residueloops that were originally compiled by Fiser et al. [126] and by Xiang et al.[127]. The 35 13-residue loop set is the same as the one investigated by Zhuet al. [124].

We characterize if a loop is correct based on its root mean square de-viation (RMSD) with respect to the crystallographically determined nativestructure (1.5 A for nine-residue loops and 2.0 A for 13-residue loops). Er-rors are classified as sampling errors if the predicted loop’s energy is higherthan the native’s and as energy error if the predicted energy is lower thanthe native’s. A minority of incorrect predictions were not classifiable as eitherenergy or sampling errors. In the following, we label these cases as marginalerrors. Marginal errors are effectively incorrect predictions due to subtle andnot easily attributable energetic, entropic, and methodological causes [101].

5.3 Folding of Peptides

5.3.1 G-Peptide Folding

REMD simulations of the C-terminal β-hairpin (residues 41–56) of the B1domain of protein G (G-peptide) were conducted with the OPLS-AA forcefield [105] and the AGBNP implicit solvent model [27]. Details of the simula-tions can be found in [75]. When using the surface-area-only model (AGB-γ)for the nonpolar interactions, the hydrophobic core (W43, Y45, F52, and V54)does not collapse to an appreciable extent; at 270 K, only 12.8% of the struc-tures have a collapsed hydrophobic core (a conformation is said to have acollapsed hydrophobic core when its radius of gyration of the side chains ofresidues W43, Y45, F52, and V54 is less than 6 A). When the full nonpolarfunction of AGBNP is used, the percentage of hydrophobic collapse increasesto 37.8% with the default dielectric screening (AGBNP) and 94.1% with theincreased dielectric screening of charged side chains (AGBNP+) [75].

The decreased degree of hydrophobic collapse with the default dielec-tric screening (AGBNP) as compared with additional dielectric screening(AGBNP+) is due to a salt bridge forming between the side chains of K50

Page 123: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

5 Protein Folding and Binding 105

and E56 that hinders the formation of the hydrophobic core. However, sig-nificantly more of the structures generated with the full AGBNP nonpolarfunction have a collapsed hydrophobic core as compared to those generatedwith AGB-γ. The full nonpolar model of the OPLS-AA/AGBNP potentialfavors the formation of the collapsed hydrophobic core of the peptide even inthe presence of the destructive salt bridge.

While previous replica exchange simulations of the C-terminal polypeptidefrom the B1 domain of protein G in explicit and implicit solvent have beencarried out using the capped peptide [19,20,70,71], the experiments have beenperformed on the uncapped form of the peptide [94, 128–130]. A salt bridgebetween the N- and C-termini can be formed in the uncapped polypeptide.The β-hairpin population of the uncapped peptide (26%) is significantly largerthan the β-hairpin population of the capped peptide (10%), with the samesolvation model (AGBNP) [75]. This is due to the stabilizing effects of the saltbridge between the N- and C-termini, which compensates for the disruptiveinteraction between the charged residues of K50 and E56. The population ofthis disruptive salt bridge is reduced when increased dielectric screening ofthe charged side chains is applied with AGBNP+; the β-hairpin populationis increased from 26% to 40%. The predicted β-hairpin population of the un-capped peptide generated with AGBNP+ agrees well with the experimentalresults of Blanco et al. (42% at 283 K) [128]. The degree of hydrophobic col-lapse (98%) agrees reasonably well with the experimental results reported byMunoz et al. who observed around 80% hydrophobic collapse at 270 K. [94]

5.3.2 Folding of Other Small Peptides

To demonstrate the accuracy of OPLS-AA/AGBNP+, we predicted the con-formations of a series of small peptides that adopt either an α-helical confor-mation (CheY2-mu peptide [131], C-peptide [132], and the S-peptide-analog[133]), no secondary structure (the CheY2 peptide [131]), or a mix of β andα conformation (the FSD1 mini-protein [134]). We performed REMD simu-lations to sample the conformational space available to these peptides. Theresults are summarized in Table 5.1. We acheive reasonable accuracy for thesepeptides. It is also apparent that there is no bias towards forming α-helicalconformation with OPLS-AA/AGBNP+ as is evident by the prediction ofthe coil conformation for CheY2 peptide, which is similar in sequence to theα-helical CheY2-mu [131].

5.3.3 Loop Prediction

Loop prediction is a form of peptide folding: in this case, the peptide is teth-ered to a protein frame and feels an energy field generated by the frame. Loopprediction is a stringent test of the OPLS-AA/AGBNP energy function be-cause during the search with PLOP to find the native conformation, manyenergetically competing conformations are also generated [101]. The results

Page 124: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

106 A.K. Felts et al.

Table 5.1. Summary of the small peptides we have predicted with REMD simula-tions using OPLS-AA/AGBNP+

Name Sequence Structure % Content

Experimental RXMD

G-peptide [128] GEWTYDDATKTFTVTE β 42 40CheY2-mu [131] EDAVEALRKLQAGGY α 39 45CheY2 [131] EDGVDALNKLQAGGY α 2 2C-peptide [132] KETAAAKFERQHM α 29 41S-pep-analog [133] AETAAAKFLREHMDS α 45–63 55FSD1 [134] QQYTAKIKGRTFRN- ββα >80 59

EKELRDFIEKFKGR

The simulations were carried out for up to 10 ns.

Table 5.2. Summary of the loop conformational predictions results with the com-bination of standard and enhanced sampling procedures

9-Residue 13-Residue

ddd AGB-γ AGBNP AGBNP+ AGBNP+

E 19 6 4 2 2S 4 4 4 5 5M 3 1 0 1 1E+S+M 26 11 8 8 8〈RMSD〉 2.31 1.10 1.04 1.00 1.87median RMSD 1.27 0.52 0.52 0.58 0.67

ddd refers to distance-dependent dielectric; E, S, and M are energy, sam-pling, and marginal errors, repectively; and 〈RMSD〉: average RMSD (inA) of the lowest energy loops [101].

of the loop prediction tests are summarized in Table 5.2 for the combinedstandard and extended conformational sampling procedures [101]. All looppredictions summarized in Table 5.2 were performed in solution instead ofthe presence of the crystallographically related molecules (crystal symmetry)as Jacobson et al. [123] and Zhu et al. [124,135] did for their loop predictionswith PLOP for the 9- and 13-residue loops, respectively. We viewed loop pre-diction as a step in homology modeling where the crystal environment is notknown a priori; therefore, we predicted loops in solution rather than in thecrystal environment. For the 57 nine-residue loops, loop prediction tests wereconducted with OPLS-AA and the following implicit solvent models: distance-dependent dielectric, AGB-γ, AGBNP, and AGBNP+. Loop prediction testsfor the 35 13-residue loops were conducted with AGBNP+. Table 5.2 reportsthe total number of errors and the number of energy, sampling and marginalerrors, and the mean and median RMSD of the predictions from the X-raystructure.

Page 125: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

5 Protein Folding and Binding 107

Prediction Accuracy

The loop prediction procedure based on PLOP with the AGBNP+ solvationmodel and the extended sampling schemes we devised is very successful inpredicting the conformations of the 9- and 13-residue loops we have investi-gated. Fiser et al. used MD along with simulated annealing to predict loopconformations with an all-atom force field and a statistical treatment of sol-vation [126]. The percentage of predictions they report within 2 A RMSD(described as good and medium predictions) is 55% [126]. Using a tighterRMSD cutoff of 1.5 A, we obtain with PLOP and AGBNP+ an 86% successrate in our predictions for nine-residue loops. For a set of 13-residue loops,Fiser et al., using the same 2 A RMSD cutoff, report a very low 15% successrate [126], compared to the 77% success rate we obtained using the AGBNP+scoring function. Xiang et al. performed a search over a discrete rotamer li-brary with scoring based on their colony energy. For nine-residue loops, theyreport an average RMSD of 2.68 A [127]. In comparison, the average RMSDwe have obtained with PLOP and AGBNP+ is 1.00 A. De Bakker et al. [136]generated loop conformations with their program RAPPER [137] and scoredthem with a knowledge-based potential and with a physics-based potential,AMBER/GBSA. For nine-residue loops from the Fiser set [126], the averageRMSD of the lowest energy loops was over 2 A when scored with the AM-BER/GBSA potential, which produced their best results [136].

Jacobson et al. [123] performed loop prediction calculations on a largeset of nine-residue loops using the SGB/NP model [40, 138], with the crys-tal symmetry included [123]. They had obtained ten energy errors and eightsampling errors [123]. We obtained two energy errors and five sampling er-rors using AGBNP+ without the presence of the crystal environment [101].A recent study based on the comparison of X-ray and NMR structures ofidentical proteins suggests that in most cases the impact of the crystal envi-ronment on protein structures is relatively small and not strongly correlatedwith crystal packing [139]. Recently, Zhu et al. [124, 135] have reported loopprediction results for the same 35 13-residue loops investigated here using theSGB/NP potential with crystal symmetry supplemented by hydrophobic cor-rection terms and a variable dielectric model. Zhu et al. showed that thesepromising models lower the average backbone RMSD’s of the 13-residue pre-dictions substantially, from 2.73 A to 1.08 A. In comparison, we obtain forthe 13-residue loop set with AGBNP+ without crystal symmetry an averageRMSD of 1.87 A which is intermediate between the range of RMSD mea-sures reported by Zhu et al. [124, 135]. The best performing model reportedby Zhu et al. [135] produces according to our definition five energy errorson the 13-residue loop set compared with the two energy errors obtainedhere [101].

Page 126: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

108 A.K. Felts et al.

5.4 Kinetic Model of the G-Peptide

5.4.1 The G-Peptide has Apparent Two-State KineticsAfter a Small Temperature Jump Perturbation

Previous experimental work in the Eaton laboratory [94] has shown that thetime dependence of loss of hairpin structure in the G-peptide after a smalltemperature-jump perturbation is well fit by a single exponential. To confirmthat our kinetic model is consistent with this previous experimental kineticwork, we performed a series of simulations modeling this temperature-jumpexperiment. We began each simulation by constructing an ensemble of start-ing points distributed according to an equilibrium distribution, with T0 rang-ing from 300 to 615 K. We then performed a Markov process simulation for2,000–5,000 time units beginning from each starting point by using a refer-ence temperature 60◦ higher than the temperature used to construct the initialstarting point ensemble. For each temperature, the number of trajectories re-siding in a β-hairpin state were monitored as a function of time. In all cases,the loss of hairpin structure is fit well by single exponential decay with theexception of a small initial “burst phase” [100]. Our results are qualitativelyconsistent with experimental observations [94].

5.4.2 The G-Peptide has an α-Helical IntermediateDuring Folding from Coil Conformations

Protein folding is a process by which conformations without identifiable sec-ondary structure adopt a native conformation. To study this process in theG-peptide with our kinetic network model, we performed a temperaturequench experiment similar to the temperature-jump experiment describedearlier, but for which the starting ensemble was chosen from the equilibriumdistribution at T0 = 700 K, and the simulation was run at a reference temper-ature of 300 K. The fraction of α-helix and β-hairpin states as a function oftime displays a rapid rise in the amount of α-helix initially, which reaches amaximum and then decreases. Simultaneously, the amount of β-hairpin risesinitially at a rapid rate, then continues to rise with a slower rate similar to therate of decrease in the fraction of α-helix. This finding is suggestive of a mech-anism in which there are a small number of fast direct paths from unfoldedcoil states to the β-hairpin, but that the majority quickly fold to α-helicalstates, which then convert into β-hairpins on a longer time scale. A similarphenomenon is not observed for the unfolding process: temperature-jump sim-ulations from 300 to 700 K do not show appreciable α-helix formation. Thatthe folding and unfolding kinetic paths are different reflects the quite differentnonequilibrium cooling and heating conditions that are being simulated [100].

We can assign approximate absolute time scales to the processes observedhere. Based on this finding, the appearance of β-hairpin has a time constantof ∼2,500 time units, which would correspond in physical units to ∼50 μs,

Page 127: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

5 Protein Folding and Binding 109

whereas the rapid initial formation of α-helix occurs with a time constant ofnine time units or ∼180 ns. [100] These rates are in qualitative agreement withexperimental observations (6 μs) [94].

To confirm that this mechanism is indeed the basis for our “ensemble av-eraged” observations, we performed an analogous single-molecule quenchingexperiment in which we chose ∼4,000 states at random from among the coilstates at 690 K and used each as a starting point for a simulation at a refer-ence temperature of 300 K. Only 9% of the trajectories reach the β-hairpinmacrostate without passing through any α-helix-containing states. This find-ing confirms that in our kinetic network model the β-hairpin folding mecha-nism consists of two parallel pathways: the direct formation of the β-hairpinstructure from coil states and the formation of α-helical conformations, whichthen interconvert into β-hairpins.

5.4.3 A Molecular View of Kinetic Pathways

One of the advantages of the kinetic network model proposed here is thatwe are able to explore a large number of potential pathways that join twomacrostates. The number of such paths will typically be extremely large. Fur-thermore, each state along the path has associated with it all of the atomiccoordinates from the REMD simulation. Therefore, the molecular aspects ofthe paths can be analyzed in detail. This ability allows us to explore themultitude of folding pathways that the system can potentially have at its dis-posal. One way in which this model can be used is to generate many paths byusing Markovian kinetic Monte Carlo simulations. Such an approach with all-atom models has been useful for enumerating and quantifying the relative fluxthrough parallel kinetic pathways in small systems [119,120]. Alternatively, itis possible to investigate thermodynamically favorable pathways by a detailedanalysis of the structure of the kinetic network, for example, by searchingfor a small number of short paths connecting the two macrostates under theconstraint that the instantaneous temperature remain below a predeterminedmaximum value. We use this approach to analyze pathways connecting theα-helix and β-hairpin macrostates in the G-peptide [100].

Two short pathways that link the α-helical and β-hairpin macrostateswithout making use of microstates with an instantaneous temperature above488 K are shown in Fig. 5.1. The path shown in Fig. 5.1(upper) involves theunwinding of both ends of the helix, leaving approximately one turn of helixin the middle of the molecule. This turn then serves as a nucleation point forthe formation of the β-turn, which is stabilized by hydrophobic interactionsbetween the side chains of Y45 and F52. The native hydrogen bonds nearest tothe turn then form, after which the remainder of the native hairpin structureforms. This pathway is similar to previously proposed mechanisms for thefolding of the G-peptide β-hairpin from a coil state, which emphasize theformation of hydrophobic contacts before hydrogen bond formation [17, 18,140–143] and the persistence of the β-turn even in the unfolded state [143].

Page 128: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

110 A.K. Felts et al.

Fig. 5.1. Two possible pathways for the interconversion of an α-helix into aβ-hairpin of the G-peptide. Backbone trace is shown in ribbons and cylinders, andthe hydrophobic core residues (W43, Y45, F52, and V54) side chains are shown insticks. (Upper) The path corresponds to an unraveling of the helix at both endsand formation of a β-turn from a residual turn of the α-helix. (Lower) The pathcorresponds to an unraveling of one end of the helix, which loops back

The novel aspect of the path shown in Fig. 5.1(upper) is the preformation ofthe β-turn from a residual turn in an otherwise unfolded α-helix.

An alternative pathway (Fig. 5.1 Lower) involves the unwinding of theC-terminal half of the α-helix, which then loops back so as to be nearly paral-lel to the remaining helix. This proximity allows for the possibility of side-chaininteractions between the helix and the C-terminal half of the molecule, includ-ing hydrophobic interactions between F52 in the helix and either W43 or Y45.This pathway is very similar to the one previously identified by us on the basisof the analysis of the potential of mean force for the G-peptide along two prin-ciple component degrees of freedom [144]. In both pathways, it is clear thatformation of native β-hairpin contacts can occur without the complete lossof helical secondary structure, making the idea of the α-helix as an on-pathintermediate in the formation of the β-hairpin physically plausible [100].

5.5 Ligand Conformational Equilibriumin a Cytochrome P450 Complex

The cytochrome P450 enzymes catalyze the oxidation of a wide variety ofhydrophobic substrates [145]. P450 enzymes are ubiquitous. In humans theyare found in the liver and are important in cellular housekeeping processes,including the metabolism of pharmaceutical agents and detoxification [145].P450 enzymes are thus important in the study of drug metabolism and toxi-city. The mechanism of catalysis by P450 is centered on the iron of the hemegroup [146]. However the crystal structures of many P450 enzyme-substratecomplexes [147–150] show the substrate bound distant to the iron in a posi-tion that is evidently unproductive for chemistry. Based on UV–vis and NMRmeasurements and induced fit docking, Jovanovic et al. [151] have proposed

Page 129: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

5 Protein Folding and Binding 111

Fig. 5.2. Active site of the P450 BM-3/NPG complex in (a) the low temperatureX-ray conformation (PDB 1jpz) representative of distal state where the NPG (shownin green) is distant from the heme iron, with Phe87 (shown in magenta) interposedbetween NPG and heme iron (shown in blue) and (b) the alternative active siteof the conformation predicted by Jovanovic et al. representative of the proximalstate where Phe87 has changed its rotameric state to allow NPG to approach theheme iron

that the structure of one of these complexes (P450 BM-3 bound to NPG [147])depends on temperature, and that at biologically relevant temperatures theligand moves from a position distant from the heme iron, as seen in the lowtemperature X-ray crystal structure, into a position proximal to the iron,leading to the displacement of the iron coordinated water molecule and theinitiation of the oxidation mechanism.

In this study we use REMD [67, 75] to study the thermodynamic equilib-rium between the conformations of the P450 BM-3/NPG complex in whichthe terminal carbon atoms of NPG is distant from the heme iron as in the lowtemperature X-ray crystal structure [147] (the distal state, see Fig. 5.2a) andconformations with the terminal carbon atoms of NPG proximal to the hemeiron as in the conformation proposed by Jovanovic et al. [151] (the proximalstate, see Fig. 5.2b). REMD is ideally suited for this problem not only becauseit improves conformational sampling but also because it yields the populationsof conformational states over a range of temperatures.

5.5.1 Methodology

We apply the REMD [67,75] to the P450 BM-3/NPG complex starting fromthe low temperature crystal structure [147] (PDB id 1jpz) over a temper-ature range from 260 to 457 K with 24 replicas. This range was chosen tostudy the system at biologically relevant temperatures and at the same time(1) to connect with low temperature experimental information and (2) toenhance sampling at low temperature. A receptor restraining scheme was de-signed to prevent unfolding of the protein at high temperatures, but to allowenough flexibility to observe the conformational change at the active site. TheREMD simulation employed the OPLS-AA all atom force field [105] and the

Page 130: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

112 A.K. Felts et al.

AGBNP [27] implicit solvent model to mimic the water environment. Thereplica exchange acceptance ratio was 25% on average. The total simulationtime, including equilibration, was 3 ns for 24 replicas for a total of 72 ns.

Population distributions were obtained by collecting the distances betweenthe ω−1 carbon atom of NPG (the main substrate oxidation site) and the cat-alytic Fe atom as well as the potential energy of conformations from 10 replicasin the temperature range from 260 to 357 K. These quantities are binned intohistograms, which are then used as the input for the temperature weightedhistogram method (T-WHAM) [144] to finally give population distributions.T-WHAM [144] makes it possible to resolve the population distributions cor-responding to conformations of relatively high free energy, which are rarelysampled at room temperature, but are needed to determine the mechanismof interconversion between stable conformations. T-WHAM accomplishes thisby exploiting information contained in the high temperature replicas wherehigh free energy conformations are generated. Using this tool we postulatea mechanism for the conformational interconversion between the distal andproximal states [80].

5.5.2 The Population of the Proximal State as a Functionof Temperature

The ω−1-Fe distance in the low temperature X-ray crystal structure [147],which corresponds to the distal state, is 8.5 A and in the conformation pro-posed by Jovanovic et al. [151], which corresponds to the proximal state, is4.5 A. By defining the proximal state to be made of all conformations withω−1-Fe distances less than 6.5 A we obtain the population of the proximal stateas a function of temperature shown in Fig. 5.3. The population of the proxi-mal state is 32% at 260 K, increases with temperature and finally plateaus at318 K with 90% of the population in the proximal state. Both proximal anddistal states exist at all temperatures: rather than a sharp conformationaltransition from distal to proximal state at a specific transition temperature, agradual shift in population from distal to proximal state occurs with increasein temperature. These findings are in agreement with the thermal activationmechanism proposed by Jovanovic et al. [151]. The predicted midpoint of thetransition from the distal to the proximal state is 268 K (see Fig. 5.3), ∼20◦

higher than the observed transition temperature [151]. The increase in pop-ulation of the proximal state with increasing temperature indicates that theproximal state is stabilized by conformational entropy [80].

5.6 Simple Continuous and Discrete Modelsfor Simulating Replica Exchange

One cannot systematically explore the convergence properties of RE as afunction of the simulation parameters and/or the underlying kinetics of the

Page 131: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

5 Protein Folding and Binding 113

Fig. 5.3. Population as a function of temperature, p(T ), corresponding to the con-formations in which ligand is proximal to the heme iron. The proximal state popu-lation increases monotonically with temperature, indicating that the proximal stateis stabilized by conformational entropy at temperatures greater than at least 268K.This is borne out by the expression for the conformational entropy difference betweenthe proximal and the distal states: S = k ln[p/(1−p)]+kT/[p(1 − p)] ∂p/∂T , wherethe second term is positive and the first term is positive for T > 268 K (p(T ) > 1/2)

molecular system by brute force molecular simulations, since RE simulationsof protein folding are very difficult to converge. As an alternative, it is usefulto study simplified low dimensionality systems. While these models do notcapture all of the complexities of the “real” molecular simulation, they docapture some of the essential features of RE and allows us to study these fun-damental aspects of the algorithm at relatively low computational cost and ina controlled setting. We discuss here two simplified models of RE. The firstis a discrete two-state network model, containing two conformational states(Folded and Unfolded) at each of the several temperatures [102]. This modelreduces the atomic complexity of the system to discrete conformational states,which evolve in continuous time according to Markovian kinetics for both con-formational transitions and exchange between replicas. The second makes useof a continuous two-dimensional potential, which is sufficiently simple to beamenable to accurate analytical and numerical solution, while including somecharacteristics of molecular systems that were absent from the discrete net-work model. In both cases, the efficiency of RE conformational sampling willbe monitored by measuring NTE , the number round-trip transitions in theconformational state of a replica, conditional on the low temperature of inter-est T0, that occur in a given observation time. A transition event is a transitof a given replica from one conformation at T0 to the other conformation atT0 and back again regardless of route. Conceptually, this measure reflects the

Page 132: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

114 A.K. Felts et al.

potential of RE to achieve rapid equilibration at the temperature of inter-est by means of conformational transitions at temperatures other than thetemperature of interest.

5.6.1 Discrete Network Replica Exchange (NRE)

In the NRE model, the protein is assumed to exist in one of the two macro-states F and U (for “folded” and “unfolded”), which do not possess any in-ternal structure. Instead, it is assumed that the system evolves in time asa Poisson process, in which instantaneous transitions between F and U oc-cur after waiting periods given by exponentially distributed random variableswith means equal to the reciprocals of the folding or unfolding rates. If thetransition events are Markovian, then the simultaneous behavior of two uncou-pled noninteracting replicas can be represented by the four composite states{F1F2, F1U2, U1F2, U1U2}. In each symbol, the first letter represents the con-figuration of replica 1, the second letter the configuration of replica 2, and thesubscripts denote the temperature of each replica.

The four-state composite system for two noninteracting replicas can beextended to create a network model of replica exchange by introducing tem-perature exchanges between replicas, that is, by allowing transitions such asF1U2 → F2U1. This leads to a system with eight states arranged in a cubicnetwork, with “horizontal” folding and unfolding transitions and “vertical”temperature exchange transitions (Fig. 5.4). The effect of the rate of tem-perature exchanges is included by introducing the rate parameter α, whichcontrols the overall scaling of the temperature exchange rate relative to thefolding and unfolding rates. For canonical equilibrium probabilities to be pre-served under temperature exchanges, it is sufficient that detailed balance issatisfied by scaling α by a factor w = Peq(F2U1)/Peq(F1U2) as appropriate.Kinetics in the NRE model is simulated using a standard method for con-tinuous time Markov processes, with discrete states known as the “Gillespiealgorithm” [122].

It was found that the convergence of NRE for a two replica system in thelimit of very rapid temperature exchanges is fastest when the high temperatureis chosen to maximize the harmonic mean of the folding and unfolding rates.Thus, if protein folding follows anti-Arrhenius kinetics, there exists an optimalmaximal temperature, beyond which the efficiency of the replica exchangemethod is degraded. Both the convergence rate and efficiency are reduced ifthe temperature exchange rate is finite, and the optimal temperature of thehigh-temperature is reduced.

5.6.2 RE Simulations using MC on a Continuous Potential

In contrast to the NRE model, the simplified model of RE based on thecontinuous potential has macrostates which, like real molecular systems,have microscopic internal structure and therefore is not guaranteed to have

Page 133: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

5 Protein Folding and Binding 115

F2F1

F2U1 U2U1

U1U2

U2F1

U1F2

F1U2

F1F2

Fig. 5.4. The kinetic network model for the discrete NRE model used by Zhenget al. [102] The state labels represent the conformation (letter) and temperature(subscript) for each replica. For example, F2U1 represents the state in which replica 1is folded and at temperature T2, while replica 2 is unfolded at temperature T1.Gray and black arrows correspond to folding and unfolding transitions, respectively,while the temperature at which the transition occurs is indicated by the solid anddashed lines (for T2 and T1, repectively). The bold arrows correspond to temperatureexchange transitions, with the solid and dashed lines denoting transitions with rateparameters α and wα, respectively

Markovian kinetics. The two-dimensional potential was constructed to mimicthe anti-Arrhenius temperature dependence of the folding rate seen in pro-teins by having an energetic barrier when going from the “folded” to the“unfolded” region, and an entropic barrier in the reverse direction. This wasachieved by imposing a hard wall constraint that limits the space accessibleto the folded region, combined with a potential energy function that has anenergetic well in the folded region, and increases as one goes further into theunfolded region. This results in a two-well free energy profile as a function ofthe folding coordinate, where the activation free energy for folding increaseswith increasing temperature.

Metropolis kinetic Monte Carlo (MC) sampling was used to simulate themovement of a particle in this two-dimensional potential, and rate constantswere obtained by calculating the mean first passage times (FPTs) betweenthe two macrostates. The resulting FPT distributions were exponential andin agreement with the activation free energies obtained from the free energyprofile along the folding coordinate. Replica exchange simulations were per-formed with a kinetic MC propogator, and exchanges of configurations wereattempted every NX MC steps.

Behavior similar to that seen for the NRE model is also observed for thecontinuous potential: the efficiency is nonmonotonic and exhibits a maximum

Page 134: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

116 A.K. Felts et al.

at an optimal high temperature given by the maximal harmonic mean of thefolding and unfolding rates. However, the number of transitions is signifi-cantly lower than that predicted from the average of the harmonic means ofthe rates as seen in the NRE model. A comparison of continuous and discreteRE simulations has revealed non-Markovian effects. By simultaneously study-ing a discrete network model of RE and RE on a simplified two-dimensionalpotential, it is possible to clarify to some degree the origins and effects of anti-Arrhenius and non-Markovian kinetics on the efficiency of RE. Furthermore,these results suggest that the use of “training” simulations to explore someaspects of the temperature dependence for folding of the atomic level modelsprior to performing replica exchange studies could be useful in improving theoverall efficiency of the calculation. [102]

5.7 Conclusion

We have demonstrated that the OPLS-AA/AGBNP+ and REMD can cap-ture the thermodynamics of peptide folding (for instance, the G-peptide andC-peptide [75]) and protein–ligand binding (N -palmitoylglycine complexedto cytochrome P450 BM-3 [80]). OPLS-AA/AGBNP+ is effective in discrim-inating the correct fold of a loop on a protein from competing misfolded con-formations [101]. This is an indication that our effective potential is suitablefor protein folding when considered in conjunction with our previous work ondetecting native folds from misfolded decoys [14]. While thermodynamics canbe calculated directly from replica exchange, kinetics cannot. We have shown,however, that network models can be constructed from the conformations gen-erated from REMD to calculate the kinetics of the system [100]. Also we haveshown that a kinetic network model with a discrete model of the RE systemcan provide insights into the kinetics of RE [102]. We have extended our in-vestigation into the behavior of RE with a simple continuous potential, whichcaptures some of the kinetics of protein folding [103]. These simple modelshave demonstrated some of the pitfalls to RE, which can occur under certaincircumstances, such as when systems exhibit anti-Arrhenius behavior.

Acknowledgments

This project has been supported in part by the National Institutes of HealthGrants, GM-30580.

References

1. W.A. Eaton, V. Munoz, S.J. Hagen, G.S. Jas, L.J. Lapidus, E.R. Henry,J. Hofrichter, Annu. Rev. Biophys. Biomol. Struct. 29, 327 (2000)

2. J.K. Myers, T.G. Oas, Annu. Rev. Biochem. 71, 783 (2002)

Page 135: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

5 Protein Folding and Binding 117

3. A.R. Dinner, A. Sali, L.J. Smith, C.M. Dobson, M. Karplus, Trends Biochem.Sci. 25, 331 (2000)

4. J. Rumbley, L. Hoang, L. Mayne, S.W. Englander, Proc. Natl. Acad. Sci. USA98, 105 (2001)

5. A.R. Fersht, V. Daggett, Cell 108, 573 (2002)6. M. Vendruscolo, E. Paci, Curr. Opin. Struct. Biol. 13, 82 (2003)7. T. Lazaridis, M. Karplus, J. Mol. Biol. 288, 477 (1999)8. D. Petrey, B. Honig, Protein Sci. 9, 2181 (2000)9. T. Lazaridis, M. Karplus, Curr. Opin. Struct. Biol. 10, 139 (2000)

10. B.D. Bursulaya, C.L. Brooks III, J. Phys. Chem. B 104, 12378 (2000)11. B.N. Dominy, C.L. Brooks III, J. Comput. Chem. 23, 147 (2002)12. Y. Liu, D.L. Beveridge, Proteins: Struct. Funct. Genet. 46, 128 (2002)13. M. Feig, C.L. Brooks III, Proteins: Struct. Funct. Genet. 49, 232 (2002)14. A.K. Felts, E. Gallicchio, A. Wallqvist, R.M. Levy, Proteins: Struct. Funct.

Genet. 48, 404 (2002)15. Y.M. Rhee, V.S. Pande, Biophys. J. 84, 775 (2003)16. R.M. Levy, E. Gallicchio, Annu. Rev. Phys. Chem. 49, 531 (1998)17. A.R. Dinner, T. Lazaridis, M. Karplus, Proc. Natl. Acad. Sci. USA 96, 9068

(1999)18. B. Zagrovic, E.J. Sorin, V. Pande, J. Mol. Biol. 313, 151 (2001)19. R. Zhou, B.J. Berne, Proc. Natl. Acad. Sci. USA 99, 12777 (2002)20. R. Zhou, Proteins: Struct. Funct. Genet. 53, 148 (2003)21. B. Roux, T. Simonson, Biophys. Chem. 78, 1 (1999)22. D. Bashford, D.A. Case, Annu. Rev. Phys. Chem. 51, 129 (2000)23. T. Simonson, Curr. Opin. Struct. Biol. 11, 243 (2001)24. J. Zhu, Y. Shi, H. Liu, J. Phys. Chem. B 106, 4844 (2002)25. M. Krol, J. Comput. Chem. 24, 531 (2003)26. A. Suenaga, J. Mol. Struct. (Theochem) 634, 235 (2003)27. E. Gallicchio, R.M. Levy, J. Comput. Chem. 25, 479 (2004)28. B. Marten, K. Kim, C. Cortis, R.A. Friesner, R.B. Murphy, M.N. Ringnalda,

D. Sitkoff, B. Honig, J. Phys. Chem. 100, 11775 (1996)29. D. Qiu, P.S. Shenkin, F.P. Hollinger, W.C. Still, J. Phys. Chem. A 101, 3005

(1997)30. N. Froloff, A. Windemuth, B. Honig, Protein Sci. 6, 1293 (1997)31. C.J. Cramer, D. Truhlar, Chem. Rev. 99, 2161 (1999)32. E. Gallicchio, L.Y. Zhang, R.M. Levy, J. Comp. Chem. 23, 517 (2002)33. M.S. Lee, M. Feig, F.R. Salsbury Jr., C.L. Brooks III, J. Comp. Chem. 24(11),

1348 (2003)34. J. Tomasi, M. Persico, Chem. Rev. 94, 2027 (1994)35. C.M. Cortis, R.A. Friesner, J. Comp. Chem. 18, 1591 (1997)36. W. Rocchia, S. Sridharan, A. Nicholls, E. Alexov, A. Chiabrera, B. Honig,

J. Comp. Chem. 23, 128 (2002)37. W.C. Still, A. Tempczyk, R.C. Hawley, T. Hendrickson, J. Am. Chem. Soc.

112, 6127 (1990)38. A. Onufriev, D. Bashford, D.A. Case, J. Phys. Chem. B 104, 3712 (2000)39. A. Ghosh, C.S. Rapp, R.A. Friesner, J. Phys. Chem. B 102, 10983 (1998)40. L. Zhang, E. Gallicchio, R. Friesner, R.M. Levy, J. Comp. Chem. 22, 591 (2001)41. M. Schaefer, C. Froemmel, J. Mol. Biol. 216, 1045 (1990)42. G.D. Hawkins, C.J. Cramer, D.G. Truhlar, J. Phys. Chem. 100, 19824 (1996)

Page 136: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

118 A.K. Felts et al.

43. M. Schaefer, M. Karplus, J. Phys. Chem. 100, 1578 (1996)44. B.N. Dominy, C.L. Brooks III, J. Phys. Chem. B 103, 3765 (1999)45. V. Tsui, D.A. Case, Biopolymers 56, 275 (2000)46. A. Ben-Naim, Hydrophobic Interactions (Plenum Press, New York, 1980)47. W. Kauzmann, Adv. Prot. Chem. 14, 1 (1959)48. K.A. Dill, Biochemistry 29, 7133 (1990)49. P.L. Privalov, G.I. Makhatadze, J. Mol. Biol. 232, 660 (1993)50. B. Honig, A.S. Yang, Ad. Prot. Chem. 46, 27 (1995)51. J.M. Sturtevant, Proc. Natl. Acad. Sci. USA 74, 2236 (1977)52. D.H. Williams, M.S. Searle, J.P. Mackay, U. Gerhard, R.A. Maplestone, Proc.

Natl. Acad. Sci. USA 90, 1172 (1993)53. X. Siebert, G. Hummer, Biochemistry 41, 2965 (2002)54. T. Ooi, M. Oobatake, G. Nemethy, A. Sheraga, Proc. Natl. Acad. Sci. USA

84, 3086 (1987)55. M.R. Lee, Y. Duan, P.A. Kollman, Proteins 39(4), 309 (2000)56. P.H. Hunenberger, V. Helms, N. Narayana, S.S. Taylor, J.A. McCammon, Bio-

chemistry 38(8), 2358 (1999)57. T. Simonson, A.T. Brunger, J. Phys. Chem. 98, 4683 (1994)58. D. Sitkoff, K.A. Sharp, B. Honig, J. Phys. Chem. 98, 1978 (1994)59. C.S. Rapp, R.A. Friesner, Proteins: Struct. Funct. Genet. 35, 173 (1999)60. F. Fogolari, G. Esposito, P. Viglino, H. Molinari, J. Comp. Chem. 22, 1830

(2001)61. E. Pellegrini, M.J. Field, J. Phys. Chem. A 106, 1316 (2002)62. C. Curutchet, C.J. Cramer, D.G. Truhlar, M.F. Ruiz-Lopez, D. Rinaldi,

M. Orozco, F.J. Luque, J. Comp. Chem. 24, 284 (2003)63. A. Wallqvist, D.G. Covell, J. Phys. Chem. 99, 13118 (1995)64. E. Gallicchio, M.M. Kubo, R.M. Levy, J. Phys. Chem. B 104, 6271 (2000)65. R.M. Levy, L.Y. Zhang, E. Gallicchio, A.K. Felts, J. Am. Chem. Soc. 25(31),

9523 (2003)66. M. Nina, D. Beglov, B. Roux, J. Phys. Chem. B 101, 5239 (1997)67. Y. Sugita, Y. Okamoto, Chem. Phys. Lett. 314, 141 (1999)68. A. Mitsutake, Y. Sugita, Y. Okamoto, Biopolymers 60, 96 (2001)69. S. Gnanakaran, H. Nymeyer, J. Portman, K.Y. Sanbonmatsu, A.E. Garcıa,

Curr. Opin. Struct. Biol. 13, 168 (2003)70. A.E. Garcıa, K.Y. Sanbonmatsu, Proteins: Struct. Funct. Genet. 42, 345 (2001)71. R. Zhou, B.J. Berne, R. Germain, Proc. Natl. Acad. Sci. USA 98, 14931 (2001)72. R.H. Swendsen, J.S. Wang, Phys. Rev. Lett. 57, 2607 (1986)73. K. Hukushima, K. Nemoto, J. Phys. Soc. Jpn. 65, 1604 (1996)74. H. Nymeyer, S. Gnanakaran, A.E. Garcıa, Meth. Enzymol. 383, 119 (2004)75. A.K. Felts, Y. Harano, E. Gallicchio, R.M. Levy, Proteins: Struct. Funct. Bioin-

form. 56, 310 (2004)76. M. Cecchini, F. Rao, M. Seeber, A. Caflisch, J. Chem. Phys. 121, 10748 (2004)77. H.H.G. Tsai, M. Reches, C.J. Tsai, K. Gunasekaran, E. Gazit, R. Nussinov,

Proc. Natl. Acad. Sci. USA 102, 8174 (2005)78. A. Baumketner, J.E. Shea, Biophys. J. 89, 1493 (2005)79. G.M. Verkhivker, P.A. Rejto, D. Bouzida, S. Arthurs, A.B. Colson, S.T. Freer,

D.K. Gehlhaar, V. Larson, B.A. Luty, T. Marrone, P.W. Rose, Chem. Phys.Lett. 337, 181 (2001)

80. K.P. Ravindranathan, E. Gallicchio, R.A. Friesner, A.E. McDermott, R.M.Levy, J. Am. Chem. Soc. 128, 5786 (2006)

Page 137: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

5 Protein Folding and Binding 119

81. F. Rao, A. Caflisch, J. Chem. Phys. 119, 4035 (2003)82. M.M. Seibert, A. Patriksson, B. Hess, D. van der Spoel, J. Mol. Biol. 354, 173

(2005)83. D.A. Kofke, J. Chem. Phys. 117, 6911 (2002)84. A. Kone, D.A. Kofke, J. Chem. Phys. 122, 206101 (2005)85. C. Predescu, M. Predescu, C.V. Ciobanu, J. Chem. Phys. 120, 4119 (2004)86. C. Predescu, M. Predescu, C.V. Ciobanu, J. Phys. Chem. B 109, 4189 (2005)87. N. Rathore, M. Chopra, J.J. de Pablo, J. Chem. Phys. 122, 024111 (2005)88. S. Trebst, M. Troyer, U.H.E. Hansmann, J. Chem. Phys. 124, 174903 (2006)89. D.M. Zuckerman, E. Lyman, J. Chem. Theory Comput. 2, 1200 (2006)90. D.M. Zuckerman, J. Chem. Theory Comput. 2, 1693 (2006)91. D.A.C. Beck, G.W.N. White, V. Daggett, J. Struct. Biol. 157, 514 (2007)92. S.I. Segawa, M. Sugihara, Biopolymers 23, 2473 (1984)93. M. Oliveberg, Y.J. Tan, A.R. Fersht, Proc. Natl. Acad. Sci. USA 92, 8926

(1995)94. V. Munoz, P.A. Thompson, J. Hofrichter, W.A. Eaton, Nature 390, 196 (1997)95. M. Karplus, J. Phys. Chem. B 104, 11 (2000)96. P. Ferrara, J. Apostolakis, A. Caflisch, J. Phys. Chem. B 104, 5000 (2000)97. W.Y. Yang, M. Gruebele, Biochemistry 43, 13018 (2004)98. M.L. Scalley, D. Baker, Proc. Natl. Acad. Sci. USA 94, 10636 (1997)99. J.D. Bryngelson, P.G. Wolynes, J. Phys. Chem. 93, 6902 (1989)

100. M. Andrec, A.K. Felts, E. Gallicchio, R.M. Levy, Proc. Natl. Acad. Sci. USA102, 6801

101. A.K. Felts, E. Gallicchio, D. Chekmarev, K.A. Paris, R.A. Friesner, R.M. Levy,J. Chem. Theory Comput. 4, 855 (2008)

102. W. Zheng, M. Andrec, E. Gallicchio, R.M. Levy, Proc. Natl. Acad. Sci. USA104, 15340 (2007)

103. W. Zheng, M. Andrec, E. Gallicchio, R.M. Levy, J. Phys. Chem. B 112, 6083(2008)

104. Y.N. Vorobjev, J.C. Almagro, J. Hermans, Proteins: Struc. Func. Gen. 32, 399(1998)

105. W.L. Jorgensen, D.S. Maxwell, J. Tirado-Rives, J. Am. Chem. Soc. 118, 11225(1996)

106. G.A. Kaminski, R.A. Friesner, J. Tirado-Rives, W.L. Jorgensen, J. Phys.Chem. B 105, 6474 (2001)

107. J.L. Banks, H.S. Beard, Y. Cao, A.E. Cho, W. Damm, R. Farid, A.K. Felts,T.A. Halgren, D.T. Mainz, J.R. Maple, R. Murphy, D.M. Philipp, M.P.Repasky, L.Y. Zhang, B.J. Berne, R.A. Friesner, E. Gallicchio, R.M. Levy,J. Comput. Chem. 26, 1752 (2005)

108. W.L. Jorgensen, N.A. McDonald, Theochem 424, 145 (1998)109. W.L. Jorgensen, N.A. McDonald, J. Phys. Chem. B 102, 8094 (1998)110. R.C. Rizzo, W.L. Jorgensen, J. Am. Chem. Soc. 121, 4827 (1999)111. E.K. Watkins, W.L. Jorgensen, J. Phys Chem. A 105, 4118 (2001)112. D.J. Weininger, J. Chem. Info. Comput. Sci. 28, 31 (1988)113. J.A. Wagoner, N.A. Baker, Proc. Natl. Acad. Sci. USA 103, 8331 (2006)114. R. Geney, M. Layten, R. Gomperts, V. Hornak, C. Simmerling, J. Chem. The-

ory Comput. 2, 115 (2006)115. S.B. Ozkan, K.A. Dill, I. Bahar, Protein Sci. 11, 1958 (2002)

Page 138: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

120 A.K. Felts et al.

116. F. Rao, A. Caflisch, J. Mol. Biol. 342, 299 (2004)117. N. Singhal, C.D. Snow, V.S. Pande, J. Chem. Phys. 121, 415 (2004)118. W.C. Swope, J.W. Pitera, F. Suits, J. Phys. Chem. B 108, 6571 (2004)119. W.C. Swope, J.W. Pitera, F. Suits, M. Pitman, M. Eleftheriou, B.G. Fitch,

R.S. Germain, A. Rayshubski, T.L.C. Ward, Y. Zhestkov, R. Zhou, J. Phys.Chem. B 108, 6582 (2004)

120. D.S. Chekmarev, T. Ishida, R.M. Levy, J. Phys. Chem. B 108, 19487 (2004)121. D.A. Evans, D.J. Wales, J. Chem. Phys. 121, 1080 (2004)122. D.T. Gillespie, Markov Processes: An Introduction for Physical Scientists (Aca-

demic Press, Boston, 1992)123. M.P. Jacobson, D.L. Pincus, C.S. Rapp, T.J.F. Day, B. Honig, D.E. Shaw,

R.A. Friesner, Proteins: Struct. Funct. Bioinform. 55, 351 (2004)124. K. Zhu, D.L. Pincus, S. Zhao, R.A. Friesner, Proteins: Struct. Funct. Bioinform.

65, 438 (2006)125. J.A. Hartigan, M.A. Wong, Appl. Stat. 28, 100 (1979)126. A. Fiser, R.K.G. Do, A. Sali, Protein Sci. 9, 1753 (2000)127. Z.X. Xiang, C.S. Soto, B. Honig, Proc. Natl. Acad. Sci. USA 99, 7432 (2002)128. F.J. Blanco, G. Rivas, L. Serrano, Nat. Struc. Biol. 1, 584 (1994)129. F.J. Blanco, L. Serrano, Eur. J. Biochem. 230, 634 (1995)130. V. Munoz, E.R. Henry, J. Hofrichter, W.A. Eaton, Proc. Natl. Acad. Sci. USA

95, 5872 (1998)131. V. Munoz, L. Serrano, J. Mol. Biol. 245, 275 (1995)132. A. Bierzynski, P.S. Kim, R.L. Baldwin, Proc. Natl. Acad. Sci. USA 79, 2470

(1982)133. C. Mitchinson, R.L. Baldwin, Proteins: Struct. Funct. Genet. 1, 23 (1986)134. B.I. Dahiyat, S.L. Mayo, Science 278, 82 (1997)135. K. Zhu, M.R. Shirts, R.A. Friesner, J. Chem. Theory Comput. 3, 2108 (2007)136. P.I.W. de Bakker, M.A. DePristo, D.F. Burke, T.L. Blundell, Proteins: Struct.

Funct. Bioinform. 51, 21 (2003)137. M.A. DePristo, P.I.W. de Bakker, S.C. Lovell, T.L. Blundell, Proteins: Struct.

Funct. Bioinform. 51, 44 (2003)138. A. Ghosh, C.S. Rapp, R.A. Friesner, J. Phys. Chem. B 102, 10983 (1998)139. M. Andrec, D.A. Snyder, Z. Zhou, J. Young, G.T. Montelione, R.M. Levy,

Proteins: Struct. Funct. Bioinform. 69, 449 (2007)140. D.K. Klimov, D. Thirumalai, Proc. Natl. Acad. Sci. USA 97, 2544 (2000)141. V. Pande, D.S. Rokhsar, Proc. Natl. Acad. Sci. USA 96, 9062 (1999)142. B. Ma, R. Nussinov, J. Mol. Biol. 296, 1091 (2000)143. P.G. Bolhuis, Proc. Natl. Acad. Sci. USA 100, 12129 (2003)144. E. Gallicchio, M. Andrec, A.K. Felts, R.M. Levy, J. Phys. Chem. B 109, 6722

(2005)145. P.R.O. Montellano, Cytochrome P450: Structure, Mechanism and Biochem-

istry, 2nd edn. (Plenum Press, New York, 1995)146. V. Guallar, R.A. Friesner, J. Am. Chem. Soc. 126, 8501 (2004)147. D.C. Haines, D.R. Tomchick, M. Machius, J.A. Peterson, Biochemistry 40,

13456 (2001)148. P.A. Williams, J. Cosme, A. Ward, H.C. Angova, D.M. Vinkovic, H. Jhoti,

Nature 424, 464 (2003)149. P.A. Williams, J. Cosme, D.M. Vinkovic, A. Ward, H.C. Angove, P.J. Day,

C. Vonrhein, I.J. Tickle, H. Jhoti, Science 305, 683 (2004)

Page 139: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

5 Protein Folding and Binding 121

150. G.A. Schoch, J.K. Yano, M.R. Wester, K.J. Griffin, C.D. Stout, E.F. Johnson,J. Biol. Chem. 279, 9497 (2004)

151. T. Jovanovic, R. Farid, R.A. Friesner, A.E. McDermott, J. Am. Chem. Soc.127, 13548 (2005)

Page 140: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

“This page left intentionally blank.”

Page 141: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

6

Functional Unfolded Proteins: How, When,Where, and Why?

H.J. Dyson, S.-C. Sue, and P.E. Wright

Abstract. Recent advances in the sequencing of whole genomes have given fasci-nating insights into the overall composition of the encoded proteins. Many of theamino acid sequences that have been deduced in this way have highly biased se-quences and are predicted to be unfolded. A significant number of these sequencescorrespond to parts of functional proteins, and in a surprising number of cases, theunstructured regions correspond to the most relevant parts of the protein for func-tion – the actual sites for the binding of activators, repressors, and other ligands.This is particularly true for proteins involved in signaling networks – that is, signaltransduction, transcriptional activation, translation, and cell cycle regulation. Theintrinsically disordered regions facilitate interactions with multiple binding partnersand also provide a means for efficiently dissociating the complex after the signalhas been transduced. This article briefly reviews some of the recent experimentalevidence from our own and other labs, upon which these conclusions are based.

6.1 What is a Functional Unfolded Protein?

As long as biochemical studies were focused on the characterization of proteinspurified from cells and tissues, it was inevitable that the proteins studiedwere well-behaved, folded, and of a recognizable structure. Classic biochemicalseparations, including salting-out, column chromatography of various kinds,and gel filtration, all relied on the presence of well-folded proteins. Thoseproteins that were incompletely folded were generally badly-behaved underthese conditions, and were frequently discarded as refractory. We thereforebuilt up a picture of the protein world where the members were in most casesfolded into distinct globular states, which could be characterized by X-raycrystal structure analysis. Any unstructured regions of such proteins had tobe removed or otherwise immobilized, sometimes by the packing in the crystalsthemselves. Order was thus equated with intact functional proteins.

With the advent of genetic methods in the 1990s, culminating in the se-quencing of whole genomes, it became possible to map the function of proteinsby altering genes. Refinement of these techniques now allows us to pinpoint

Page 142: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

124 H.J. Dyson et al.

the areas of a given protein that are vital for its function. It was at thisstage that the puzzling widespread occurrence of proteins that were clearlyunstructured but nevertheless functional, was observed [1–6]. Such behaviorhad previously been observed for peptide hormones, rationalized as a case ofspecific folding upon binding to a specific receptor [7, 8]. However, the real-ization that this phenomenon was not only operative within cells [9], but waswidespread particularly among the most important proteins in the controlmechanisms of the cell was not recognized until later. The recognition camealmost simultaneously from experimental and theoretical studies. Several ex-amples of functional unstructured proteins from cellular signal transductionpathways, cell cycle control and transcriptional activation were noted [10–15].At the same time, scanning of published genome sequences showed that therewere frequently long stretches of the coded amino acid sequences that couldnot, by any of the rules of normal globular protein structure, form foldedthree-dimensional structures in water environments [16, 17]. These sequencescontained, for example, repeated units of hydrophilic amino acids, or patternsof hydrophobic and hydrophilic amino acids that did not correspond to anyknown secondary structure. In addition, these sequences (up to 30% of proteinsequences derived from published genomes) appeared to be disproportionatelypresent in cancer-related genes [18]. Thus it appears that intrinsically unstruc-tured proteins are found among the most important processes that go on inthe cell.

6.2 Where do Functional Unfolded Proteins Occur?

Functional unfolded proteins, and unfolded domains of otherwise folded pro-teins, frequently occur among the most important cellular processes, includ-ing signal transduction [19, 20], transcriptional regulation [21–24], regulationof translation [25] and cell cycle regulation [10]. The biological function ofunstructured protein domains frequently involves coupled folding and bind-ing [26] and the various components of a complex may show different degreesof structure/lack of structure (Fig. 6.1).

6.3 How Are Functional Unfolded Proteins Studied?

Because an unfolded or partly folded protein consists of a conformationalensemble containing a wide range of different structures, it is impossible toobtain meaningful results from crystal structures; even if the molecule willform crystals, the resulting structure will not be representative of the ensem-ble in solution. It is necessary to obtain information on unfolded proteins insolution. Spectroscopic methods are therefore employed to give information onconformational preferences within the ensemble. These include circular dichro-ism, fluorescence, Raman and NMR spectroscopy. NMR gives a great deal ofsite-specific information, and is preferred when NMR spectra are possible.

Page 143: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

6 Functional Unfolded Proteins: How, When, Where, and Why? 125

Fig. 6.1. Schematic representation showing various types of disorder that may occurin proteins. Adapted by permission from [5] (Macmillan Publishers Ltd., copyright2005)

6.4 NMR Spectra: Practical Considerations

Because the chemical environments of all of the nuclei in the polypeptidechain are very similar when the chain is disordered in water solution, theNMR signals, which rely for their dispersion on small local differences in theenvironment, will be largely overlapped, although the resonances themselvesmay be quite narrow (Fig. 6.2). In the past this resulted in the study of un-folded proteins being, in most cases, abandoned. The use of 3D spectra andtriple resonance experiments, as well as the availability of high-field NMR in-struments, means that the assignment of resonances in the NMR spectra ofunfolded proteins is no longer a deterrent to the study of these systems byNMR. Partly folded systems can be more problematic, since they frequentlyconsist of a series of conformations in intermediate exchange, which causesbroadening of the resonance lines. The NMR spectra of unfolded proteins areassigned mainly using the intrinsic resonance dispersion of the backbone 15Nand 13CO resonances, which are highly sequence-dependent [27].

Other problems arise if the complex that is formed is of high molecularweight – in this case the T2 relaxation time, which depends on the molecularweight, causes broadening of the resonances, although this problem can beovercome by the use of relaxation-optimized (TROSY) techniques.

Page 144: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

126 H.J. Dyson et al.

Fig. 6.2. 1H–15N HSQC spectra of folded apomyoglobin at pH 6 (left) and unfoldedapomyoglobin at pH 2 (right). Note the wide dispersion in the 1H dimension inthe left spectrum, and the narrow dispersion on the right. Also, the cross peaksare broader in the left spectrum, due to isotropic tumbling of the folded, globularprotein. The cross peaks are narrower in the right spectrum due to rapid segmentalmotion of the unfolded polypeptide chain

6.5 Dynamic Complexes in CBP

Our group has been particularly interested in the transcriptional activatorCBP and its partners, which show a wide range of different modes of inter-action of unstructured and partly folded proteins (Fig. 6.3). The first CBPsystem where an unstructured component was identified was the KIX do-main and its partner pKID, the phosphorylated kinase-inducible domain ofCREB [13, 14]. KIX is a folded domain, but pKID is unstructured in solu-tion, becoming folded into a pair of helices when bound to the KIX domain(Fig. 6.4). The mechanism of the coupled folding and binding process for thepKID–KIX system has recently been elucidated by NMR, utilizing HSQCtitrations and relaxation dispersion measurements [28]. These results are de-scribed in more detail in Chap. 1 (Wright).

A particularly intriguing example occurs in the complex of the interac-tion domain of ACTR and the nuclear coactivator binding domain (NCBD)of CBP. CD spectra show that although neither of the free proteins is coop-eratively folded, the complex is folded and stable. The 3D structure of thecomplex [23] demonstrates one of the rationales for the existence of intrin-sically unstructured proteins: the surface area of contact between the twoproteins (Fig. 6.5) is much larger than could be expected from the interactionof folded proteins of comparable size, as has been pointed out [29].

Another functional application of intrinsically unstructured proteins is il-lustrated by the complex between the TAZ1 domain of CBP and the inter-action domain of the hypoxia-inducible factor, HIF-1α. Like the KIX–pKIDcomplex, the TAZ1–HIF-1α complex involved the folding of an unstructured

Page 145: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

6 Functional Unfolded Proteins: How, When, Where, and Why? 127

Fig. 6.3. Schematic representation of the domain structure of human CREB-bindingprotein CBP. Folded domains are shown as spheres. Adapted by permission from [5](Macmillan Publishers Ltd., copyright 2005)

Fig. 6.4. Illustration of the unfolded nature of the phosphorylated kinase-inducibledomain (pKID) of CREB (left) and its conformation after folding upon binding tothe KIX domain of CBP (right). The mechanism of this process has recently beenelucidated by NMR [28] and is discussed more fully in Chap. 1 (Wright). Adaptedby permission from [5] (Macmillan Publishers Ltd., copyright 2005)

partner (in this case HIF-1α) onto a folded domain (in this case TAZ1). The3D structure of the complex [21] shows not only the extensive surface area ofcontact seen for other such complexes, but illustrates the operation of a bio-logical switch, another major rationale for the participation of unstructuredproteins in systems such as this. The TAZ1–HIF-1α interaction is primed bythe presence or absence of a hydroxyl group on a particular asparagine residue,Asn803, in HIF-1α. The enzyme that accomplishes this hydroxylation reac-tion, termed FIH, binds the sequence containing Asn803 as part of a β-strand,according to the crystal structure [30] (Fig. 6.6a), but the same sequence ispresent in a well-formed helix in the NMR structure of the TAZ1–HIF-1αcomplex (Fig. 6.6b). That is, the same sequence can take up two functionallyimportant, quite different, structures, as a consequence of its conformationalfreedom as an intrinsically unstructured protein in the uncomplexed state.

Page 146: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

128 H.J. Dyson et al.

Fig. 6.5. Illustration of the extensive binding surface of the ACTR domain on theNCBD of CBP. The right-hand structure is obtained by rotation of the left-handstructure in the manner indicated by the arrow. The backbone and side chains ofACTR are indicated by a wire, while the CBP is represented by a van der Waalssurface. Adapted by permission from [23] (Macmillan Publishers Ltd., copyright2002)

Fig. 6.6. Conformations of the HIF-1α sequence containing the regulatory as-paragine 803 that is hydroxylated under normoxic conditions. (a) Extended con-formation in the X-ray crystal structure of the complex with the hydroxylatingenzyme FIH [30]. (b) helical conformation in the NMR structure of the complexwith the TAZ1 domain of CBP [21]

6.6 Role of Flexibility in the Function of IκBα

One of the major roles of intrinsically unstructured proteins and domains, aswell as partially folded domains and domains that undergo significant inter-nal motion is in cellular signaling. The dynamic nature of such systems makes

Page 147: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

6 Functional Unfolded Proteins: How, When, Where, and Why? 129

them well-suited to the reception, transduction, and eventual turning-off ofcellular signals. Indeed, the cessation of the response upon removal of thesignal is a vital part of the process, and is frequently accomplished by integra-tion of signaling pathways with the proteolytic destruction of the intermediarymolecules, many of which are partly or completely unstructured.

The interaction of NF-κB with IκB provides a wealth of examples of severaldifferent kinds of order–disorder processes. This work was started in our labas a collaboration with Dr. E.A. Komives at the University of California,San Diego. Nuclear factor-kappaB (NF-κB) is a dimeric transcription factorwidely employed for the transcription of stress-response genes, as it binds toκB upstream enhancer DNA sequences, where it recruits the transcriptionalactivator CBP. In an unstressed cell, the majority of the NF-κB resides inthe cytoplasm, in complex with the inhibitor of NF-κB (IκB). Response tostress involves phosphorylation and ubiquitination of IκB and its subsequentdegradation by the proteasome. The free NF-κB is transported to the nucleus,where it binds to the κB enhancer sequences and mediates the transcriptionof genes that include that of IκB, which acts subsequently to remove NF-κBfrom the DNA and return it to the cytoplasm as the NF-κB–IκB complex.

A number of X-ray crystal structures of complexes of NF-κB have illus-trated the interactions that occur with DNA and with IκBα [31–33] (Fig. 6.7).The most common form of NF-κB consists of a heterodimer of two proteins,

Fig. 6.7. (a) X-ray crystal structure of the complex of the p50/p65 heterodimerof NF-κB with the cognate DNA sequence [32]. Adapted by permission from [32](Macmillan Publishers Ltd., copyright 1998). (b) X-ray crystal structure of the com-plex of the p50/p65 heterodimer of NF-κB with IκBα [31]. Adapted by permissionfrom [31] (Elsevier, copyright 1998)

Page 148: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

130 H.J. Dyson et al.

p65 and p50, which each consist of two immunoglobulin-like domains, togetherwith various linker sequences that are unstructured in solution. Figure 6.7ashows that the N-terminal domains of each of the two molecules form themajor sites of DNA binding, while Fig. 6.7b shows that the interaction withIκBα occurs with the two dimerization domains. IκBα is seen from Fig. 6.7b toconsist of an ankyrin-repeat (ANK) structure containing six ankyrin repeats.

As well as binding to the dimerization domains of p65 and p50, IκBαappears to form a cooperative interaction with the nuclear localization sig-nal(NLS) of p65: this observation was used to form a hypothesis about themechanism of inhibition by IκBα. By this hypothesis, NF-κB binds DNA in anopen conformation (Fig. 6.7a), but when IκBα binds NF-κB, the N-terminaldomain of p65 rotates into the DNA-binding site (the N-terminal domain ofp50 is missing in the X-ray structure). Binding of the p65 NLS causes the com-plex to remain largely in the cytoplasm. Upon activation, IκBα is removedand targeted for degradation, releasing the NLS, which allows NF-κB to betransported to the nucleus for gene activation.

This picture does not give the complete story. The interaction of NF-κBand IκBα is mediated and orchestrated by changes in flexibility and motionin both molecules. Parts of IκBα are highly fluxional in the free protein, anddifferent parts appear to be fluxional in the complex, which may be func-tionally relevant. Initial evidence for the fluxional nature of IκBα came fromH/D exchange monitored by mass spectrometry [34,35]. These studies demon-strated that while amide protons in the first four ankyrin repeats remainedprotected either in the free protein or when they were bound to NF-κB, re-peats 5 and 6 were highly exchanged in the free protein but not in the complex.Figure 6.7b shows that all of the ankyrin repeats are equally well-structuredin the complex, and ankyrin repeats are normally highly stable proteins. Fur-ther circumstantial evidence for motion or heterogeneity in IκBα comes fromthe inability of the protein to form crystals in the free state. Although repeat6 has a lower similarity to the consensus ankyrin repeat sequence, neitherrepeat 5 nor repeat 6 appears more likely than repeats 1–4 to form a stablestructure. We decided to apply NMR to the problem, to quantitate structuraland dynamic differences both between individual repeats and between freeand bound IκBα.

The initial spectra of a construct of IκBα containing repeats 1–6 showedthat only some parts of the protein give rise to observable cross peaks. Wewere able to show that, consistent with the mass spectrometry H/D results, thesignals that were observed arose from repeats 1–4, which are well-structuredin the free protein. The remaining signals are badly broadened, and some arecompletely missing, indicating that there is conformational exchange withinrepeats 5 and 6, probably on an intermediate time scale.

This circumstance made our stated aim of comparing dynamic behaviorof the free protein with the bound protein more difficult. We developed astreamlined production method that takes advantage of the differential ex-pression levels of p50 and p65 in the E. coli expression system. Using this

Page 149: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

6 Functional Unfolded Proteins: How, When, Where, and Why? 131

Fig. 6.8. Schematic diagram showing the assignment strategy for the 94 kDa com-plex of p50/p65 with IκBα67–287 based on transfer of assignments from smallerproteins and complexes. The top row shows putative structures of the complexes,modeled from X-ray crystal structures [Jacobs and Harrison [31] for (a–d) and Chenet al. [32] for (d)]. The approximate position of the flexible PEST sequence is indi-cated by a dotted line. The bottom row shows the 600 MHz 1H–15N HSQC spectra(a, b) or 900 MHz TROSY-type HSQCs (c, d) for each IκBα fragment. (a) [2H, 15N,13C]-labeled IκBα67–206 (15 kDa), (b) [2H, 15N, 13C]-IκBα67–206 in complex withp65 NLS (residues 289–321 of human p65) (19 kDa), (c) [2H, 15N, 13C]-IκBα67–287in complex with the heterodimer of the C-terminal dimerization domains of p50 andp65 (52 kDa), (d) [2H, 15N, 13C]-IκBα67–287 in complex with p50/p65

method, we were able to produce differentially labeled complexes of IκBα andNF-κB, and were able to complete the resonance assignments of IκBα even invery large complexes containing both the dimerization and DNA-binding do-mains of both p65 and p50, as well as IκBα. Since very large complexes causedifficulty in resonance assignment, we transferred assignments from smallercomplexes to larger ones, as described for other large systems [36, 37]. Theprocess is shown in Fig. 6.8.

Problems remain – the assignments of both free and bound IκBα are farfrom complete, since a significant number of resonances are missing from bothsets of NMR spectra, mainly in repeats 5 and 6 of the free protein and in repeat3 of the bound protein. However, if we infer that these resonances are missingdue to a dynamic process, we can use this information to build up a pictureof the dynamics of IκBα in the presence and absence of NF-κB. Figure 6.9shows the backbone nitrogens of the missing resonances mapped onto thebackbone of IκBα in the NF-κB complex (there is no structural informationon the free form of IκBα). Missing resonances abound in repeats 5 and 6 of the

Page 150: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

132 H.J. Dyson et al.

Fig. 6.9. Mapping of missing resonances onto IκBα Left: Representation of theankyrin repeat structure of IκBα derived from the X-ray structure (no direct struc-tural information is available for free IκBα showing missing residues mainly in ANK5and 6. Right: Structure of the ankyrin repeat region of IκBα from the X-ray struc-ture [31] showing missing resonances in ANK3

free protein, but this appears to be the best-structured region in the boundprotein. Surprisingly, the flexibility measured by missing resonances appearsto be enhanced in repeat 3 of the bound protein, compared to the free protein.We can therefore classify IκBα into four regions according to their dynamiccharacteristics (Fig. 6.10).

Region 1, comprising the majority of ankyrin repeats 1 and 2, appears tobe well-folded and stable in both the free protein and in the complexes withNF-κB, as shown by the presence of most resonances in the NMR spectra,the high protection factors in the H/D exchange experiments and the uniformvalues of the 1H–15N NOE and other relaxation measurements. According tothe crystal structure of the complex, Region 1 makes intimate contact withthe NLS of p65. We know from a comparison of the dispersion of the NMRspectra of the NLS when free or bound to IκBα that these 20 residues ofp65 are unstructured in solution in the free state, but become well-structuredin the complex. Thus, Region 1 of IκBα provides a structured scaffold uponwhich the intrinsically unstructured NLS can bind in a specific manner.

Page 151: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

6 Functional Unfolded Proteins: How, When, Where, and Why? 133

Fig. 6.10. Regions of IκBα with different dynamic properties in the free and com-plexed states (see text). Region 1 consists of parts of ANK1 and 2, and appearsrigid in both free and complexed states. Region 2, consisting of ANK3 and part ofANK4, is more rigid in the free state than in the complexed state. Region 3 consistsof ANK5 and part of ANK6, which are more rigid in the complexed state than inthe free state. Region 4 consists of the C-terminal portion of ANK 6 and the PESTsequence; this region is flexible in both free and complexed states. Note that theC-terminal helix of ANK4 (marked with asterisk) is not included in any of theseregions, as it appears to be rigid in both free and complexed IκBαu Adapted bypermission from Sue et al. [38]

Region 2 comprises much of ankyrin repeat 3 and the N-terminal partof repeat 4. This region shows some enigmatic properties. According to theNMR H/D exchange measurements, this region is well-structured in the freestate, but is destabilized to exchange in the complex, consistent with theloss or broadening of many of the resonances in repeat 3 upon complex for-mation (Fig. 6.9). This region of IκBα spans the interval between repeats 1and 2, bound to the NLS in the complex, and repeats 5–6, which are boundto the bulk of the dimerization domains of p50 and p65 in the complex. Thus,we may expect that Region 3 might undergo some intermediate time-scaleexchange processes concomitant with segmental motion of the two ends ofthe complex.

Region 3 consists largely of repeat 5 and the N-terminal part of repeat 6.This region is distinguished by segmental motion on an intermediate timescale in the free state, such that many of the resonances are completelymissing and others are severely broadened. Yet upon complexation, this re-gion becomes well structured, with high protection factors and well-dispersedresonances. Clearly in this case there is a transition from less-structured to

Page 152: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

134 H.J. Dyson et al.

more-structured upon complex formation. Region 3 shows the classic folding-upon-binding behavior that is frequently observed for intrinsically unstruc-tured domains [26].

Finally, Region 4 remains unstructured in both the free and complexedIκB protein. This region, containing the C-terminal portion of ankyrin repeat6 and the PEST sequence, might well undergo conformational transitions toa more structured form in other contexts. For example, this region is thoughtto be involved in the removal of NF-κB from the DNA after the signal is nolonger needed [33].

Given the wide variety in the behavior of structurally similar ankyrin re-peats in IκBα, it is interesting to speculate about the possible reasons. Partof the activation of NF-κB in the cytoplasm in response to signaling involvesthe dissociation and degradation of IκBα. The presence of a rather mobile,solvent-accessible region such as is seen for Region 2 in the complex, mightpredispose the complex to dissociation, perhaps in the presence of an acces-sory factor associated with the phosphorylation and ubiquitylation processthat ultimately decide the fate of the IκBα molecule. From a thermodynamicstandpoint, the loss of conformational entropy that accompanies the formationof the stable and rigid structure of Region 3 in the complex from a relativelyflexible form in the free state would require a considerable enthalpic term forthe complex to be formed. However, this complex must be readily dissociatedin response to a signal, so the complex cannot be too stable – a compromiseposition may be to transfer some of the entropy loss from repeats 5 and 6 asthe complex is formed, to repeat 3, thus lowering the requirement for a largeenthalpy term.

The NF-κB–IκBα system provides examples of many different types ofunfolded protein interactions, which are unified into a delicately balanced setof interactions that enable NF-κB to be rapidly deployed in response to cellularsignaling. However, the means by which nuclear IκBα dissociates NF-κB fromthe κB site on the DNA after its job is done is not at all clear from structuralstudies, and remains an intriguing challenge to future spectroscopic studies.

Acknowledgments

We thank Elizabeth Komives, Stephanie Truhlar, Carla Cervantes,Gourisankar Ghosh, Maria Yamout, and Gerard Kroon for helpful discussions.This work was supported by grant GM71862 from the National Institutes ofHealth.

References

1. P.E. Wright, H.J. Dyson, J. Mol. Biol. 293, 321 (1999)2. V.N. Uversky, Protein Sci. 11, 739 (2002)3. A.K. Dunker, C.J. Brown, Adv. Protein Chem. 62, 25 (2002)

Page 153: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

6 Functional Unfolded Proteins: How, When, Where, and Why? 135

4. P. Tompa, Trends Biochem. Sci. 27, 527 (2002)5. H.J. Dyson, P.E. Wright, Nat. Rev. Mol. Cell Biol. 6, 197 (2005)6. R.B. Russell, T.J. Gibson, FEBS Lett. 582, 1271 (2008)7. C. Bosch, A. Bundi, M. Oppliger, K. Wuthrich, Eur. J. Biochem. 91, 209 (1978)8. X. He, D. Chow, M.M. Martick, K.C. Garcia, Science 293, 1657 (2001)9. A.J. Daniels, R.J.P. Williams, P.E. Wright, Neuroscience 3, 573 (1978)

10. R.W. Kriwacki, L. Hengst, L. Tennant, S.I. Reed, P.E. Wright, Proc. Natl. Acad.Sci. USA 93, 11504 (1996)

11. G.W. Daughdrill, M.S. Chadsey, J.E. Karlinsey, K.T. Hughes, F.W. Dahlquist,Nat. Struct. Biol. 4, 285 (1997)

12. G.W. Daughdrill, L.J. Hanely, F.W. Dahlquist, Biochemistry 37, 1076 (1998)13. I. Radhakrishnan, G.C. Perez-Alvarado, D. Parker, H.J. Dyson, M.R. Montminy,

P.E. Wright, Cell 91, 741 (1997)14. I. Radhakrishnan, G.C. Perez-Alvarado, H.J. Dyson, P.E. Wright, FEBS Lett.

430, 317 (1998)15. D. Liu, R. Ishima, K.I. Tong, S. Bagby, T. Kokubo, D.R. Muhandiram, L.E.

Kay, Y. Nakatani, M. Ikura, Cell 94, 573 (1998)16. P. Romero, Z. Obradovic, C.R. Kissinger, J.E. Villafranca, E. Garner,

S. Guilliot, A.K. Dunker, Pac. Symp. Biocomput. 3, 437 (1998)17. P. Romero, Z. Obradovic, C.R. Kissinger, J.E. Villafranca, A.K. Dunker, Proc.

IEEE Int. Conf. Neural Networks 1997, 90 (1997)18. L.M. Iakoucheva, C.J. Brown, J.D. Lawson, Z. Obradovic, A.K. Dunker, J. Mol.

Biol. 323, 573 (2002)19. N. Abdul-Manan, B. Aghazadeh, G.A. Liu, A. Majumdar, O. Ouerfelli, K.A.

Siminovitch, M.K. Rosen, Nature 399, 379 (1999)20. A.H. Huber, D.B. Stewart, D.V. Laurents, W.J. Nelson, W.I. Weis, J. Biol.

Chem. 276, 12301 (2001)21. S.A. Dames, M. Martinez-Yamout, R.N. De Guzman, H.J. Dyson, P.E. Wright,

Proc. Natl. Acad. Sci. USA 99, 5271 (2002)22. R.N. De Guzman, M. Martinez-Yamout, H.J. Dyson, P.E. Wright, J. Biol. Chem.

279, 3042 (2004)23. S.J. Demarest, M. Martinez-Yamout, J. Chung, H. Chen, W. Xu, H.J. Dyson,

R.M. Evans, P.E. Wright, Nature 415, 549 (2002)24. N.K. Goto, T. Zor, M. Martinez-Yamout, H.J. Dyson, P.E. Wright, J. Biol.

Chem. 277, 43168 (2002)25. P.E. Hershey, S.M. McWhirter, J.D. Gross, G. Wagner, T. Alber, A.B. Sachs,

J. Biol. Chem. 274, 21297 (1999)26. H.J. Dyson, P.E. Wright, Curr. Opin. Struct. Biol. 12, 54 (2002)27. J. Yao, H.J. Dyson, P.E. Wright, FEBS Lett. 419, 285 (1997)28. K. Sugase, H.J. Dyson, P.E. Wright, Nature 447, 1021 (2007)29. K. Gunasekaran, C.J. Tsai, S. Kumar, D. Zanuy, R. Nussinov, Trends Biochem.

Sci. 28, 81 (2003)30. J.M. Elkins, K.S. Hewitson, L.A. McNeill, J.F. Seibel, I. Schlemminger,

C.W. Pugh, P.J. Ratcliffe, C.J. Schofield, J. Biol. Chem. 278, 1802 (2003)31. M.D. Jacobs, S.C. Harrison, Cell 95, 749 (1998)32. F.E. Chen, D.B. Huang, Y.Q. Chen, G. Ghosh, Nature 391, 410 (1998)33. T. Huxford, D.B. Huang, S. Malek, G. Ghosh, Cell 95, 759 (1998)34. C.H. Croy, S. Bergqvist, T. Huxford, G. Ghosh, E.A. Komives, Protein Sci. 13,

1767 (2004)

Page 154: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

136 H.J. Dyson et al.

35. S.M. Truhlar, J.W. Torpey, E.A. Komives, Proc. Natl. Acad. Sci. USA 103,18951 (2006)

36. J. Fiaux, E.B. Bertelsen, A.L. Horwich, K. Wuthrich, Nature 418, 207 (2002)37. R. Sprangers, L.E. Kay, Nature 445, 618 (2007)38. Sue et al., J. Mol. Biol. 380, 917 (2008)

Page 155: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

7

Structure of the Photointermediateof Photoactive Yellow Proteinand the Propagation Mechanismof Structural Change

M. Kataoka and H. Kamikubo

Abstract. In order to understand the molecular mechanism of a protein function,it is important to reveal the conformational change of the protein during func-tioning. Time-resolved X-ray crystallography has been utilized to reveal the struc-tural change during functioning, and has revealed the local structural change aftertriggering. However, global conformational changes which are demonstrated by so-lution studies with various spectroscopic measurements are generally difficult toobserve through time-resolved crystallography. Furthermore, the structural proper-ties of folding intermediates cannot be revealed by crystallography. Solution X-rayscattering (SOXS) is one of the powerful techniques to study solution structure of aprotein and its change. We will describe the solution structure analysis of the pho-tointermediate of a light-absorbing protein by high-angle solution X-ray scattering.

7.1 Solution X-ray Scattering

Solution X-ray scattering (SOXS) experiments at small angle region (SAXS)give the overall structural parameters of a protein, such as the radius of gyra-tion, the maximum dimension of the particle, and the molecular shape, undervarious physiological conditions [1,2]. Low-resolution structural models can beconstructed without any assumptions by SAXS profile. This so-called ab initioshape prediction [3,4] is widely used to characterize protein structures underphysiological conditions [5, 6]. On the other hand, high-angle profiles con-tain information about secondary structure packing and tertiary folds [7–11].It is also suggested that the high angle scattering is sensitive to the subtlestructural change [8]. Furthermore, high angle scattering is quite useful forcharacterizing the structure of folding intermediates [12–14] as well as theprotein folding process [15]. Although some theoretical treatments have beenproposed to analyze high angle scattering [7, 8], no successful application toderive the structural information on real proteins has been reported. This ismainly due to the difficulties in observing high angle scattering profile withhigh accuracy. When we observed high angle scattering of hemoglobin solution

Page 156: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

138 M. Kataoka and H. Kamikubo

with the second generation synchrotron, Photon Factory, it required a fairlyhigh concentration (100mg ml−1) and a long exposure time (10 min) [9]. Thedetector used was a one-dimensional position sensitive proportional counter.However, recent improvements in two-dimensional X-ray detectors and theavailability of third-generation synchrotron radiation sources have improvedthe quality of X-ray solution scattering profiles even in the higher angle re-gion with momentum transfer (Q) values up to 6 A

−1. We can observe high

angle scattering of photoactive yellow protein (PYP) with 5mg ml−1 and 1-min exposure. Quantitative analysis of high angle scattering is now required.A promising method would be the combination of molecular dynamics simu-lation and high-angle solution scattering [8]. Here we describe the structuralchange of PYP upon light absorption by high angle scattering combined withthe fluctuation analysis [16].

7.2 Photoactive Yellow Protein

Photoactive yellow protein (PYP) is a putative photoreceptor of negative pho-totaxis in the purple phototropic bacterium Halorhodospira halophila [17,18].PYP is a prototype of PAS domain which is conserved in various proteinsmediated in signal transduction [19]. Crystal structure revealed that PYPis composed of four segments, namely, an N-terminal cap (residues 1–28), aPAS core (residues 29–69), a helical connector (residues 70–87), and a β scaf-fold (residues 88–125) [19, 20]. We refer to the latter three segments as thechromophore-binding region. Absorption of a photon by the chromophore,p-coumaric acid, triggers the isomerization of the chromophore [21] and thesubsequent thermal reaction cycle [22–26]. The blue-shifted reaction inter-mediate PYPM, which has also been referred to as I2 or pB and formsover a timescale of ∼100 μs, is assumed to be the active state. Althoughthe target molecule of PYP has not been identified, structural informationabout PYPM is crucial for understanding the molecular mechanism of PYP-dependent photosignal transduction. According to time-resolved crystallog-raphy, the structural changes in PYPM were confined to the area near thechromophore [27, 28]. The large change is only observed for R52, which is lo-cated inside the protein in a ground state, but exposed to solvent at PYPM.On the other hand, substantial conformational changes in the protein moietyof PYPM in solution have been reported [29–39].

An interesting aspect of the photoreaction of PYP is the similarity to theprotein folding/unfolding reaction. Hellingwerf and his coworkers applied thetransition state theory to the photoreaction of PYP and estimated the ther-modynamic parameters, the entropy, enthalpy, and heat capacity changes ofactivation [29]. They also carried out thermodynamic analysis on the ther-mal denaturation of PYP. Consequently, they found that the heat capac-ity changes in the photoreaction are comparable to those in the unfolding

Page 157: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

7 Structure of the Photointermediate of Photoactive Yellow Protein 139

reaction. We performed the urea denaturation experiments on PYP andPYPM [30]. PYPM is more sensitive to urea than PYP. The free energychange upon denaturation is estimated as 11.0–11.5 kcal mol−1 for PYP and7.6–7.8 kcal mol−1 for PYPM. Taking into account the fact that the isomericstate of the chromophore of the denatured state of PYPM is different from thatof PYP, the free energy difference in protein moiety between PYP and PYPM

is estimated to be 6.5–11.5 kcal mol−1, which is comparable to the differencebetween the native state and the molten globule state in soluble proteins [30].We concluded that PYPM has a property of the partially unfolded state. Weobserved the significant diffusion constant change upon formation of PYPM

by the transient grating method in collaboration with Terazima [31, 32]. Thediffusion constant change is well explained by the unfolding of α-helical moietyin the N-terminal region.

Most of HSQC peaks assigned to the N-terminal region disappear uponthe formation of PYPM [33,34]. The loss of α-helical content is also observedby CD [35]. However the controversial conclusion was obtained by the frag-mentation and H/D exchange mass spectroscopic analysis [36]. Therefore,detailed structural information about PYPM in solution is required to clarifythe mechanism underlying the phototransduction.

There are two ways to study the structure of a short-lived photointerme-diate: the kinetic measurement and the static measurement. For high angleX-ray scattering, static measurement is preferable, because the analysis ofkinetic data depends on the kinetic model. Chymotrypsin cleaves PYP atthe C-terminal sides of the 6th, the 15th, and the 23rd residues [40], whichwill be called T6 (residues 7–125), T15 (residues 16–125), and T23 (residues24–125), respectively hereafter. The absorption spectrum of each truncatedPYP is identical to that of the intact PYP, indicating that the structure ofthe chromophore-binding region is not perturbed by the truncations [40]. Thelifetime of PYPM for T6, T15, and T23 are 30, 300, and 600 s, respectively.The lifetime of PYPM of intact PYP is only 0.3 s. Therefore, these truncatedforms are suitable for the structure analysis of the M intermediate. The crystalstructure of PYP and the truncated parts are shown in Fig. 7.1.

7.3 Solution Structure Analysis of Photointermediateof PYP

7.3.1 High-Angle X-ray Scattering of PYP in the Darkand in the Light

The N-terminal deletions of PYP may affect the scattering profile. Figure 7.2(right) shows the experimentally observed scattering profiles of intact PYPand three truncated variants (T6, T15, and T23). The profile of intact PYPhas two broad peaks at Q = 0.35 and 0.55 A

−1, with a valley around Q =

0.41 A−1

. In T6, the intensity of the peak at the lower Q value increases,

Page 158: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

140 M. Kataoka and H. Kamikubo

Fig. 7.1. Crystal structure of PYP and the truncated position by chymotrypsintreatment

Fig. 7.2. High-angle scattering profiles of wild type PYP, T6, T15, and T23 mea-sured in solution (left), and calculated from the respective atomic structural models(right) [16]

while the peak position shifts toward a higher Q value. At the same time,the intensity of the peak at the higher Q value decreases and the valley shiftstoward a higher Q value. On the other hand, both T15 and T23 resultedin similar scattering profiles with a single maximum around Q = 0.39 A

−1.

These characteristic profiles indicate that the scattering profile in this Q regionreflected intramolecular interference.

The crystal structure of PYP (Fig. 7.1) explains the experimentally ob-served profiles satisfactorily. The theoretical profile of intact PYP has twobroad peaks at the same positions as those observed in the experimen-tally obtained curve (Fig. 7.2 right). The theoretical profiles for T6 and T23

Page 159: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

7 Structure of the Photointermediate of Photoactive Yellow Protein 141

were also similar to the respective observed profiles. The agreement betweenthe calculated profiles and the observed profiles indicates that the structuresof T6 and T23 as well as that of intact PYP can be explained by removingthe corresponding residues from the crystal structure (Fig. 7.1). The theoret-ical profile for T15 appeared to be an intermediate between those for T6 andT23, and was different from the observed profile of T15. As shown in Fig. 7.1,the truncated position of T15 is at the center of α-helix. After removing 15residues, the helix may be no longer stable, resulting in the disappearance ofthe interference between the N-terminal region and the rest of the protein.

The X-ray scattering profiles of T6, T15, and T23 were measured undercontinuous illumination. Due to the long lifetime of the M intermediate forthe truncated form, we can expect that more than 90% of the protein is in thePYPM state under continuous illumination [41]. Figure 7.3 shows the intensityprofiles of the M intermediates of the truncated PYP variants compared withthose obtained for their dark states. Significant differences between the twostates were observed for each truncated PYP. The profiles of the PYPM inter-mediates of the three truncated PYP variants are similar with two broad peaks

Fig. 7.3. High-angle X-ray scattering profiles of T6, T15, and T23 under illumi-nation (circles with error bars) [16]: As a reference, the profiles of the dark statesare shown (dashed lines). The characteristic bimodal profiles observed under theillumination are noted by the arrowheads

Page 160: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

142 M. Kataoka and H. Kamikubo

located at the same positions (arrowheads in the figure). The characteristicprofile changes in T23, which lacks most of the N-terminal cap, indicate re-arrangements of the secondary structure packing in the chromophore-bindingregion.

The profiles of the PYPM of the three truncated PYP variants can be su-perimposed on the log–log plot. The differences among the profiles appear inthe valley around Q = 0.3 A

−1, where the shape scattering and the intramole-

cular interference scattering overlap. In order to derive the contribution fromthe secondary structure packing, the contribution of the shape scattering pro-files were subtracted from the original profile. In general, the final slope ofthe shape scattering can be described as Q−α, where α is related to the frac-tal dimension [42] or the protein conformational state [12]. The final slope ofthe shape scattering from each truncated variant is well approximated by astraight line in a log–log plot. The slope gives the value α. The excess intensitydue to the shape scattering thus estimated was subtracted to derive the cor-rected intramolecular interference profile of the PYPM intermediate for eachtruncated PYP. All the corrected profiles were identical within the statisticalerrors, indicating that the N-terminal regions of T6 and T15 did not influencethe intramolecular interference scattering.

7.3.2 Analysis of High Angle Scattering

The change in the profile of T23 indicates a significant rearrangement in thesecondary structure packing of the chromophore-binding region during theformation of PYPM. On the basis of the obtained profile, we attempted to con-struct a solution structural model of PYPM, especially for the chromophore-binding region. We attempted to generate plausible conformations from avariety of structures derived from the crystal structure of PYP using thehigh-angle X-ray scattering profile as a boundary condition. The 500 struc-tures were constructed using the CONCOORD program [43]. The high anglescattering profile of each generated structure was calculated by CRYSOL [44].Most of the structures showed the profiles similar to the profile of the darkstate of T23 (a single broad peak at Q = 0.39 A

−1), some structures showed

profiles with the bimodal shape observed for the PYPM intermediate of T23.We selected the structures that satisfied the following two properties in thecalculated scattering profile as the candidate models of the PYPM structure:(1) the peak position was observed at Q < 0.39 A

−1; and (2) a clear shoulder

was present around Q = 0.6 A−1

. Consequently, 51 structures from the 500structures were selected. The average of the selected structures is adopted asa structure model of the chromophore-binding region of PYPM. According tothe model, the loop between β4 and β5, and the α4 helix that envelop thechromophore-binding pocket in the dark state of the protein move away fromeach other, opening the chromophore-binding pocket. The root-mean-squaredeviation of the model structure of PYPM from the structure of intact PYP

Page 161: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

7 Structure of the Photointermediate of Photoactive Yellow Protein 143

suggests that the structural changes in PYPM are localized in the N-terminaltail (residues 24–28), the α4 helix (residues 55–58), and the loop connectingβ4 and β5 (residues 96–102). The structure of the N-terminal region of T6 issimilar to that of the dark state of the wild-type protein. It, however, under-goes large structural changes during the formation of PYPM that abrogate theintramolecular interference between the N-terminal and chromophore-bindingregions. The lack of interference strongly suggests that the N-terminal regionis substantially disordered and moves stochastically in PYPM. In fact, smallangle X-ray scattering analysis indicated that the N-terminal region movesaway from the chromophore-binding region. Taking all these into consider-ation, a schematic structural model for wild-type PYPM was built by com-bining the structural model of the chromophore-binding region of the PYPM

intermediate of T23 with the structural fluctuation of the N-terminal regionpredicted by the results for T6 (Fig. 7.4). The photosignal generated by thechromophore is propagated to N-terminal tail (residues 24–28), the α4 helix(residues 55–58), and the loop connecting β4 and β5 (residues 96–102). Thepropagation direction of the structural changes is consistent with the analysisby fragmentation and mass spectroscopy [36].

The NMR structures of PYP lacking the N-terminal 25 residues were re-ported under the dark and illuminated conditions [34]. In the NMR structureof the M intermediate, the three regions at residues 42–58, 63–78, and 96–103 (the amino-acid positions in intact PYP) are highly disordered to bringthe exposure of the hydrophobic chromophore to the solvent. Although thestructural changes in the α4 helix (residues 55–58) and the loop connecting

Fig. 7.4. A schematic model of the PYPM intermediate of intact PYP (solid ribbonmodel) [16]: The crystal structure of the dark state of intact PYP (1NWZ; line ribbonmodel) is superimposed on the model

Page 162: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

144 M. Kataoka and H. Kamikubo

β4 and β5 (residues 96–102) revealed in the present study are conserved inthe NMR structure, there are significant differences in the amplitudes of thestructural displacements. In our model, the chromophore is buried inside themolecule. The scattering profiles of the NMR structures were calculated forthe 20 NMR structures of PYPM and the dark state of the protein listed inthe 1ODV and 1XFQ PDB files, respectively. The profiles of the 20 NMRstructures of the dark state of PYP are similar to the observed profile ofthe dark state of T23, but the calculated profiles of the NMR structures ofPYPM are completely different from the observed scattering profile for T23.The calculated profiles for the NMR structures are also quite different fromeach other. The increases in the calculated radii of gyration of the NMR struc-tures (>2A) are also larger than the observed value (∼0.7 A) [41], indicatingthat the NMR structures are not as compact as the native solution structure.Although the reason that NMR produced such highly disordered structures isunclear, the poor distance restraints in these regions may not yield good con-vergent structures, resulting in the divergent features of the obtained models.Our structural model is supported by the molecular dynamics study [45].

7.4 Propagation Mechanism of the Structural Change

The first event after light absorption by PYP is a proton transfer from the E46to the chromophore [11, 46]. In the dark state, it is considered that the chro-mophore is deprotonated and E46 is protonated. Therefore, E46 was postu-lated as the direct proton donor for the chromophore. However, it is suggestedthat the protonation of the chromophore is independent of deprotonation ofE46 [47–49]. The large conformational change of PYPM is closely related tothe protonated state of E46. The key property in understanding these findingsis an interaction between the chromophore and E46. The recent high resolu-tion crystal structure analysis of PYP revealed that the hydrogen bond formedbetween the chromophore and E46 is an unusual strong short hydrogen bond(SSHB), where the distance between the phenolic oxygen of the chromophoreand the carboxylic oxygen of E46 is 2.58 A, much shorter than the standarddonor–acceptor distance [50, 51]. When the distance between the donor andthe acceptor becomes shorter, the electron orbitals overlap to form a quasico-valent bond called a low-barrier hydrogen bond (LBHB) [52]. It is proposedthat LBHBs are responsible for hydrolytic catalysis of serine proteases, andthat they are formed at the transition states of enzymes [52, 53]. It could bepossible that the SSHB in PYP is LBHB, although no direct evidence hasbeen demonstrated.

The photosignal is finally propagated to the N-terminal region. TheN-terminal region interacts with the C-terminal β6 of the chromophore-binding region. Hydrogen bond would play a major role in this interaction.We prepared the site-directed substitution mutants for the putative hydrogenbonding residues, E9A, E12A, and K110A. The lifetimes of PYPM of these

Page 163: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

7 Structure of the Photointermediate of Photoactive Yellow Protein 145

mutants were 0.98, 0.39, and 1.95 s, respectively [54]. The lifetimes of the wildtype PYP and T6 are 0.29 and 29 s, respectively. Therefore, the hydrogenbonds between these residues are not essential for the structural change. Onthe other hand, F6A substantially prolongs the lifetime of the M intermedi-ate, 19 s, and K123A produces no pigment. We assumed that the interactionbetween F6 and K123 is essential for the structural change. Both K123E andK123L do not change the photochemical property, indicating that the chargeis not essential but that the alkyl chain is important [55]. The substitution mu-tations of F6 dramatically change the properties of PYP except for F6Y [55].Therefore, the aromatic ring is essential at the position. Based on these ob-servations, we concluded that the weak CH/π hydrogen bond is responsiblefor the structural change [55]. It is interesting that both the unusual SSHBand the very weak CH/π hydrogen bond play essential roles for the photosig-nal transduction. In order to clarify the properties of these peculiar hydrogenbonds, the identification of the hydrogen atom position should be most es-sential. Neutron crystallography is the most promising method to identifythe hydrogen atom position [51]. For this purpose, the preparation of a largecrystal is an essential step and we succeeded in obtaining a large crystal ofPYP [56].

7.5 Summary

We developed a promising method for the analysis of high-angle X-ray solutionscattering combined with the fluctuation analysis. The method is especiallyuseful for the understanding of the structural change during the functionalexpression. In order to apply the method, it is essential to record high-angleX-ray scattering data with high accuracy, which became possible by usingthe third-generation synchrotron radiation and two-dimensional CCD-baseddetector. We succeeded in analyzing the solution structure of the functionalphotointermediate of PYP by high-angle scattering. PYP undergoes substan-tial conformational changes upon light absorption. The changes are propa-gated from the chromophore to N-terminal tail (residues 24–28), the α4 helix(residues 55–58), and the loop connecting β4 and β5 (residues 96–102). Theconformational change at the N-terminal tail is propagated through the hy-drogen bond network including both a very SSHB and a very weak CH/πhydrogen bond. The generated structural ensembles based on the dark statestructure by fluctuation analysis (the simplified molecular dynamics simu-lation) include the ensemble of the intermediate structures, indicating thatthe conformations at the functional intermediates are involved in an ensem-ble of the possible conformations of the resting state. Solution NMR analysisof the photointermediate is not necessarily consistent with high-angle X-rayscattering. The origin of the discrepancy should be clarified for a better un-derstanding of the intermediate structure.

Page 164: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

146 M. Kataoka and H. Kamikubo

Acknowledgments

The authors are grateful to Prof. Y. Imamoto (Kyoto University), Drs.N. Shimizu (SPring-8) and M. Harigai (Kyoto University) for their helpthroughout the study. This work is partly supported by the Grant-in-Aidof Scientific Research in a Priority Area, “Chemistry of Biological ProcessesCreated by Water and Biomolecules” to MK (15076208).

References

1. O. Glatter, O. Kratky, Small Angle X-ray Scattering (Academic, New York,1982)

2. L.A. Feigin, D.I. Svergun, Structure Analysis by Small-Angle X-Ray and NeutronScatteing (Plenum, New York, 1982)

3. D.I. Svergun, Biophys. J. 76, 2879 (1999)4. D.I. Svergun, M.V. Petoukhov, M.H.J. Koch, Biophys. J. 80, 2946 (2001)5. S.S. Funari, G. Rapp, M. Perbandt, K. Dierks, M. Vallazza, C. Betzel, V.A.

Erdmann, D.I. Svergun, J. Biol. Chem. 275, 31283 (2000)6. R. Kato, M. Kataoka, H. Kamikubo, S. Kuramits, J. Mol. Biol. 309, 227 (2001)7. B.A. Fedorov, J. Mol. Biol. 98, 341 (1975)8. C.A. Pickover, D.M. Engelman, Biopolymers 21, 817 (1982)9. T. Ueki, Y. Inoko, M. Kataoka, Y. Amemiya, Y. Hiragi, J. Biochem. 99, 1127

(1986)10. R. Zhang, P. Thiyagarajan, D.M. Tiede, J. Appl. Crystallogr. 33, 565 (2000)11. D.M. Tiede, R. Zhang, S. Seifert, Biochemistry 41, 6605 (2002)12. M. Kataoka, I. Nishii, T. Fujisawa, T. Ueki, F. Tokunaga, Y. Goto, J. Mol. Biol.

249, 215 (1995)13. M. Kataoka, Y. Goto, Fold. Des. 1, 107 (1996)14. M. Kataoka, K. Kuwajima, F. Tokunaga, Y. Goto, Protein Sci. 6, 422 (1997)15. M. Hirai et al., Biochemistry 43, 9036 (2004)16. H. Kamikubo, N. Shimizu, M. Harigai, Y. Yamazaki, Y. Imamoto, M. Kataoka,

Biophys. J. 92, 3633 (2007)17. T.E. Meyer, Biochim. Biophys. Acta 806, 175 (1985)18. W.W. Sprenger, W.D. Hoff, J.P. Armitage, K.J. Hellingwerf, J. Bacteriol. 175,

3096 (1993)19. J.L. Pellequer, K.A. Wager-Smith, S.A. Kay, E.D. Getzoff, Proc. Natl. Acad.

Sci. USA 95, 5884 (1998)20. G.E. Borgstahl, D.R. Williams, E.D. Getzoff, Biochemistry 34, 6278 (1995)21. Y. Imamoto, Y. Shirahige, F. Tokunaga, T. Kinoshita, K. Yoshihara,

M. Kataoka, Biochemistry 40, 8997 (2001)22. T.E. Meyer, E. Yakali, M.A. Cusanovich, G. Tollin, Biochemistry 26, 418 (1987)23. W.D. Hoff et al., Biophys. J. 67, 1691 (1994)24. Y. Imamoto, M. Kataoka, F. Tokunaga, Biochemistry 35, 14047 (1996)25. L. Ujj et al., Biophys. J. 75, 406 (1998)26. Y. Imamoto, M. Kataoka, F. Tokunaga, T. Asahi, H. Masuhara, Biochemistry

40, 6047 (2001)27. U.K. Genick et al., Science 275, 1471 (1997)

Page 165: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

7 Structure of the Photointermediate of Photoactive Yellow Protein 147

28. H. Ihee et al., Proc. Natl. Acad. Sci. USA 102, 7145 (2005)29. M.E. van Brederode et al., Biophys. J. 71, 365 (1996)30. S. Ohishi, N. Shimizu, K. Mihara, Y. Imamoto, M. Kataoka, Biochemistry 40,

2854 (2001)31. J.S. Khan, Y. Imamoto, M. Harigai, M. Kataoka, M. Terazima, Biophys. J. 90,

3686 (2006)32. Y. Hoshihara, Y. Imamoto, M. Kataoka, F. Tokunaga, M. Terazima, Biophys. J.

94, 2187 (2008)33. G. Rubinstenn et al., Nat. Struct. Biol. 5, 568 (1998)34. C. Bernard et al., Structure. 13, 953 (2005)35. B.C. Lee et al., J. Biol. Chem. 276, 20821 (2001)36. R. Brudler et al., J. Mol. Biol. 363, 148 (2006)37. R. Brudler, R. Rammelsberg, T.T. Woo, E.D. Getzoff, K. Gerwert, Nat. Struct.

Biol. 8, 265 (2001)38. A. Xie, L. Kelemen, J. Hendriks, B.J. White, K.J. Hellingwerf, W.D. Hoff, Bio-

chemistry 40, 1510 (2001)39. N. Shimizu, H. Kamikubo, K. Mihara, Y. Imamoto, M. Kataoka, J. Biochem.

132, 257 (2002)40. M. Harigai, S. Yasuda, Y. Imamoto, K. Yoshihara, F. Tokunaga, M. Kataoka,

J. Biochem. 130, 51 (2001)41. Y. Imamoto, H. Kamikubo, M. Harigai, N. Shimizu, M. Kataoka, Biochemistry

41, 13595 (2002)42. P.W. Schmidt, J. Appl. Crystallogr. 24, 414 (1991)43. B.L. de Groot et al., Proteins: Struct. Funct. Genet. 29, 240 (1997)44. D.I. Svergun, C. Baberato, M.H.J. Koch, J. Appl. Crystallogr. 28, 768 (1995)45. M. Shiozawa, M. Yoda, N. Kamiya, N. Asakawa, J. Higo, Y. Inoue, M. Sakurai,

J. Am. Chem. Soc. 123, 7445 (2001)46. Y. Imamoto et al. J. Biol. Chem. 272, 12905 (1997)47. B. Borucki et al., Biochemistry 44, 13650 (2005)48. B. Borucki, C.P. Joshi, H. Otto, M.A. Cusanovich, M.P. Heyn, Biophys J. 91,

2991 (2006)49. N. Shimizu, Y. Imamoto, M. Harigai, H. Kamikubo, Y. Yamazaki, M. Kataoka,

J. Biol. Chem. 281, 4318 (2006)50. S. Anderson, S. Crosson, K. Moffat, J. Acta Crystallogr. D 60, 1008 (2004)51. S.Z. Fisher et al., J. Acta Crystallogr. D 63, 1178 (2007)52. W.W. Cleland, M.M. Kreevoy, Science 264, 1887 (1994)53. P.A. Frey, S.A. Whitt, J.B. Tobin, Science 264, 1927 (1994)54. M. Harigai, M. Kataoka, Y. Imamoto, Photochem. Photobiol. 84, 1031 (2008)55. M. Harigai, M. Kataoka, Y. Imamoto, J. Am. Chem. Soc. 128, 10646 (2006)56. S. Yamaguchi, H. Kamikubo, N. Shimizu, Y. Yamazaki, Y. Imamoto,

M. Kataoka, Photochem. Photobiol. 83, 336 (2007)

Page 166: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

“This page left intentionally blank.”

Page 167: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

8

Time-Resolved Detection of IntermolecularInteraction of Photosensor Proteins

M. Terazima

Abstract. A recently developed new method to monitor reaction kinetics of inter-molecular interaction is reviewed. This method is based on the measurement of thetime-dependent diffusion coefficient using the pulsed-laser-induced transient grat-ing technique. Using this method, conformation change, transient association, andtransient dissociation on reactions are successfully detected. The principle and someapplications to studies on changes in the intermolecular interactions of photosensorproteins (e.g., photoactive yellow protein, phototropins, AppA) in the time domainare described. In particular, unique features of this time-dependent diffusion coeffi-cient method are discussed.

8.1 Introduction

Inter- and intraprotein (domain–domain) interactions play an important rolein many signal transduction processes of sensor proteins. For example, manysignaling proteins consist of modulator components that regulate input, out-put, and also protein–protein communication. They contain characteristictransmitter and receiver domains that transfer information within and be-tween proteins. Signaling pathways are assembled by arranging these domains.Therefore, revealing such interprotein interaction during the signaling path-way should be important for understanding the molecular mechanism of thesensor proteins. Furthermore, since the interprotein interaction is closely re-lated to the oligomerization of the protein, detection of oligomer formationduring the signaling process would be essential. In fact, reflecting the im-portance, there are many sensor proteins that exist in the oligomeric form.For example, the oligomerized state is stable for some PAS (PerArntSim) pro-teins, which are well-known regulators: e.g., a dimer of ARNT PAS-B domain,a dimer of the heme-binding PAS domain E. coli Dos (EcDos), and a decamerof PixD [1–3].

Page 168: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

150 M. Terazima

Nevertheless, it is not generally simple to detect the dynamics of the as-sociation/dissociation change induced by external stimulations, in particular,in real time. Although optical absorption change in the time domain (flashphotolysis method) has been frequently used for studying reaction dynamicsof proteins, one should always be careful with the fact that the whole proteinsize is very large compared to that of the chromophore. Since the absorptionspectrum of the chromophore is sensitive to only conformational change closeto the chromophore, structural changes far from the chromophore and changesin the interprotein interaction are frequently spectral silent processes.

Several techniques that can detect protein binding have been developed.For example, a gel chromatographic technique has been used to monitor theassociation state. However, it does not have any time resolution [3]. The sur-face plasmon resonance (SPR) method is another highly sensitive and widelyused method [4–8]. The principle of this technique is based on the refractiveindex change by the protein–protein binding and the refractive index depen-dence of the wavelength for the surface plasmon excitation. For the detection,a target protein must be fixed on a metal surface and an analyte moleculeis introduced on the surface. If protein association occurs, the refractive in-dex near the surface changes and it changes the resonance angle to excite thesurface plasmon. The SPR biosensor monitors this change in the resonanceangle. However, it usually takes several tens of minutes to accumulate proteinson the surface for the detection and this time response is not fast enough tostudy protein association of a chemically unstable intermediate species thatcould play a key role in the signal transduction process. Furthermore, sincethe target protein should be fixed on a metal surface, any possible interactionwith the metal surface could change the protein conformation or the reactiv-ity. Some other spectroscopic techniques such as NMR or IR are also verydifficult to apply to monitoring the protein–protein interaction in the timedomain for short-lived species.

Another physical property that may reflect an association state of a mole-cule is the transport property, such as the rotational relaxation rate or transla-tional diffusion coefficient. In particular, the translational diffusion coefficient(D) has been shown to be a good physical property reflecting the conforma-tional change and the intermolecular interaction. Because of its importancein the field of physical chemistry, many techniques, e.g., Taylor dispersion,capillary method, NMR method, and so on, have been developed to moni-tor molecular diffusion in the solution phase [9–13]. However, a difficulty inusing the diffusion process for detecting the transient interprotein interac-tion is again the slow time response. For example, it takes several hours formeasuring D by the Taylor dispersion method. This difficulty, the slow timeresolution of the traditional diffusion measurement, was overcome by using thepulsed-laser-induced transient grating (TG) technique [14–19]. In this chap-ter, the principle and some applications of photosensor proteins to studies onchanges in the intermolecular interactions in the time domain are reviewed.In particular, transient association and dissociation reactions are described.

Page 169: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

8 Time-Resolved Detection of Intermolecular Interaction 151

8.2 Principle

In the TG method, two pulsed laser beams are crossed at an angle θ withinthe coherence time so that an interference (grating) pattern is created with awavenumber q (Fig. 8.1) [14–24]:

q = 2π/Λ = 4π sin(θ/2)/λex, (8.1)

where Λ is the fringe length and λex is the wavelength of the excitation laser.The wavenumber q can be varied by varying θ. Photosensor proteins are pho-toexcited by this grating light, and chemical reaction is initiated. When aprobe beam is introduced to the interference region, a part of the light isdiffracted as the TG signal. When the absorption change at the probe wave-length is negligible, the TG intensity (ITG) is proportional to the square of therefractive index (δn) difference between the peak null of the grating pattern.

ITG = α(δn)2, (8.2)

where α is a constant representing the sensitivity of the experimental system.There are several reasons for the origin of the phase grating [24]. One of theimportant contributions is the temperature change of the medium inducedby the thermal energy released from the decay of excited states and from the

excitationpulses

TG signal

sample

L

probe beam

Concentration

Fig. 8.1. Schematic illustration of the TG experiment (upper) and the principleof diffusion measurement (lower). Lower: The white and black circles indicate thereactant and product molecules. The concentrations of the reactant and the productare spatially modulated by the sinusoidally modulated light intensity of the gratinglight. The fringe length Λ is also indicated

Page 170: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

152 M. Terazima

enthalpy change of the reaction (thermal grating; δnth). Furthermore, a changein absorption spectrum (population grating) and a change in molecular volume(volume grating) also contribute to the signal. The sum of the populationgrating and volume grating terms is called the species grating (δnspe) [24].The species grating signal intensity is given by the difference between δndue to the reactant (δnR) and product (δnP). Hence, the observed TG signal[ITG(t)] is expressed as

ITG(t) = α [δnth(t) + δnspe(t)]2

= α [δnth(t) + δnP(t) − δnR(t)]2 .(8.3)

The “product” in this equation does not necessarily mean the final prod-uct, but can be any molecule produced from the reactant at the time of ob-servation.

The temporal profile of δnth(t) is determined by the convolution integralbetween the thermal diffusion decay and intrinsic temporal evolution of thethermal energy [Q(t)].

δnth(t) = (dn/dT ) (WΔN/ρCp) (∂Q(t)/∂t) ∗ exp(−Dthq2t

), (8.4)

where ∗ represents the convolution integral, dn/dT is the refractive indexchange by the temperature variation of the solution, W is the molecular weight(g mol−1), ρ is the density (g cm−3) of the solvent, ΔN the molar density ofthe excited molecule (mol cm−3), and Dth is the thermal diffusivity.

The temporal evolution of the species grating component is determinedby the chemical reaction and protein diffusion processes. When there is nochemical reaction in the detection time window, and the molecular diffusioncoefficient (D) is time-independent, the temporal profile of the species gratingsignal can be calculated by the molecular diffusion equation. The Fouriercomponent at a wavenumber of q of the concentration profile decays with arate constant Dq2 for both the reactant and the product. Hence, the timedevelopment of the TG signal can be expressed by [15–19,23]

ITG(t) = α[δnP exp

(−DPq2t

)− δnR exp

(−DRq2t

)]2, (8.5)

where DR and DP are diffusion coefficients of the reactant and the product,respectively. Furthermore, δnR(>0) and δnP(>0) are, respectively, the initialrefractive index changes due to changes in reactant and product concentra-tions during the reaction.

When a chemical reaction including a conformation change of a proteintakes place during a time range of the signal detection, the apparent D of theprotein changes. The observed TG signal should be calculated from the diffu-sion equation with a concentration-dependent term. Describing the reactionby the following model,

Scheme1 R hν−→ I k−→ P,

Page 171: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

8 Time-Resolved Detection of Intermolecular Interaction 153

where R, I, P, and k represent, respectively, a reactant, an intermediatespecies, a final product, and the rate constant of the change, one may find thetime dependence of the refractive index as [23,25]

ITG = α

{δnR exp

(−DRq2t

)+[δnI+

δnPk

(Dp − DR) q2 − k

]exp

[−(DIq

2 + k)t]

−[

δnPk

(DP − DR) q2 − k

]exp

(−DPq2t

)}2

,

(8.6)

where δnI and DI are the refractive index change due to the formation of theintermediate species and the diffusion coefficient of the intermediate species,respectively. Here, it should be noted that δnP(t) describes the species gratingsignal of the product as well as the intermediate.

When proteins are dimerzied during the diffusion process, the apparent Dis also time dependent. For analyzing the observed TG signal, we may use thefollowing model.

Scheme 2 A hν−→ A∗ + A k−→ (A∗ : A),

where A∗ indicates an intermediate created by the photoexcitation and thedimer is formed between this intermediate (A∗) and the ground-state protein(A) with a rate constant k. Under the condition that the concentration of Ais sufficiently large so that it can be treated as a constant, we may find thetime dependence of the TG signal as

ITG = α

{δnR exp

(−DRq2t

)

+[δnI +

δnPk[A](Dp − DR) q2 − k[A]

]exp

[−(DIq

2 + k[A])t]

−[

δnPk[A](DP − DR) q2 − k[A]

]exp

(−DP q2t

)}2

.

(8.7)

The time range over which one can observe the protein diffusion depends onthe grating wavenumber q. For instance, the typical D of a globular proteinwith a size of myoglobin (18 kDa) is 10−10 m2 s−1 [14]. Hence, if one usesq2 = 1014 m−2, the signal disappears with a rate constant of Dq2 = 10 4 s−1:i.e., D of 100 μs after the photoexcitation can be detected. If one uses q2 =1010 m−2, the signal disappears with a rate constant of 1 s−1, and D within atime window of 1 s can be detected. These are the typical time ranges we canuse for detecting the protein diffusion dynamics.

Page 172: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

154 M. Terazima

8.3 Diffusion Coefficient

The diffusion coefficient is a physical property that represents the speed ofmolecular diffusion. Recently it was shown that D changes during chemicalreactions. Here, we describe the origin of the change in D. Intuitively, it maybe easily understood that D decreases when molecular size increases becauseof the association reaction. In some cases, the relationship between D and themolecular size is well described by the Stokes–Einstein equation. The Stokes–Einstein equation is expressed by [9–13]

D =kBT

aηr, (8.8)

where kB, T , η, a, and r are the Boltzmann constant, temperature, viscosity, aconstant representing the boundary condition between the diffusing moleculeand the solvent, and radius of the molecule, respectively. Hence, D decreaseswith increasing r. When molecular size decreases as a result of the dissociationreaction, D is expected to increase.

Although the molecular size is certainly an important factor that deter-mines D, D also depends on the intermolecular interaction between the mole-cule and the solvent. A clear example has been reported for chemical reactionof aromatic molecules [15–19]. It was found that the D values of organicradicals are much smaller than those of electronically closed shell moleculeswith similar sizes and shapes. This change was attributed to the enhancedintermolecular interaction between the radicals and solvent molecules. It wasfurther reported that D of cytochrome c in its native form is much larger thanthat in the unfolded state [25–27]. This difference was attributed to the largerintermolecular interaction between the protein and water due to the unfoldedconformation of the α-helices.

D has been sometimes expressed in terms of the hydrodynamic radius.However, we consider that “hydrodynamic radius” is not the proper term todescribe the change of D, because this is not a well-defined radius such asthe “radius of gyration,” which is clearly defined to show the molecular size.The hydrodynamic radius has just the same meaning as D as long as theStoke–Einstein relation holds good.

8.4 Time-Resolved Detection of Interprotein Interactions

Below, we describe some examples demonstrating that the time-resolved mea-surement of D is a suitable way for detecting the change in the interproteininteraction.

Page 173: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

8 Time-Resolved Detection of Intermolecular Interaction 155

8.4.1 Protein–Protein Interaction of the PhotoexcitedPhotoactive Yellow Protein

Change in D of a reaction intermediate protein during a chemical reaction wasfirst reported for the photochemical reaction of Photoactive Yellow Protein(PYP) [28–32]. PYP is a 14-kDa photoreceptor protein functioning in negativephototaxis of the purple sulfur bacterium Ectothiorhodospira halophila [33].For detecting light, it possesses a chromophore of p-hydroxycinnamyl boundvia a thioester bond to Cys69 [34,35]. Upon photoexcitation of PYP, the chro-mophore is photoisomerized from the trans form to the cis form to initiatethe photocyclic reaction [36–38]. The reaction dynamics of PYP has been ex-tensively studied by various methods [39–44]. The ground state species (pG)is initially converted to the first intermediates (pR1 and pR2), and then trans-formed to the second species (pB′ and pB). This pB species returns to pGwith lifetimes of 150 ms to 2 s (Fig. 8.2). One of the intermediates should in-teract with proteins in the bacterium to transfer the light information. Thisinformation transfer should stimulate the biological response. However, an in-teracting protein (or molecule) with the intermediate species of PYP is notknown. One of the reasons is the lack of experimental techniques to monitorthe protein association reaction with a time resolution better than 1 s (thelifetime of unstable transient species). The TG method was used to monitorthe intermolecular interaction with the transient intermediate species pB.

For demonstrating the time-resolved detection of intermolecular interac-tion, D of pB was measured with various molecules extracted from the bac-terium [45]. Before describing the effect of intermolecular interaction on D,the TG signal of PYP in the buffer solution without any additive is describedto show the principle and difference [28, 29]. The TG signal upon photoexci-tation of PYP rose quickly and then showed a weak, slow-rising componentcorresponding to pR1 → pR2 [30]. After this, the signal decayed to a certainintensity with a time constant Dthq2 and showed the growth–decay curvestwice (Fig. 8.3a). The decay component with Dthq2 should be the thermal

pG

pG*

pR2

pR1

[short-livedIntermediates]

pB'

pB

Fig. 8.2. A proposed photochemical reaction scheme of PYP

Page 174: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

156 M. Terazima

10−5 10−4 10−3 10−2 10−1

I TG

/a.u

.

t/s

a

b

pG diffusionpB diffusion

thermalgrating

pR2 pB' pB

Fig. 8.3. Typical TG signals (circles) after photoexcitation of PYP (a) in a buffersolution and (b) in the buffer with an eluted fraction from the bacterium. Therise–decay components in a few milliseconds to a few hundred milliseconds rangerepresent the protein diffusion signal. The enhancement of the diffusion peak of(b) is due to the intermolecular interaction between the pB species and DNA ofthe bacterium. The best-fitted curves by (8.9) are shown by the solid lines. Theassignments of the signal components are shown

grating component. The rising component represents the chemical reactionfrom pR2 to pB. It is the latest growth–decay curve that was attributed to theprotein diffusion processes of pG and pB. The presence of the rise–decay curveimplies that there are two diffusing species having different signs of the ampli-tude (δn). The assignment of the diffusing species was made from the sign ofthe refractive index change. It was found that the rate constant of the risingcomponent represents DpGq2 (DpG: diffusion coefficient of the pG species ofPYP) and that of the decaying component corresponds to DpBq2 (DpB: diffu-sion coefficient of the pB species) (DpG > DpB). This rise–decay componentis a clear indication that D of PYP changed by the photoexcitation.

The TG signal in Fig. 8.3a was expressed by [28,30]

ITG(t) = α[δnth exp

(−Dthq2t

)+ δn1 exp (−t/τ1) + δn2 exp (−t/τ2)

−δnpG exp(−DpGq2t

)+ δnpB exp

(−DpBq2t

)]2,

(8.9)

where the lifetimes τ1 and τ2 represent the pR2 → pB kinetics. The peak inthe latest time region (diffusion peak) appeared because DpG and DpB aredifferent. If the difference between DpG and DpB becomes smaller, two termsof δnpG exp(−DpGq2t) and δnpB exp(−DpBq2t) are cancelled and the signalintensity becomes weaker, because the signs of δnpG and δnpB are opposite.Hence the maximum amplitude of this peak is an indicator of the difference inD between the reactant and the product. For the quantitative measurementof D, the rate constants of the rise and decay component were determined

Page 175: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

8 Time-Resolved Detection of Intermolecular Interaction 157

by curve-fitting, and plotted against q2, and from the slopes of the plotsDpG and DpB were determined to be DpG = 1.3 × 10−10 m2 s−1 and DpB =1.2×10−10 m2 s−1. The observed reduction in D by the chemical reaction wasrather surprising. If the Stokes–Einstein relation is applicable, the ratio ofDpB/DpG = 0.92 means a volume expansion of 1.27 times. Since the partialmolar volume of PYP is estimated to be ca. 1,000 cm3 mol−1, the volume in-crease is 270 cm3 mol−1, which is unrealistically large. This reduction in D ofthe intermediate species was attributed to the enhanced intermolecular inter-action between PYP and water molecules due to the unfolding the N-terminalα-helices (diffusion sensitive conformation change) [31].

Next, in order to detect protein–protein interaction of the transient speciespB and molecules in the bacteria, the extracted solution from the bacteriawas separated into 20 fragments by chromatography and were added to thePYP solution [45]. The TG signal of the PYP solution with the first elutedsolution is shown in Fig. 8.3b. Most part of the signal was the same as thatwithout the protein solution. However, by the addition of the protein solutionthe amplitude of the diffusion peak was dramatically enhanced. Since thisamplitude reflected the difference in D between the reactant and the product,the larger peak amplitude should result from a larger reduction in DpB byadding the protein solution from the bacteria. From the signal, DpG = 1.3 ×10−10 m2 s−1 and DpB = 1.10 × 10−10 m2 s−1 were determined. The decreaseof DpB indicated that the pB species of PYP interacted with molecules in thesolution. A similar enhancement was observed by adding any fraction from theextracted solution. We investigated the target molecules in the solution andfound that DNA of the bacterium was bound to the pB species in this case.

8.4.2 Photoinduced Dimerization of AppA

Transient Diffusion Change

Another example of time-resolved detection of transient protein–protein in-teraction for a photosensor protein was reported for photochemical reaction ofAppA [46]. AppA is a light- and redox-responding regulator of photosynthesisgene transcription in Rb. sphaeroides, where it can be found in two differentfunctional forms [47–53]. Under anaerobic, low-light growth conditions, AppAis in a “dark-adapted” form which is able to bind and inactivate the repressorPpsR, thereby allowing the RNA polymerase to maximally transcribe photo-synthesis genes. Under aerobic highlight conditions or under strong blue lightillumination, FAD in AppA is photoexcited and AppA is transformed into asignaling state (“light-adapted” form), which is incapable of interacting withthe photosynthesis repressor PpsR. Under these conditions, there is a maximalrepression of the photosynthesis gene expression [47].

The isolated N-terminal BLUF domain exhibits a photocycle identical tothat observed with full-length AppA [48]. Photoexcitation of AppA involvinga singlet excited state in the flavin chromophore leads to the formation of

Page 176: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

158 M. Terazima

AppABLUF hν

AppABLUF*

excited state

AppABLUF

AppABLUF*-AppABLUF

Fig. 8.4. Photochemical reaction scheme of AppA. If the reaction is monitored bythe flash photolysis method, the spectrally red-shifted product AppABLUF* directlyreturns to the ground state (broken line). However, using the diffusion detectionmethod, the dimerization reaction takes place after the AppABLUF* formation

a red-shifted intermediate state (or signaling state) after 10 ns, which slowlydecays to the ground state with a lifetime of 30 min (Fig. 8.4) [49]. The red shiftwas attributed to altered π–π stacking interactions between the isoalloxazinering and a conserved tyrosine residue. The dark-state X-ray structure of theBLUF domain of AppA (AppABLUF) was determined at 2.3 A resolution [52],and it indicated that AppABLUF forms the dimer in the crystal through thehydrophobic interactions of a β-sheet of two monomers. The ground stateof AppABLUF exists as a dimer even in a very dilute solution [53]. Reactiondynamics of AppABLUF was monitored by the transient absorption technique.A detailed study showed that the absorption change indicated only the decayof the excited triplet state in a microsecond time range, and there was noother slow dynamics that may be expected for creating the signaling state.

The observed TG signal of AppABLUF after the photoexcitation is de-picted in Fig. 8.5. Initially, a weak, slow-rising component appeared with atime constant of ∼3.4 μ s [46]. After measuring the TG signal at different q2

it was concluded that the rising part of the TG signal represented a reactionphase of the protein, not the diffusion, e.g., the decay rate of the triplet stateof the chromophore, flavin adenine dinucleotide (FAD). After this rising com-ponent, the TG signal decayed to zero with a time constant of Dthq2. Thiswas the thermal grating component created by the thermal energy due to thenonradiative transition from the excited state of FAD.

After the thermal grating signal, the signal rose again and finally it decayedto the baseline. This rise–decay component depended on q2 (Fig. 8.6a) andthis q2 dependence is a clear indication that these components represent thediffusion processes. On the basis of considerations similar to the previousPYP case, it was concluded that this rise–decay feature of the diffusion signal

Page 177: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

8 Time-Resolved Detection of Intermolecular Interaction 159

40

30

20

10

0

I TG

/a.u

.

10−5 10−4 10−3 10−2 10−1

t /s

thermal diffusion

reactantdiffusion

productdiffusion

triplet state decay

Fig. 8.5. A typical TG signal after photoexcitation of AppABLUF. The assignmentsof the signal components are shown

indicated different D values between the reactant and the product, and theproduct diffuses more slowly than the reactant. A prominent feature of thissignal was that not only the rate but also the temporal profile of the signaldepended on q2. If D values of the reactant and the product were constants intime, and the product was created promptly, the time dependence should beexpressed by a combination of terms of exp(−Dq2t) (e.g., (8.5)). In this case,if the signal measured at various q2 values was plotted against q2t, the shapeof the signals should be identical. However, the signals were totally differentdepending on the q2 value (Fig. 8.6b). This behavior was explained by thetime-dependent diffusion.

For determining the rate constant, the observed TG signal was analyzedon the basis of the theoretical equation presented in the Principle section(8.6). In order to reduce the ambiguity of the fitting, some parameters wereindependently determined before the fitting. The method was the following:DR and DP were determined from the signal in a long time region withoutusing (8.6). It should be mentioned that after the reaction (conformationalchange or association/dissociation) completes, D should be time-independent.Therefore, the temporal profile of the TG signal after this time should beexpressed by a bi-exponential function (8.5), and, from the rate constants,DR and DP were determined to be 8.8 × 10−11 and 7.2 × 10−11 m2 s−1, re-spectively. Therefore, the product diffuses 1.22 times more slowly than thereactant. The determined DR is smaller than that of other proteins havinga similar size; e.g., the value for myoglobin (18 kDa) measured by the TGmethod is 10 × 10−11 m2 s−1 [14]. The molecular weight of the BLUF do-main of AppA is ∼15.5 kDa. This difference in D reflects the dimeric form of

Page 178: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

160 M. Terazima

1.2

1.0

0.8

0.6

0.4

0.2

0.0

I TG

/a.u

.I T

G/a

.u.

0.001 0.01 0.1t /s

q

1.2

1.0

0.8

0.6

0.4

0.2

0.0

0.1 1 10

q2t/1010m−2s

a

b

Fig. 8.6. (a) Grating wavenumber dependence of the TG signals after photoexci-tation of AppABLUF (0.95 mM). The signal intensity is normalized at the peak. Thearrow indicates the increase of q, and the q2 values are 4.5 × 1012, 5.6 × 1011, and1.3× 1011 m−2. (b) TG signals of the BLUF domain of AppA (0.95 mM) at variousq2 plotted against q2t

AppABLUF in solution [53]. Indeed, D of a protein having a molecular weightof ∼30 kDa (about the same size as the dimer of AppABLUF) was reported tobe 8.7 × 10−11 m2 s−1 (green fluorescent) [54]. The similar value of D to DR

of AppABLUF ensures the dimeric form of AppABLUF. Using these D values,the signals at various q2 were fitted by (8.6) well and the rate constant of theD change was determined to be k−1 = 4.5ms at 0.95 mM.

Page 179: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

8 Time-Resolved Detection of Intermolecular Interaction 161

Origin of Diffusion Change

Why did D of the product decrease? The origin of the D change was investi-gated using the kinetics. There are mainly two possible origins of the observedD change as described in Sect. 8.3. One possible explanation is the conforma-tional change of the protein, which leads to an increase in the interactionbetween the solvent and the protein. As demonstrated by the PYP reaction,D of the intermediate could be smaller than that of the ground-state species.Another possible explanation for the large reduction in D is the dimerizationof the BLUF domain after the photoreaction. (Since AppABLUF already existsas a dimer in the ground state even in a very dilute solution [52, 53], the for-mation of the dimer in this case means the tetramer formation in the signalingstate. However, we call this process “dimerization” because this process is abi-molecular reaction.)

To examine these possibilities, the TG signals at various AppABLUF con-centrations were examined. If the dimerization were the main cause of the dif-ference in D, this reaction rate should be slower at a lower concentration. Onthe other hand, if a conformational change was responsible for the reduction inD, the temporal profile of the TG signal should not depend on concentration,besides the absolute intensity. Under a low q2 condition (q2 = 3.9×1010 m−2),the temporal profile of the diffusion signal was relatively similar at any concen-tration. At this low q2, the diffusion peak was reproduced by a bi-exponentialfunction with DR = 8.8 × 10−11 m2 s−1 and DP = 7.2 × 10−11 m2 s−1 after80 ms at any concentration. Therefore the final product should be the sameat all concentrations after a sufficiently long time.

On the other hand, in a fast timescale, the temporal profile of the TG sig-nals changed very drastically with the concentration. The signal became an ap-proximately single exponential decay as the concentration decreased (Fig. 8.7).Considering that the diffusion peak arises as a result of the difference betweenDR and DP, one may understand that the nearly single exponential behaviorindicates a small change in D in this time range. As DR and the final DP arealways constant as shown above, the small change in D should be interpretedin terms of a slower rate of change in DP with decreasing concentration. Thissingle exponential behavior provided us with another important information;i.e., D of the initially created product was similar to DR [DI = DR in (8.6)].This concentration dependence of the TG profile and the 1.22 times decreasein D (i.e., about two times increase in molecular volume) in the product statesupport the dimerization mechanism in the excited state of this protein.

For producing the dimer, there may be two possible reaction schemes: Thephototransformed AppABLUF (AppABLUF*) is associated with the groundstate AppABLUF to yield a dimer (Scheme 3), or two AppABLUF* form thedimer (Scheme 4).

Scheme 3 AppABLUF ∗ +AppABLUF → (AppABLUF ∗ −AppABLUF),Scheme 4 AppABLUF ∗ +AppABLUF∗ → (AppABLUF∗)2.

Page 180: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

162 M. Terazima

2.0

1.5

1.0

0.5

0.0

I TG

/a.u

.

0.0120.0100.0080.0060.0040.0020.000t /s

Fig. 8.7. Concentration dependence of the TG signals of AppABLUF at q2 = 1.3 ×1012 m−2. The arrow indicates the increase of the concentration: 0.95, 0.48, 0.31,and 0.17 mM (from upper to lower curves). The gray lines are the best-fitted curveby (8.7)

These possibilities were distinguished by measuring the laser power depen-dence of the rate constant. If the concentration of AppABLUF is high enough,compared to that of AppABLUF*, the reaction of Scheme 3 can be representedby the pseudo-first-order reaction and the rate constant of this reaction shouldbe essentially independent of the laser power. On the other hand, the reactionof Scheme 4 should be the second-order reaction on the phototransformedAppA so that the rate depends on the laser power; that is, the profile shouldbe changed by changing the laser power. From the laser power dependence,it was concluded that the photoexcited AppABLUF (AppA∗

BLUF) is associatedwith the ground-state AppABLUF to yield the dimer.

Kinetics of Dimer Formation

The dimer formation rate k was determined by fitting the TG signal at variousconcentrations using (8.7). The rate constant k decreased as the concentra-tion decreased. From the slope of the plot of k vs. concentration and therelation k = ki[AppA], we determined the second-order rate constant ki to be∼2.5 × 105 M−1 s−1. Interestingly, this value is much smaller than that of adiffusion-controlled reaction (∼109 M−1 s−1) calculated by the Smolochowski–Einstein equation for a bimolecular reaction in solution [55]. This differenceindicated that the collision between two protein molecules is not the solecriterion for the aggregation process; i.e., their relative orientations dictateadditional constraints, which slow down the rate of the reaction by 4 ordersof magnitude.

Page 181: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

8 Time-Resolved Detection of Intermolecular Interaction 163

This photoinduced dimer finally dissociates to the original species, becausethe TG signal is reproducible when the repetition rate of the excitation is lowenough. This leads to the conclusion that there is no covalent bond formationin the aggregated state. This was the first report showing the dimerizationrate of photosensor proteins in the short-lived signaling process. Later, theorigin of the photoinduced association was attributed to the exposure of thehydrophobic surface by the initial reaction [56].

8.4.3 Photoinduced Dimerization and Dissociationof Phototropins

Dimerization Reaction

In the previous sections, protein–protein association reactions were described.However, not only the association but also the dissociation reaction was re-ported for a photosensor protein; phototropins are unique system becauseassociation and dissociation reactions upon photoexcitation are observed si-multaneously [57, 58]. Phototropins (phot1 and phot2) are blue light recep-tors in higher plants for regulating phototropism, chloroplast relocations, andstomatal opening [59]. All these are major regulation mechanisms of the pho-tosynthetic activities. Both proteins, phot1 and phot2, are homologous flavo-proteins and contain two LOV (light–oxygen–voltage sensing) domains (LOV1and LOV2), a typical serine/threonine kinase at the C-terminus, and one linkerregion connecting the LOV2 and the kinase domains acting as light-regulatedprotein kinase [60]. Both LOV domains bind a flavin mononucleotide (FMN)as chromophore [61]. The mechanism and the kinetics of the reaction havebeen attracting much attention recently [62–69].

The reaction kinetics has mainly been studied by monitoring the absorp-tion change of the chromophore [63–66]. Upon blue light illumination, theground state LOV2 possessing the absorption maximum at 447 nm (D447) isconverted to a species with a broad absorption spectrum (L660) [67]. Thischange is attributed to the creation of the excited triplet state through theintersystem crossing from the photoexcited singlet state. This broad spectrumchanges to a blue-shifted absorption spectrum peaked at 390 nm (S390) witha lifetime of 4 μs (for phot1LOV2 of Avena) [67]. This species was assigned tothe FMN–cysteinyl adduct, in which the sulfur covalently binds to the C(4a)carbon of the isoalloxazine ring of FMN. This adduct is stable for tens ofseconds before returning to the ground state (Fig. 8.8) [68].

The assignment of this product has been confirmed by NMR and X-raycrystallography [69, 70]. It is believed that this state is the signaling state.Therefore, as long as the reaction kinetics is monitored by UV–vis spec-troscopy, the signaling state is formed with a lifetime of a few microsecondsand no significant change has been reported after this process. Photochemicalreactions of phot1 and phot2 were studied by the TG method and a significantchange in the association state was observed mainly for phot1LOV2.

Page 182: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

164 M. Terazima

hnD447

S390

L660

D447S390−D447

S390+D447

Fig. 8.8. Photochemical reaction scheme of the phototropin LOV domain(phot1LOV2). If the reaction is monitored by the flash photolysis method, the S390

intermediate directly returns to the ground state (broken line). However, using thediffusion detection method, the dimerization reaction takes place after the formation

10

8

6

4

2

0

I TG

/a.u

.

10−6 10−5 10−4 10−3 10−2 10−1 100

t /s

thermal diffusion

reactant diffusion

productdiffusionadduct formation

Fig. 8.9. A TG signal (broken line) of phot1LOV2 at 50 μM and q2 = 3.4×1010 m−2.The best-fitted curve to the observed TG signal based on the two state model (8.6)is shown by the solid line. The assignments of the signal components are shown

A typical TG signal of a phot1LOV2 domain observed at 50 μM and atq2 = 3.4×1010 m−2 is shown in Fig. 8.9. The signal consisted of a rapid decayin microseconds, following rise and decay, and a peak in a time region oflonger than milliseconds. The TG signal in the whole time range was expressedby [57,58,71,72]

ITG(t) = α[δn1 exp (−k1t) + δn2 exp

(−Dthq2t

)+ δnspe(t)

]2, (8.10)

Page 183: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

8 Time-Resolved Detection of Intermolecular Interaction 165

where k1 > k2. The faster decay time constant k1 was determined to be1.9 μs. This value did not depend on q2. On the basis of the comparisonwith rate constants reported before, the 1.9 μs dynamics was attributed tothe conversion process from D447 to S390. The second term represented thethermal grating term. The third term δnspe(t) represented the species gratingsignal appearing in the longer time region, and this δnspe(t) signal reflectedthe chemical reaction kinetics as well as the molecular diffusion process. Thetemporal profile of this part depended on the q2 value and the concentrationin complex ways. At a low concentration ([LOV] = 50 μM), the signal afterthe thermal grating decayed to the base line monotonously in the high-q2

range (q2 > 5 × 1012 m−2) (Fig. 8.10). This decay was expressed by a singleexponential function:

δnspe(t) = δn3 exp (−k3t) . (8.11)

Since this rate constant depended on the q2 value (e.g., Fig. 8.10), this compo-nent was certainly originated by the molecular diffusion process. If a productwas formed by the photoexcitation, the molecular diffusion of the reactantand the product should be observed. This single exponential decay at a highq2 indicated that D’s of the reactant (D447) and the product (S390) were thesame (DR = DP); i.e., D did not change upon the reaction in this observationtime range. From the rate constant of the exponential fitting and q2 value,D(= DR = DP) was calculated to be 9.8 ×10−11 m2 s−1. Since D is one of the

6

5

4

3

2

1

0

I TG

/a.u

.

10−4 10−3 10−2 10−1 100

t /s

q

Fig. 8.10. Grating wavenumber (q) dependence of the TG signals (broken lines) ofa 50 μM phot1LOV2 solution. The arrow indicates the increase of q. The q2 valuesare 4.5 × 1010, 7.3 × 1010, 3.4 × 1011, 6.3 × 1011, and 5.3 × 1012 m−2 in the order ofthe amplitude. The signals representing the molecular diffusion processes are shown,and these signals are normalized at the initial part of the diffusion signal

Page 184: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

166 M. Terazima

quantities that represent the global molecular structure of proteins, this factof DR = DP suggested that phot1LOV2 does not change the conformationsignificantly upon photoreaction within approximately 1 ms time range.

The temporal profile changed at a relatively low q2 condition (Fig. 8.10); agrowth–decay signal (diffusion peak) appeared. Similar to the results describedin the previous sections, the rise and decay components of the TG signal wereattributed to the molecular diffusion processes of the reactant [ground stateprotein; (D447)] and the photoproduct, respectively; i.e., the faster rate of therising component than the rate of decay indicated that the product diffusesmore slowly than the reactant (DR > DP) in this time range.

The drastic change of the profile depending on q2 was rationalized by thetime dependence of D. The temporal profile of the TG signal was analyzed us-ing (8.6). For analyzing the signal, some of the parameters were independentlydetermined. For example, DR was fixed at 9.8 × 10−11 m2 s−1, which was ob-tained from the high q2 signal (Fig. 8.10). The determined DR of phot1LOV2is a typical value for a protein of this size. This fact suggested that phot1LOV2existed in a monomeric form in the solution at this concentration. Secondly,as noted above, the final DP was determined to be DP = 8.0 × 10−11 m2 s−1

from the signal in a long time range. By using these parameters, the observedTG signal was reproduced very well at various q2 values using a single reactionrate k. The time constant of the change determined from the fitting is 40 msat 50 μM. The photoreaction process with the lifetime of 1.9 μs accompanyingthe adduct formation (S390) should be a trigger for this diffusion change.

Possible explanations for the reduction of D were a dimerization reactionof the monomeric phot1LOV2 or the conformation change upon the photore-action. The origin of the change of D was investigated by the concentrationdependence. In a lower q2 range than 7.0× 1010 m−2; i.e., in a relatively longtime region for the diffusion signal, the temporal profile was rather insensi-tive to the concentration, and they were reproduced well by a bi-exponentialfunction with DP = 9.8 × 10−11 m2 s−1 and DP = 8.0 × 10−11 m2 s−1 after200 ms. Therefore, the product with the final DP was independent of the con-centration at least after 200 ms. On the other hand, in a middle q2 range(q2 = 6.3 × 1011 m−2), the temporal profiles depended on the concentrationsignificantly. In particular, the relative intensity of the diffusion peak withrespect to the thermal grating intensity decreased with decreasing the con-centration (Fig. 8.11). Considering that the diffusion peak appeared as a resultof the difference between DP and DR, one may find that the change in DP

is smaller in this time range for a dilute sample. This change should be dueto the slower rate of the DP change with decreasing concentration. This con-centration dependence of the rate indicated that more than one molecule isinvolved in the D change process. The 1.8 times increase in the molecularvolume suggested that dimerization is a cause of the D change.

From the laser power dependence, the reaction scheme was written as

LOV hν−→ LOV∗ k−→ (LOV∗ − LOV),

Page 185: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

8 Time-Resolved Detection of Intermolecular Interaction 167

80

60

40

20

0

I TG

/a.u

.

0.120.100.080.060.040.020.00

t /s

concentration

Fig. 8.11. Concentration dependence of the TG signal (broken lines) measured atq2 = 6.3×1011 m−2 with the concentrations of 40, 60, 70, 80, 120, and 190 μM in theorder of the concentration increase shown by the arrow. The signals are normalizedat the initial part of the diffusion signal. The smooth solid lines are the best fittedcurves

where k is a bimolecular reaction rate and may be written as k2[LOV], wherek2 is the intrinsic bimolecular reaction rate constant, and [LOV] is the concen-tration of phot1LOV2. This scheme is identical to Scheme 2. The very good fitof the observed signal by (8.7) implies that the above Scheme 2 is appropriateto describe the dimerization process.

From the slop of the plot of k against [LOV], k2 is determined to be6.6×105 M−1 s−1. This value is much smaller than that of the diffusion-limitedreaction rate calculated from DR and the reaction distance [55]. This smallk2 suggests that the dimerization reaction occurs only at a specific relativeorientation of two phot1LOV2 monomers.

The light-induced dimer should eventually dissociate to return to themonomers, because no permanent change was observed. It may be reasonableto assume that the dimer dissociates when the photoadduct state of LOV2goes back to the ground state. We should emphasize that this TG techniquefor the D measurement in the time domain has been the only one techniquethat can detect such transient dimer formation.

Photodissociation Reaction

In the previous section, the protein association reaction upon photoexcitationwas described; DP was smaller than DR. However, at a higher concentra-tion, the opposite change was observed. Figure 8.12 depicts the concentrationdependence of the signal in the concentration range 40–250 μM. When theconcentration was low enough, the species grating signal decayed single ex-ponentially. This feature indicated that the molecular diffusion process was

Page 186: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

168 M. Terazima

4

3

2

1

0

I TG

/a.u

.

543210t /ms

reactant

product

concentration

Fig. 8.12. Concentration dependence of the TG signals (broken lines) with theconcentrations of 56, 110, 180, 200, and 300 μM (in the order of the arrow) measuredat q2 = 7.9×1012 m−2. The signals are normalized at the initial part of the diffusionsignal. The best-fitted curves to the observed TG signals by the two state model(8.7) are shown by the solid lines

faster than the dimerization reaction on this timescale. When the concentra-tion was increased, the signal showed the growth–decay feature (Fig. 8.12).The signs of δn of the rise and the decay components were, respectively, pos-itive and negative, which was opposite to what we observed for the dilutesample. Therefore, the rising component was attributed to the diffusion of aproduct and decay to that of the reactant. Apparently, from the rates of therise and decay components one may easily find that the product diffusion isfaster than that of the reactant at high concentrations (DR < DP).

The temporal profile was again fitted by (8.6). It was found that thesignal was reproduced almost perfectly with D of the reactant at the lowconcentration (DR = 8.0 × 10−11 m2 s−1), DI =DR, D of the product (DP =9.8×10−11 m2 s−1), and k−1 = 300 μs. One should note that, from the resultsof the previous section, D of the dimer and the LOV monomer are 8.0×10−11

and 9.8 × 10−11 m2 s−1, respectively. Therefore, at these concentrated solu-tions, the reactant existed in a dimeric form and the product is a monomer.The observed TG signal indicated that the dimer was dissociated to yieldthe monomer with a time constant of 300 μs upon the photoexcitation. Thereaction detected by this method is summarized in Fig. 8.13.

8.4.4 Diffusion Detection of Interprotein Interaction

In the previous sections, several examples were reviewed to show that the dif-fusion change is sensitive enough for the detection of the protein–protein inter-action. This method could be called the diffusion detected biosensor method.Characteristic features of this method are discussed in the following.

Page 187: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

8 Time-Resolved Detection of Intermolecular Interaction 169

LOV2

LOV2

LOV2

LOV2

LOV2

LOV2

1.9 μs

1.9 μs 300 μs

LOV2

LOV2

k2[LOV]

LOV2

LOV230 s

LOV2

LOV2

30 s

LOV2LOV2

a

b

Fig. 8.13. Schematic showing the photoreaction process of phot1LOV2 detected byTG: (a) light-induced association of two monomers and (b) light-induced dissocia-tion of a dimeric form

First, the most prominent character of this method is the fast time re-sponse. The time response of this TG method is fast enough to detect tran-sient protein association or protein dissociation reactions. This technique canbe used for the measurement of the binding rate constant in real time. Itshould be noted that our TG technique monitors sensitively the refractiveindex change caused only by the creation of the photoexcited state, whereasgel chromatography monitors all proteins in the solution. It might be difficultto detect the dimer contribution among the whole proteins by the convention-ally used gel chromatography unless the population of the dimer is dominant.Moreover, while covalently linked or stable noncovalently linked protein aggre-gates may be detected by size exclusion liquid chromatography, a noncovalentprotein aggregate that is formed by a weak hydrophobic or hydrogen-bondinteraction may not be detectable because of a possible dissociation duringthe elution through the column.

Second, not only protein–protein interaction but any intermolecular in-teraction that changes D can be detected. Protein association changes theradius of the diffusing species, which leads the changes in D. However, D isdetermined not only by these factors, but also by the conformation of theprotein or the intermolecular interaction. This is a characteristic comparedto the SPR method, in which a refractive index change by the association isnecessary. Since the small molecular binding to a protein may not change therefractive index, this process should be silent for the SPR method.

Third, compared with the SPR method, it is a big advantage that thetarget protein need not be fixed on a metal surface. The intermolecular inter-action can be detected in the solution phase. Hence this method can be usedconveniently without pretreatment of the sample, such as fixing on a metal

Page 188: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

170 M. Terazima

surface. For example, the protein activity can be checked during the proteinseparation if this system is combined with a column chromatograph. Since wecan avoid the protein contact with a metal surface, any possible denaturationor inactivation by the surface can be avoided.

Fourth, since the diffusion coefficient is not sensitive to the temperaturefluctuation during the measurement, a precise temperature control is not re-quired. This merit may be very useful compared with the SPR technique,which is very sensitive to the temperature so that the sample temperatureshould be kept constant precisely during the measurement.

Fifth, solvent properties do not affect the measurement by this system atall. Hence, solvent can be changed without any limitation. This is also anadvantage over the SPR method, in which the refractive index of the solventis an important property for the experiment.

We believe that these prominent characteristics of the diffusion detectedbiosensor are important for studying intermolecular interaction of sensor pro-teins in the time domain and will be used for many cases to reveal theiressential features.

Acknowledgments

The author is deeply indebted to the coauthors of the papers cited in thisarticle.

References

1. B. Card, P.J.A. Erbel, K.H. Gardner, J. Mol. Biol. 353, 664 (2005)2. H.J. Park, C. Suquet, J.D. Satterlee, C. Kang, Biochemistry 43, 2738 (2004)3. K. Okajima, S. Yoshihara, Y. Fukushima, X. Geng, M. Katayama, S. Higashi,

M. Watanabe, S. Sato, S. Tabata, Y. Shibata, S. Itoh, M. Ikeuchi, J. Biochem.137, 741 (2005)

4. D.A. Schultz, Curr. Opin. Biotechnol. 14, 13 (2003)5. J.M. McDonnell, Curr. Opin. Chem. Biol. 5, 572 (2001)6. M. Fivash, E.M. Towler, R.J. Fisher, Curr. Opin. Biotechnol. 9, 97 (1998)7. Z. Salamon, H.A. Macleod, G. Tollin, Biochim. Biophys. Acta 1331, 131 (1997)8. I.L. Medintz, G.P. Anderson, M.E. Lassman, E.R. Goldman, L.A. Bettencourt,

J.M. Mauro, Anal. Chem. 76, 5620 (2004)9. E.L. Cussler, Diffusion (Cambridge University Press, Cambridge, 1997)

10. H.J.V. Tyrrell, K.R. Harris, Diffusion in liquids (Butterworth, London, 1984)11. G.I. Taylor, Proc. Roy. Soc. A 219, 186 (1953)12. K.M. Berland, Methods Mol. Biol. 261, 383 (2004)13. R. Pecora, Dynamic Light Scattering (Plenum, London, 1985)14. N. Baden, M. Terazima, Chem. Phys. Lett. 393, 539 (2004)15. M. Terazima, N. Hirota, J. Chem. Phys. 98, 6257 (1993)16. M. Terazima, K. Okamoto, N. Hirota, J. Phys. Chem. 97, 13387 (1993)17. M. Terazima, K. Okamoto, N. Hirota, J. Chem. Phys. 102, 2506 (1995)18. K. Okamoto, M. Terazima, N. Hirota, J. Chem. Phys. 103, 10445 (1995)

Page 189: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

8 Time-Resolved Detection of Intermolecular Interaction 171

19. M. Terazima, Acc. Chem. Res. 33, 687 (2000)20. H.J. Eichler, P. Gunter, D.W. Pohl, Laser induced dynamic gratings (Spirnger,

Berlin, 1986)21. M. Terazima, Adv. Photochem. 24, 255 (1998)22. M. Terazima, J. Photochem. Photobiol. C 3, 81 (2002)23. M. Terazima, Phys. Chem. Chem. Phys. 8, 545 (2006)24. M. Terazima, N. Hirota, S.E. Braslavsky, A. Mandelis, S.E. Bialkowski,

G.J. Diebold, R.J.D. Miller, D. Fournier, R.A. Palmer, A. Tam, Pure Appl.Chem. 76, 1083 (2004)

25. S. Nishida, T. Nada, M. Terazima, Biophys. J. 87, 2663 (2004)26. T. Nada, M. Terazima, Biophys. J. 85, 1876 (2003)27. S. Nishida, T. Nada, M. Terazima, Biophys. J. 89, 2004 (2005)28. K. Takeshita, N. Hirota, Y. Imamoto, M. Kataoka, F. Tokunaga, M. Terazima,

J. Am. Chem. Soc. 122, 8524 (2000)29. K. Takeshita, Y. Imamoto, M. Kataoka, F. Tokunaga, M. Terazima, Biochem-

istry 41, 3037 (2002)30. K. Takeshita, Y. Imamoto, M. Kataoka, K. Mihara, F. Tokunaga, M. Terazima,

Biophys. J. 83, 1567 (2002)31. J.S. Khan, Y. Imamoto, M. Harigai, M. Kataoka, M. Terazima, Biophys. J. 90,

3686 (2006)32. Y. Hoshihara, Y. Imamoto, M. Kataoka, F. Tokunaga, M. Terazima, Biophys.

J. 94, 2187 (2008)33. T.E. Meyer, Biochem. Biophys. Acta 806, 175 (1985)34. G.E.O. Borgstohl, D.R. Williams, E.D. Getzoff, Biochemistry 34, 6278 (1995)35. W.D. Hoff, P. Dux, K. Hard, B. Devreese, I.M. Nugteren-Roodzant,

W. Crielaard, R. Boelens, R. Kaptein, J. van Beeumen, K.J. Hellingwerf, Bio-chemistry 33, 13959 (1994)

36. R. Kort, H. Vonk, X. Xu, W.D. Hoff, W. Crielaard, K.J. Hellingwerf, FEBSLett. 382, 73 (1996)

37. U.K. Genick, G.E.O. Borgstahl, K. Ng, Z. Ren, C. Pradervand, P.M. Burke,V. Srajer, T. Teng, W. Schildkamp, D.E. McRee, K. Moffat, E.D. Getzoff, Sci-ence 275, 1471 (1997)

38. Brudler,R., R. Rammelsberg, T.T. Woo, E.D. Getzoff, K. Gerwert, Nat. Struct.Biol. 8, 265 (2001)

39. W.D. Hoff, I.H.M. van Stokkum, H.J. van Ramesdonk, M.E. van Brederode,A.M. Brouwer, J.C. Fitch, T.E. Meyer, R. van Grondelle, K.J. Hellingwerf,Biophys. J. 67, 1691 (1994)

40. P. Dux, G. Rubinstenn, G.W. Vuister, R. Boelens, F.A.A. Mulder, K. Hard,W.D. Hoff. A.R. Kroon, W. Crielaard, K.J. Hellingwerf, R. Kaptein, Biochem-istry 37, 12689 (1998)

41. Y. Imamoto, H. Koshimizu, K. Mihara, O. Hisatomi, T. Mizukami,K. Tsujimoto, M. Kataoka, F. Tokunaga, Biochemistry 40, 4679 (2001)

42. G. Rubinstenn, G.W. Vuister, F.A.A. Mulder, P. Dux, R. Boelens,K.J. Hellingwerf, R. Kaptein, Nat. Struct. Biol. 5, 568 (1998)

43. M.E.Van Brederode, W.D. Hoff, I.H.M. van Stokkum, M. Groot,K.J. Hellingwerf, Biophys. J. 71, 365 (1996)

44. K.J. Hellingwerf, J. Hendriks, T. Gensch, J. Phys. Chem. A, 107, 1082 (2003)45. J.S. Khan, Y. Imamoto, Y. Yamazaki, M. Kataoka, F. Tokunaga, M. Terazima,

Anal. Chem. 77, 6625 (2005)

Page 190: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

172 M. Terazima

46. P. Hazra, K. Inoue, W. Laan, K.J. Hellingwerf, M. Terazima, Biophys. J. 91,654 (2006)

47. S. Masuda, C.E. Bauer, Cell 110, 613 (2002)48. B.J. Kraft, S. Masuda, J. Kikuchi, V. Dragnea, G. Tollin, J.M. Zaleski, C.E.

Bauer, Biochemistry 42, 6726 (2003)49. M. Gauden, S. Yeremenko, W. Laan, I.H. van Stokkum, J.A. Ihalainen, R. van

Grondelle, K.J. Hellingwerf, J.T. Kennis, Biochemistry 44, 3653 (2005)50. S. Masuda, K. Hasegawa, T.A. Ono, Biochemistry 44, 1215 (2005)51. W. Laan,T. Bednarz, J. Heberle, K.J. Hellingwerf, Photochem. Photobiol. Sci.

3, 1011 (2004)52. S. Anderson, V. Dragnea, S. Masuda, J. Ybe, K. Moffat, C. Bauer, Biochemistry

44, 7998 (2005)53. W. Laan, M. Gauden, S. Yeremenko, R. van Grondelle, J.T.M. Kennis,

K.J. Hellingwerf, Biochemistry 45, 51 (2006)54. R. Swaminathan, C.P. Hoang, A.S. Verkman, Biophys. J. 72, 1900 (1997)55. P. Atkins, J. Paula, Physical Chemistry (Oxford University Press, Oxford, 2004)56. P. Hazra, K. Inoue, W. Laan, K.J. Hellingwerf, M. Terazima, J. Phys. Chem. B

112, 1494 (2008)57. Y. Nakasone, T. Eitoku, D. Matsuoka, S. Tokutomi, M. Terazima, Biophys. J.

91, 645 (2006)58. Y. Nakasone, T. Eitoku, D. Matsuoka, S. Tokutomi, M. Terazima, J. Mol. Biol.

367, 432 (2007)59. E. Huala, P.W. Oeller, E. Liscum, I.S. Han, E. Larsen, W.R. Briggs, Science

278, 2120 (1997)60. W.R. Briggs, E. Huala, Annu. Rev. Cell Dev. Biol. 15, 33 (1999)61. J.M. Christie, P. Reymond, G.K. Powell, P. Bernasconi, A.A. Raibekas,

E. Liscum, Science 282, 1698 (1998)62. J.M. Christie, M. Salomon, K. Nozue, M. Wada, W.R. Briggs, Proc. Natl Acad.

Sci. USA 96, 8779 (1999)63. J.A. Jarriol, H. Gabrys, J. Capel, J.M. Alonso, J.R. Ecker, A.R. Cashmore,

Nature 410, 952 (2001)64. T. Kagawa, T. Sakai, N. Suetsugu, K. Oikawa, S. Ishiguro, T. Kato, Science

291, 2138 (2001)65. T. Kinoshita, M. Doi, N. Suetsugu, T. Kagawa, M. Wada, K. Shimazaki, Nature

414, 656 (2001)66. T.E. Swartz, S.B. Corchnoy, J.M. Christie, J.W. Lewis, I. Szundi, W.R. Briggs,

R.A. Bogomolni, J. Biol. Chem. 276, 36493 (2001)67. T. Kottke, J. Heberle, D. Hehn, B. Dick, P. Hegemann, Biophys. J. 84, 1192

(2003)68. T.A. Schuttrigkeit, C.K. Kompa, M. Salomon, W. Rudiger, M.E. Michel-

Beyerle, Chem. Phys. 294, 501 (2003)69. J.T.M. Kennis, S. Crosson, M. Gauden, I.H.M. van Stokkum, K. Moffat, R. van

Grondelle, Biochemistry 42, 3385 (2003)70. E. Schleicher, R.M. Kowalczyk, C.W.M. Kay, P. Hegemann, A. Bacher,

M. Fischer, R. Bittl, G. Richter, S. Weber, J. Am. Chem. Soc. 126, 11067(2004)

71. T. Eitoku, Y. Nakasone, D. Matsuoka, S. Tokutomi, M. Terazima, J. Am. Chem.Soc. 127, 13238 (2005)

72. T. Eitoku, Y. Nakasone, K. Zikihara, D. Matsuoka, S. Tokutomi, M. Terazima,J. Mol. Biol. 371, 1290 (2007)

Page 191: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

9

Volumetric Properties of Proteins and the Roleof Solvent in Conformational Dynamics

C.A. Royer and R. Winter

Abstract. Walter Kauzmann stated in a review of protein thermodynamics that“volume and enthalpy changes are equally fundamental properties of the unfoldingprocess, and no model can be considered acceptable unless it accounts for the entirethermodynamic behaviour” (Nature 325:763–764, 1987). While the thermodynamicbasis for pressure effects has been known for some time, the molecular mechanismshave remained rather mysterious. We, and others in the rather small field of pressureeffects on protein structure and stability, have attempted since that time to clarifythe molecular and physical basis for the changes in volume that accompany proteinconformational transitions, and hence to explain pressure effects on proteins. Thecombination of many years of work on a model system, staphylococcal nuclease andits large numbers of site-specific mutants, and the rather new pressure perturbationcalorimetry approach has provided for the first time a fundamental qualitative un-derstanding of ΔV of unfolding, the quantitative basis of which remains the goal ofcurrent work.

9.1 Introduction

The physical chemical properties of proteins inform their function and assuch have been the object of intense investigation for over 50 years. Indeed,major progress in the understanding of protein structure, dynamics and ther-modynamics, as well as their inter-relationships has been made thanks toadvances in experimental and computational approaches. Despite this gain infundamental understanding, a complete description of the factors that controlthese properties has not been achieved. In particular, the characterization ofthe role of solvent in controlling protein conformational transitions and sta-bility remains to be accomplished [1].

During the 1970s and 1980s the fundamental basis for the temperaturedependence of protein stability and conformational changes was revealed [2].Heat and cold denaturation were clearly attributed to the significant decreasein heat capacity upon folding, leading to entropy-driven unfolding at hightemperature and enthalpy-driven unfolding at low temperature. The amountof hydrophobic surface area that is removed from interaction with water was

Page 192: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

174 C.A. Royer and R. Winter

shown to be proportional to the magnitude of the loss of heat capacity associ-ated with disorder–order transitions involved in protein folding or function [3].

In contrast to the fairly complete understanding of the temperature de-pendence of protein conformation, insight into pressure effects on proteinshas lagged behind. Although a few rather complete studies on the pressuredependence of protein stability appeared early on [4–6], the number of scien-tists working in the field of high pressure did not increase, and indeed, evendiminished for a time. This pressure dependence of protein stability is basedon the volume change associated with unfolding. A review in 2002 by one ofthe authors of the present review [7] provides a listing of a number of volumechanges obtained in pressure studies reported in the literature over about 30years. Given the long time period, the data base is indeed quite small. More-over, these volume changes were measured for several different proteins undercompletely different conditions of temperature and pH and some involved as-sumptions of nonzero compressibility changes. Hence, they are difficult if notimpossible to compare. One clear observation is that at low temperature, thevolume changes upon unfolding are invariably negative, with values rangingfrom just below zero to −185ml mol−1. Positive volume changes reported athigh temperature or low pressure, however, have served to confuse the is-sue, and the volumetric properties of proteins have largely been consideredinextricable.

In 1987 [8] and again in 1993 [9], it was pointed out that the hydropho-bic liquid model could not be entirely adapted to protein folding, since itcompletely fails to explain the effects of pressure. Kauzmann points out that“volume and enthalpy changes are equally fundamental properties of the un-folding process, and no model can be considered acceptable unless it accountsfor the entire thermodynamic behaviour” In his “Reminiscences from a Lifein Protein Physical Chemistry” [10], Kauzmann further states:

I continue to feel that the study of the volume changes in protein reactions issorely neglected. They may be determined by dilatometry and by the effects ofpressure on protein equilibrium constants. The results complement the resultsof the determination of enthalpy changes as measured by calorimetry and theeffects of temperature on equilibrium constants. Much useful insight at themolecular level can be obtained from a knowledge of volume changes

So, rather than follow the example of Kauzmann’s drunk [8], who searches forhis keys under the light of the street lamp, despite having lost them in thedark, we have attempted over the past 15 years to shed new light on what hetermed “the darkness of pressure studies.”

9.2 Thermodynamics

The early pressure unfolding studies cited above revealed all of the essentialparameters for describing combined temperature–pressure effects. Hawley firstdemonstrated that protein unfolding p–T diagrams were elliptical in shape

Page 193: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

9 Volumetric Properties of Proteins 175

pressuredenaturation

T

heat denaturationnative

denatured

ΔV=0

ΔS=0

cold denaturation

p

Fig. 9.1. Hypothetical general p–T phase diagram for two-state cooperative proteinfolding, according to (9.1). The stability decreases with increasing or decreasingtemperature from the ΔS = 0 line and with increasing or decreasing pressure fromthe ΔV = 0 line. The shape of the ellipse depends very strongly on Δα and ΔCp

(Fig. 9.1). He analyzed the p–T diagrams using the following approximationwhich incorporates changes upon unfolding of the basic thermodynamic para-meters ΔH, ΔS, and ΔV as well as their temperature (ΔCp, Δα ) and pres-sure (Δβ) dependences. The Gibbs energy difference between the denatured(unfolded) and native state, relative to some reference point T0, p0 (e.g., theunfolding temperature at 25◦C and ambient pressure), can be approximated –assuming a second-order Taylor series of ΔG(T, p) expanded with respect toT and p around T0, p0 – as [6, 11]:

ΔG = ΔG0 + ΔCp

[T

(ln

T

T0− 1

)+ T0

]+ ΔS (T − T0) + ΔV (p − p0)

+ Δα (T − T0) (p − p0) +Δβ

2(p − p0)

2.

(9.1)In particular, these early studies clearly demonstrated that the volume

change upon unfolding (like the enthalpy change) is not constant with tem-perature and that also like the enthalpy, changes sign, being rather large andnegative at low temperature but becoming positive at higher temperatures.This temperature dependence of the volume change is due to Δα, the differ-ence in thermal expansivity between the unfolded and the folded state. Despitethis rather complete description, a profound understanding of the molecularcontributions to the value of the volume change has remained elusive [7], andit has been our goal to describe these contributions to ΔV and its completetemperature dependence. Hence we have sought to understand Δα, as well.1

1 We note here that while Δβ, the difference in compressibility between the un-folded and folded state, necessarily plays a role at high temperature and pressure,

Page 194: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

176 C.A. Royer and R. Winter

Fig. 9.2. Ribbon diagram of Snase, PDB 1EYO [13]. The single tryptophan is shownin dark gray and one of the residues for which a number of site specific mutants hasbeen studied, valine 66, is shown in black

To approach these issues, we have studied for several years a model proteinsystem, staphylococcal nuclease (Snase, Fig. 9.2) that presents a number of ad-vantages. First of all, Snase (as well as a very large number of site-specific mu-tants) has been widely studied in terms of structure using multiple techniques(NMR, crystallography and other spectroscopic approaches) and in thermaland chemical denaturation, both at equilibrium and in kinetic studies. There-fore a great deal of information is available (which will not be cited here).Secondly, Snase is a highly basic protein, evolved to hydrolyze nucleic acids,and as such presents a high positive surface charge that minimizes aggrega-tion phenomena. This has been quite useful in high-pressure Fourier transforminfrared (FTIR), small-angle X-ray scattering (SAXS), NMR and densitome-try experiments as well as in pressure perturbation calorimetry (PPC), sincethese techniques require rather large concentrations of protein, 2–20mg ml−1.In our hands, in contrast to Snase, many proteins fail to exhibit reversiblethermodynamics under these conditions. Third, Snase at low temperature hasa relatively large, negative volume change for unfolding (e.g., ∼−90ml mol−1

at 4◦C), and the wild type presents marginal stability at ambient conditions(∼−5 to − 6 kcal mol−1), rendering it rather pressure sensitive.

we have not undertaken a complete description to date, as these are not the con-ditions under which most pressure unfolding studies are carried out. Indeed wehave found that over most of the temperature range, a difference in compressibil-ity between the folded and unfolded states need not be invoked. Hence we haveleft this parameter for future consideration. We further note that reported posi-tive ΔV values at low pressure [11,12] are likely due to changes in spectroscopicobservables due to simple isothermal compression of the folded state.

Page 195: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

9 Volumetric Properties of Proteins 177

−10 100 20 30 40 50T/°C T/°C

25

50

100

75

60

−DV/

ml m

ol−1

0

50

100

150

200

250

p/M

Pa

denaturated

native

−10 0 2010 30 40 50

a b

Fig. 9.3. Temperature dependence of Snase high pressure unfolding. (a) Tempera-ture dependence of the absolute value of the volume change of unfolding as measuredby fluorescence (triangles) and FITR (squares); (b) p–T phase diagram of Snase sta-bility by fluorescence (triangles), FITR (crosses) and SAXS (circles)

We determined several years ago the temperature dependence of the pres-sure unfolding of Snase [14] using fluorescence, FTIR and SAXS to build thep–T phase diagram (Fig. 9.3). These studies showed a clear decrease in the ab-solute value of the volume change for unfolding as a function of temperature,although the uncertainty in the recovered values of ΔV did not allow us toconclude unequivocally in a linear dependence. Nonetheless, in the absence ofany further information we assumed linearity and hence calculated from theslope the change in thermal expansivity between the folded and the unfoldedstate to be on the order of 1ml mol−1 K−1.

This value for Δα was in accord with the values reported for chy-motrypsinogen [6] and metmyoglobin [5], and can clearly account for thechange in sign of the volume change that may occur at high temperature.(Note that the slope of the p–T phase diagram for Snase becomes steeperat high temperature, but it never becomes positive, at least under these ex-perimental conditions.) While these results confirmed the importance of theexpansivity in defining the pressure dependence of protein stability, they didnot bring much further insight into the molecular basis for such effects. More-over, as in the earlier studies cited above, the values of ΔV and Δα werederived from analysis of spectroscopic data as a function of pressure and tem-perature according to a two-state unfolding model.

We thus felt it important to measure, directly, the quantities of interest,and hence undertook densitometric studies as a function of pressure and tem-perature using an ultra-highsensitivity oscillating U-tube densitometer (AntonPaar, Graz, Austria) [15]. We were able to calculate also the decrease in volumeupon unfolding by temperature at atmospheric pressure and by pressure atabout 40◦C (arrow at 100 MPa in Fig. 9.4). The latter value (−55ml mol−1)was in good agreement with the ΔV obtained from fitting the spectroscopicpressure-induced unfolding profiles to a two-state model, −52ml mol−1 (as-suming no significant change in isothermal compressibility between the twostates).

Page 196: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

178 C.A. Royer and R. Winter

0 5025 75 100p /MPa

0.7675

0.7700

0.7725

0.7750

0.7775

V s/m

l g−1

DV

0.7800

Fig. 9.4. Specific volume of Snase as a function of pressure at 40◦C [15]. Theprotein is folded up to 50 MPa, and the slope up to that pressure is indicative ofthe isothermal compressibility of the folded state. The arrow at 100 MPa indicatesthe volume change of unfolding assuming constant compressibility of the foldedstate and nearly complete unfolding by 100 MPa. Unfortunately, the high-pressuredensitometer was limited to 100MPa, so the compressibility of the unfolded statecould not be determined

0 10 20 30 40 50 60 70T /°C

11700

11800

11900

12000

12100

12200

V/m

l mol

−1 Vf

Vu

Fig. 9.5. Specific molar volumes of the folded (Vf) and unfolded (Vu) states of Snaseas derived from densitometric measurements [15] (crosses, diamonds), pressure per-turbation calorimetry [16] (open square), and spectroscopic high-pressure unfoldingexperiments [14] (filled squares). Dashed lines correspond to extrapolations

In Fig. 9.5 is shown the first, and to our knowledge, only direct experi-mental plot of the volume of both the folded and unfolded states of a protein.The densitometric studies yielded directly the volume V of Snase as a func-tion of temperature for the folded state (below the transition temperature,crosses) and for the unfolded state (above the transition temperature, dia-monds). It can be seen as well from Fig. 9.5 that the increase in V of thenative state of Snase with temperature is not linear; indeed the folded state αdecreases significantly as the temperature increases while at high temperature

Page 197: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

9 Volumetric Properties of Proteins 179

the expansivity of the unfolded state appears to be a constant. Taking the val-ues of ΔV obtained in Fig. 9.2a, we also calculated the V of the unfolded stateat low temperature (filled squares), which to a first approximation appearsto increase linearly over this temperature range as well, with approximatelythe same slope as over the high temperature range (extrapolated triangles).Thus, we have concluded that the expansivity of the unfolded state is, to afirst approximation, temperature independent, while that of the folded stateis not. Hence, Δα, the difference in expansivity between the two states is mostlikely not constant. The crossed circle just below 50◦C corresponds to the Vof the folded state calculated from the V of the unfolded state plus the volumechange for folding obtained from PPC measurements of the volume changeupon unfolding at the transition temperature [16]. Beyond this point, we donot know the value of the expansivity or the specific volume of the foldedstate. The dashed line represents the extrapolation of a polynomial fit to thecurve at lower temperature.

Thus, the direct measurement of volumetric properties confirms the im-portance of the difference in thermal expansivity of the unfolded and foldedstates of Snase in determining the pressure dependence of the volume change.These studies also support the notion that the difference in compressibility issmall and likely only contributes to the pressure dependence of the unfoldingat high temperature. Our results also suggest that the difference in expansiv-ity is probably not constant with temperature; and indeed we have no ideahow Δα may depend upon pressure. Nonetheless, these results reinforce andexpand the studies from the 1970s, and at least from a thermodynamic pointof view, clear up to a significant extent the confusion that has surrounded thevolumetric properties of proteins. However, it still does not provide insightinto the molecular nature of volume changes and pressure effects.

9.3 Thermal Expansivity and ΔV

We can reasonably assume two major contributions to the difference in specificvolume between the unfolded and folded states of a protein. The first contri-bution is that arising from the decrease in solvent-excluded volume when thetightly, but of course not perfectly, packed protein folded structure is dis-rupted. Water molecules enter this volume, thereby decreasing the overallvolume of the protein–solvent system. The magnitude of this contribution isa specific property of the protein, both in its folded and unfolded state. Thesecond contribution arises from the change in the volume of the water mole-cules that hydrate the newly exposed protein surface area, relative to theirvolume in the bulk. Much of our present understanding of the contributionof differential hydration volume has come from recent studies of model com-pounds and proteins based on PPC. This technique, developed by Brandtsand coworkers [17] and recently reviewed by us [16, 18], is based on the mea-surement of the heat released or absorbed upon small (e.g., 0.5 MPa) pressure

Page 198: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

180 C.A. Royer and R. Winter

perturbations in a differential scanning calorimeter. The heat exchange is re-lated to the entropy change (9.2). Taking the derivative with respect to pres-sure (9.3) and substituting the Maxwell relation (9.4) yields the expressionfor the heat change with pressure in terms of the thermal expansivity α (9.5).If a transition occurs, integrating the change in α over the temperature range(from T0 to Tf) of the transition yields the volume change for the transition[at that temperature (9.6)].

dQrev = TdS. (9.2)(∂Qrev

∂p

)T

= T

(∂S

∂p

)T

. (9.3)

(∂S

∂p

)T

= −(

∂V

∂T

)p

. (9.4)

(∂Qrev

∂p

)T

= −T

(∂V

∂T

)p

= −TV α,

α =1V

(∂V

∂T

)p

= −ΔQrev

TV Δp.

(9.5)

ΔV

V=

∫ Tf

T0

α dT . (9.6)

Thus, measurement of the heat exchange every degree or two along adifferential scanning calorimetry (DSC) scan for a model compound or proteinprovides a direct measurement of the expansivity, and in the case of proteins,the volume change of unfolding at the folding transition temperature. Lin andcoworkers [17] have measured the expansivity of individual amino acid sidechains (by subtracting the value obtained for glycine) (Fig. 9.6a).

Lin and coworkers observed that the expansivity value for polar aminoacids was large and positive at low temperature, and decreased dramaticallybetween 5◦C and 50◦C. Quite the opposite was observed for nonpolar aminoacid side chains, which exhibited a large negative expansivity at low temper-ature which increased dramatically between 5◦C and 50◦C. We have carriedout similar studies following a host–guest scheme, in which we subtractedthe expansivity measured for a glycine tripeptide, from peptides in which thecentral glycine residue was substituted with the residue of interest (Fig. 9.6b).The relative magnitude of the results from these two studies is not the same,but the overall picture is similar. In our case, we have controlled very carefullyfor aggregation phenomena and we observe that the magnitude of the nega-tive expansivities for the nonpolar amino acids is more or less proportional totheir hydrophobicity (L > A > Q = M > F).

Note that the black line in Fig. 9.6a corresponds to the expansivity of purewater and that it exhibits a small negative value at low temperature. Thisobservation helps to interpret the expansivity data. A negative expansivity

Page 199: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

9 Volumetric Properties of Proteins 181

16 24 32 40 48 56 648T/° C

−1.4

−1.2−1.0−0.8−0.6

Da

K−0.4−0.2

0.00.2

GAG-GGGGLG-GGGGOG-GGGGMG-GGGGFG-GGG

/10

GX

GG

2010 30 40 50

T / °C

a

/ 10K

60 70 80 90

AsnGlu

Ser

Phe

Leu

Val

Ala

H2O0

−2.0

−1.5

−1.0

−0.5

2.5

0.5

1.0

1.5

2.0

a b

Fig. 9.6. (a) PPC data taken from Lin and coworkers [17] for polar and nonpolaramino acids (calculated with respect to the signal obtained from glycine). (b) Similarstudies obtained by us [18] for nonpolar amino acids using a host–guest approach

means that the density increases upon heating. We know this is true for waterat low temperature, since ice floats. We can use the same reasoning for thenonpolar amino acids. As their solutions in water are heated, hydrating watersare released to the bulk where they occupy a smaller partial molar volume,akin to ice melting. Hence we can conclude that at these low temperatures,the density of the waters hydrating the nonpolar residues is lower than inthe bulk, or ice-like. While the Frank and Evans iceberg model has beenhighly controversial, these PPC results lend some support. Indeed Kauzmannstated in the 1987 Nature article comments: “I still believe that the Frankand Evans iceberg model of 40 years ago is essentially correct . . .” [8]. Incontrast, the large positive expansivity for the polar amino acids indicatesa degree of “electrostriction” of the hydrating water molecules around polarmoieties, leading to a higher density than that of the bulk. Upon heating,these molecules are released gradually into the less dense bulk, and hencelead to a large, positive expansivity.

Proteins, being composed of a combination of polar and nonpolar moieties,more or less exposed to solvent depending upon their conformation, shouldexhibit expansivities that correspond, in part to a weighted combination ofthe expansivities of these moieties. In addition, we must consider for proteinsthe intrinsic expansivity of the protein structure itself, in addition to thehydration, which can be positive and negative. We and Brandts and coworkers[16–18] have measured the expansivity of a few model proteins, in particularSnase, under a variety of conditions. A typical protein PPC scan is shownin Fig. 9.7a.

Page 200: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

182 C.A. Royer and R. Winter

0

T /°C T /°C

00.4

0.5

0.6

0.7

0.8

0.9

1.0

1.1

0 1010 20 3020 30 40 5040 50 60 7060 70 80 90−0.5

0.0

0.5

1.0

1.5

a/1

0−3K

−1

a/1

0−3K

−1

a b

Fig. 9.7. (a) PPC scan for Snase (2 mg ml−1) taken from Ravindra et al. [16] and(b) the (less accurate) expansivity calculated from the densitometry measurementsof Seemann et al. [15]

It can be seen from Fig. 9.7 that the expansivity of Snase is rather large andpositive at low temperature and that it decreases dramatically up to about43◦C. At this point, the protein unfolds, and the accompanying DSC scanshowing the enthalpy peaks at 50◦C. Above 60◦C, the expansivity of the pro-tein corresponds to that of the unfolded state, and between 60◦C and 70◦C itis rather constant. Moreover, the agreement between the PPC measurementsand those obtained by densitometry is rather astounding. The expansivities ofthe folded state (populated at low temperature) and the unfolded state (pop-ulated at high temperature) are nearly identical using the two techniques. Inboth experiments, α for the folded state decreases dramatically with temper-ature, while that for the unfolded state is rather constant. The expansivityprofile for the folded state of Snase resembles that obtained for polar aminoacid residues, and this similarity is due to the fact that protein surfaces arerather polar. If the expansivity of the unfolded state is rather constant, assuggested by the extrapolation to low temperatures in Fig. 9.5, then one mayconclude that this arises from the offset of the polar and hydrophobic surfaceareas that are exposed in the unfolded state.

From the PPC data one can reliably calculate the volume change of un-folding, ΔV (at the transition temperature), by integrating α over the unfold-ing transition as shown in (9.6). Under these conditions, we found it to be−19ml mol−1 A linear extrapolation of the plot in Fig. 9.3 would place thevalue closer to −40ml mol−1, but we do not know if the dependence is linear;indeed we suspect that Δα is not a constant. Moreover, the data in Fig. 9.3were obtained from the analysis of high pressure data, and Δβ may play arole. In any case it is clear from this rather direct measurement of the vol-ume change of unfolding that it is not positive at low pressure (0.5 MPa) andmoderate temperature, at least in the case of Snase, in agreement with ourexperimental p–T diagram in Fig. 9.3. Thus the often-cited statement that thevolume change for protein unfolding is negative at high pressure and positiveat low pressure is not necessarily true, and likely quite often false.

Page 201: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

9 Volumetric Properties of Proteins 183

As a means of understanding more clearly the determinants for α, Δα,and hence ΔV , we can ask the question as to how the expansivity profileschange as a function of solution conditions that modify protein stability. Weinvestigated the PPC profiles of Snase as a function of the osmolyte, sorbitol,and the denaturant, urea. It has been amply demonstrated that these additivesdo not function through some hypothetical effect on water structure [19] only,but rather through either positive or negative interaction energies with theprotein surface [20, 21], the peptide bond in the case of urea. Thus we canbe reasonably sure that the differences observed in the PPC curves obtainedin the presence of these additives arise from changes in the protein stability,structure or hydration.

It can be seen in Fig. 9.8 that the transition shifts, as expected, to highertemperature as a function of increasing osmolyte concentration. The ΔVdecreases in absolute value from −19 to −5ml mol−1. This is in part dueto the increase in the transition temperature, and because of a positive Δα(see Fig. 9.5) the volume between the unfolded and folded state decreases inabsolute value. There may be a contribution of the effect of the osmolyte tothe structure of the unfolded state as well. The value of α at low temperatureincreases with increasing osmolyte as a result of the preferential hydration ef-fect. At high temperature, the differences in the expansivity of the bulk water

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

1.1

0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 900.30.40.50.60.70.80.91.01.11.21.31.4

α/1

0-3 K

-1

α/1

0-3 K

-1

T/°C

0 10 20 30 40 50 60 70 80 90

T/°C

0 10 20 30 40 50 60 70 80 90

T/°C

T/°C

-40-35-30-25-20-15-10-50510

Cp/k

J m

ol-1

K-1

Cp/k

J m

ol-1

K-1

-40-35-30-25-20-15-10-50510

Fig. 9.8. PPC (upper panels) and DSC (lower panels) profiles of Snase (4 mg ml−1)in phosphate buffer at pH 5.5. The effects of sorbitol (left panels) (0, 0.5, and 1.5 M,curves shifting to higher temperatures) and urea (right panels) (0, 0.5, 1.5, and2.5 M, curves shifting to lower temperatures) were tested

Page 202: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

184 C.A. Royer and R. Winter

and the hydrating water, as well as the decrease in the hydration interactionlead to basically indistinguishable α values at high temperature. The effectof urea on the PPC profiles is just the opposite. As expected, the tempera-ture of the transition decreases, and the absolute value of ΔV increases from−19 to − 56ml mol−1. This again is due primarily to the effect of Δα, whichincreases the difference in volume between the two states as temperature de-creases. The value of α at low temperature decreases significantly. This maybe due to the decrease of hydration because of urea binding, or may involvethe density differences between bound and bulk urea at low temperature.

9.4 Conclusions

The PPC studies carried out so far on proteins seem to suggest that theirvolumetric properties, and hence the effects of pressure on their structuresand stabilities can be largely explained by the differential hydration terms.For example, we have found recently (unpublished data) that partially de-structured variants of Snase that expose more hydrophobic surface area tothe solvent also exhibit lower values of α at low temperatures. However, otherrecent experiments in progress on hyperstable variants of Snase suggest thatthe stability and dynamics of the various states of the protein, in addition tothe degree and type of hydration, may be crucial in determining the value of αas well. These studies have suggested that high stability and limited dynamicstend also to decrease the amplitude of α for the folded state at low temper-ature, indicating that the value of the expansivity for the folded state at lowtemperature results from a combination of surface hydration properties andstructural flexibility. More experimental work on model compounds and onspecific variants under a variety of conditions, in addition to computationalapproaches, will be necessary to quantify protein expansivity, which to ourmind is essential to the molecular-level understanding of volume changes andpressure effects.

We have come to consider the volume change of unfolding at 4◦C as astandard value. At this low temperature, the differences in expansivity for thepolar and nonpolar amino acid side chains are close to maximal. Since theexpansivity (releasing water to the bulk) for polar and charged groups is largeand positive, then moving water molecules from the bulk to hydrate newly ex-posed polar surface area leads to an increase in density or a decrease in volume.Just the opposite is true for nonpolar surface area exposed upon unfolding.Hence the ΔV at this low temperature can be considered to comprise the sumof negative values for the exposure of each polar moiety, positive values for theexposure of each nonpolar moiety, and the contribution of the disappearanceof solvent excluded volume upon disruption of the tertiary packing. Given thatthe protein interior contains most of the nonpolar amino acid side chains, andthat disruption of the structure would expose this nonpolar surface area, onemight expect that the result of the contributions from the exposed polar and

Page 203: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

9 Volumetric Properties of Proteins 185

nonpolar surface area could be a positive (or less negative) ΔV . This is notthe case. Indeed, it is at these low temperatures that ΔV is found to be atits most negative. Hence, we propose that the difference in solvent-excludedvolume is mainly responsible for the decrease in volume upon unfolding ofproteins at low temperature, and that this contribution may indeed overcomea positive contribution from differential solvation. We must bear in mind thatthe magnitude of the difference in solvent-excluded volume depends both onthe packing density of the folded state and the degree of disruption of theunfolded state which is rather poorly characterized in most cases.

The folded state presents a relatively more polar surface area than theunfolded state, and it has a specific three-dimensional structure that imposesconstraints on its expansion. Hence its expansivity decreases drastically withincreasing temperature, whereas that of the unfolded state appears to berather constant. Thus, as the temperature increases, the unfolded state ex-pands much more efficiently than the folded state. This is why the differencein specific volume between the unfolded and folded state of proteins decreaseswith increasing temperatures and may even become positive. Indeed, we haveobserved in PPC experiments on a hyperstable variant of Snase (unpublishedresults) that under certain conditions the volume change for unfolding indeedbecomes positive. Such an observation was possible because the unfoldingtemperature of the variant is considerably higher than that of the wild type.This leads us to suggest that at low temperature the defining contribution toΔV comes mainly from excluded volume differences, and ΔV for unfolding isnegative. In contrast, at high temperatures, differential solvation due to theincreased exposed surface area of the unfolded state in addition to its largerthermal volume linked to increased conformational dynamics takes over andΔV for unfolding eventually becomes positive.

After almost two decades of wandering around in “the darkness of the fieldof pressure effects on protein folding” we have come to understand, at leastqualitatively, the underlying molecular contributions to the volumetric proper-ties of the various states of proteins and how these change with temperature.We have yet to reach a quantitative understanding of these contributions.While we can calculate for example from pressure-jump relaxation studies,the fractional change in hydration between the folded and transition state orthe transition state and the unfolded state [22, 23], we cannot say how manywater molecules are excluded from the protein surface in these transitions; norcan we predict volumetric properties from sequence and structure. Finally, wehave yet to explore in detail the pressure effects on the volumetric propertiesof proteins. Despite these remaining challenges, it would appear that the lightof a small candle may be making its way into the darkness. We are confidentthat further progress in understanding the volumetric properties of proteinswill provide fundamental information in adaptation and evolution that will ul-timately contribute to the multiple applications involving protein design andfunctional modulation.

Page 204: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

186 C.A. Royer and R. Winter

References

1. Y. Levy, J.N. Onuchic, Annu. Rev. Biophys. Biomol. Struct. 35, 389 (2006)2. P.L. Privalov, S.J. Gill, Adv. Protein Chem. 39, 191 (1988)3. K.P. Murphy, E. Freire, Adv. Protein Chem. 43, 313 (1992)4. J.F. Brandts, R.J. Oliveira, C. Westort, Biochemistry 9, 1038 (1970)5. A. Zipp, W. Kauzmann, Biochemistry 12, 4217 (1973)6. S.A. Hawley, Biochemistry 10, 2436 (1971)7. C.A. Royer, Biochim. Biophys. Acta 1595, 201 (2002)8. W. Kauzmann, Nature 325, 763 (1987)9. K.A. Dill, Biochemistry 29, 7133 (1990)

10. W. Kauzmann, Protein Sci. 2, 671 (1993)11. E.J. Fuentes, A.J. Wand, Biochemistry 37, 9877 (1998)12. T.M. Li, J.W. Hook III, H.G. Drickamer, G. Weber, Biochemistry 15, 5571

(1976)13. J. Chen, Z. Lu, J. Sakon, W.E. Stites, J. Mol. Biol. 303, 125 (2000)14. G. Panick, G.J. Vidugiris, R. Malessa, G. Rapp, R. Winter, C.A. Royer, Bio-

chemistry 38, 4157 (1999)15. H. Seemann, R. Winter, C.A. Royer, J. Mol. Biol. 307, 1091 (2001)16. R. Ravindra, C. Royer, R. Winter, Phys. Chem. Chem. Phys. 6, 1952 (2004)17. L.N. Lin, J.F. Brandts, J.M. Brandts, V. Plotnikov, Anal. Biochem. 302, 144

(2002)18. L. Mitra, N. Smolin, R. Ravindra, C. Royer, R. Winter, Phys. Chem. Chem.

Phys. 8, 1249 (2006)19. J.D. Batchelor, A. Olteanu, A. Tripathy, G.J. Pielak, J. Am. Chem. Soc. 126,

1958 (2004)20. T. Arakawa, S.N. Timasheff, Biophys. J. 47, 411 (1985)21. M. Auton, L.M. Holthauzen, D.W. Bolen, Proc. Natl. Acad. Sci. USA 104,

15317 (2007)22. L. Brun, D.G. Isom, P. Velu, B. Garcia-Moreno, C.A. Royer, Biochemistry 45,

3473 (2006)23. L. Mitra, K. Hata, R. Kono, A. Maeno, D. Isom, J.B. Rouget, R. Winter, K.

Akasaka, B. Garcia-Moreno, C.A. Royer, J. Am. Chem. Soc. 129, 14108 (2007)

Page 205: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

10

A Statistical Mechanics Theoryof Molecular Recognition

T. Imai, N. Yoshida, A. Kovalenko, and F. Hirata

Abstract. A novel theoretical approach to the molecular recognition process inprotein is presented, based on the statistical mechanics of molecular liquids, or thereference interaction site model/three-dimensional reference interaction site model(RISM/3D-RISM) theory. The method requires just the structure of protein andthe potential energy parameters for the biomolecule and solutions as inputs. Thecalculation is carried out in two steps. The first step is to obtain the pair correlationfunction for solutions consisting of water and ligands based on the RISM theory.Then, given the pair correlation functions prepared in the first step, we calculatethe 3D-distribution functions of water and ligands around and inside protein basedon the 3D-RISM theory. The molecular recognition of a ligand by the protein isrealized by the 3D-distribution functions: if one finds some conspicuous peaks in thedistribution of a ligand inside the protein, then the ligand is regarded as “recognized”by the protein. Some molecular recognition processes of small ligands, includingwater, noble gases, and ions, by a protein are presented in this chapter. The relationof the molecular recognition process to the pressure denaturation of protein is alsodiscussed.

10.1 Introduction

Life phenomena are a series and a network of chemical reactions, which areregulated by genetic information inherited from generation to generation. Thegenetic information itself is generated and transmitted by a series of chemi-cal processes [1]. In each of those reactions, some characteristic process takesplace, which distinguishes biochemical reactions from ordinary chemical reac-tions in solutions. The process is commonly referred to as “molecular recog-nition (MR).” For example, in order for the enzymatic reaction to occur, thesubstrate molecules should be accommodated first by the protein in its re-action pocket to form the so-called enzyme-substrate (ES) complex [2]. TheMR process is extremely selective and specific in atomic level, and that selec-tivity as well as specificity is the key for living systems to maintain their life.

Page 206: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

188 T. Imai et al.

Imagine what happens if a calcium binding protein binds say potassium ion er-roneously. In that respect, the MR is an elementary process of life phenomena.

The MR process can be defined as a molecular process in which one or afew guest molecules are bound in high probability at a particular site, a cleftor a cavity, of a host molecule in a particular orientation. In this regard, theMR is a molecular process determined by specific interactions between atomsin host and guest molecules. On the other hand, the process is a thermody-namic process as well, with which the chemical potential or the free energy ofguest molecules in the recognition site and in the bulk solution are concerned.As an example, let us think about the binding of a substrate molecule atsome reaction pocket of a host protein. Usually, the reaction pocket is likelyto be filled with one or a few water molecules when there is no substrate.For a substrate molecule, in order to come into the reaction pocket, one orsome of the water molecules should be disposed from the pocket, while thesubstrate molecule itself should be partially or entirely dehydrated. The freeenergy changes associated with the processes are commonly called “dehydra-tion penalty.” When a guest molecule comes into a cleft or a cavity of a hostmolecule, it has to overcome a high entropy barrier, because the space or thedegree of freedom allowed to the guest molecule is so small compared to thosein the bulk solution. The conformation of the host molecule should fluctuateto accommodate the guest molecule dynamically. The hinge bending motionof protein to accommodate a ligand is an example of such induced fitting. Theconformational fluctuation of biomolecules is also driven by the free energy.

The reason why the MR process is so challenging for any theoretical meanslies in the fact that the process is a “molecular process” governed by “thermo-dynamic laws.” The “docking simulation” often employed in drug design usesessentially a trial and error scheme to find a “best-fit complex” of host andguest molecules based on geometrical and/or energetic criteria [3,4]. However,the best-fit complex in geometrical sense will never be the most stable onein terms of the thermodynamics, because it cannot account for the solvent:neither the dehydration penalty nor the entropy barrier mentioned earlier istaken into account. The so-called implicit solvent models, the generalized Born(GB) [5] and the Poisson–Boltzmann (PB) equations [6], which have been usedmost popularly for evaluating the solvation thermodynamics of biomolecules,are much less accurate and are not insightful at all for the problem underconcern, because by definition they do not have a molecular view for solvent.Moreover, it is impossible to define a dielectric constant of solvent inside a hostcavity; thereby it cannot account for the dehydration penalty, especially thatfrom the host cavity. At best, those quantities can be calculated by fitting theempirical parameters such as the boundary conditions and the dielectric con-stants with experimental data, but then it loses credibility as a first-principletheory predicting the phenomena.

Molecular simulation, on the other hand, can provide the most detailedmolecular view for the process. However, a “let-it-do” type simulation doesnot work for the problem at all, because the MR process is usually a slow

Page 207: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

10 A Statistical Mechanics Theory of Molecular Recognition 189

and rare event. A common strategy adopted by the simulation communityto overcome the difficulty is a non-Boltzmann-type sampling, which defines a“reaction coordinate” or an “order parameter” onto which all other degrees offreedoms are projected. The best example is “umbrella” sampling to realizethe potential of mean force, or the free energy along a conduction path ofan ion in an ion-channel [7]. The method is quite powerful for sampling theconfiguration space around an order parameter if the parameter is unique.Unfortunately, the problems in the biochemical processes are not so simple ascan be described by a unique order parameter. So, it is often the case thatthe results of the simulation depend on the choice of order parameter and on“scheduling” of the sampling. The other methodology employed to acceleratethe sampling is to apply an artificial external force to the system: for example,external pressure applied to water molecules in aquaporin [8]. That kind ofsimulations should verify that the configuration water satisfies the Boltzmanndistribution; otherwise, the simulation has a danger of ending up with just a“science fiction.”

Recently, a new theoretical approach to the MR process has been launched,based on the three-dimensional reference interaction site model (3D-RISM)method, a statistical mechanics theory of liquids [9–11]. The 3D-RISM equa-tion was derived from the molecular Ornstein–Zernike (MOZ) equation, themost fundamental equation to describe the density pair correlation of mole-cular liquids [12, 13], for a solute–solvent system in the infinite dilution bytaking a statistical average over the orientation of solvent molecules. By solv-ing the combined 3D-RISM and RISM equations, the latter providing the bulksolvent structure in terms of the site–site density pair correlation functions,one can get the “solvation structure” or the solvent distributions around asolute. The solvation structure so produced retains the atomic information,because it starts from a Hamiltonian in which the information of atom–atominteractions among molecules is embedded just as in a molecular simulation.The method produces naturally all the solvation thermodynamics as well, in-cluding energy, entropy, free energy, and their derivatives such as the partialmolar volume and compressibility. Unlike molecular simulation, there is nonecessity for concern about size of the system and “sampling” of the config-uration space, because the method treats essentially the infinite number ofmolecules and integrates over the entire configuration space of the solvent.

The power of the 3D-RISM theory has been demonstrated fully in the sol-vation structure and thermodynamics of protein. The partial molar volumesof proteins in aqueous solutions calculated by Imai, Kovalenko, and Hiratahave exhibited quantitative agreement with corresponding experimental re-sults [14]. This turns out to be the first quantitative results obtained for thethermodynamics of protein entirely from statistical mechanics theory. It wasan accomplishment by itself in the sense that it gave great confidence in the3D-RISM to explore the stability of protein in solutions. However, it was onlya prelude to a discovery that will give even bigger impact on the science.When we were analyzing the 3D-distribution of water around hen egg-white

Page 208: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

190 T. Imai et al.

lysozyme, we found conspicuous peaks inside small cavities in the protein,which no doubt reveal the water molecules trapped inside the macromole-cule [15]. In fact, the number of water molecules and the positions inside thecavity coincide with those found by the X-ray crystallography. This impliesthat the 3D-RISM is capable of “detecting” the molecules “recognized” byprotein or the host molecule. This is nothing but the realization of the “mole-cular recognition.”

In this chapter, we review our recent studies on molecular recognition byprotein based on the RISM and 3D-RISM theories, which have been carriedout as a part of the Scientific Research in Priority Areas “Water and Biomole-cules” during last 5 years.

10.2 Outline of the RISM and 3D-RISM Theories

Let us begin the section with asking the following questions to the readers.“What is the structure of liquid?” “How the structure of liquid can be charac-terized?” These questions are nontrivial, because unlike individual moleculesand crystal, liquid state does not form a structure of definite shape. One canreadily define the structure of a molecule by giving the bond lengths, bondangles, and dihedral angles even for the most complex molecule like protein.The crystalline structure of solid can be also defined unambiguously by givingthe lattice constants. However, molecules in liquids are in continuous diffusivemotion, and thereby the definite geometry among the molecules cannot bedefined. In such a case, we can only use statistical or probabilistic language.

The probabilistic language to characterize the structure of liquids is thedistribution functions, which are nothing but the moments of the density fieldν(r) =

∑i δ(r− ri) with respect to the Boltzmann weight. If there is no field

applied to the system, the first moment or the average density is just constanteverywhere in the system, namely, ρ(r) ≡ 〈ν(r)〉 = ρ = N/V , where V and Nare the volume of the container and the number of molecules in the system,respectively, and 〈· · · 〉 indicates the thermal average. So, the average densitydoes not convey any information with respect to the liquid. However, if youlook at the second moment ρ(r, r′) = 〈ν(r)ν(r′)〉, this quantity carries thestructural information of liquids. The quantity is referred to as the densitypair distribution function, which has essentially the same physical meaningas the radial distribution function (RDF) obtained from X-ray diffractionmeasurement. The density pair distribution function ρ(r, r′) is proportionalto the probability density of finding two molecules at the two positions r andr′ at the same time, and becomes just a product of the average densitieswhen the distance between the two positions becomes so large that there isno “correlation” between the densities at the two positions.

lim|r−r′|→∞

ρ(r, r′) → ρ(r)ρ(r′)(=ρ2 in uniform liquids

). (10.1)

Page 209: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

10 A Statistical Mechanics Theory of Molecular Recognition 191

The quantity g(r, r′) = ρ(r, r′)/ρ2 represents a “correlation” between the den-sities at the two positions r and r′. So, it is referred to as the “pair correlationfunction” (PCF), or RDF when the liquid density is uniform and the trans-lational invariance is implied. We further define a function called the “totalcorrelation function” by h(r, r′) = g(r, r′)−1, which represents the correlationof the density “fluctuations” at the two positions r and r′,

h(r, r′) = 〈δν(r)δν(r′)〉/ρ2, (10.2)

where δν(r) (= ν(r) − ρ) denotes the density fluctuation. The main task ofthe liquid state theory is to find an equation that governs the function h(r, r′)or g(r, r′) based on the statistical mechanics, and to solve the equation.

As is briefly described in the Introduction, an “exact” equation referredto as the Ornstein–Zernike equation, which relates h(r, r′) with another cor-relation function called the direct correlation function c(r, r′), can be “de-rived” from the grand canonical partition function by means of the functionalderivatives. Our theory to describe the molecular recognition starts from theOrnstein–Zernike equation generalized to a solution of polyatomic molecules,or the molecular Ornstein–Zernike (MOZ) equation [12],

h(1, 2) = c(1, 2) +∫

c(1, 3)ρh(3, 2) d3, (10.3)

where h(1, 2) and c(1, 2) are the total and direct correlation functions, re-spectively, and the numbers in the parenthesis represent the coordinates ofmolecules in the liquid system, including both the position R and the orien-tation Ω. d3 = Ω−1 dR3dΩ3, where Ω is the unweighted integral over theangular coordinates. The boldface letters of the correlation functions indi-cate that they are matrices consisting of the elements labeled by the speciesin the solution. In the simple case of a binary mixture, the equation can bewritten down labeling the solute by “u” and solvent by “v” as follows. (It isstraightforward to generalize the equations to multi-component mixtures.)

hvv(1, 2) = cvv(1, 2) +∫

cvv(1, 3)ρvhvv(3, 2) d3 +∫

cvu(1, 3)ρuhuv(3, 2) d3,

(10.4)

huv(1, 2) = cuv(1, 2) +∫

cuv(1, 3)ρvhvv(3, 2) d3 +∫

cuu(1, 3)ρuhuv(3, 2) d3,

(10.5)

huu(1, 2) = cuu(1, 2) +∫

cuv(1, 3)ρvhvu(3, 2) d3 +∫

cuu(1, 3)ρuhuu(3, 2) d3.

(10.6)By taking the limit of infinite dilution (ρu → 0), one gets

hvv(1, 2) = cvv(1, 2) +∫

cvv(1, 3)ρvhvv(3, 2) d3, (10.7)

Page 210: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

192 T. Imai et al.

huv(1, 2) = cuv(1, 2) +∫

cuv(1, 3)ρvhvv(3, 2) d3. (10.8)

The equations depend essentially on six coordinates in the Cartesian space,and it includes a sixfold integral. This integral is the one that prevents thetheory from applications to polyatomic molecules. It is the interaction-sitemodel and the RISM approximation proposed by Chandler and Andersen[16] that enabled one to solve the equations. The idea behind the model isto project the functions onto the one-dimensional space along the distancebetween the interaction sites, usually placed on the center of atoms, by takingthe statistical average over the angular coordinates of the molecules withfixation of the separation between a pair of interaction site.

fαγ(r) =∫

δ (R1 + lα1 ) δ (R2 + lγ2 − r) f(1, 2) d1d2, (10.9)

where lαi is the vector displacement of site α in molecule i from the molecularcenter Ri. It follows that Ri + lαi = rα

i denotes the position of site α inmolecule i. The angular average of the second terms in (10.7) and (10.8) isformidable, but the approximation

c(1, 2) ≈∑α,γ

cαγ (|rα1 − rγ

2 |) (10.10)

allows one to perform the angular average, leading to the RISM equation

ρhρ = ω ∗ c ∗ ω + ω ∗ c ∗ ρhρ, (10.11)

where the asterisk denotes convolution integrals

f ∗ g(r) =∫

f(r′)g(|r′ − r|) dr′. (10.12)

The new function ω appearing in the derivation of (10.11) is called the “in-tramolecular” correlation function, which is defined for a pair of atoms α andγ in a molecule by

ωαγ(r) = ρδαγδ(r) + ρ(1 − δαγ)δ(r − lαγ), (10.13)

in which δαγ and δ(r) are the Kronecker and Dirac delta functions, respec-tively. By means of the Dirac delta function, the term δ(r − lαγ) imposes adistance constraint lαγ between the pair of atoms. Thus, in the RISM theory,imposing the distance constraints on all pairs of atoms in the molecule definesthe molecular geometry in terms of trigonometry, similar to the z-matrix incomputational chemistry.

The 3D-RISM equation for the solute–solvent system at infinite dilutioncan be derived from (10.8) by taking the statistical average over the angularcoordinate of “solvent,” but not for that of “solute” [10,11,17]. The equationreads

Page 211: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

10 A Statistical Mechanics Theory of Molecular Recognition 193

hγ(r) =∑γ′

∫cγ(r′)

(ωvv

γ′γ(|r′ − r|) + ρhvvγ′γ(|r′ − r|)

)dr′, (10.14)

where hγ(r) and cγ(r) are, respectively, the total and direct correlation func-tions of solvent site γ at position r in the Cartesian coordinate of which originis placed at an arbitrary position, generally inside the protein. The functionsωvv

γ′γ(r) and hvvγ′γ(r) are the correlation functions for solvent molecules, which

appear in (10.11). It is these equations that can be applied to the molecularrecognition process. If one views the solute molecule as a “source of externalforce” exerted on solvent molecules, then ρgγ(r) (= ρhγ(r)+ρ) is identified asthe density distribution of solvent molecules in the “external force.” This iden-tification called “Percus trick” is the “key” concept to realize the molecularrecognition process by means of statistical mechanics.

The equations described earlier contain two unknown functions, h(r) andc(r). Therefore, they are not closed without another equation that relatesthe two functions. Several approximations have been proposed for the closurerelations: HNC, PY, MSA, etc. [12]. The HNC closure can be obtained fromthe diagramatic expansion of the pair correlation functions in terms of densityby discarding a set of diagrams called “bridge diagrams,” which have multifoldintegrals. It should be noted that the terms kept in the HNC closure relationstill include those up to the infinite orders of the density. Alternatively, therelation has been derived from the linear response of a free energy functionalto the density fluctuation created by a molecule fixed in the space within thePercus trick. The HNC closure relation reads

h(r) = exp (−u(r)/kBT + h(r) − c(r)) + 1, (10.15)

where kB and T are the Boltzmann constant and temperature, respectively,and u(r) is the interaction potential between a pair of atoms in the system.Equation (10.15) is the relation that incorporates the physical and chemicalcharacteristics of the system into the theory through u(r). The PY approxi-mation can be obtained from the HNC relation just by linearizing the factorexp (h(r) − c(r)). The HNC closure has been quite successful for describing thestructure and thermodynamics of liquids and solutions including water. How-ever, the approximation is notorious in the low density regime. The drawbackbecomes fatal sometimes when one tries to apply the theory to associatingliquid mixtures or solutions, especially of dilute concentration, because a so-lution of “dilute” concentration is equivalent to “low density” liquid for theminor component. To get rid of the problem, Kovalenko and Hirata proposedthe following approximation, or the KH closure [18],

g(r) ={

exp (d(r)) for d(r) ≤ 0,1 + d(r) for d(r) > 0,

(10.16)

where d(r) = −u(r)/kBT + h(r) − c(r). The approximation turns out to bequite successful even for mixture of complex liquids.

Page 212: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

194 T. Imai et al.

The procedure of solving the equations consists of two steps. We first solvethe RISM equation (10.11) for hvv

γ′γ(r) of solvent or a mixture of solvents incases of solutions. Then, we solve the 3D-RISM equation (10.14) for hγ(r)of a protein–solvent (solution) system, inserting hvv

γ′γ(r) for the solvent into(10.14), which has been calculated in the first step. Considering the definitiong(r) = h(r) + 1, g(r) thus obtained is the three-dimensional distribution ofsolvent molecules around a protein in terms of the interaction site densityrepresentation of the solvent or a mixture of solvents in case of solutions.

The so-called solvation free energy can be obtained from the distributionfunctions through the following equations [18,19] corresponding, respectively,to the two closure relations described earlier, (10.15) and (10.16):

ΔμHNC = ρkBT∑

γ

∫dr

[12hγ(r)2 − cγ(r) − 1

2hγ(r)cγ(r)

], (10.17)

ΔμKH = ρkBT∑

γ

∫dr

[12hγ(r)2Θ(−hγ(r)) − cγ(r) − 1

2hγ(r)cγ(r)

],

(10.18)

where Θ(x) is the Heaviside step function. The other thermodynamic quanti-ties concerning solvation can be readily obtained from the standard thermo-dynamic derivative of the free energy except for the partial molar volume.

The partial molar volume, which is a very important quantity to probe theresponse of the free energy (or stability) of protein to pressure, including theso-called pressure denaturation, is not a “canonical” thermodynamic quantityfor the (V, T ) ensemble, since volume is an independent thermodynamic vari-able of the ensemble. The partial molar volume of protein at infinite dilutioncan be calculated from the Kirkwood–Buff equation [20] generalized to thesite–site representation of liquid and solutions [21,22],

V = kBTχT

(1 − ρ

∑γ

∫cγ(r) dr

), (10.19)

where χT is the isothermal compressibility of pure solvent or solution, whichis obtained from the site–site correlation functions of the solution.

In the following, we present an application of the theory described earlierto demonstrate the robustness of the theory.

The example is the partial molar volume of protein, which can be calcu-lated using (10.19) from h(r), or equivalently from c(r) obtained from the3D-RISM equation. The partial molar volume of several proteins in waterwhich appear frequently in the literature of protein research is plotted againstthe molecular weight in Fig. 10.1. [23] By comparing the results with theexperimental ones plotted in the same figure, one can readily see that the the-ory is capable of reproducing the experimental results in quantitative level.At a glance, the results seem to be reproduced by just simple consideration

Page 213: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

10 A Statistical Mechanics Theory of Molecular Recognition 195

Fig. 10.1. Partial molar volume of proteins plotted against the molecular weight.The theoretical results (black circles) show quantitative agreement with the experi-mental ones (crosses)

of protein geometry using a commercial software to calculate the exclusionvolume of protein. However, it is never the case. The reason is because thepartial molar volume is the “thermodynamic quantity,” not the “geometricalvolume.” The partial molar volume reflects all the solvent–solvent and solute–solvent interactions as well as all the configurations of water molecules in thesystem, while geometrical volume accounts for just the simplified (hardcoretype) repulsive interaction between the solute and solvent. Other factors suchas attractive interactions between solute and solvent and the solvent reorga-nization are entirely neglected in the geometrical volume. The contributionsfrom the solvent reorganization are of particular importance in the partialmolar volume of protein, because it is concerned with the so-called volume of“cavity” in protein. As is well regarded, a protein has many internal cavitieswhere water molecules can or cannot be accommodated. Let us carry out asimple “thought experiment” with respect to the partial molar volume of pro-tein. The experiment is to dissolve a protein in water. Upon the dissolutionof protein in water, some of the cavities in the protein may be filled by wa-ter molecules, but others may not. If the cavity stays empty, then the emptyspace will contribute to increase the partial molar volume of the protein. Onthe other hand, if the space is filled by water molecules due to the reorga-nization of the solvent, it will contribute to reduce the entire volume of thesolution, and compensate the increase due to the cavity volume. This compen-sation is nontrivial: if a cavity can accommodate one water molecule, it givesrise to the reduction in the volume by 18 cm3 mol−1. In this regard, unless atheory is able to describe the reorganization of water molecules induced by

Page 214: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

196 T. Imai et al.

protein, it is useless to predict the partial molar volume. The nearly quan-titative results shown in the figure demonstrate that the theory is properlyaccounting for all the solute–solvent and solvent–solvent interactions as wellas solvent reorganization induced by protein, including the accommodation ofsome water molecules into the internal cavity.

In the following sections, we will demonstrate how the 3D-RISM theory iscapable of describing molecular recognition processes.

10.3 Recognition of Water Molecules by Protein

It is not necessary to emphasize how important water is for living systemsto maintain their life [24–26]. No wonder that many scientists in the field ofX-ray and neutron diffraction measurement have been trying to determinepositions and orientations of water molecules around and inside biomolecules,or protein and DNA [27, 28]. However, it is not so easy even for modern ex-perimental technology to locate the position of water molecules, partly due tothe limited resolution of diffraction measurements in space as well as in time.This is because water molecules at the surface of protein are not necessar-ily bound firmly to some particular site of biomolecules, but exchange theirpositions quite frequently. Actually this flexibility and fluctuation of watermolecules are essential for living systems to control their life. The diffractionmeasurement can identify only some water molecules that have long residencetime at some particular position of the biomolecules.

In this study [15, 29], we have carried out the 3D-RISM calculation for ahen egg-white lysozyme immersed in water and obtained the 3D-distributionfunction of oxygen and hydrogen of water molecules around and inside theprotein. The native 3D structure of the protein is taken from the protein databank (PDB). The protein is known to have a cavity composed of the residuesfrom Y53 to I58 and from A82 to S91, in which four water molecules havebeen determined by means of the X-ray diffraction measurement [30]. In ourcalculation, those water molecules are not included explicitly.

In Fig. 10.2, depicted by green surfaces or spots using isosurface represen-tation is g(r) of water oxygen, which is very similar to the electron densitymap obtained from the X-ray crystallography. We have drawn g(r) greaterthan a threshold value: the left, center, and right figures correspond, respec-tively, to g(r) > 2, g(r) > 4, and g(r) > 8. Since g(r) is unity in the bulk,the left figure indicates that the probability of finding those water moleculesat the surface is more than twice as larger compared to the bulk water. Assuch, the water molecules depicted in the right figure have the probability oflocation in those spots eight times higher than in the bulk. The water mole-cules are those bound firmly to some particular atoms of the protein due to,say, hydrogen bonds, and they are quite rare as one can see from the figure.In this sense, the threshold values play the role of “temperature” in the X-raydiffraction measurement: if you lower the temperature, you can observe more

Page 215: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

10 A Statistical Mechanics Theory of Molecular Recognition 197

water molecules that have weaker interaction with protein. The results suggestthat the X-ray and neutron diffraction communities have acquired a powerfultheoretical tool to analyze their data to locate the position and orientationsof water molecules, as our theory also provides the distribution of hydrogensites of water molecules.

The results depicted in Fig. 10.2 are what we expected before we actuallycarried out the calculation, although they were entirely new by themselves inthe history of statistical mechanics. Entirely unexpected was that we observedsome peaks of water distribution in a cavity “inside” the protein, which issurrounded by the residues from Y53 to I58 and from A82 to S91. The resultsare shown in Fig. 10.3. The left picture in Fig. 10.3 shows the isosurfaces ofg(r) > 8 for water-oxygen (green) and hydrogen (pink) in the cavity. In thefigure, only the surrounding residues are displayed, except for A82 and L83,which are located in the front side. There are four distinct peaks of wateroxygen and seven distinct peaks of water hydrogen in the cavity. The spotscolored by green and pink indicate water oxygen and hydrogen, respectively.From the isosurface plot, we have reconstructed the most probable model ofthe hydration structure. It is shown in the center of Fig. 10.3, where the fourwater molecules are numbered in the order from the left. Water 1 is hydrogen-bonding to the main-chain oxygen of Y53 and the main-chain nitrogen of L56.Water 2 forms hydrogen bonds with the main-chain nitrogen of I56 and themain-chain oxygen of L83, which is not drawn in the figure. Water 3 and 4 alsoform hydrogen bonds with protein sites, the former to the main-chain oxygenof S85 and the latter to the main-chain oxygens of A82 (not displayed) andof D87. There is also a hydrogen bond network among Water 2, 3, and 4. Thepeak of the hydrogen between Water 3 and 4 does not appear in the figurebecause it is slightly less than 8, which means the hydrogen bond is weaker orlooser than the other hydrogen-bonding interactions. Although the hydroxylgroup of S91 is located in the center of the four water molecules, it makesonly weak interactions with them.

It is interesting to compare the hydration structure obtained by the 3D-RISM theory with crystallographic water sites of X-ray structure [30]. Thecrystallographic water molecules in the cavity are depicted in the right ofFig. 10.3, showing four water sites in the cavity, much as the 3D-RISM theoryhas detected. Moreover, the water distributions obtained from the theory andexperiment are quite similar to each other. Thus the 3D-RISM theory canpredict the water-binding sites with great success.

It should be noted that one peak of the 3D-distribution function does notnecessarily correspond to one molecule. If a water molecule transfers back andforth between two sites in the equilibrium state, two peaks correspondinglyappear in the 3D-distribution function. In fact, the number of water mole-cules within the cavity calculated from the 3D-distribution function is 3.6.It is less than the number of water-binding sites and includes decimal frac-tions. To explain that, we carried out molecular dynamics (MD) simulationusing the same parameters and under the same thermodynamic conditions as

Page 216: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

198 T. Imai et al.

Fig. 10.2. Isosurface representation of the 3D distribution function g(r) of wateroxygen around lysozyme calculated by the 3D-RISM theory. Green surfaces or spotsshow the area where the distribution function is larger than 2 (left), 4 (center), and8 (right)

Fig. 10.3. Water molecules in a cavity of lysozyme. Only the surrounding residuesare displayed. The isosurfaces of water oxygen (green) and hydrogen (pink) for the3D distributions larger than 8 (left), the most probable model of the hydrationstructure reconstructed from the isosurface plots (center), and the crystallographicwater sites (right)

Fig. 10.4. Xenon bound by lysozyme: protein surface, blue; xenon, yellow; wateroxygen, red; water hydrogen, white. The right and left panels magnify the substratebinding site and the internal site, respectively. The X-ray xenon sites are painted asorange spheres

Page 217: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

10 A Statistical Mechanics Theory of Molecular Recognition 199

for the 3D-RISM calculation. Only one exception was that the four crystal-lographic water molecules in the cavity as well as the other crystallographicwater molecules were initially put at their own sites in the MD simulation.The result of MD simulation also shows the hydration number less than 4,that is, 3.5 [29]. From the MD trajectory, it is found that two inner watermolecules, Water 1 and 2, stay at their own sites during all the simulationtime, and make only small fluctuation around the sites. On the other hand,two outer water molecules, Water 3 and 4, sometimes enter and leave the sites,and by chance exchange with other water molecules from the bulk phase. Asa result, the number of water molecules at the outer sites is 1.5 on aver-age. The 3D-RISM theory provides a reasonable hydration number includ-ing fractions through statistical-mechanical relations, even though the theorytakes no explicit account of the dynamics of molecules.

10.4 Noble Gas Binding to Protein

Molecular recognition by protein, or ligand binding, is one of the most funda-mental functions of protein in the biological process. In addition to a scientificinterest, prediction of the ligand binding sites and affinities is the startingpoint for drug discovery [31, 32]. Therefore, a large number of computationalmethods as well as experimental approaches have been proposed [3,4,33]. Thecomputational methodologies are divided into two categories or stages. Oneis the prediction of ligand binding sites in a target protein. The binding sitesare located, in the most common case, based on a purely geometric analysisof the protein structure, in which cavities or clefts in the protein are detectedand regarded as the potential binding sites [3]. The binding sites can alsobe predicted by bioinformatics from multiple alignment of the amino acid se-quences in the protein family [33]. The other is docking of a ligand moleculeat the binding sites that are already known or predicted in advance. Possi-ble docking structures are then evaluated based on a force field or a scoringfunction [4].

Although such docking programs are increasingly popular among the fieldsof bioscience and pharmacology [34], theoretical methodologies are not fullydeveloped. One of the least developed methodologies is how to incorporatethe effect of water into the binding affinity or free energy. Water participatesin the protein–ligand binding in the following two ways. Primarily, bulk waterprovides the reaction field acting on the binding. This effect includes theelectrostatic screening and the hydrophobic interaction between protein andligand molecules. Moreover, individual water molecules can act as integralmolecular components of the complex [35–37]. In fact, water molecules areoften found at the binding interface of protein–ligand complexes mediatingwith the hydrogen bonds or simply filling void spaces. In spite of evidentsignificance of such water molecules, the effect of water is usually treated atthe level of continuum solvent models [4], unless the interfacial water moleculesare found in advance.

Page 218: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

200 T. Imai et al.

The methodology described in the previous section can be applied to theprocess with a slight modification, and provides a powerful theoretical toolto realize the ligand binding by protein. The modification to be made is justto change the solvent from the pure water to an aqueous solution containingligand molecules. In this section, we present the results for binding of noblegases [38], which are the simplest model of nonpolar ligands.

Figure 10.4 shows the 3D distribution functions of xenon and water (oxy-gen and hydrogen) around lysozyme calculated by the 3D-RISM theory forlysozyme in water–xenon mixture at the concentration of 0.001 M. The mole-cular surface of the protein is painted blue. The regions where g(r) > 8 arepainted with different colors for different species: yellow, xenon; red, wateroxygen; white, water hydrogen. Of course, the surface painted blue is cov-ered by water molecules weakly bound to the protein, which are not shown.A number of well-defined peaks, yellow and red spots, are found for xenonand water oxygen at the surface of the protein, which are separated fromeach other. The result demonstrates the capability of the 3D-RISM theoryto predict “preferential binding” of ligands. The distributions of ligand andwater are simultaneously found in this result, which means the peak of eitherthe ligand or the water is found at each site, depending on the ratio of theiraffinities to the site. Actually, Fig. 10.4 indicates that there are water- andxenon-preferred sites on the protein surface. Similar results are obtained forthe other gases and the other concentrations.

It is interesting to compare the distribution of xenon obtained by the3D-RISM theory with the xenon sites in the X-ray structure [39], eventhough their conditions are different: the former is aqueous solution underatmospheric pressure, while the latter is crystal under xenon gas pressure of12 bar. There are two binding sites of xenon in lysozyme: one correspondsto the binding pocket of native ligands, which is referred to as the substratebinding site, and the other is located in a cavity inside the protein, which isreferred to as the internal site [39]. The right panel of Fig. 10.4 compares thetheoretical result of the 3D distribution of xenon with the X-ray xenon siteat the substrate binding site. The location of a high and sharp peak found bythe theory is in complete agreement with the X-ray xenon site. The left panelof Fig. 10.4 shows the result at the internal site. The xenon peak found thereis actually a minor one; nevertheless, the location is again consistent with theX-ray site. It is interesting to note that the peaks of water are shifted off fromthe xenon binding site.

Figure 10.5 shows the size dependence of the coordination number of noblegases at the two binding sites, which is calculated at the concentration of0.001 M. At the substrate binding site, the coordination number becomes ex-ponentially larger as the size of gas increases (Fig. 10.5a). At the internal site,the coordination number becomes larger with increase in the gas size up toσ ≈ 3.4 A, while it decreases in the region where σ > 3.4 A (Fig. 10.5b). As aresult, argon has the largest binding affinity to the internal site. These resultsdemonstrate that the 3D-RISM theory has the ability to describe ligand-size

Page 219: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

10 A Statistical Mechanics Theory of Molecular Recognition 201

Fig. 10.5. Coordination numbers of noble gases at the two binding sites, plottedagainst the atomic diameter of the gases. (a) substrate binding site. (b) internal site

selectivity in binding or molecular recognition. Although there are no cor-responding experimental data, the present results serve as a representativetest case.

It is well known that the activity of protein plotted against the logarithmof ligand concentration generally produces a sigmoidal curve, which is the so-called dose-response curve. Experimentalists use the sigmoidal dose-responsecurve to obtain the equilibrium constant of the protein–ligand binding andthe binding free energy. As in the experimental procedure, we can plot thecoordination number of each noble gas against the logarithm of the gas concen-tration. In the present case, the complete sigmoidal curves were not obtained(data not shown) because the affinities between the protein and noble gasesare considerably weak. Nevertheless, it should be emphasized here that theproduction of the dose-response curve can be achieved only if the employedmethod can treat a highly dilute mixture, because the typical equilibriumconstant of ligand binding is in the order of μM. The ordinary molecularsimulation would never cover such highly dilute conditions. In the 3D-RISMtheory, the calculation can be done at an arbitrary concentration, just by set-ting the value of component density ρ in the equation. Then, we can obtainthe equilibrium constants and the binding free energies from the concentrationdependence without calculating the free energy directly.

10.5 Selective Ion-Binding by Protein

Ion binding is essential for a variety of physiological processes. The binding ofcalcium ions by some protein triggers the process to induce the muscle con-traction and enzymatic reactions [40,41]. The initial process of the information

Page 220: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

202 T. Imai et al.

transmission through the ion channel is the ion-binding by channel pro-tein [42]. The ion-binding plays an essential role sometimes to the foldingprocess of a protein by inducing the secondary structure [43]. Such processesare characterized by the highly selective ion recognition by the proteins. It isof great importance, therefore, for life science to clarify the origin of the ionselectivity in molecular detail.

In this section, we present theoretical results for the ion binding by humanlysozyme [44, 45] obtained through basically the same procedure as that de-scribed in the preceding section, but with change in the solution from noblegas to ionic solutions. We first prepare the correlation functions for the bulksolutions by solving (10.11), and then plug those functions into the 3D-RISMequation (10.14) to obtain the 3D-distribution of ions along with water mole-cules. A special attention, however, should be paid to the treatment of the bulksolution as the reference state, because the ion–ion interactions in the solu-tions are the Coulomb interaction, and their contribution to the “dehydrationpenalty” should not be disregarded even in low concentration. To make surethat the free energy due to ion–ion interaction is reasonably accounted, wehave calculated the excess chemical potential, or the mean activity coefficient,of ions in solutions. The results are given in Fig. 10.6.

The results in general show fair agreement with the experimental results.Particularly, the theory discriminates the divalent ion from the monovalentions quite well. Apparently, the concentration dependence of the two mono-valent ions is not resolved well. This may be due to the potential parametersfor the ions. However, it will not seriously influence the results for the ionrecognition by protein, because the process is determined primarily by thefree energy difference of the same ion inside protein and in bulk solutions.

The 3D-RISM calculation was carried out for aqueous solutions of threedifferent electrolytes, CaCl2, NaCl, and KCl, and for four different mutantsof the protein, wild type, Q86D, A92D, Q86D/A92D that have been studiedexperimentally by Kuroki and Yutani [46].

Fig. 10.6. Mean activity coefficient of aqueous solutions of NaCl, KCl, and CaCl2

Page 221: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

10 A Statistical Mechanics Theory of Molecular Recognition 203

Fig. 10.7. Selective ion binding by human lysozyme: upper left, wild type; uppermiddle, Q86D; upper right, A92D; lower left, Q86D/A92D. The lower middle pictureshows the calcium binding site in the Q86D/A92D mutant detected by X-ray, whilethe picture in lower right exhibits the binding-site found by the 3D-RISM theory

In Fig. 10.7, the distributions of water molecules and the cations insideand around the cleft under concern are shown, which consists of amino acidresidues from Q86 to A92. The area where the distribution function g(r) isgreater than five is painted with a color for each species: oxygen of water,red; Na+ ion, yellow; Ca2+ ion, orange; K+ ion, purple. For the wild type ofprotein in the aqueous solutions of all the electrolytes studied, CaCl2, NaCl,and KCl, there are no areas of g(r)> 5 observed for the ions inside the cleft, asseen in the upper left part of Fig. 10.7. The Q86D mutant exhibits essentiallythe same behavior as that of the wild type, but with the water distributionchanged slightly. (There is a trace of yellow spot that indicates a slight possi-bility of finding a Na+ ion in the middle of the binding site, but it is too smallto make a significant contribution to the distribution.) Instead, the distribu-tion corresponding to water oxygen is observed, as shown in red in the figure.The distribution covers faithfully the region where the crystallographic watermolecules have been detected, which are shown with the spheres colored gray.There is a small difference between the theory and the experiment, which isthe crystallographic water bound to the backbone of D91. The theory doesnot reproduce the water molecule by unidentified reasons. Except for this dif-ference, the observation is consistent with the experimental finding, especiallythat the protein with the wild type sequence binds neither Na+ nor Ca2+.

The A92D mutant in the NaCl solution shows a conspicuous distribu-tion of a Na+ ion bound at the recognition site, which is in accord with the

Page 222: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

204 T. Imai et al.

experiment (upper-right part of the figure). The Na+ ion is apparently boundto the carbonyl oxygen-atoms of D92, and is distributed around the moieties.There is a water distribution observed at the active site, but the shape ofthe distribution is entirely changed from that in the wild type. The distribu-tion indicates that the Na+ ion bound at the active site is not naked, but isaccompanied by hydrating water molecules. The mutant does not show anyindication of binding K+ ion. (The results are not shown.) This suggests thatthe A92D mutant discriminates a Na+ ion from a K+ ion. The finding demon-strates the capability of the 3D-RISM theory to realize the ion selectivity byprotein.

In the lower panels, shown are the distributions of Ca2+ ions and of wateroxygen at the ion binding site of the holo-Q86D/A92D mutant. The mutantis known experimentally as a calcium binding protein. The protein, in fact,exhibits a strong calcium binding activity as is evident from the figure. Thecalcium ion is recognized by the carboxyl groups of the three aspartic acidresidues, and is distributed around the oxygen atoms. Water distribution atthe center of the triangle made by the three carbonyl oxygen atoms is reduceddramatically, which indicates that the Ca2+ ion is coordinated by the oxygenatoms directly, not with water molecules in between. The Ca2+ ion, however,is not entirely naked, because the persistent water distribution is observed atleast at two positions where original water molecules were located in the wildtype of the protein.

10.6 Pressure-Induced Structural Transitionof Protein and Molecular Recognition

“Molecular recognition” or specific hydration in the internal cavity of proteinis of substantial importance for the stability and integrity of protein structureitself. In this section, we present an example of such phenomena.

Pressure denaturation of protein has been one of the problems in the focusof protein research due not only to its significance in science [47–49], but alsoto its importance in industrial applications, including food processing [50].The molecular mechanism of the process has not been clarified for a longtime, especially concerning the role played by water or hydration. We haveapplied the RISM/3D-RISM theory to this problem to clarify the molecularmechanism behind the thermodynamics process [51].

Change in the equilibrium constant for the transition (N↔D) between thenative (N) and denatured (D) states of protein due to applied pressure canbe described thermodynamically by(

∂ ln K

∂p

)T

= −ΔV

RT, (10.20)

where ΔV denotes the partial molar volume (PMV) change associated withthe transition from N to D. This equation indicates that the conformational

Page 223: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

10 A Statistical Mechanics Theory of Molecular Recognition 205

change induced by pressure should proceed toward decreasing the volume,which is nothing but “Le Chaterier’s law.” The experimental facts that aprotein denatures entirely or partially by pressure indicate that ΔV for theN to D transition should be negative. However, this simple law has never beenverified in terms of molecular theories. The reason is there was neither mole-cular theory to describe PMV nor data available for protein conformationsat high pressure. As we have noted in the section outlining the theory, theRISM/3D-RISM theory is capable of describing PMV of protein in quanti-tative level. Moreover, the structure of ubiquitin at high pressure (300 MPa)as well as at low pressure (3 MPa), shown in Fig. 10.8, have been obtainedrecently by the Akasaka group [52]. So, it was a natural attempt to calculatePMV for the two structures, high-pressure structure (HPS) and low-pressurestructure (LPS), of the protein by using the 3D-RISM theory.

The data shown in Fig. 10.8 are the PMV change upon the structuraltransition and its decomposition into different contributions obtained by the3D-RISM theory [53]. The decomposition is made by the following equation,which was proposed first by Chalikian and Breslauer [54] and later redefinedtheoretically by us [23,55],

V = VW + VV + VT + VI + kBTχT , (10.21)

where VW is the van der Waals volume, VV is the volume of structural voidswithin the solvent-inaccessible core, VT is the thermal volume that results

Fig. 10.8. Changes in the structure and in the volume components associated withthe pressure-induced structural transition of ubiquitin. Solid ribbon representationof low-pressure (3 MPa) and high-pressure (300 MPa) structures. The data shownare the total change in the partial molar volume (V ) and the changes in the van derWaals (VW), void (VV), thermal (VT), and interaction (VI) volumes

Page 224: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

206 T. Imai et al.

from thermally induced molecular fluctuations between the solute and solventand is considered as average empty space around the solute due to imperfectpacking of the solvent, VI is the change in the solvent volume induced by theintermolecular interaction between the solute and solvent, and the last termkBTχT is the ideal contribution to PMV from the translational degrees offreedom of solute.

The theoretical calculation indicates that PMV of HPS is less than thatof LPS according to Le Chaterier’s law, and most of the contribution to thevolume reduction results from the void volume VV. Then, a question to beasked is what is the molecular mechanism of decreasing the void volume bypressure. Is it simply caused by shrinking the volume of internal cavities wherethere are no water molecules? The answer is “no.” (In such a case, unlike thepresent result, the thermal volume VT is almost unchanged [23]). Take a lookat the pictures in Fig. 10.9, which exhibit the water distribution in the internalcavities of LPS (left) and HPS (right) of the protein. As indicated by dashedcircles, the water distribution inside the cavities is largely enhanced in HPS,compared to that in LPS. What happened is that part of the internal voidspace in LPS is filled with water molecules upon the structural change intoHPS due to the pressure, which gives rise to the decrease in the void volume.

The relation between the thermodynamics and the molecular process ofpressure denaturation, clarified by the 3D-RISM theory, is as follows. At thelow pressure condition in which all the calculations have been carried out, HPSis not the equilibrium conformation but is one of the fluctuating structures.

Fig. 10.9. Isosurface representation of the 3D distribution function of water oxygenaround the low-pressure (3 MPa) and high-pressure (300 MPa) structures of ubiqui-tin. The dark gray surfaces show the area where the distribution function is largerthan 2. This is a top-view representation, in which the upper parts (the front partsin the figure) are clipped to bring the internal cavity (marked by dashed circle) intoview

Page 225: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

10 A Statistical Mechanics Theory of Molecular Recognition 207

Applying pressure stabilizes the structure in fluctuation at low pressure byreducing PMV through the enhanced contact with water molecules in theinternal cavity. The equilibrium shifts toward HPS due to the reduced PMV.

10.7 Perspective

In this chapter, we have presented a new method to describe the molecularrecognition in biomolecules based on the statistical mechanics of molecular liq-uids, or the RISM/3D-RISM theory. In some phenomena for which thermody-namic and structural data are available, the theoretical results have exhibitedat least qualitative agreement with the experiment. The typical example is thepositions of water molecules in a cavity of hen egg-white lysozyme for whichthe theoretical and experimental results exhibited quantitative agreement. Inother cases where there are no experimental data to be compared with, thetheory has demonstrated its predictive capability. The best example is therecognition of noble gas by lysozyme. Although there is no data available fornoble-gas binding by the protein, except for xenon, our theory reasonablyaccounts for the dependence of the binding affinity on the size of noble-gasmolecules, which shows an entirely different trend depending on the positionand size of the cavities. We believe that the prediction will be proven sooneror later by the X-ray and/or neutron diffraction measurements.

Although the RISM/3D-RISM theory has proven its capability of “predic-tion,” there are few other summits to be conquered before it establishes itselfas the “theory of molecular recognition.” The problem concerns conforma-tional fluctuation of protein. For example, the present theory still requires ex-perimental data for structure of protein as an “input.” In other words, we havenot yet succeeded in “building” tertiary structure of protein from the aminoacid sequence. If we become able to build the tertiary structure in differentsolution conditions (containing, say, electrolytes or other ligands) on the freeenergy surface produced by the RISM/3D-RISM method, we will be able toattain at the same time two most highlighted problems in the biophysics: the“protein folding” and the “molecular recognition.” The statement of “differentsolution conditions” has an even deeper implication. Experimental results areclearly indicating that some of the folding processes are driven or enforced by“salt bridges” or “water bridges.” This implies that the methodologies thatdo not account for water molecules and electrolytes explicitly are fatal in thisbusiness. The RISM/3D-RISM theory certainly has such an ability to realizethose ions and water molecules “bridging” amino-acid residues inside protein,as has been demonstrated in this chapter. If one could sample the proteinconformation on the potential of mean force or free energy surface producedby the RISM/3D-RISM method, one would attain the two goals at the sametime. We have already developed such methodologies to explore large fluctu-ation of protein by combining the RISM/3D-RISM theory with the moleculardynamics [56] and Monte Carlo method [57–59].

Page 226: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

208 T. Imai et al.

Experimental analysis of protein function involves time-dependent prop-erties such as the rate of an enzymatic reaction and the conduction rate ofions in an ion channel. These properties are related to comparably small fluc-tuations of protein around the native conformations. In enzymatic reactions,an enzyme may have to “open” its “door” of entrance to accommodate sub-strate molecules in the reaction pocket. The ion channels have some devicecalled the “gating” mechanism to control the flow of ions into the channelpore. The mechanisms are regulated often by conformational fluctuation ofprotein. Analyses of those processes require evaluation of “dynamic” or time-dependent properties of both protein as well as solvent, which are sometimesclosely correlated. In such a case, the “dynamics” on the free energy surfacedescribed earlier is insufficient. We have to describe the dynamics of pro-tein and solvent on an equal footing. To our best knowledge, the generalizedLangevin equation is only the theory to meet such a requirement. The studyto combine the RISM/3D-RISM theory with the generalized Langevin equa-tion to realize the correlated dynamics of protein and solvent is in progress inour group [60].

Any of those methods that we have been developing requires solving the3D-RISM equations for many conformations of a protein. Currently, it takesa few hours to solve the 3D-RISM equations for the conformation of a proteinwith a few hundred residues, using a modern workstation. It is not feasible atpresent to solve the above-stated problems on conventional computational re-sources, even though we succeed in building the methodology. However, withthe National Project of building a next-generation supercomputer, which isunderway in Japan, the 3D-RISM methodology fine-tuned to and drasticallyaccelerated with the new supercomputer will hopefully make a crucial contri-bution to solving these most important problems in life sciences.

References

1. J.D. Watson et al., Molecular Biology of the Gene (Benjamin/Cummings, MenloPark, CA, 1987)

2. L. Michaelis, M. Menten, Biochem. Z. 49, 333 (1913)3. C. Sotriffer, G. Klebe, Il Formaco 57, 243 (2002)4. H. Gohlke, G. Klebe, Angew. Chem. Int. Ed. 41, 2644 (2002)5. W.C. Still, A. Tempczyk, R.C. Hawley, T. Hendrickson, J. Am. Chem. Soc. 112,

6127 (1990)6. M.K. Gilson, B. Honig, Proteins: Struct. Funct. Genet. 4, 7 (1988)7. M. Kato, A. Warshel, J. Phys. Chem. B 109, 19516 (2005)8. F. Zhu, E. Tajkhorshid, K. Schulten, Biophys. J. 86, 50 (2004)9. F. Hirata (ed.), Molecular Theory of Solvation (Kluwer, Dordrecht, 2003)

10. A. Kovalenko, F. Hirata, Chem. Phys. Lett. 290, 237 (1998)11. D. Beglov, B. Roux, J. Phys. Chem. B 101, 7821 (1997)12. J.-P. Hansen, I.R. McDonald, Theory of Simple Liquids, 3rd edn. (Academic,

London, 2006)

Page 227: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

10 A Statistical Mechanics Theory of Molecular Recognition 209

13. L. Blum, A.J. Torruella, J. Chem. Phys. 56, 303 (1972)14. T. Imai, A. Kovalenko, F. Hirata, Chem. Phys. Lett. 395, 1 (2004)15. T. Imai, R. Hiraoka, A. Kovalenko, F. Hirata, J. Am. Chem. Soc. 127, 15334

(2005)16. D. Chandler, H.C. Andersen, J. Chem. Phys. 57, 1930 (1972)17. C.M. Cortis, P.J. Rossky, R.A. Friesner, J. Chem. Phys. 107, 6400 (1997)18. A. Kovalenko, F. Hirata, J. Chem. Phys. 110, 10095 (1999)19. S.J. Singer, D. Chandler, Mol. Phys. 55, 621 (1985)20. J.G. Kirkwood, F.P. Buff, J. Chem. Phys. 19, 774 (1951)21. T. Imai, M. Kinoshita, F. Hirata, J. Chem. Phys. 112, 9469 (2000)22. Y. Harano, T. Imai, A. Kovalenko, M. Kinoshita, F. Hirata, J. Chem. Phys.

114, 9506 (2001)23. T. Imai, A. Kovalenko, F. Hirata, J. Phys. Chem. B 109, 6658 (2005)24. E. Mayer, Protein Sci. 1, 1543 (1992)25. Y. Zhou, J.H. Morais-Cabral, A. Kaufman, R. MacKinnon, Nature 414, 43

(2001)26. T. Tanimoto, Y. Furutani, H. Kandori, Biochemistry 42, 2300 (2003)27. M. Nakasako, Phil. Trans. R. Soc. Land. B Biol. Sci. 359, 1191 (2004)28. N. Niimura, S. Arai, K. Kurihara, T. Chatake, I. Tanaka, R. Bau, Cell. Mol.

Life Sci. 62, 285 (2006)29. T. Imai, R. Hiraoka, A. Kovalenko, F. Hirata, Proteins: Struct. Funct. Bioinfor-

mat. 66, 804 (2007)30. K.P. Wilson, B.A. Malcolm, B.W. Matthews, J. Biol. Chem. 267, 10842 (1992)31. D.B. Kitchen, H. Decornez, J.R. Furr, J. Bajorath, Nat. Rev. Drug Discov. 3,

935 (2004)32. G. Klebe, Drug Discov. Today 11, 580 (2006)33. O. Lichtarge, M.E. Sowa, Curr. Opin. Struct. Biol. 12, 21 (2002)34. S.F. Sousa, P.A. Fernandes, M.J. Ramos, Proteins: Struct. Funct. Genet. 65, 15

(2006)35. J.E. Ladbury, Chem. Biol. 3, 973 (1996)36. Y. Levy, J.N. Onuchic, Annu. Rev. Biophys. Biomol. Struct. 35, 389 (2006)37. Z. Li, T. Lazaridis, Phys. Chem. Chem. Phys. 9, 573 (2007)38. T. Imai, R. Hiraoka, T. Seto, A. Kovalenko, F. Hirata, J. Phys. Chem. B 111,

11585 (2007)39. T. Prange, M. Schiltz, L. Pernot, N. Colloc’h, S. Longhi, W. Bourguet,

R. Fourme, Proteins: Struct. Funct. Genet. 30, 61 (1998)40. O. Herzberg, M.N. James, Nature 313, 635 (1985)41. M. Ikura, G.M. Clore, A.M. Gronenborn, G. Zhu, C.B. Klee, A. Bax, Science

256, 632 (1992)42. B. Hille, Ionic Channels of Excitable Membranes (Sinauer Associates,

Sunderland, MA, 2001)43. S. Tsuda, K. Ogura, Y. Hasegawa, K. Yagi, K. Hikichi, Biochemistry 29, 4951

(1990)44. N. Yoshida, S. Phongphanphanee, Y. Maruyama, T. Imai, F. Hirata, J. Am.

Chem. Soc. 128, 12042 (2006)45. N. Yoshida, S. Phongphanphanee, F. Hirata, J. Phys. Chem. B 111, 4588 (2007)46. R. Kuroki, K. Yutani, J. Biol. Chem. 273, 34310 (1998)47. J.L. Silva, G. Weber, Annu. Rev. Phys. Chem. 44, 89 (1993)48. C. Balny, P. Masson, K. Heremans, Biochim. Biophys. Acta 1595, 3 (2002)

Page 228: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

210 T. Imai et al.

49. F. Meersman, C.M. Dobson, K. Heremans, Chem. Soc. Rev. 35, 908 (2006)50. M.F. San Martin, G.V. Barbosa-Canovas, B.G. Swanson, Crit. Rev. Food Sci.

Nutr. 42, 627 (2002)51. T. Imai, Condens. Matter Phys. 10, 343 (2007)52. R. Kitahara, S. Yokoyama, K. Akasaka, J. Mol. Biol. 347, 277 (2005)53. T. Imai, S. Ohyama, A. Kovalenko, F. Hirata, Protein Sci. 16, 1927 (2007)54. T.V. Chalikian, K.J. Breslauer, Biopolymers 39, 619 (1996)55. T. Imai, Y. Harano, A. Kovalenko, F. Hirata, Biopolymers 59, 512 (2001)56. T. Miyata, F. Hirata, J. Comput. Chem. 29, 871 (2008)57. M. Kinoshita, Y. Okamoto, F. Hirata, J. Am. Chem. Soc. 120, 1855 (1998)58. A. Mitsukake, M. Kinoshita, Y. Okamoto, F. Hirata, Chem. Phys. Lett. 329,

295 (2000)59. A. Mitsukake, M. Kinoshita, Y. Okamoto, F. Hirata, J. Phys. Chem. B 108,

19002 (2004)60. B. Kim, S.-H. Chong, R. Ishizuka, F. Hirata, Condens. Matter Phys. 11, 179

(2008)

Page 229: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

11

Computational Studies of Protein Dynamics

J.A. McCammon

Abstract. Theoretical and computational studies of protein function have reachedthe point at which they are making important contributions to drug discovery andother practical applications. At the same time, they are deepening our understandingof the principles of protein activity, including the dynamical features that give riseto NMR and other experimental measurements, and the time-dependent aspects ofbiological function.

11.1 Introduction

Proteins are well known to exhibit a wide variety of internal motions, ontimescales extending from femtoseconds to hours. These motions are alsoknown to be involved in protein function. Examples of such functional motionsinclude the displacement of amino acid residues in enzymes to allow substratebinding and product release, and the rearrangements of enzyme and substrateatoms during catalysis. But how important are the details of the time depen-dence of such motion? It appears, in fact, that the functions of proteins aregoverned in many cases by the detailed time dependence of their internal mo-tions. Indeed, it appears that evolution has shaped not only the structuresof proteins, but also these essential dynamical characteristics. This chapterprovides an overview of protein dynamics and function. Representative exper-imental results are outlined, and it is shown how computer simulations can beused quantitatively to interpret the dynamical behavior of proteins, includingtheir binding of ligands.

11.2 Brief Survey of Protein Motions

Some internal motions of proteins can be described quite simply. These includethe localized vibrations within covalently bonded groups and also the elasticvibrations that involve coherent small-amplitude displacements of larger por-tions of the molecule. But generally, motions in proteins are more complex,

Page 230: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

212 J.A. McCammon

and more interesting. The ease with which dihedral angles can be varied inproteins, together with the relatively soft nature of their nonbonded interac-tions other than the short-range interatomic repulsions, and the dense packingof groups within globular proteins combine to yield the rugged energy land-scape that is now familiar from much experimental and theoretical work [8,35].Variations in the protonation states of titratable groups and in the binding ofwater molecules and ions to sites in the protein also contribute to the structureof the energy landscape, as discussed below.

Motions in proteins correspond to excursions on this energy landscape,and may be correspondingly complex. Even the “simple” motions mentionedat the outset of this section will be perturbed by transitions over barriers inthe protein’s energy landscape; e.g., the localized vibrations of a covalentlybonded group will differ to some extent, depending on which energy well inthe landscape the biopolymer resides in.

Spectroscopic studies on the protein myoglobin indicated that proteinsmay have hierarchical energy landscapes [8]. This study and many subse-quent ones suggest that a typical globular protein may have a few conforma-tional substates in its “taxonomic” tier with the largest barriers, that barriersbetween such taxonomic conformational substates may be on the order of100 kJ mol−1, and that there may typically be a few lower tiers with a smallnumber of conformational substates in each tier [29]. A nuclear magnetic res-onance study of the most slowly exchanging buried water molecule in thebovine pancreatic trypsin inhibitor indicates that its exchange can be mod-eled as a diffusion process on an energy landscape with the crossing of barrierson the order of 10 kJ mol−1 [6]. The exchange of this water molecule occurswith a characteristic time of about 170 μs at 300 K. Examination of the struc-ture of the protein shows that not only side-chain motions but also significantbackbone motions must occur during the exchange of this particular watermolecule, which is consistent with the many conformational substates beinginvolved with the exchange process. The authors of this study suggest thatany local process in a protein that occurs on the nanosecond to millisecondtimescale and requires substantial displacements of groups in the protein maybe rate-limited by interconversion of conformational substates and displayfeatures similar to those observed in their study [6].

Recent single-molecule experimental studies of proteins provide more de-tailed views of protein motions, and confirm that a wide variety of timescales isinvolved in,e.g., catalyticactionofenzymes [7,14,15,19,33].Ofcourse,moleculardynamics simulations have been used to probe motions in single proteins formany years, and advances in both theory and computational science havemade simulations a powerful approach to building theoretical understandingof protein dynamics [1]. The recent introduction of “accelerated moleculardynamics” methods is helpful in this context [11]. Although detailed dynamicalinformation is sacrificed to the enhanced sampling of conformational space inthese methods, which have been shown to access conformational fluctuationsthat are revealed by nuclear magnetic resonance experiments on the millisecond

Page 231: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

11 Computational Studies of Protein Dynamics 213

timescale [17], it is possible to recover dynamical information with certainmodels and approximations [12]. Also, accelerated molecular dynamics simu-lations have revealed the important role of solvent water in contributing to therough energy landscape of proteins. That is, the roughness does not emergeentirely from the protein; the making and breaking of hydrogen bonds be-tween the protein and the solvent are estimated to increase the roughness ofthe protein landscape by about 4 kJ mol−1, with marked effects on the overalltimescale of protein motions [13].

11.3 Binding and Selectivity

In a number of cases, particularly where ligand–receptor binding is fast, itappears that certain features of the internal motion of one or both partnershave evolved to be rapid enough to avoid kinetic bottlenecks. The enzymeacetylcholinesterase represents one such case. Acetylcholinesterase is foundin cholinergic synapses, including neuromuscular junctions. It functions toclear the neurotransmitter acetylcholine following excitation of the postsy-naptic nerve or muscle. As such, it has been under tremendous evolutionarypressure to operate at the maximum possible speed; e.g., the correspondinglyfast reflexes aided our ancestors in escaping from predators. In the crystallo-graphic structures of forms of the enzyme from two different species, a gorgeor channel extending approximately 2 nm from the surface of the enzyme tothe active site is apparent [3, 28]. This is the most likely route for bindingthe substrate acetylcholine. But in both structures, a constriction exists mid-way down the channel that, if static, would preclude passage of substrate.Despite this, the enzyme binds substrate at or near the diffusion-controlledlimit. It has been known for some time that if such obstacles can be removedfrequently enough by the fluctuations in an enzyme or other receptor, theobstacles will not slow the overall rate of binding [18]; this is termed the fastgating kinetic regime. Molecular dynamics simulations suggest that this is thesituation for acetylcholinesterase [25, 34]. Fluctuations in the enzyme openthe channel every few picoseconds, which is often enough to allow capture ofthe substrate before it can diffuse away over times on the order of a few hun-dred picoseconds. A recent analysis by Zhou [37] presents the most completecurrent theory of such “gated” diffusional binding processes, and suggeststhat a similar picture describes the classic example of myoglobin. Anothergroup of gated enzymes comprises those that have a peptide loop that opensand closes over the active site. Wade et al. [32] suggest that somewhat slowerbut still rapid gating (times on the order of 1 ns) allows one such enzyme,triosephosphate isomerase, to operate in the diffusion-controlled regime.

It must be noted that the molecular dynamics simulations of acetyl-cholinesterase mentioned above are far too short to sample transitions overbarriers that separate many conformational substates of the protein. But, foracetylcholinesterase similar behavior is observed for the two subunits of the

Page 232: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

214 J.A. McCammon

homodimeric enzyme that was simulated [34]. Because only small displace-ments in the wall of the channel are required to open the gate, it may be thata relatively simple “elastic” picture is sufficient here. For myoglobin, whereligand binding is thought to involve more complex motions of the protein, thegate dynamics may still be sufficiently rapid at 300 K to allow for “simple”binding.

What are the possible functional implications of such gating motions?Ensuring the maximum possible speed of binding is clearly one function. Forexample, it is necessary to create a special environment around a substratefor enzymatic catalysis, but evolutionary pressure has forced the creation ofthis environment to happen very rapidly for certain enzymes.

Another function may well be the contribution of gating to the selectivityof binding [39]. For ligands that are only slightly larger than the natural ones,the gate may not open frequently enough to allow unhindered binding, so thatthe larger ligands are less likely to be bound before diffusing away. The overallrate of binding can decrease very rapidly with the increasing size of the ligand,and this will be reflected in the probability of binding one ligand comparedto another in the nonequilibrium regime typical of living systems [39].

As biophysical studies move above the molecular level to considersupramolecular and cellular scale processes, similar issues are certain toarise. In fact, the physiologically important form of acetylcholinesterase inmany synapses comprises closely held tetramers, attached to collagen-likestalks, which are in turn attached to the postsynaptic membrane. X-ray crys-tallography has suggested that a number of arrangements of the monomersis possible in these tetrameric clusters, including structures in which onemonomer may occlude the active site of a neighbor. Recent simulation studiesby Gorfe et al. show, however, that the relative diffusional motions of theacetylcholinesterase monomers is fast enough to reduce the kinetic penaltiesassociated with such steric hindrance; in other words, the kinetics is in the“fast gating” regime [9].

Although the above discussion focused on the binding of small molecules tobiopolymers, similar issues arise in connection with the binding of biopolymersto one another. In particular, rapid motion (times of a few nanoseconds)of surface loops of proteins may facilitate the assembly of chaperonins [16]and allow the binding of multiple receptors in the case of certain fibronectindomains [4].

Recent studies have shown that conformational fluctuations of proteinscan be important in structure-based drug discovery as in the discovery of anunexpected “cryptic” binding site in the HIV integrase enzyme during thecourse of molecular dynamics studies (Fig. 11.1) [24]. This helped to pavethe way for the discovery of the first in a new class of antiviral agents forHIV/AIDS, the compound Isentress (raltegravir), which was licensed by theU.S. Food and Drug Administration in October 2007. A recent review of workin this area has been published by Amaro et al. [2].

Page 233: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

11 Computational Studies of Protein Dynamics 215

Fig. 11.1. Two predicted binding conformations of an HIV-1 integrase inhibitorto a molecular dynamics (MD) snapshot of the protein. The green conformation issimilar to that in the crystal structure and the magenta is in a secondary predictedbinding trench that opened during an MD simulation of the protein [24]

Of importance in the present context, the binding of drugs to fluctuat-ing binding sites in target molecules can be kinetically gated by the detaileddynamics of those sites. This has been shown to be the case in the bindingof a number of clinically useful inhibitors to the HIV protease enzyme [5].Simulations of the HIV protease enzyme revealed opening and closing of pep-tide loops or “flaps” that lie over the active site. Analysis of these usinggated binding theory [39] showed that the predicted order of rate constantsfor drugs of different sizes agreed with the experimental results. An emerg-ing frontier in biophysics is the characterization of the effects of the crowdedcellular environment on molecular processes. For the case of HIV protease,Brownian dynamics simulations using coarse-grained models of the polypep-tide have shown that crowding can have a substantial effect on the frequencyof opening and closing of the enzyme’s active site [20].

Because only small displacements are required to open the gates in someof the systems mentioned above, biopolymer motion on short timescales (pico-seconds to nanoseconds) can influence function. In other cases, larger displace-ments and longer timescales are important, as discussed in later sections.

Page 234: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

216 J.A. McCammon

11.4 Concerted Binding and Release

In the case of biopolymers or assemblies of biopolymers that bind more thanone ligand, it appears that the binding of one ligand sometimes drives the re-lease or relocation of another ligand. One example is the enzyme dihydrofolatereductase from E. coli, in which the binding of the cofactor nicotinamide ade-nine dinucleotide phosphate (NADPH) leads to structural changes that tendto expel the cofactor tetrahydrofolate (THF) from a different site, as part ofthe cyclic activity of the enzyme [23]. In some ATP synthases, the proton-driven rotation of an asymmetric axle centered in an enzymatic cluster causeschanges in the conformations of these enzymes, which in turn enable substratebinding, and drive catalysis and product release, all at different sites in thecluster; an excellent discussion of coordinated events in molecular biophysicshas been presented recently by Zhou [38]. The actual dynamics of the tran-sitions involved remains to be fully determined, but is undoubtedly complex.Nevertheless, remarkable videos of the concerted motions in this system havebeen obtained in the laboratory of Masasuke Yoshida [30]. To have useful ratesof turnover (time of about 100 ms for ATP synthase), there must be upperbounds on the roughness of the energy landscape.

11.5 Molecular Clocks

The preceding discussion has considered processes that are fast, or at leastclosely correlated in time. Other functional processes in biopolymers mayrequire delay times, which in some cases may imply lower bounds on theroughness of the energy landscape.

Slow kinetics is very important in signal transduction. A well-known case isthat of the so-called G proteins, which typically exchange the nucleotide GDPfor GTP to become activated and so able to activate downstream partners[10]. The G proteins return to their inactivated GDP-bound state by slowhydrolysis of GTP; in other words, the G proteins are intrinsically “bad”enzymes. The inactivation of the G proteins can be greatly speeded up bytheir interaction with “GTPase activating proteins.”

Enzymes that bind two or more substrates or cofactors that interact inthe active site may in some cases require that one of these species be heldfor some time in a particular conformation. This has been suggested to bethe case in lactate dehydrogenase, where the cofactor NADH may have toretain the conformation found in its binary complex with the enzyme duringthe binding and reaction of substrate [31]. The nicotinamide ring is stericallyhindered from rotating during the reaction, which occurs on the millisecondtimescale, and this is thought to contribute to the stereospecificity of thereaction.

Necessary delay times may also occur in biopolymer conformationalchanges. This appears to occur, for example, in certain proteins that effect

Page 235: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

11 Computational Studies of Protein Dynamics 217

the fusion of viral and host membranes. In the case of the influenza proteinhemagglutinin HA2, post-translational cleavage is thought to leave the pro-tein in a long-lived metastable conformation, which is induced to change onlyin response to a reduction in pH within the host cell [27]. In another context,it has been suggested on the basis of molecular dynamics simulations thatthe relaxation of the bacterial photosynthetic reaction center is slow on thetimescale of the initial electron transfer steps following photoexcitation, andthat this slow relaxation leads to a smaller reorganization energy and fasterelectron transfer than would be obtained in the case of fast relaxation [21].

The delay of product release is crucial for the time-dependent organizationof the cell cytoskeleton. Actin filaments are dynamic polymers whose assemblyand disassembly in the cytoplasm drives cell shape changes, cell locomotion,and chemotactic migration. The ATP hydrolysis that accompanies actin poly-merization and the subsequent release of the cleaved phosphate destabilizesthe filaments, and, therefore, must be slow compared to their elongation [22].The results of molecular dynamics simulations suggest that the phosphateis stabilized by a tightly bound divalent cation and by a salt bridge formedwith His73 [36]. Consistent with this model, certain His73 mutants exhibitrapid depolymerization or decreased stability [26]. Actin’s phosphate releaseappears to act as a clock, altering in a time-dependent manner the mechanicalproperties of the filament and its propensity to depolymerize.

Acknowledgments

Work in the author’s laboratory is supported in part by the National Insti-tutes of Health, the National Science Foundation, the Howard Hughes Med-ical Institute, the NSF Center for Theoretical Biological Physics, the NIHNational Biomedical Computation Resource, the NSF Supercomputer Cen-ters, and Accelrys.

References

1. S.A. Adcock, J.A. McCammon, Chem. Revs. 106, 1589 (2006)2. R.E. Amaro, R. Baron, J.A. McCammon, J. Comput. Aid. Mol. Des. 22, 693

(2008)3. Y. Bourne, P. Taylor, P. Marchot, Cell 83, 502 (1995)4. P.A. Carr, H.P. Erickson, A.G. Palmer, Structure 5, 949 (1997)5. C.E. Chang, T. Shen, J. Trylska, V. Tozzini, J.A. McCammon, Biophys. J. 90,

3880 (2006)6. V.P. Denisov, J. Peters, H.D. Horlein, B. Halle, Nat. Struct. Biol. 3, 505 (1996)7. R.M. Dickson, A.B. Cubitt, R.Y. Tsien, W.E. Moerner, Nature 388, 355 (1997)8. H. Frauenfelder, S.G. Sligar, P.G. Wolynes, Science 254, 1598 (1991)9. A.A. Gorfe, C.E. Chang, I. Ivanov, J.A. McCammon, Biophys. J. 94, 1144 (2008)

10. A.A. Gorfe, B.J. Grant, J.A. McCammon, Structure 16, 885 (2008)11. D. Hamelberg, J. Mongan, J.A. McCammon, J. Chem. Phys. 120, 11919 (2004)

Page 236: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

218 J.A. McCammon

12. D. Hamelberg, T. Shen, J.A. McCammon, J. Chem. Phys. 122, 241103 (2005)13. D. Hamelberg, T. Shen, J.A. McCammon, J. Chem. Phys. 125, 094905 (2006)14. Y.W. Jia, A. Sytnik, L.Q. Li, S. Vladimirov, B.S. Cooperman,

R.M. Hochstrasser, Proc. Natl. Acad. Sci. USA 94, 7932 (1997)15. H. Kojima, E. Muto, H. Higuchi, T. Yanagida, Biophys. J. 73, 2012 (1997)16. S.J. Landry, N.K. Steede, K. Maskos, Biochemistry 36, 10975 (1997)17. P.R.L. Markwick, G. Bouvignies, M. Blackledge, J. Am. Chem. Soc. 129, 4724

(2007)18. J.A. McCammon, S.H. Northrup, Nature 293, 316 (1981)19. W. Min, B.P. English, G. Luo, B.J. Cherayil, S.C. Kou, X.S. Xie, Acc. Chem.

Res. 38, 923 (2005)20. D.D.L. Minh, C.E. Chang, J. Trylska, V. Tozzini, J.A. McCammon, J. Am.

Chem. Soc. 128, 6006 (2006)21. W.W. Parson, Z.T. Chu, A. Warshel, Biophys. J. 74, 182 (1998)22. T.D. Pollard, I. Goldberg, W.H. Schwarz, J. Biol. Chem. 267, 20339 (1992)23. M.R. Sawaya, J. Kraut, Biochemistry 36, 586 (1997)24. J. Schames, R.H. Henchman, J.S. Siegel, C.A. Sotriffer, H. Ni, J.A. McCammon,

J. Med. Chem. 47, 1879 (2004)25. T. Shen, K. Tai, R.H. Henchman, J.A. McCammon, Acc. Chem. Res. 35, 332

(2002)26. L.R. Solomon, P.A. Rubenstein, J. Biol. Chem. 262, 11382 (1987)27. D.A. Steinhauer, J. Martin, Y.P. Lin, S.A. Wharton, M.B.A. Oldstone, J.J.

Skehel, D.C. Wiley, Proc. Natl. Acad. Sci. USA 93, 12873 (1996)28. J.L. Sussman, M. Harel, F. Frolow, C. Oefner, A. Goldman, L. Toker, I. Silman,

Science 253, 872 (1991)29. D. Thorn Leeson, D.A. Wiersma, K. Fritsch, J. Friedrich, J. Phys. Chem. B

101, 6331 (1997)30. H. Ueno, T. Suzuki, K. Kinosita Jr., M. Yoshida, Proc. Natl. Acad. Sci. USA

102, 1333 (2005)31. J. van Beek, R. Callender, M.R. Gunner, Biophys. J. 72, 619 (1997)32. R.C. Wade, B.A. Luty, E. Demchuk, J.D. Madura, M.E. Davis, J.M. Briggs,

J.A. McCammon, Nat. Struct. Biol. 1, 65 (1994)33. S. Wennmalm, L. Edman, R. Rigler, Proc. Natl. Acad. Sci. USA 94, 10641

(1997)34. S.T. Wlodek, T.W. Clark, L.R. Scott, J.A. McCammon, J. Am. Chem. Soc.

119, 9513 (1997)35. P.G. Wolynes, Q. Rev. Biophys. 38, 405 (2005)36. W. Wriggers, K. Schulten, Proteins 35, 262 (1999)37. H.X. Zhou, J. Chem. Phys. 108, 8146 (1998)38. H.X. Zhou, Phys. Biol. 2, R1 (2005)39. H.X. Zhou, S.T. Wlodek, J.A. McCammon, Proc. Natl. Acad. Sci. USA 95,

9280 (1998)

Page 237: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

12

Biological Functions of Trehaloseas a Substitute for Water

M. Sakurai

Abstract. A disaccharide, α, α-trehalose, acts as a substitute for water in biologicalsystems. Such a function comes from the unique hydration and solid-state propertiesof this sugar, which ultimately originate in the presence of the α, α-1,1-glycosidiclinkage. A recent study on the anhydrobiosis of Polypedilum vanderplanki, an insectthat survives desiccation, brought about a significant advance in our understandingof the functional mechanism of trehalose in vivo.

12.1 Introduction

Water is the most abundant molecule in cells, accounting for approximately70% of the total weight of a cell, plays a crucial role in stabilizing the higherorder structures of proteins, membranes and DNA, and is a medium of vari-ous biological reactions. However, some organisms can survive adverse envi-ronments such as drought and low temperature through various physiologicaland biochemical adaptations. An ultimate strategy against desiccation stressis anhydrobiosis, or “life without water” [1,2], which is the state of an organ-ism severely dehydrated but capable of revival after rehydration. Anhydro-biosis is found across diverse biological kingdoms, including plants, animals,mushrooms, nematodes, yeasts, fungi, brine shrimp and insects [1–3]. Theseanhydrobiotes commonly contain high concentrations of disaccharides, par-ticularly α, α-trehalose (hereafter trehalose, Fig. 12.1) [1]. For example, whenan African chironomid, Polypedilum vanderplanki, was dehydrated slowly, itconverted as much as 20% of its dry weight into this molecule [3].

Trehalose is able to stabilize biological structures in a dehydrated form,and to make them intact and functional as soon as the hydration and temper-ature conditions return to normal. Thus, it behaves as a chemical chaperonein desiccation stress. Additionally, trehalose acts as a protectant against otherenvironmental stresses such as freezing [4,5], osmotic shock [6], oxidation [7–9],

Page 238: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

220 M. Sakurai

Fig. 12.1. Molecular structure of α, α-trehalose (α-D-glucopyranosyl α-D-gluco-pyranoside)

among others. Our goal is to understand why trehalose behaves as a chemicalchaperone, and why it is superior to other saccharides as a stress protectant.To address the first problem, it is necessary to obtain detailed informationabout the physicochemical property of trehalose in the dehydrated state, es-pecially about its solid-state phase transition and vitrification behaviors. Onthe other hand, as is well-known, the water surrounding solute molecules likesugar is structurally and dynamically distinct from bulk water. It is inferredthat the solute-induced changes of the water structure result in a significantmodification of the hydration shell near biological molecules such as proteinand membranes and thereby influence their stability. For trehalose in partic-ular, its strong protection ability against water stresses may result from thestrong perturbation effect on the surrounding water. Therefore, to addressthe second problem, it is necessary to elucidate the features of trehalose fromthe viewpoint of its hydration property.

In this review, we shall first focus our attention on the hydration and solid-state properties of trehalose, and then extract the characteristic features oftrehalose. Based on these findings, we shall discuss the possible mechanismsby which trehalose acts as a chemical chaperone and subsequently outline ourrecent study reporting the mechanism by which P. vanderplanki survives anextremely dehydrated state. Furthermore, regarding the peculiar hydrationproperty of this sugar, we shall briefly describe the antioxidant function oftrehalose and its inhibitory effect on protein aggregation. Finally, we shalldescribe a perspective on the application of trehalose to long-term storage ofbiological materials.

Page 239: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

12 Biological Functions of Trehalose as a Substitute for Water 221

12.2 Hydration Property of Trehalose

12.2.1 Property of the Aqueous Solution of Trehalose

Here we compare the thermodynamic parameters of trehalose, maltose andsucrose because they have the same chemical formula (C12H22O11) and mass(molecular weight 342.3), but different structures which could be responsiblefor their different hydration properties. The anomaly of hydration of trehaloseis understood from the following observation [10]. Namely, the amount of wa-ter used for the preparation of 1.5 M trehalose solution is smaller than theamount used for the preparation of other sugar solutions. In a 1.5 M solution,trehalose itself occupies 37.5% of the volume of the solution. However, in a1.5 M solution, sucrose occupies 13% and maltose occupies 14%. These datasuggest that trehalose has a larger hydrated volume than the other sugars.This hypothesis can be demonstrated from various thermodynamic parame-ters as shown in Table 12.1.

The intrinsic viscosity [η] is attributed to an overall hydrodynamic volumeof the solute. The values of [η] for the above disaccharides are close to eachother but that for trehalose is slightly larger [11]. Partial molar volume V 0

2 isthe sum of the intrinsic volume Vint of the solute and the volume contributionVsolute−solvent due to solute–solvent interactions

V 02 = Vint + Vsolute−solvent

and is therefore informative of the character of solute–solvent interactions.The above disaccharides might have similar molecular volumes because of thesame mass and formula and thus the difference of their V 0

2 values shouldreflect that of the Vsolute−solvent term. The V 0

2 value of trehalose is smallerthan those of maltose and sucrose [12], which is indicative of a more ex-tensive solute–solvent interaction in the aqueous solution of trehalose. The

Table 12.1. Comparison of various hydration properties of disaccharides

Sugar [η] (cm3

g−1)aV 0

2 (cm3

mol−1)b104K0

2 (cm3

mol−1

bar−1)c

C0p,2 (J K−1

mol−1)bN d

DHN τhc /τ0d

c

Trehalose 2.58 207.61 −30.2 655 48.3 7.08Maltose 2.55 210.07 −23.7 622 23.8 4.66Sucrose 2.45 211.92 −18.5 648 36.8 6.81Lactose 2.5 208.96 −31.1 657 – –aCited from [11].bCited from [12].cCited from [13].dCited from [15]. In the evaluation of τh

c /τ0c , we used the value of nh obtained from

DSC measurements (see Table 12.2).

Page 240: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

222 M. Sakurai

Table 12.2. Comparison of hydration numbers as determined by various techniques

Sugar Viscositymeasurementsa

Ultrasoundmeasurementsb

QENSc DSCd

Trehalose 8.0 15.3 9.0 8.0Maltose 7.5 14.5 8.4 6.5Sucrose 6.8 13.9 7.5 6.3aCited from [11].bCited from [12].cQuasielastic neutron scattering measurements. Cited from [14].dDifferential scanning calorimetry measurements. Cited from [15].

partial molar compressibility K02 , corresponding to the second derivative of

free energy with respect to pressure, is a more sensitive parameter that di-rectly reflects solute–solvent interactions, since the intrinsic volume can beregarded as incompressible, which would be true for small molecules underordinary pressures. The value of isentropic partial molar compressibility K0

s,2

assumes a more negative value when the water in the hydration shell becomesdenser and less compressible than bulk water; in other words, when the hy-dration shell forms more extensive or strong hydrogen bonding. As expected,trehalose has a larger negative value of K0

s,2 than maltose and sucrose [13].These observations are also supported by the fact that trehalose has a largerpositive value of partial molar heat capacity C0

p,2 [12], which becomes morepositive when extensive or strong hydrogen bonding interaction or hydropho-bic hydration occurs between a solute and its surrounding water molecules.Table 12.2 summarizes the data of hydration number nh obtained from differ-ent experimental techniques [11, 13–15]. In accord with the picture based onthe above thermodynamic parameters, the hydration number nh of trehaloseis larger than maltose and sucrose. According to the result of recent terahertzabsorption spectroscopy, the dynamical hydration shell of trehalose extendsfrom the surface to 6.5 ± 0.9 A [16].

In order to characterize the hydration phenomena in more detail, it isworthwhile to obtain information on the dynamics of water molecules involvedin the hydration shell. One of the useful techniques for such a purpose is17O-NMR spectroscopy. In the so-called two-state model, 17O nuclei in theaqueous solution are assumed to be distributed between the following twomotional states: the water in the hydration shell and the bulk water. Underthis assumption, the analysis of concentration-dependent changes of the spin–lattice relaxation time of 17O nucleus gives the following important parameterknown as the dynamic hydration number [17]:

nDHN = nh

[K

(τhc /τ0

c − 1)− 1

],

where nh is the hydration number, K is a constant related to the quadru-pole coupling constant of the 17O nucleus, and τh

c and τ0c are the rotational

Page 241: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

12 Biological Functions of Trehalose as a Substitute for Water 223

correlation times of hydration water and pure, i.e., bulk water, respectively.We reported the data of nDHN which revealed a larger retardation of the wa-ter dynamics near trehalose relative to several gluco-oligosaccharides [15]. Asshown in Table 12.1, nDHN lies in the order trehalose > sucrose > maltose. Inaddition, and more importantly, the magnitude of τh

c/τ0c , a direct measure of

retardation of the water dynamics in the hydration shell, is larger for trehalosethan for the other gluco-oligosaccharides studied. Taken together, trehalosehas a characteristic hydration property in terms of not only its large hydrationnumber but also the remarkably lowered dynamics of its hydration water.

Branca et al. investigated the aqueous solutions of trehalose, maltose andsucrose using Raman spectroscopy and comparatively analyzed the relativespectral contribution from the O–H vibration in the tetrabonded H2O mole-cules and from that in a distorted bond [18]. Of particular interest is thattrehalose exerts a superior, destroying effect – relative to the others – onthe tetrahedral hydrogen bond network of pure water with an increase insugar concentration. Similar results have also been obtained from inelasticlight scattering measurements [19]. What emerges from these data is that thewater structure formed on the sugar surface is incompatible with that of thetetrahedral hydrogen bond network of pure water. In this regard, trehalose isa good water structure breaker.

Finally, it should be noted that the thermodynamic parameters of trehaloseare not anomalous compared with those of lactose (Table 12.1). This meansthat the peculiarity of trehalose in hydration is not necessarily deduced fromthe macroscopic properties of the solution alone.

12.2.2 Atomic-Level Picture of Hydration of Trehalose

The information from the above experimental data is limited to water dy-namics and structure averaged over an inhomogeneous sugar surface in vari-ous conformational states, and does not provide atomic-level detail about thehydration difference depending on the stereochemistry of sugars. Computersimulation is useful to address this issue. French’s group made much effort toelucidate the conformational property of trehalose. Their molecular mechanicsand quantum chemical calculations indicated that in vacuo trehalose has onlya single energy minimum around the glycosidic bond [20, 21]: the minimumis located at the glycosidic dihedral angles of (φ, ϕ) = (−60◦,−60◦), corre-sponding to the gauche conformation. This is true for the sugar in an aqueoussolution as shown in Fig. 12.2a which shows the population density map forthe dihedral angle distributions obtained from an MD simulation. This uniqueconformation is similar to a clam shell (Fig. 12.1). Such conformational rigid-ity of trehalose comes from the α, α-1,1 type of glycosidic linkage, which isunique to this sugar among naturally occurring gluco-disaccharides. Indeed,neotrehalose, which has an α, β-1,1 configuration, has more than two stableconformations around the glycosidic linkage (Fig. 12.2b) [20]. Similarly, othertypes of glycosidic linkage, including (1–4) and (1–6) bonds, among others,

Page 242: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

224 M. Sakurai

Fig. 12.2. Population density map for the dihedral angle distributions obtainedfrom MD simulations for trehalose (a) and neotrehalose (b) in aqueous solution

allow for multiple conformers. Therefore, the less flexible α, α-1, 1-glycosidiclinkage may be an important clue responsible for the biological functions oftrehalose.

To date, the MD simulation for aqueous trehalose has been reported byseveral groups [22–29]. Our early MD study indicated that trehalose canhydrogen-bond with the surrounding water more extensively than maltose,leading to more restrained translational diffusion of water molecules aroundtrehalose [22]. A recent remarkable increase in computer performance allowsfor more rapid and accurate MD calculations for various sugars in solution.More recently, Choi et al. performed systematic computational work for aseries of disaccharides to obtain an atomic-level insight into the unique bio-chemical role of trehalose over other glycosidically linked sugars [29]. In thatstudy, 13 different homodisaccharides with different glycosidic linkages wereexamined. Analyses of the hydration number and the radial distribution func-tion of solvent water molecules showed that a highly anisotropic hydrationshell is formed around this sugar in aqueous solution As shown in Fig. 12.3,the concave side of the clam shell is fully hydrated, while there are pocketshaving no first hydration shell on the convex side In addition, they evalu-ated the number of long-lived hydrogen bonds defined as having a lifetimelonger than 20 ps. As a result, trehalose was shown to have an average of2.8 of long-lived hydrogen bonds with water, which is a much larger num-ber than the average number of hydrogen bonds for the other 12 sugars. Thestable hydrogen-bond network was thought to be derived from the formationof long-lived water bridges at the expense of decreasing the dynamics of thewater molecules. This dynamic reduction of water by trehalose was also con-firmed from the data for the translational diffusion coefficients. These resultsare consistent with our 17O NMR results as described above [15]. According

Page 243: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

12 Biological Functions of Trehalose as a Substitute for Water 225

Fig. 12.3. Distribution of water molecules around trehalose. Cloud-like regionsrepresent iso-probability surface of water oxygen atoms

to Choi et al., trehalose is a “dynamic reducer” for solvent water molecules,which comes from its anisotropic hydration and conformational rigidity [29].

Taken together with findings explained in Sects. 12.2.1 and 12.2.2, tre-halose is a water structure maker in the sense that it forms a highly anisotropicand unmobilized hydration shell around itself. However, this simultaneouslymeans that the tetrahedral hydrogen bond network in water is highly per-turbed by the addition of trehalose. In this sense, trehalose is a water struc-ture breaker as well. Such a dual character is the key to understanding thevarious biological roles of this sugar. Finally, it should again be stressed thatsuch a peculiar property of trehalose originates from the presence of its α,α-1,1-linkage.

12.3 Solid-State Property of Trehalose

12.3.1 Polymorphism

In order to elucidate the mechanism by which trehalose enables biological or-ganisms to survive desiccation stress, information about the solid-state prop-erty of this sugar is indispensable. Trehalose is crystallized from its aqueoussolution as a dihydrate. The two crystalline water molecules are easily acti-vated on heating. Our FTIR study indicated that their bending vibration bandundergoes a steep shift from 1,680 to 1,640 cm−1 at around 70◦C [30], whichimplies that they convert from an ice-like structure to a liquid-like one be-fore melting (90◦C) of the crystal. Furthermore, this was thermodynamicallysupported by our study on the low temperature heat capacity of the dihy-drate [31]. Due to such a labile nature of the crystal water, solid-state trehaloseexhibits intriguing polymorphism. So far three different crystal forms includ-ing the dihydrate have been identified. The dihydrate, usually referred to as

Page 244: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

226 M. Sakurai

Th (or form I), has a rhombic crystalline form [32,33]. One anhydrous form, re-ferred to as Tβ (or form III), is a monoclinic form [34–36]. Another anhydrousform, referred to as Tα (or form II), was first identified by Sussich et al. [37]and its property has been extensively studied by differential scanning calorime-try (DSC), powder X-ray diffractometry [36,38–40] and FTIR [35,41,42]. How-ever it was not until more recently that we succeeded in the X-ray analysis inwhich its crystalline structure was revealed [43].

It is accepted that the dehydration behavior of Th depends on the heat-ing rate [38, 44, 45], the presence or absence of nitrogen gas flow [40] and theparticle size [46,47]. On the other hand, little is known about the effect of va-por pressure (humidity) on the interconversions among the phases, includingcrystalline states and amorphous ones, despite the fact that vapor pressureis one of the key thermodynamic parameters that influences the property ofthe crystal water. These situations have led to considerable puzzling and scat-tering with respect to the interpretation of the dehydration behavior of Th.In order to address this problem, we investigated the de- and rehydrationbehavior of Th under humidity-controlled atmospheres through simultaneousmeasurements of X-ray and DSC, and those of thermogravimetry and dif-ferential thermal analysis (DTA) [48]. It was revealed that anhydrous formsresulting from Th dehydration strongly depend on the surrounding humid at-mospheres, and the resulting anhydrous forms under different conditions ofhumidity require different partial vapor pressures of water for their rehydra-tion back to Th. Figure 12.4 shows the resulting pathways that link differentsolid forms of trehalose. In dry atmospheres, Tα is formed at 105◦C on dehy-drating of Th. It is highly hygroscopic and can be readily rehydrated back toTh when exposed to even low humid atmospheres, consistent with our previ-ous results from FTIR measurements [35, 42]. In highly humid atmospheres,on the other hand, dehydration of Th undergoes a direct transformation (i.e.solid–solid conversion) into a stable anhydrous crystal Tβ at as high as 90◦C,although a higher temperature (≈170◦C) is needed for the formation of this

Fig. 12.4. Phase and state transitions of trehalose

Page 245: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

12 Biological Functions of Trehalose as a Substitute for Water 227

anhydrous crystal under dry conditions. Tβ is less hygroscopic, and a highpartial vapor pressure of water is necessary for its rehydration back to Th. Inintermediate humid atmospheres, the dehydration of Th leads to the forma-tion of an unidentified state Tε, whose crystallinity is higher under the morehumid atmospheres. In addition to the effect of humidity, we recently investi-gated the effect of atmospheric pressure on the phase transition of Th usingan in-house DTA apparatus and obtained invaluable information for confir-mative assignment of the endothermic peaks due to melting or dehydrationof Th [49].

Recently, by using positron annihilation lifetime spectroscopy, Kilburnet al. observed that in the dihydrated Th, water is organized as a confinedone-dimensional fluid in channels of fixed diameter that allow activated diffu-sion of water in and out of the crystallites [5]. They present direct real-timeevidence of water molecules unloading reversibly from these channels, therebyacting as both a sink and a source of water in low moisture systems. They pos-tulated that this behavior may provide the overall stability required to keeporganisms viable through dehydration conditions. The empty and water-filledchannels may correspond to the crystal structures of the anhydrous forms,Tα and Th, respectively [50]. Therefore, among the interconversion processesshown in Fig. 12.4, the formation of Tα and its reversible conversion to Thmay be particularly important for a better understanding of the protectiveaction of trehalose.

To obtain more detailed insight into the biological role of Tα, we recentlyrevealed its crystal structure [43]. The features of Tα are summarized as fol-lows. The trehalose molecule in Tα has an approximate C2 symmetry as doesthat in Th and Tβ. The molecular arrangement in Tα was very similar tothat in Th and there are hydrogen bonds preserved in both. One of the mostimportant findings is that there are two different holes, hole-1 and hole-2,along one crystal axis. Hole-1 is constructed by trehalose molecules with ascrew diad at its center, while hole-2 has a smaller diameter and is without asymmetry operator (Fig. 12.5). Due to the screw axis at the center of hole-1,hollows are present at the side of the hole with diameters roughly equal tothat of hole-1. Hole-1 and side pockets followed by hollows correspond to thepositions of two water molecules of the dihydrate. Therefore, hole-1 is consid-ered to be a one-dimensional water channel with side pockets. Additionally,molecular and crystal energy calculations demonstrated that the intermolec-ular interactions between trehalose molecules in Tα were weaker than thosein Tβ, which accounts for more rapid water uptake into the Tα crystal.

12.3.2 Glassy State of Trehalose

Table 12.3 lists the glass transition temperatures for all of the naturally occur-ring gluco-disaccharides, i.e., disaccharides composed of two glucose units, andfor sucrose. For trehalose, the value of 115 ± 2◦C is currently accepted as theexact Tg of anhydrous trehalose [51–55], although various Tg values have been

Page 246: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

228 M. Sakurai

Fig. 12.5. Crystal structures of (a) Tα and (b) Th along a-axis. Trehalose moleculesare drawn by a spacefilling model with a partial wireframe model. There are twodifferent holes in Tα: hole 1 and hole 2. Diameter of each circle is 2.1 A. In Th, theseholes are occupied by two crystal water molecules

Table 12.3. Glass transition temperatures Tg and the activation energies ΔErelax

of enthalpy relaxation of dry amorphous disaccharidesa

Sugar Tg (◦C) ΔErelax (kJ mol−1)

Trehalose 116 (113) 401.0 (360.8)Neotrehalose 105 223.4Kojibiose 118.3 273.1Sophorosee 88.6 283.3Nigerose 81.1 270.5Laminaribiose 106.7 314.1Maltose 84.5 (90) 292.4 (245.4)Isomaltose 89.5 279.9Cellobiose 100.2 307.4Gentibiose 94.4 284.1Sucrose (68) (212.2)aThe data in parentheses were cited from [55]. The otherdata were cited from [57].

reported so far from 73◦C [56] to 116.9◦C [52]. The Tg of trehalose is high-est among the gluco-disaccharides, although the value is not special, at leastnot anomalous. In addition, trehalose has another noteworthy glass-formingproperty in favor of its function as a biological protective agent for long-termstorage in dry states. Kawai et al. reported the activation energy of the en-thalpy relaxation, ΔErelax for trehalose, maltose and sucrose [55]. ΔErelax isthought to be the activation energy of the translational diffusion of moleculesforming the glass of interest, being a direct measure of the chemical and phys-ical stability of the vitrified matrix. According to their results, the ΔErelax

Page 247: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

12 Biological Functions of Trehalose as a Substitute for Water 229

of trehalose is larger than maltose and sucrose by >150 kJ mol−1. Recently,we extended a similar study to all the naturally occurring gluco-disaccharidesand indicated that trehalose has the highest Tg and the largest ΔErelax value(Table 12.3) [57]. These results indicate that trehalose is more easily vitrifiedthan other well-known disaccharides and its glassy state is more stable thanthat of others.

Generally, water is a good plasticizer for glassy matrices: with an increasein water content, the glass transition temperature is lowered, which is anunfavorable phenomenon for preserving biomaterials in the dry state. Aldouset al. focused on the ability of a given sugar to form crystalline hydrates fromthe anhydrous amorphous state. It was found that trehalose can be crystallizedas hydrous forms from the amorphous state, leading to a decrease in theresidual water content of the remaining amorphous matrix [58]. As a result, theglass transition temperature Tg becomes higher, or at least its Tg depressioncaused by plasticization through water uptake is more or less avoidable [51,58].In a similar way, the coexistence of Tα crystals as a sink of water is usefulfor reducing the risk of plasticization. Indeed, Nagase et al. reported that ifa mixture of Tα and amorphous trehalose is exposed to moisture, water isabsorbed more rapidly by the transformation from Tα to Th than by waterabsorption to amorphous trehalose [36].

12.4 Biological Roles of Trehalose

12.4.1 Possible Mechanisms of Anhydrobiosis

It has been widely accepted that trehalose acts as a stabilizer that protectsbiomolecules against water stresses such as desiccation, freezing and osmoticpressure, and so on [1, 2, 4–9]. Among them, the functional mechanism fordesiccation stress has been extensively investigated [1, 59, 60] and three mainmechanisms have been proposed, so far [60]. The vitrification hypothesissuggests that the mobility of cellular components caged by sugar glasses isseverely restricted and that they can thus escape from destruction [59, 60].The water replacement hypothesis suggests that sugars can replace watermolecules by forming hydrogen bonds with polar residues of lipid and/orprotein molecules, thereby stabilizing their structures in the absence of wa-ter [59,60]. The water entrapment hypothesis suggests that sugars concentratewater near the surfaces of membrane and protein, thus preserving them fromdestruction [61–63]. Currently, it is thought that these three mechanisms arenot mutually exclusive [59, 60]. For instance, vitrification may occur simulta-neously with direct interactions between the sugar and the polar residues ofbiomolecules.

As shown in Table 12.3, dry trehalose vitrifies at a higher temperaturethan do other disaccharides and the resultant glassy matrix is highly stable inthe sense that enthalpy relaxation occurs with more difficulty than in other

Page 248: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

230 M. Sakurai

disaccharides. Thus, trehalose is one of the sugars by which the vitrificationmechanism would work more efficiently. As shown in Table 12.2, trehalose hasa larger hydration number, thereby being able to serve a larger number ofhydrogen bonding sites to a biomolecule in place of water. Thus, trehalose isalso one of the sugars appropriate for the water replacement mechanism. Itshigh hydration ability (a larger hydration number and a larger retardation ofthe water dynamics) is of course, a great advantage to the water entrapmentmechanism as well.

In the past decades, the above three mechanisms have been demonstratedby various in vitro experiments [59, 60] and computer simulations [62–69].Among them, a recent model study by Albertorio et al. should be noted inthe sense that it pointed to the importance of α, α-(1-1) linkage of trehalosefor preserving the membrane structure [70]. They found that disaccharidemolecules containing an α, α-(1-1) linkage, compared with other disaccha-rides, are effective at retaining the bilayer structure in the absence of wa-ter. They inferred that the specific arrangement of the hydroxyl groups inα, α-trehalose may optimize the hydrogen-bonding arrangement for water re-placement, because the somewhat less protective behavior was afforded byα, α-galactotrehalose, which only differs from the structure of α, α-trehaloseby the epimerization of the two 4-hydroxyl groups to the axial position fromthe equatorial position.

The vitrification mechanism has been demonstrated well for anhydrobi-otic plants, although in these cases vitrified sugar is not trehalose but sucroseprobably mixed with proteins [60]. Our previous studies using lyophilizedyeast cells provided results that could be reasonably interpreted by the wa-ter replacement mechanism [71] or the water entrapment mechanism [72].However, not enough direct evidence has accumulated for these mechanismsto work well in vivo. In order to obtain rigorous evidence for the functionalmechanism of trehalose in vivo, we recently performed a study using the larvaeof the sleeping chironomid, Polypedilum vanderplanki, as described below [73].

12.4.2 Strategy for Desiccation Tolerancein the Sleeping Chironomid

P. vanderplanki is the most complex and largest multicellular animal capableof anhydrobiosis [74,75]. The larvae dwell in temporary rock pools in semiaridregions in Africa. The small and shallow pools occasionally dry up, so that thelarvae become severely desiccated, but are able to recover after rehydrationwhen the next rain arrives. P. vanderplanki can repeat the process of dehydra-tion/rehydration several times as long as they remain in the larval stage. Ac-cording to one report, larvae of P. vanderplanki can recover from desiccationof up to 17 years [76]. Watanabe et al. succeeded in inducing P. vanderplankilarvae to enter anhydrobiosis under laboratory conditions and found that highlevels of trehalose (about 20% of the dry body mass) are synthesized in thedehydrated larvae [3, 77].

Page 249: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

12 Biological Functions of Trehalose as a Substitute for Water 231

We focused our attention on seeking evidence for the vitrification and waterreplacement mechanisms in P. vanderplanki [73]. For this purpose, two kindsof dehydrated larvae with very different trehalose contents were prepared byregulating the dehydration rate. That is, larvae accumulating a large amountof trehalose (36 μg per individual) were obtained by slow dehydration over72 h, while those with comparatively little trehalose (2 μg per individual) wereobtained by quick dehydration within several hours. No apparent differencewas found between the contents of total protein, triacylglycerol and watercontent (≈3wt. % per dry individual) in both of these preparations. Then,the trehalose distribution in the larvae body was visualized by use of FTIRimaging spectroscopy. Trehalose is known to exhibit a unique vibration bandat 992 cm−1 [35], which is assigned to its α, α-1,1 linkage. Indeed a clearshoulder peak was observed at this position for a slowly dehydrated larva,whereas the corresponding peak was not detected for a quickly dehydratedone. The intensity distribution of this peak for the slowly dehydrated larvaeis shown in Fig. 12.6, where the peak intensity at 992 cm−1 is normalizedwith respect to that of amide II band. This clearly indicates that trehalose isalmost uniformly distributed through the larval body, at least at this level ofresolution.

The physical state of trehalose accumulated in the larvae body was exam-ined using DSC. The resulting thermogram for the slowly dehydrated larvaeexhibited a clear baseline shift in a step-wise manner (Fig. 12.7a), indicatingthe occurrence of a glass transition. The onset, middle and end temperatures

Fig. 12.6. Optical (a) and FTIR (b) imaging data for a slowly dehydrated larva.Mapped are intensities of the characteristic 992-cm−1 peak, which were normalizedby being divided by that of the amide II band

Page 250: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

232 M. Sakurai

Fig. 12.7. Glass in anhydrobiotic larvae and their recovery after heat treatments.(a) DSC thermograms for slowly and quickly dehydrated larvae. (b) Dependenceof the recovery rate after rehydration on exposure to high temperatures in slowly(filled symbols) and quickly (open symbols) dehydrated larvae. Circles and trianglesshow recovery after exposure to high temperature for 5 min and 1 h, respectively.Data from [73]

were 62◦C, 65◦C, and 71◦C, respectively, meaning that the sample was in theglassy state at a temperature of <62◦C and in the rubber state at >71◦C.In contrast, neither baseline shift nor peak appearance was observed for thequickly dehydrated sample. We then compared the viability of both the slowlyand quickly dehydrated larvae to determine the recovery rate after rehydra-tion following exposure to different temperatures for 5 min or 1 h (Fig. 12.7b).For the slowly dehydrated sample, a high recovery rate of 60–90% was ob-served up to 50◦C exposure, and longer exposures tended to cause a slightlylower survival rate. Exposure to higher temperatures gradually decreased therecovery rate, and no survival occurred beyond ca. 100◦C. The quickly de-hydrated larvae never recovered, regardless of the temperatures employed.Interestingly, the glass transition curve for the slow sample correlates closely

Page 251: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

12 Biological Functions of Trehalose as a Substitute for Water 233

Fig. 12.8. (a) FTIR spectra in the region of 1,280–1,200 cm−1, which shows asym-metric stretching vibration of P=O atomic groups. (b) Temperature dependence ofFTIR bands in the region 2,849–2,856 cm−1, which shows symmetric CH2 stretchingvibration. Data from [73]

with the corresponding recovery rate. This result clearly indicates that tre-halose acts as a protectant only when it is in the glassy state, in other words,vitrification of trehalose is a prerequisite to keep the anhydrobiotic state stablein P. vanderplanki.

Evidence for the water replacement mechanism was obtained frommeasurements of the P=O asymmetric stretching vibration appearing at1,280–1,200 cm−1, which sensitively reflects the hydrogen bonding interac-tions of the head groups of phospholipids with other molecules. As shown inFig. 12.8a, the peak position of this band was slightly lower in slowly than inquickly dehydrated larvae. This suggests that in the former sample hydrogenbonds are formed between the polar head groups of phospholipids and prob-ably trehalose, although compounds other than phospholipids, such as DNAand RNA, could also contribute to such a peak shift. Indeed, our previousreport indicated that the P=O stretching vibration of dry DNA is perturbed

Page 252: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

234 M. Sakurai

by the addition of trehalose [78]. To further assess whether the above shiftis related to the physical state change of phospholipids, we focused on thesymmetric CH2 stretching vibration of fatty acid chains. As a result, it wasfound that this peak shifted from 2,850 to 2,854 cm−1 with increasing temper-ature (Fig. 12.8b) and interestingly the gel-to-liquid crystalline temperature,defined as the midpoint of the transition curve, is significantly lowered in theslowly dehydrated larvae. It should be noted that cellular membranes in thissample are in the liquid crystalline state at room temperature in spite of theabsence of water. Thus unfavorable phase transition is avoided during thesubsequent rehydration process, a key factor that allows cellular membranesto successfully recover from desiccation. Combining the observations for bothvibration peaks, it is reasonable to interpret that trehalose perturbs the headgroups of phospholipids through direct hydrogen bonding interactions, whichallows the membrane to be kept in the liquid crystalline state even in a highlydehydrated environment.

Taken together, our results indicate that the vitrification and water re-placement mechanisms are both involved in anhydrobiosis in P. vanderplanki,and that trehalose is a major player in such an intriguing biological phenom-enon. The successful anhydrobiotic larva is just like a substance assembledmainly with biological organic molecules, with the spatial arrangements re-quired for normal physiology largely maintained by immobilization in thebiological glasses. The larvae of P. vanderplanki can reversibly convert fromthe living state to such an amorphous solid state by replacing the normalintracellular medium with trehalose to enter anhydrobiosis, and vice versa.

Finally, the possibility should be pointed out that some factors other thantrehalose may be involved when the vitreous state is formed in the body ofP. vanderplanki. This is partly due to the fact that the glass transition tem-peratures of the slowly dehydrated larvae shifted less with an increase in watercontent than expected from theoretical values calculated for a binary mixtureof pure trehalose and water. For plant anhydrobiotes, it has been reportedthat proteins as well as soluble sugars may be vitrified in the cytoplasmicglass [79, 80]. Our results do not exclude such a possibility. Late embryoge-nesis abundant (LEA) proteins are known to occur in various anhydrobioticorganisms [81] and have been suggested to reinforce biological glasses [82]. Re-cently, LEA-like proteins were also found in P. vanderplanki [83]. Therefore,further studies are required for a complete understanding of the desiccationtolerance in P. vanderplanki.

12.4.3 Other Biological Roles of Trehalose

Protein stabilization by trehalose in aqueous solution is also an example of thebiological functions of trehalose [84–86]. According to reports by Timasheffand coworkers, preferential hydration should occur when the interaction of acosolvent with water is stronger than its interaction with a protein [85]. Inother words, a water structure maker like trehalose is a good cosolvent causing

Page 253: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

12 Biological Functions of Trehalose as a Substitute for Water 235

preferential hydration. The preferential hydration effect should lead to a lossin the entropy of solvation upon protein denaturation, rendering the unfoldedstate even more unstable, and resulting in a shift of the equilibrium in favorof the native state. In principle, the preferential hydration model should beapplied not only to proteins but also to other biological components suchas membranes. Indeed, our early 31P NMR study indicated that trehalosestabilizes hydrated unilamellar liposome by increasing the packing densityamong the constitutive phospholipid molecules, leading to inhibition of thefusion of the liposome [87].

According to the preferential hydration model, trehalose is expected topromote the aggregation of unfolded proteins because the aggregated stateshould have a smaller protein-solvent interface than their isolatedly dissolvedstate. However, in contradiction to this expectation, trehalose has been shownto suppress the aggregation of proteins associated with Huntington’s andAlzheimer’s diseases. Tanaka et al. reported that trehalose could be used to in-hibit the aggregation of polyglutamine in vivo in a rat model for Huntington’sdisease [88], while an in vitro study by Liu et al. indicated that this sugar ef-fectively inhibits the aggregation and neurotoxicity of β-amyloid (Aβ) 40 and42 [89]. A similar inhibition effect on protein aggregation was also observedin yeast cells during heat shock [90]. Although the underlying mechanism forsuch phenomena is far from being fully understood at present, a key to solvethis issue may exist in another peculiar property of trehalose, that is, a specificinteraction with hydrophobic compounds as described below.

In addition to the protective function against water stresses, there isgrowing evidence that trehalose is capable of protecting biological moleculesagainst oxidative damage [7–9]. In particular, we have extensively studiedthe antioxidant function on unsaturated fatty acid (UFA) from both theexperimental and theoretical viewpoints [91, 92]. The autoxidation of UFA isinitialized by the reaction in which activated oxygen or free radicals attracthydrogen atoms from the allyl group of UFA as follows:

–CH2–CH=CH–CH2–CH=CH–CH2– → –CH2–CH=CH– • CH–CH=CH–CH2– .

We indicated that trehalose suppresses this reaction, while other disaccharides,such as sucrose, maltose and neotrehalose, showed a negligible effect [8]. Ac-cording to detailed NMR analyses, trehalose interacts specifically with UFApossessing a cis type C=C double bond(s), such as LA (18:2, cis), with a1:1 stoichiometry. A theoretical model for the trehalose-cis C=C bond com-plex is shown in Fig. 12.9, where the OH–6′ of trehalose interacts with theπ-orbital at the mid position of the double bond and simultaneously theOH−3 forms the C–H · · · O type of hydrogen bond at a terminal of the dou-ble bond. The complex formation energy (stabilization energy) was estimatedto be 5.52 and 7.78 kcal mol−1 from quantum chemical calculations at theHF/6-31G** and B3LYP/6-31G** levels of theory, respectively. On complex

Page 254: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

236 M. Sakurai

Fig. 12.9. The optimized structures of trehalose / 2-butene complex obtained fromthe HF/6-31G** calculation

formation, the activation energy of the above hydrogen abstraction reac-tion was shown to be greatly increased: in the isolated state, 14.8 kcal mol−1

(UHF/6-31G**) and 9.2 kcal mol−1 (UB3LYP/6-31G**), while in the com-plexed state, 37.8 kcal mol−1 (UHF) and 38.6 kcal mol−1 (UB3LYP). Theseresults indicate that the OH · · ·π and CH · · ·O multiple hydrogen bonds withtrehalose significantly modify the electronic structure of the diene moiety,leading to a kinetic depression of the hydrogen abstraction reaction.

The above finding for the complexation of trehalose with a cis doublebond leads us to expect that this sugar can also interact with benzene andits derivatives because their double bonds are cis-like. In fact, our prelimi-nary study using NMR and molecular dynamics simulation indicated that inaqueous solution a benzene molecule binds to trehalose in such a manner thatdehydration penalty could be minimized [93]. Concretely, it binds to the con-vex side of trehalose, where there are less hydrated regions as can be seen fromFig. 12.3. This peculiar interaction may account for the suppressive effect ofthis sugar on peptide or protein aggregation as described above. Namely, thereis a possibility that trehalose binds to aromatic side chains that are exposed tothe aqueous phase upon unfolding and consequently act as a spacer to inhibitthe direct contact between unfolded protein molecules. This interesting issueis now under investigation in our laboratory.

12.5 Conclusion

The physicochemical uniqueness of trehalose originates from the presenceof an α, α-1,1-linkage, which brings about the rigid conformation with aclam, shell-like shape. Because of the conformational rigidity, trehalose has a

Page 255: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

12 Biological Functions of Trehalose as a Substitute for Water 237

unique hydration characteristic: a spatially anisotropic but dynamically stablehydration shell. This in turn brings about several characteristic thermody-namic properties for its aqueous solution. Trehalose has the dual character asa good water structure maker and breaker. Additionally, trehalose has charac-teristic solid-state properties. In particular, the glassy property with not onlyhigh Tg but also high ΔErelax makes trehalose a superior desiccation protec-tant than other saccharides. Furthermore, the actual glassy matrix of trehalosemay be partially prevented from devitrification through the coexistence withanhydrous Tα crystal, which acts as a sink of water. As unveiled for P. van-derplanki, anhydrobiotes successfully maintain their shelf lives by utilizingwell these characteristic features of trehalose, especially through the vitrifica-tion and water replacement mechanisms. The view described here would bringabout a significant advance in understanding the limitation and further pos-sibility of this sugar in various applications and in undertaking the moleculardesign of more effective protectants in the future.

With progress in the understanding of the fundamental aspects of tre-halose, much effort has been made to confer desiccation tolerance on non-anhydrobiotic organisms by introducing trehalose into target cells. Althoughrecently the human platelet was successfully freeze-dried with trehalose [94],a major obstacle to application is that usually cellular membranes are im-permeable to trehalose. Several trials introducing trehalose into target cellshave been made and have brought a certain degree of success. For exam-ple, introduction of bacterial trehalose biosynthetic enzyme genes into hu-man fibroblasts increases intracellular trehalose concentration and results inenhanced desiccation tolerance [95]. In another approach, engineered switch-able pores or extracellular nucleotide-gated channels (engineered-hemolysinor P2X7 purinergic receptor pore) were created in cellular membranes to al-low trehalose uptake [96]. Most recently, Kikawada et al. isolated a noveltrehalose transporter (TRET1) from P. vanderplanki [97]. Transport activ-ity of TRET1 was stereochemically specific for trehalose and the direction oftransport is reversible depending on the concentration gradient of trehalose.By combining the knowledge obtained from the study of P. vanderplanki andthese new techniques, it is expected that long-term storage becomes possiblefor a variety of cells, tissues and even organs in a dry state.

Acknowledgments

This work was supported in part by the Program for Promotion of Basic Re-search Activities for Innovative Biosciences (PROBRAIN) and also in part byGrants-in-Aid for Scientific Research on Priority Areas (no. 16041212 and18031012) from the Ministry of Education, Culture, Sports, Science, andTechnology of Japan.

Page 256: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

238 M. Sakurai

References

1. J.H. Crowe, F.A. Hoekstra, L. Crowe, Annu. Rev. Physiol. 54, 579 (1992)2. J.S. Clegg, Comp. Biochem. Physiol. B 128, 613 (2001)3. M. Watanabe, T. Kikawada, N. Minagawa, F. Yukuhiro, T. Okuda, J. Exp. Biol.

205, 2799 (2002)4. R.A. Ring, H.V. Danks, Cryo Lett. 19, 275 (1998)5. P.O. Montiel, Cryo Lett. 21, 83 (2000)6. A.V. Laere, FEMS Microbiol. Rev. 63, 201 (1988)7. N. Benaroudj, D.H. Lee, L.A. Goldberg, J. Biol. Chem. 276, 24261 (2001)8. K. Oku, M. Kurose, M. Kubota, S. Fukuda, M. Kurimoto, Y. Tujisaka,

M. Sakurai, Nippon Shokuhin Kagaku Kougaku Kaishi (in Japanese) 50, 133(2003)

9. R.S. Herderio, M.D. Pereira, A.D. Panek, E.C.A. Eleutherio, Biochem. Biophys.Acta 1760, 340 (2006)

10. M. Sola-Penna, J.R. Meyer-Fernandes, Arch. Biochem. Biophys. 360, 10 (1998)11. M-O. Portmann, G. Birch, J. Sci. Food Agric. 69, 275 (1995)12. P.K. Banipal, T.S. Banipal, B.S. Lark, J.C. Ahluwalia, J. Chem. Soc. Faraday

Trans. 93, 81 (1997)13. S.A. Galema, H. Høiland, J. Phys. Chem. 95, 5321 (1991)14. S. Magazu, V. Villiari, P. Migliardo, G. Maisano, M.T.F. Telling, J. Phys. Chem.

B 105, 1851 (2001)15. H. Kawai, M. Sakurai, Y. Inoue, R. Chujo, S. Kobayashi, Cryobiology 29, 599

(1992)16. M. Heyden, E. Brundermann, U. Heugen, G. Niehues, D.M. Leitner, M.

Havenith, J. Am. Chem. Soc. 130, 5773 (2008)17. H. Uedaira, M. Ikura, H. Uedaira, Bull. Chem. Soc. Jpn. 62, 1 (1989)18. C. Branca, S. Magazu, G. Maisano, P. Migliardo, J. Chem. Phys. 111, 281

(1999)19. C. Branca, S. Magazu, G. Maisano, S.M. Bennington, B. Fak, J. Phys. Chem.

107, 1444 (2003)20. M.K. Dowd, P.J. Reilly, A.D. French, J. Comp. Chem. 13, 102 (1992)21. A.D. French, G.P. Johnson, A-M. Keltere, M.K. Dowd, C.J. Cramer, J. Phys.

Chem. A 106, 4988 (2002)22. M. Sakurai, M. Murata, Y. Inoue, A. Hino, S. Kobayashi, Bull. Chem. Soc. Jpn.

70, 847 (1997)23. Q. Liu, R.K. Schmit, B. Teo, P.A. Karplus, J.W. Brady, J. Am. Chem. Soc.

119, 7851 (1997)24. G. Bonanno, R. Noto, S.L. Fornili, J. Chem. Soc. Faraday Trans. 94, 2755 (1998)25. P.B. Conrad, J.J. de Pablo, J. Phys. Chem. A 103, 4049 (1999)26. S.B. Engelsen, S. Perez, J. Phys. Chem. B 104, 9301 (2000)27. A. Lerbret, P. Bordat, F. Affouard, Y. Guinet, A. Hedoux, L. Paccou,

D. Prevost, M. Descamps, Carbohydr. Res. 340, 881 (2005)28. A. Lerbret, P. Bordat, F. Affouard, M. Descamps, F. Migliardo, J. Phys. Chem.

B. 109, 11046 (2005)29. Y. Choi, K.W. Cho, K. Jeong, S. Jung, Carbohydr. Res. 341, 1020 (2006)30. K. Akao, Y. Okubo, T. Ikeda, Y. Inoue, M. Sakurai, Chem. Lett. 8, 759 (1998)31. T. Furuki, R. Abe, H. Kawaji, T. Atake, M. Sakurai, J. Chem. Thermodyn. 38,

1612 (2006)

Page 257: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

12 Biological Functions of Trehalose as a Substitute for Water 239

32. G.M. brown, D.C. Rohrer, B. Berking, C.A. Beevers, R.G. Gould, R. Simpson,Acta Crystallogr. B 28, 3145 (1972)

33. T. Taga, M. Senma, K. Osaki, Acta Crystallogr. B 28, 3258 (1972)34. G.A. Jeffrey, R. Nanni, Carbohydr. Res. 137, 21 (1985)35. K. Akao, Y. Okubo, N. Asakawa, Y. Inoue, M. Sakurai, Carbohydr. Res. 334,

233 (2001)36. H. Nagase, T. Endo, H. Ueda, M. Nakagaki, Carbohydr. Res. 337, 167 (2002)37. F. Sussich, R. Urbani, A. Cesaro, F. Princivalle, S. Bruckner, Carbohydr. Lett.

2, 403 (1997)38. F. Sussich, R.Urbani, F. Princivalle, A. Cesaro, J. Am. Chem. Soc. 120,

7893(1998)39. F. Sussich, C. Skopec, J. Brady, A. Cesaro, Carbohydr. Res. 334, 165 (2001)40. H. Nagase, T. Endo, H. Ueda, T. Nagai, STP Pharm. Sci. 13, 269 (2003)41. A.M. Gil, P.S. Belton, V. Felix, Spectrochim. Acta 52, 1649 (1996)42. K. Akao, Y. Okubo, Y. Inoue, M. Sakurai, Carbohydr. Res. 337, 1729 (2002)43. H. Nagase, N. Ogawa, T. Endo, M. Shiro, H. Ueda, M. Sakurai, J. Phys. Chem.

B 112, 9105 (2008)44. F. Sussich, A. Cesaro, J. Therm. Anal. Calorim. 62, 757 (2000)45. J.F. Willart, A. De Gusseme, S. Hemon, M. Descamps, F. Leveiller, A. Rameau,

J. Phys. Chem. B 106, 3365 (2002)46. L.S. Taylor, P. York, J. Pharm. Sci. 87, 347 (1998)47. L.S. Taylor, A.C. Williams, P. York, Pharm. Res. 15, 1207 (1998)48. T. Furuki, A. Kishi, M. Sakurai, Carbohydr. Res. 340, 429 (2005)49. T. Furuki, R. Abe, H. Kawaji, T. Atake, M. Sakurai, J. Therm. Anal. Calorim.

91, 561–567 (2008)50. D. Kilburn, S. Townrow, V. Meunier, R. Richardson, A. Alam, J. Ubbink, Nat.

Mater. 5, 632 (2006)51. L.M. Crowe, D.S. Reid, J.H. Crowe, Biophys. J. 71, 2087 (1996)52. D.P. Milller, J.J. de Pablo, J. Phys. Chem. B 104, 8876 (2000)53. T. Chen, A. Fowler, M. Toner, Cryobiology 40, 277 (2000)54. R. Surama, A. Pyne, R. Suryanarayanan, Pharm. Res. 21, 867 (2004)55. K. Kawai, T. Hagiwara, R. Takai, T. Suzuki, Pharm. Res. 22, 490 (2005)56. J.L. Green, C.A. Angell, J. Phys. Chem. 93, 2880 (1989)57. K. Oku, M. Kubota, S. Fukuda, M. Kurimoto, Y. Tujisaka, M. Sakurai, Cryobiol.

Cryotechnol. 50, 97 (2004)58. B.J. Aldous, A.D. Affret, F. Franks, Cryo Lett. 16, 181 (1996)59. J.H. Crowe, J.F. Carpenter, L.M. Crowe, Annu. Rev. Physiol. 60, 73 (1998)60. J.H. Crowe, in Molecular Aspects of the Stress Response: Chaperones, Mem-

branes and Networks, ed. by P. Csermely, L. Vıgh (Landes Bioscience andSpringer, New York, 2007), Chapter 13

61. P.S. Belton, A.H. Gil, Biopolymers 34, 957 (1994)62. G. Cottone, G. Gicotti, L. Gordone, J. Cell. Phys. 117, 9862 (2002)63. R.D. Lins, C.S. Pereira, P.H. Hunenberger, Proteins 55, 177 (2004)64. A.K. Sum, R. Faller, J.J. de Pablo, Biophys. J. 85, 2830 (2003)65. M.A. Villarreal, S.B. Dıaz, E.A. Disalvo, G.G. Montich, Langmuir 20, 7844

(2004)66. C.S. Pereira, R.D. Lins, I. Chandrasekhar, L.C.G. Freitas, P.H. Hunenberger,

Biophys. J. 86, 2273 (2004)67. A. Skibinsky, R.M. Venable, R.W. Pastor, Biophys. J. 89, 4111 (2005)

Page 258: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

240 M. Sakurai

68. C.S. Pereira, P.P.H. Hunenberger, J. Phys. Chem. B 110, 15572 (2006)69. L. Lerbret, F. Affouard, P. Bordat, A. Hedoux, Y. Guinet, M. Descamps, Chem.

Phys. 345, 267 (2008)70. F. Albertorio, V.A. Chapa, X. Chen, A.J. Diaz, P.S. Cremer, J. Am. Chem. Soc.

129, 10567 (2007)71. F. Sano, N. Asakawa, Y. Inoue, M. Sakurai, Cryobiology 39, 80 (1999)72. M. Sakurai, H. Kawai, Y. Inoue, A. Hino, S. Kobayashi, Bull. Chem. Soc. Jpn.

68, 3621 (1995)73. M. Sakurai, T. Furuki, K. Akao, D. Tanaka, Y. Nakahara, T. Kikawada,

M. Watanabe, T. Okuda, Proc. Natl. Acad. Sci. USA 105, 5093 (2008)74. H.E. Hinton, J. Insect Physiol. 5, 286 (1960)75. H.E. Hinton, Nature 188, 336 (1960)76. S. Adams, Antenna 8, 58 (1985)77. M. Watanabe, M. Kikawada, T. Okuda, J. Exp. Biol. 206, 2281 (2003)78. B. Zhu, T. Furuki, T. Okuda, M. Sakurai, J. Phys. Chem. B 111, 5542 (2007)79. W.Q. Sun, A. Leopold, Comp. Biochem. Physiol. A 117, 327 (1997)80. J. Buitink, O. Leprince, Cryobiology 48, 215 (2004)81. A. Tunnacliffe, M.J. Wise, Naturwissenshafen 114, 741 (2007)82. W.F. Wolkers, S. McCready, W. Brandt, G.G. Lindsey, F.A. Hoekstra, Biochim.

Biophys. Acta 1544, 196 (2001)83. T. Kikawada, Y Nakahara, Y. Kanamori, K. Iwata, M. Watanabe, B. McGee,

A. Tunnacliffe, T. Okuda, Biochem. Biophys. Res. Commun. 348, 56 (2006)84. T.-Y. Lin, S.N. Timasheff, Protein Sci. 5, 372 (1996)85. G. Xie, S.N. Timasheff, Biophys. Chem. 64, 25 (1997)86. J.K. Kaushik, R. Bhat, Proc. Natl. Acad. Sci. USA 278, 26458 (2003)87. T. Nishiwaki, M. Sakurai, Y. Inoue, R. Chujo, S. Kobayashi, Chem. Lett. 19,

1841 (1990)88. M. Tanaka, Y. Machida, S. Niu, T. Ikeda, N.R. Jana, H. Doi, M. Kurosawa,

M. Nekooki, N. Nukina, Nat. Med. 10, 148 (2004)89. R. Liu, H. Barkhordarian, S. Emadi, C.B. Park, M.R. Sierks, Neurobiol. Dis.

20, 74 (2005)90. M.A. Singer, S. Lindquist, Mol. Cell 1, 639 (1998)91. K. Oku, H. Watanabe, M. Kubota, S. Fukuda, M. Kurimoto, Y. Tsujisaka,

M. Komori, Y. Inoue, M. Sakurai, J. Am. Chem. Soc. 125, 12739 (2003)92. K. Oku, M. Kurose, M. Kubota, S. Fukuda, M. Kurimoto, Y. Tujisaka, A. Okabe,

M. Sakurai, J. Phys. Chem. B 109, 3032 (2005)93. A. Okabe, K. Oku, S. Fukuda, T. Furuki, M. Sakurai, Cryobiol. Cryotechnol.

53, 111 (2007)94. G. Brumfiel, Nature 428, 14 (2004)95. N. Guo, I. Puhlev, D.R. Brown, J. Mansbridge, F. Levine, Nat. Biotechnol. 18,

168 (2000)96. A. Eroglu, M.J. Russo, R. Bieganski, A. Fowler, S. Cheley, H. Bayley, M. Toner,

Nat. Biotechnol. 18, 163 (2000)97. T. Kikawada, A. Saito, Y. Kanamori, Y. Nakahara, K. Iwata, D. Tanaka,

M. Watanabe, T. Okuda, Proc. Natl. Acad. Sci. USA 104, 11585 (2007)

Page 259: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

13

Protein Misfolding Diseases and the Key RolePlayed by the Interactions of Polypeptideswith Water

C.M. Dobson

Abstract. The manner in which a newly synthesised chain of amino acids folds intothe unique structure of a functional globular protein depends both on the intrinsicproperties of the amino acid sequence and on multiple influences within the crowdedaqueous milieu of the cell. But if proteins misfold, or fail to remain correctly folded, acommon consequence is aggregation, a phenomenon that is involved in many highlydebilitating and increasingly common medical disorders including Alzheimer’s dis-ease and Type II diabetes. In this chapter we describe first how the concerted appli-cation of a wide range of experimental and theoretical techniques under laboratoryconditions has allowed the fundamental principles of protein misfolding and aggre-gation to be understood at an atomic level. Then we discuss approaches that aredesigned to explore how these principles apply within living systems. Of particularimportance in the context of this volume is the emergence of the role of aggregationpropensity, closely linked to the solubility of specific states of proteins in the aqueousenvironment of the cell, as one of the most fundamental properties that is encodedin the sequences of peptide and protein molecules.

13.1 Introduction

One of the essential characteristics of a living system is the ability of its com-ponent molecular structures to self-assemble into their functional forms in alargely aqueous environment [1]. The folding of proteins into their compactthree-dimensional structures is the most fundamental example of biologicalself-assembly; understanding this process therefore provides unique insightinto the way in which evolutionary selection has influenced the properties of amolecular system for functional advantage [2]. The wide variety of highly spe-cific structures that result from protein folding, and which serve to bring keyfunctional groups into close proximity, has enabled living systems to developastonishing diversity and selectivity in their underlying chemical processes.A key aspect of this process is the role played by water in the stability of thefolded states of proteins in the cellular environment and in enabling the fold-ing process to occur efficiently [3]. In addition, the evolutionary selection of

Page 260: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

242 C.M. Dobson

the sequences of proteins ensures that they are able to retain solubility at thelevel required for the optimal functional efficiency of the organisms in whichthey are expressed [4].

Another important recent development in molecular biology is that wenow know that the folding process does much more than simply generatebiological activity, and that it is strongly coupled to many other biologicalprocesses including the trafficking of molecules to specific cellular locationsand the regulation of cellular growth and differentiation. In addition, onlycorrectly folded proteins have the ability to remain soluble in crowded biolog-ical environments and to interact selectively with their natural partners [2]. Itis not surprising, therefore, that the failure of proteins to fold correctly, or toremain correctly folded, is the origin of a wide variety of pathological condi-tions [5]. In this chapter we explore the underlying nature and consequences ofmisfolding and its links with disease, with particular emphasis on the role ofwater. In order to achieve these objectives we show how it is possible to relateprocesses, such as solubility which can be studied in detail in the test tube,to their effects in living systems through the use of model organisms such asthe fruit fly [6]. In this context we stress the remarkable correlations betweenspecific physicochemical phenomena and biological phenomena ranging fromlocomotor abilities to lifespan.

13.2 The Importance of Normal and Aberrant ProteinFolding in Biology

The manner by which a polypeptide chain folds to a specific three-dimensionalprotein structure has not until recently been understood at anything ap-proaching the atomic level. The folded structures of the native states of manyproteins are, however, known, and are thought almost always to correspondto the structures that are most thermodynamically stable under physiologi-cal conditions [7]. The role of water in determining this stability is critical,and globular proteins have a close-packed hydrophobic core with polar andcharged groups on the surface. Burial of the hydrophobic residues is a majordriving force in folding, and the nature and distribution of surface groups iscrucial for ensuring solubility and independence within the crowded molecularenvironment of the cell [8]. Despite the fact that the native state is energeti-cally favoured, the total number of possible conformations of any polypeptidechain is so large that a systematic search for this required structure duringfolding from an ensemble of highly unstructured species would take an astro-nomical length of time. It is now clear, however, that the folding process doesnot involve a series of mandatory steps between well-defined partially foldedstates, but rather a stochastic search of the many conformations accessible toa polypeptide chain [7, 9–11].

Page 261: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

13 Protein Misfolding Diseases 243

Natural proteins are able to fold to specific structures because, on aver-age, native-like interactions between residues are more stable than non-nativeones. The former are therefore more persistent and the polypeptide chain isable to find its lowest energy structure by a process of trial and error. More-over, if the free energy surface or landscape has the right shape (see Fig. 13.1),only a minute fraction of all possible conformations is sampled by any given

Fig. 13.1. A highly schematic energy landscape for protein folding. This surface isderived from a computer simulation of the folding of a highly simplified model of asmall protein. The surface serves to “funnel” the multitude of denatured conforma-tions to the unique native structure. The critical region on a simple surface such asthis one is the saddle point corresponding to the transition state, the barrier thatall molecules must cross to be able to fold to the native state. Superimposed onthis schematic surface is an ensemble of structures corresponding to the experimen-tal transition state for the folding of a small protein; this ensemble was calculatedby using computer simulations constrained by experimental data from mutationalstudies of the protein acylphosphatase [12]. The spheres represent the three “keyresidues” in the structure; when these residues have formed their native-like con-tacts, the overall topology of the native fold is established. The structure of thenative state is shown at the bottom of the surface, while at the top are indicatedschematically some contributors to the distribution of unfolded states that representthe starting point for folding. Also indicated are highly simplified trajectories forthe folding of individual molecules. From [2]

Page 262: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

244 C.M. Dobson

protein molecule during its transition from a random coil to a native struc-ture [7, 9–11]. As the landscape, describing the free energies of the differentpossible conformations of the protein in its cellular environment (aqueous forthe cytosolic proteins that we largely discuss here, although non-polar forat least some regions of membrane proteins), is encoded by the amino acidsequence, natural selection has enabled proteins to evolve so that they areable to fold rapidly and efficiently. Such a description is often referred to asthe new view of protein folding and illustrates how the application of ideasfrom chemical physics and statistical mechanics has provided a robust anduniversal conceptual basis for understanding this complex biological processin molecular detail [7, 9–11].

In a living system, proteins are synthesised on ribosomes from the ge-netic information encoded in the cellular DNA. The nature of the subsequentfolding process for a given protein varies significantly for different types ofprotein, and ranges from co-translational folding, in which the nascent chainbecomes at least partially structured prior to its release from the ribosome,to folding within organelles such as mitochondria where folding may occuronly after trafficking and translocation through membranes [13–15]. But de-spite such variety, the fundamental principles of folding, discussed above, areundoubtedly universal. And as incompletely folded proteins must inevitablyexpose to the solvent at least some regions of structure that are buried in thenative state, they are prone to inappropriate interactions with other moleculeswithin the crowded environment of a cell [16]. Living systems have thereforeevolved a range of strategies to prevent such behaviour [14], including thepresence of proteins that catalyse potentially slow steps in attaining the cor-rect fold, such as proline isomerisation and disulphide bond formation, manyvarieties of molecular chaperones that play a vital role in reducing misfoldingand aggregation, as well as quality control mechanisms that play crucial rolesin targeting irreversibly misfolded proteins for degradation [14,17–19].

It is increasingly recognised, however, that the process of protein foldingis much more than just a fascinating example of the ability of a biologicalsystem to self-assemble to generate a functional state. Biological phenom-ena as apparently diverse as the translocation of proteins across membranes,their trafficking to particular locations or secretion to the outside world, thespecificity of the immune response and the regulation of cell growth and pro-liferation are directly dependent on folding and unfolding events [2]. Failureto fold correctly, or to remain correctly folded, will therefore give rise to themalfunctioning of living systems and hence to disease [20–22]. Some of thesediseases (e.g., cystic fibrosis [20] and some types of cancer [23]) result fromthe simple fact that if proteins do not fold correctly they will not be presentin sufficient quantities to exercise their proper function; many such disor-ders, normally called loss of function diseases, are familial as the probabilityof misfolding is often greater in mutational variants than in the wild-typeprotein because of the likelihood of their decreased stability and reduced co-operativity. In other cases, proteins with a high propensity to misfold escape

Page 263: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

13 Protein Misfolding Diseases 245

Fig. 13.2. Schematic representation of the possible mechanism of amyloid forma-tion by a globular protein such as lysozyme. After synthesis on the ribosome, theprotein folds in the endoplasmic reticulum (ER), aided by molecular chaperonesthat deter aggregation of incompletely folded species. The correctly folded proteinis secreted from the cell and functions normally in its extracellular environment. Un-der some circumstances, the protein unfolds at least partially, and becomes prone toaggregation. This can result in the formation of fibrils and other aggregates that canaccumulate in tissue. Small oligomeric or pre-fibrillar aggregates as well as highlyorganised fibrils and plaques can give rise to pathological conditions in some disor-ders, notably the neurodegenerative diseases. N, I and U refer to native, partiallyunfolded (intermediate) and unfolded states of the protein, respectively. QC refersto the quality control mechanism that prevents incompletely folded proteins beingsecreted from the ER. From [24]

all the protective mechanisms and form intractable aggregates within cellsor (more commonly) in extracellular space (Fig. 13.2). An increasing numberof disorders (see Table 13.1), including Alzheimer’s and Parkinson’s diseases,the spongiform encephalopathies and type II diabetes, are known to be di-rectly associated with the deposition of such aggregates in tissues includingthe brain, heart and spleen [5]. In the next section we shall look at the un-derlying molecular origins of the formation of these species, and of the crucialimportance of the solubility of proteins in the environments in which they arelocated within living systems.

Page 264: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

246 C.M. Dobson

Table 13.1. A selection of some of the major human diseases associated with mis-folding and the formation of extracellular amyloid deposits or intracellular inclusionswith amyloid like characteristics (selected from [5] in which a more comprehensivelist is given)

Disease Aggregatingprotein or peptide

Length ofprotein orpeptidea

Structure of proteinor peptideb

Neurodegenerative diseases

Alzheimer’s diseasec Amyloid β peptide 40 or 42d Natively unfoldedSpongiformencephalopathiesc,e

Prion protein orfragments thereof

253 Natively unfolded(1–120) andα-helical (121–230)

Parkinson’s diseasec α-Synuclein 140 Natively unfoldedAmyotrophic lateralsclerosisc

Superoxidedismutase 1

153 All-β, IG-like

Huntington’s diseasef Huntingtin withlong polyQstretches

3,144g Largely nativelyunfolded

Familial amyloidoticpolyneuropathyf

Mutants oftransthyretin

127 All-β,prealbumin-like

Non-neuropathic systemic amyloidoses

AL amyloidosisc Immunoglobulinlight chains orfragments thereof

ca. 90d All-β, IG-like

AA amyloidosisc Fragments ofserum amyloid Aprotein

76–104d All-α, unknown fold

Senile systemicamyloidosisc

Wild-typetransthyretin

127 All-β,prealbumin-like

Hemodialysis-relatedamyloidosisc

β2-Microglobulin 99 All-β, IG-like

Finnish hereditaryamyloidosisf

Fragments ofgelsolin mutants

71 Natively unfolded

Lysozymeamyloidosisf

Mutants oflysozyme

130 α + β, lysozyme-fold

Non-neuropathic localised amyloidoses

ApoAI amyloidosisf Fragments ofapolipoprotein AI

80–93d Natively unfolded

Type II diabetesc Amylin 37 Natively unfoldedMedullary carcinomaof the thyroidc

Calcitonin 32 Natively unfolded

(continued)

Page 265: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

13 Protein Misfolding Diseases 247

Table 13.1. (Continued)

Disease Aggregatingprotein or peptide

Length ofprotein orpeptidea

Structure of proteinor peptideb

Hereditary cerebralhaemorrhage withamyloidosisf

Mutants ofamyloid β peptide

40 or 42d Natively unfolded

Injection-localisedamyloidosisc

Insulin 21 + 30h All-α, insulin-like

aThe data do not refer to the number of amino acid residues of the precursor proteins,but to the lengths of the processed polypeptide chains that deposit into aggregates.bThis column reports the structural class and fold; both refer to the processed pep-tides or proteins that deposit into aggregates prior to aggregation and not to theprecursor proteins.cPredominantly sporadic although in some of these diseases hereditary forms asso-ciated with specific mutations are well documented.dFragments of various lengths are generated and reported in ex vivo fibrils.eFive per cent of cases are infectious (iatrogenic).fPredominantly hereditary although in some of these diseases sporadic cases aredocumented.gLengths refer to the normal sequences with non-pathogenic traits of polyQ.hHuman insulin consists of two chains (A and B with 21 and 30 residues, respec-tively) covalently bonded by disulphide bridges.

13.3 Protein Aggregation and Amyloid Formation

Each amyloid-associated disease involves predominantly the aggregation of aspecific protein, although a range of other components including additionalproteins and carbohydrates is incorporated into the deposits when they formin vivo [5]. In the case of neurodegenerative diseases, the quantities of ag-gregates involved can sometimes be so small as to be almost undetectable,whereas in some systemic diseases – such as that associated with lysozymediscussed below – literally kilograms of protein can be found in one or moreorgans [29]. The characteristics of the soluble forms of the 40 or so proteinsinvolved in the well-defined amyloid disorders are very varied – they rangefrom intact globular proteins to largely unstructured peptide molecules – butthe aggregated forms have many common characteristics [30]. Amyloid de-posits all show specific optical behaviour (such as birefringence) on bindingcertain dye molecules such as Congo red. The fibrillar structures typical ofmany of the aggregates have very similar morphologies (long, unbranched andoften twisted structures a few nanometres in diameter) and a characteristic“cross-β” X-ray fibre diffraction pattern. The latter reveals that the organisedcore structure is composed of β-sheets whose strands run perpendicular to thefibril axis [30].

Page 266: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

248 C.M. Dobson

The ability of polypeptide chains to form such structures turns out, how-ever, not to be restricted to the relatively small numbers of proteins associatedwith recognised clinical disorders, and, indeed, we have suggested that it couldbe a generic feature of polypeptide chains [21,24]. Compelling evidence for thelatter statement is that fibrils can be formed in vitro by many peptides andproteins with no known disease association, including such well-known andhighly studied molecules as myoglobin [31], and also by homopolymers suchas polyalanine, polythreonine or polylysine [32]. The latter finding indicatesthat the ability to form the amyloid structure does not need to be encodedin the sequence of the protein; in essence it is inherent in the intrinsic char-acter of polypeptide chains, akin to analogous properties of many syntheticpolymers, and this finding is reinforced by recent computer simulations ofa simple model of a small homopolymeric peptide that self-assembles into across-β structure under a wide range of conditions (Fig. 13.3) [33]. Of partic-ular interest is the fact that a variety of different mechanisms of assembly areobserved in the simulations, ranging from the direct assembly of single β-sheetsto a process in which the peptides coalesce into a disorganised oligomer withinwhich structural reorganisation takes place to produce the cross-β structure;remarkably, the variety of assembly processes seen in an extended series ofcomputer simulations has been observed experimentally in studies of a widerange of different systems [5].

We have determined the atomic-level structure of a peptide molecule inamyloid fibrils by solid-state NMR techniques, and the results show clearlythe extended molecular conformation characteristic of β-strands and also thefact that the side chains are close-packed in remarkably specific orientations,at least within the central region of the structure [25]. Indeed, increasinglydetailed models based on data from techniques such as X-ray fibre diffraction,cryo-electron microscopy (EM) and solid-state NMR are now emerging [5];one early example showing characteristic features that have been observed ingeneral terms in a range of more recent studies of a variety of different systems,representing variations on a common theme, is shown in Fig. 13.4 [26,34]. Anadditional development is the ability to crystallise small peptides that showfibrillar-like assemblies within three-dimensional crystals, enabling the natureof the interactions between specific residues in amyloid-like structures to beexplored [27]. But it is clear that the increasing capability of solid-state NMRspectroscopy to determine detailed three-dimensional structures of fibrillarstructures [28] is the crucial step forward, and that it will soon lead to a knowl-edge of sufficient amyloid and amyloid-like structures to enable the determi-nants of their characteristic structural features to be understood in detail.

In addition to defining their molecular structures, it is of considerable in-terest to understand the physical properties of the fibrils and the nature of theforces that lead to their stability. To this end, we have been studying a rangeof different fibrils by means of experimental approaches originally developedwithin the rapidly developing field of nanotechnology, such as atomic forcemicroscopy (AFM), in conjunction with computer simulation methods [35].

Page 267: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

13 Protein Misfolding Diseases 249

Fig. 13.3. Schematic illustration of the “condensation-ordering” mechanism of ag-gregation. This mechanism is indicated by results from computer simulations of theaggregation of a 12-residue peptide composed of identical amino acids, modelled us-ing a simple “tube” model to describe the peptide structure [33]. The characteristiccross-β structure of amyloid fibrils is observed to emerge spontaneously, and can doso through a variety of apparently distinct processes that have been the focus ofintense experimental and theoretical studies [5]. These different processes appear asdifferent manifestations of a common underlying process and depend on the relativeimportance of hydrogen bonding and hydrophobic interactions. Highly hydrophobicpolypeptide chains collapse first into disordered and highly dynamic oligomers andthen rearrange into ordered assemblies, while more hydrophilic peptides assembledirectly into an array of β-strands. As well as allowing the various processes in-volved in aggregation to be identified, these simulations enable the nature of thenucleation process to be revealed and provide insight into the origin of the toxicityof the oligomeric aggregates that appear in the intermediate stages of the process.From [33]

Our findings reveal that amyloid fibrils represent a well-defined class of highlyorganised materials with similar physical properties that can be comparedand contrasted on the nanometre scale with well-established types of moreconventional materials [35]. Specifically, the core structure of the fibrils isstabilised primarily by interactions, particularly hydrogen bonds, involvingthe polypeptide main chain (Fig. 13.5). As the main chain is common to all

Page 268: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

250 C.M. Dobson

Fig. 13.4. Comparison of examples of native and amyloid structures of proteinmolecules. On the left are ribbon diagrams of the native structures of three smallproteins: an SH3 domain (top), myoglobin (bottom) and acylphosphatase (middle).The native structures differ in their topologies and contents of α-helices and β-sheetsresulting from the dominance of side-chain interactions within their highly evolvedsequences. On the right is a molecular model of an amyloid fibril (image kindlyprovided by Helen Saibil, Birkbeck College, London, from work reported in [26].The fibril was produced from the SH3 domain whose native structure is shown onthe left, and consists of four “protofilaments” that twist around one another to forma hollow tube with a diameter of approximately 6 nm. The β-strands (flat arrows) areoriented perpendicular to the fibril axis and are linked together by hydrogen bondsinvolving main chain amide and carbonyl groups, many of which are intermolecular,to form a continuous structure in each protofilament. The protofilaments are heldtogether by much weaker interactions involving primarily side-chain contacts. Asthe main chain is common to all polypeptides, the core protofilament structures offibrils from different sequences have common features, differing only in detail as aresult of differences in the non-dominant effects of side-chain packing. The arrowindicates that when the native states of globular proteins are destabilised, they tendto convert into the generic amyloid structure, as described in the text. From [34]

polypeptides, this observation explains why fibrils formed from polypeptidesof very different sequences have marked similarities, particularly in the fibrilcore structure, although differences in detail exist as a result of the influenceof the packing of the side chains [24, 30, 35]. In some cases, only a fraction ofthe residues of a given protein may be involved in this core structure, withthe remainder of the chain associated in some other manner with the fibrillarassembly; in other cases, almost the whole polypeptide chain appears to beinvolved. The generic amyloid structure, characteristic of the polymeric char-acter of polypeptide chains, contrasts strongly with the highly individualisticglobular structures of most natural proteins; in these latter structures the

Page 269: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

13 Protein Misfolding Diseases 251

Fig. 13.5. Comparison of the mechanical properties among different classes of mate-rials. The plot shows the correlation between the bending rigidity of a given materialas a function of its cross-sectional moment of inertia. A linear relationship within aspecific type of material indicates that the forces stabilising the differently sized sam-ples of that material are identical. The dark grey band in the diagram encompassesthe various examples of amyloid fibrils formed from different types of peptide orprotein investigated in this study. The close correlation of the rigidity and momentof inertia indicates similar interactions in each type of fibril, and analysis showsthat the dominant contribution to the interactions are the main-chain hydrogenbonds between the β-strands of the amyloid cross-β structure; further support forthis conclusion comes from the fact that spider silk, the strength of which is alsoattributed to main-chain hydrogen bonding, correlates closely with amyloid fibrils.The mid-grey band encompasses materials such as actin filaments that are held to-gether by amphiphilic interactions characteristic of amino-acid side chains; the twoexamples of amyloid protofibrils examined in this study fall within this range, sug-gesting that strong main-chain interactions are not fully formed at this stage of theassembly process. Further details are given in [35], from which this figure is taken

interactions associated with the highly complementary packing of the sidechains appear to override the main chain preferences (Fig. 13.4) [24, 35]. Be-cause the interactions stabilising the two alternative types of ordered proteinstructure, the globular and amyloid forms, are similar in nature their stabili-ties can be similar under some conditions.

Even though the ability to aggregate to form amyloid fibrils appears tobe generic, the propensity to do so under given circumstances can vary dra-matically between different sequences [5]. It has proved possible to correlatethe relative aggregation rates of a wide range of peptides and proteins with

Page 270: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

252 C.M. Dobson

Fig. 13.6. Calculated vs. observed changes in aggregation rates upon mutation.The experimental data relate to mutations of short peptides or natively unfoldedproteins including amylin, the Aβ-peptide and α-synuclein. The calculated valuesare determined from an equation involving the changes in just three variables –hydrophobicity, charge and secondary structure propensities – caused by the muta-tions. The plot shows, for both experimental and calculated data, ln (υmut/υwt), i.e.,the natural logarithm of the aggregation rate of the mutant υmut divided by that ofthe wild-type molecule υwt. From [36]

physicochemical features of the molecules such as charge, secondary structurepropensities and hydrophobicity (Fig. 13.6) [36] and indeed to predict the re-gions of a polypeptide chain that have the highest propensity to self-assembleand which are likely to be found in the fibril cores [37]. In a globular pro-tein the polypeptide main chain and the hydrophobic side chains are largelyburied within the folded structure. Only when they are exposed, for exam-ple when the protein is partially unfolded (e.g., at low pH or as the result ofdestabilising mutations) or fragmented (e.g., by proteolysis), will conversioninto amyloid fibrils be facile. Recent studies are exploring in much greaterdetail than before the nature and rate of establishment of the equilibriumbetween the solution and fibrillar states of a protein, and in essence definingboth the kinetic behaviour and the solubility of the peptides and proteinsinvolved [38,39].

The propensities of folded proteins to aggregate will therefore depend onthe accessibility of such aggregation-prone species, a conclusion that is clearlydemonstrated by detailed studies of the amyloidogenic mutational variants oflysozyme, which we have found to decrease the stability and cooperativityof the native state (Fig. 13.4) [40–43]. Indeed, these experiments show thatthe effect of the disease-associated mutations is to decrease the energy differ-ence between the native state and the intermediates populated in the normalfolding of the protein, such that the latter are accessible to a much greaterextent in the variants than in the wild-type protein [40]. The large mass ofevidence now accumulated from studies of lysozyme has provided detailed

Page 271: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

13 Protein Misfolding Diseases 253

insight into many aspects of the likely origin of systemic amyloid disease; thistopic has recently been reviewed and will not be discussed in this article [41].Of particular interest, however, is the increasing recognition that fluctuationsof native-like species could be of critical importance in the aggregation of pro-teins to form amyloid structures under physiological conditions without theneed for significant perturbations of the environment in which the proteinsnormally function [44].

13.4 Molecular Evolution and the Controlof Protein Misfolding

It is apparent that biological systems have become robust not just by carefulmanipulation of the sequences of proteins but also by controlling, by means ofmolecular chaperones and degradation mechanisms, the particular conforma-tional state adopted by a given polypeptide chain at a given time and undergiven conditions (Fig. 13.7). This process can be thought of as being analogousto, and just as fundamental and important as, the way that biology regulatesand controls the various chemical transformations that take place in the cellby means of enzymes. And, just as the aberrant behaviour of enzymes cancause metabolic disease, the aberrant behaviour of the chaperone and othermachinery regulating polypeptide conformations can contribute to misfoldingand aggregation diseases [45,46].

The ideas encapsulated in Fig. 13.7, therefore, serve as a physicochemicalframework for understanding the fundamental events that underlie misfold-ing diseases. For example, many of the mutations associated with the familialforms of deposition diseases, as discussed earlier for lysozyme, increase thepopulation of partially unfolded states, and hence increase the propensityto aggregate by decreasing the stability or cooperativity of the native struc-ture [41,47,48]. Other familial diseases are associated with the accumulation ofamyloid deposits whose primary components are fragments of native proteins;such fragments can be produced by aberrant processing or incomplete proteol-ysis, and are unable to fold into aggregation-resistant states. Other pathogenicmutations act by enhancing the propensities of such species to aggregate, forexample, by increasing their hydrophobicity or decreasing their charge [36].And, in the case of the prion disorders such as Kuru or Creutzfeldt–Jakob dis-ease, it appears that ingestion of pre-aggregated states of an identical protein,e.g., by voluntary or involuntary cannibalism or by means of contaminatedpharmaceuticals or surgical instruments, can increase dramatically the inher-ent rate of aggregation through seeding and breakage, and hence generate amechanism for transmission [49,50].

In some aggregation diseases, the large quantities of insoluble protein in-volved may physically disrupt specific organs and hence cause pathologicalbehaviour [29]. But for neurodegenerative disorders, such as Alzheimer’s dis-ease, the primary symptoms almost certainly result from toxicity associated

Page 272: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

254 C.M. Dobson

Fig. 13.7. A unified view of some of the multiple types of structure that can beformed by polypeptide chains. An unstructured chain, for example newly synthesisedon a ribosome, may fold to a native structure, perhaps via one or more partiallyfolded intermediates. It can, however, experience other fates such as degradationor aggregation. An amyloid fibril is just one form of aggregate, but it is unique inhaving a highly organised structure, as shown in Fig. 13.5. The populations and in-terconversions of the various states are determined by their relative thermodynamicand kinetic stabilities under any given conditions. In living systems, however, transi-tions between the different states are highly regulated by control of the environment,and by the presence of molecular chaperones, proteolytic enzymes and other factors.Failure of such regulatory mechanisms is likely to be a major factor in the onset ofmisfolding diseases. From [2]

with aggregation and are therefore often described as gain of function dis-eases [51, 52]. The early pre-fibrillar aggregates of proteins associated withsuch diseases have been shown to be highly damaging to cells; by contrast,the mature fibrils appear relatively benign [52–54]. Moreover, we have recently

Page 273: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

13 Protein Misfolding Diseases 255

found that similar aggregates of proteins that are not connected with anyknown diseases can be equally toxic to cells, both when added to cell culturemedium [55] and also when microinjected into the brains of rats [56].

The generic nature of such aggregates and their effects on cells has recentlybeen supported by the remarkable finding that antibodies raised against earlyaggregates of Aβ cross-react with early aggregates of a range of different pep-tides and proteins, and moreover inhibit their toxicity [57, 58]. It is possiblethat there are specific mechanisms for this toxicity, for example, as a result ofannular species that resemble the toxins produced by bacteria that form poresin membranes and disrupt the ion balance in cells [59]. It is likely, however,that the relatively disorganised pre-fibrillar aggregates are inherently toxicthrough a less specific mechanism, for example, as a result of the exposureof non-native hydrophobic surfaces stimulating aberrant interactions with cellmembranes or other cellular components [60]. In contrast to the exquisitelydesigned surfaces of the correctly structured molecules within the crowded cel-lular environment, that have evolved to interact only with specific partners,the surfaces of any non-evolved polymeric aggregates that escape the varioustypes of protective mechanisms, discussed below, are likely to interact inap-propriately with many of the components of a biological system and hencewill commonly cause malfunctions and potentially disease.

13.5 Impaired Misfolding Controland the Onset of Disease

Under normal circumstances, molecular chaperones and other “housekeeping”mechanisms are remarkably efficient in ensuring that such potentially toxicspecies are neutralised before they can do any damage [14, 60, 61]. Such neu-tralisation could result simply from the efficient targeting of misfolded proteinsfor degradation, but it appears that molecular chaperones are also able to al-ter the partitioning between harmful and harmless forms of aggregates, as aresult of changing the kinetic or thermodynamic stability of one or more ofthe multiple species accessible to a protein (Fig. 13.7) [62]. If the efficiencyof such protective mechanisms becomes impaired, however, the probability ofpathogenic behaviour must increase [45, 61]. Such a scenario would explainwhy most of the amyloid diseases are associated with old age, where there islikely to be an increased tendency for proteins to become misfolded or dam-aged, ultimately at least coupled with a decreased efficiency of the protectivechaperone and unfolded protein responses [63]. It is ironic that through oursuccess in increasing the life expectancy of the populations of the developedworld we are now seeing the limitations of our proteins and of the regulatorymechanisms that control their behaviour [60,64].

One of the characteristics of proteins that is implied in this explanationof misfolding diseases is that relatively small changes in their sequences as aresult of mutation, or of their biological environment in old age, are, at least

Page 274: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

256 C.M. Dobson

in some cases, enough to cause a shift from normal (soluble) to abnormal(aggregation) behaviour. This situation can be qualitatively rationalised bythe argument that natural selection can only generate sequences that are goodenough to allow the organism concerned to flourish relative to its potentialcompetitors; in this context, the behaviour of proteins in old age is unlikely tobe of importance in such a selection process [24,64]. Dramatic evidence for thissupposition has recently emerged from an analysis of the relationship betweenexperimental aggregation rates of a set of human proteins and measurementsof the level of gene expression that are likely to relate to the concentrationsof the corresponding proteins in the organism itself [4].

This analysis [4] shows that the correlation coefficient between the aggre-gation rates and expression levels of all the proteins for which both sets of datacould be found, which includes proteins both associated and not associatedwith amyloid disease, is an astonishing 0.97 (Fig. 13.8). This very high degreeof correlation is, however, exactly that predicted qualitatively by the reason-ing given above concerning evolutionary selection. Specifically, it reflects thefact that a protein must be soluble enough to exist at the level that is optimalfor the organism concerned, and this solubility is achieved by the selectionduring evolution of amino acid substitutions which reduce the propensity toaggregate. Most amino acid substitutions, however, increase the aggregationpropensity of natural proteins [36]. So once evolutionary selection has achieved

Fig. 13.8. Correlation between expression levels and the measured aggregation ratesfor a set of human proteins. The aggregation rates represent all the data obtainedfrom a comprehensive search of the amyloid aggregation literature, for studies carriedout at pH values between 4.0 and 8.0. The expression levels are estimated from thecellular mRNA concentration and are taken from published databases. The standarddeviations of the aggregation rates are reported only in four cases, as these valuesare generally not available or difficult to extract from the literature. Data for twoproteins not involved in any known medical conditions are included in the plot whilethe other points correspond to proteins that are associated with amyloid diseases.From [4]

Page 275: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

13 Protein Misfolding Diseases 257

a sufficiently low aggregation propensity to allow the optimal level of the pro-tein concerned to be achieved, random mutagenesis will in general prevent theaggregation propensity decreasing further; this combination of effects is likelyto be the explanation for cytosolic proteins tending to fall very close to theline indicated in Fig. 13.8. This result reflects the critical role that the interac-tion of proteins with water plays in the evolution of biological organisms andin the balance between the normal and aberrant behaviour that is associatedwith the onset of misfolding diseases.

13.6 Probing Misfolding and Aggregationin Living Organisms

The conclusions and ideas of the molecular basis of amyloid disease that havebeen discussed so far have been derived almost completely from experimentscarried out in the test tube (in vitro) and in the computer (in silico). Despitethe fact that there is strong circumstantial evidence to link them to eventsoccurring in living systems (in vivo), including experiments with cells in cul-ture, we wish to explore much more rigorously the way in which the myriadcomponents of the intra- and extracellular environment affect the quantitativerelationship between physicochemical properties such as aggregation propen-sity and its consequences in a living organism. To this end we are using thefruit fly (Drosophila meganister) as a model organism to link the chemistryand physics of aggregation to its biological effects in higher organisms [65].The advantage of this particular system for our purposes is that the shortlifespan (typically about 30 days) and low unit cost relative to, for example,transgenic rodent models permit us to carry out a very large number of exper-iments in a reasonable timeframe to obtain data that are statistically highlysignificant.

The approach we have taken is to exploit the existence of transgenic fruitflies in which the 42-residue Aβ-peptide is expressed in the brain. Lines offlies had previously been generated in which deposits of the peptide can beseen to develop with time [66]. In addition, the flies develop locomotor defects,observed most easily in assays of their ability to climb up a glass surface, andhave reduced lifespans. The deposits of the Aβ-peptide were found initiallyto occur within neurons and then to accumulate as extracellular depositsanalogous to those seen in sufferers from Alzheimer’s disease as well as intransgenic mouse models designed to study this condition. It had also beenfound that flies expressing the Aβ-peptide having the E22G (Arctic) mutation,which results in a very early onset form of Alzheimer’s disease in humans, havevery much shorter lifespans than those expressing the wild-type peptide andshow a much earlier appearance of peptide-containing deposits within braintissue and of locomotor defects [66,67].

The conceptual basis that underlies these experiments is encapsulated inFig. 13.8 and the accompanying explanation, which indicates that at least

Page 276: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

258 C.M. Dobson

many of our proteins are “on the edge” of aggregation as they can have evolvedonly to be as robust as is necessary to allow the living system in which they arepresent to compete successfully for survival [4]. If we were to make mutationsin the Aβ-peptide that increase or decrease its propensity for aggregation,we predicted that they should, on the arguments made earlier, increase ordecrease respectively the severity of neuronal damage in the transgenic flysystem. From our previous studies of aggregation in vitro, we can predictthe changes in the intrinsic propensity to aggregate by using the algorithmsbased on physicochemical principles and derived from the experimental data(Fig. 13.6) [36,37,54].

We have used this approach to design a series of some 20 single mutationalvariants of the 42-residue peptide in the first instance which we anticipatedwould give a spread of aggregation propensities. Because this peptide is notintrinsic to the fly, there is no reason to suppose that the mutations willcause any other differences in their behaviour; this assumption can, however,be explored statistically when the results on the whole set of peptides areanalysed. The variation in intrinsic aggregation rates is generally predicted tobe significantly less than an order of magnitude – rather modest in terms of thevariations in the rates of different naturally occurring peptides and proteinsthat cover more than six orders of magnitude – and representative studiescarried out in vitro have validated the accuracy of these predictions [54].In addition, quantitative analysis shows that the levels of expression of thedifferent peptides are similar, enabling this factor to be eliminated from theanalysis of the origins of any significantly different behaviour found withinthe series of variants.

The results of this set of experiments are dramatic, and a flavour of their re-markable nature is illustrated in a snapshot of a climbing assay involving a sub-set of the mutated peptides (Fig. 13.8). This experiment illustrates the effectof introducing two different single-residue mutations designed in each case toreduce the aggregation propensity of the wild-type peptide. It is immediatelyevident that the mutations result in the dramatic recovery of locomotor skills;similar experiments in which mutations were designed to increase the aggrega-tion propensity show equally striking decreases in such skills [54]. By defininga “toxicity” parameter based on locomotor ability and lifespan, the correlationof the experimental effects of the mutations can be compared with the pre-dictions in a quantitative manner (Fig. 13.9); this procedure reveals that thecorrelation coefficient relating toxicity to the aggregation propensity of 17 mu-tational variants is an astonishing 0.85 [6]. We can conclude from this findingthat, despite the vast machinery associated with the regulation and manage-ment of peptide and protein expression and degradation, the times of onset ofrestricted movement and the lifespans of the flies are quantitatively dependentsimply on the physicochemical properties of the aggregation-prone species.

The value of the correlation coefficient for the data shown in Fig. 13.9shows that the probability that neuronal dysfunction is not related directlyto the aggregation of the Aβ-peptide, in this system at least, is less than 1 in

Page 277: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

13 Protein Misfolding Diseases 259

Fig. 13.9. The effect of mutations in the sequence of the 42-residue humanAlzheimer Aβ-peptide on neuronal dysfunction in transgenic fruit flies. The upperleft panel (a) illustrates a climbing assay of flies expressing the wild-type sequence(left) and two mutational variants predicted to reduce the peptide’s aggregationpropensity; the more mobile the flies, the higher up the tube they can climb. Theright-hand upper panel (b) represents a similar experiment with flies expressing theAβ-peptide containing the E22G “Arctic mutation” (left-hand tube). The two righthand tubes contain flies expressing peptides that contain mutations that decreasethe propensity to form pre-fibrillar aggregates (protofibrils). The lower panel (c)shows the degree of correlation between the relative locomotor activity of a series ofmutational variants against their predicted propensities to form protofibrils. Figureadapted from [6]

100,000. In additional studies we have investigated the effects of second mu-tations introduced to “rescue” flies expressing an aggregation-prone variantof the Aβ-peptide, specifically the Arctic mutation (E22G) (Fig. 13.9). These

Page 278: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

260 C.M. Dobson

experiments show that it is possible to neutralise effectively completely theeffects of even this highly pathogenic mutation by a further substitution thatincreases its solubility [6]. Moreover, more detailed analysis shows that thedata correlate even more closely with the tendency of the various muta-tional variants to convert into pre-fibrillar (oligomeric) species than with theirpropensities to form the fully formed amyloid fibrils themselves [6].

These experiments therefore provide further very strong evidence for theproposition that oligomeric aggregates are responsible for cellular damage,and that they are the culprits in the onset of at least some of the diseasesassociated with the eventual appearance of amyloid fibrils. Moreover, studiesof the effects of aggregation in another model organism, C. elegans, using geneknockout techniques have provided evidence for the idea that the formationof relatively harmless large aggregates could have evolved to be a protectivemechanism against neuronal damage [68,69]. We believe that the use of modelorganisms in the ways illustrated in these examples will play a major role inthe quest to understand the underlying links between physical and chemicalprinciples and biological function: specifically in the context of this chapter,the fundamental origins of the complex and increasingly common diseasesthat are associated with protein misfolding [6] and the key role of the linksbetween the interactions of biological systems with water in terms of theirstability and solubility.

13.7 The Recent Proliferation of Misfolding Diseasesand Prospects for Effective Therapies

In the specific context of protein misfolding and misassembly, events that willalways have a finite probability of occurring given the complex and stochasticprocesses involved in normal folding and assembly, these studies have shownthat under normal circumstances molecular chaperones and other “housekeep-ing” mechanisms are remarkably efficient in ensuring that potentially toxicspecies such as oligomeric or pre-fibrillar amyloid aggregates are neutralisedin living systems before they can do significant damage [5,14]. Such neutralisa-tion can result from targeting them efficiently for degradation, from disruptingthem to regenerate their soluble precursors or from their conversion into lesstoxic aggregates such as fibrils and plaques.

The evidence discussed in this chapter suggests that the reason for therecent proliferation of aggregation diseases, in the developed world in par-ticular, is fundamentally due to the fact that at least some of our proteinsare poised right at the boundary between solubility and insolubility [4]. Insuch a situation, relatively small changes in aggregation propensities (e.g.,resulting from even a single mutation as in familial amyloid diseases such asthat associated with lysozyme [40, 41]), or in protein concentration (e.g., indialysis-related amyloidosis [74]) or decreases in the efficiency of protectivemechanisms or increases in the number of misfolded or damaged proteins

Page 279: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

13 Protein Misfolding Diseases 261

(e.g., in old age [63]) can result in the initiation and slow accumulation of ag-gregates such as amyloid fibrils, which can in some cases result in the presenceof significant quantities of toxic species such as fibril precursors.

These ideas, based initially on studies in “test tubes” or of cells in culture,are now being linked to the behaviour of higher organisms though the useof model systems such as fruit flies as “living test tubes” [64]. We see asa result of this type of approach the way that the principles of chemistryand physics translate remarkably directly into the biological and physiologicalproperties of living systems to an extent that can be attributed to the highlyinterdependent co-evolution of molecules and the biological environments inwhich they function. It is particularly satisfying, in the light of the fact thatliving cells contain a remarkable concentration of molecular species, typicallymore than 300 g L−1 [16], to conclude that the importance of maintainingsolubility of these species reflects the key role that the interaction of waterwith biomolecules plays in determining whether the behaviour of a biologicalsystem is normal or aberrant.

This picture that our proteins, the most abundant and ubiquitous of allmolecules in biology, are poised on the brink of an aggregation precipice mayappear at first sight to be a very negative conclusion about the prospects foravoiding misfolding and deposition diseases in the future. There is, however, avery positive conclusion that can be drawn from these findings: they indicatethat only relatively small reductions in intrinsic physicochemical propertiessuch as aggregation propensities, or in factors such as protein concentrationor the efficiency of the various mechanisms, natural or otherwise, which serveto protect us from disease, can take us into the safety zone of solubility; sucha situation is illustrated in the dramatic effects of the “rescue” mutationsin the fly model of Alzheimer’s disease [54]. Indeed, the vast increase in ourunderstanding of the origins and means of progression of misfolding and aggre-gation diseases that has taken place in the last decade are beginning to allowthe rational design of strategies to combat these highly debilitating disordersin different ways. The generic process of aggregation that has been outlinedearlier indicates that there are several very specific steps in the process wheredirected therapeutic intervention looks highly promising [70,71].

Ultimately, if one can achieve the ability to manipulate gene sequencesin humans (e.g., by “gene therapy” or stem-cell techniques), it should bepossible to abolish disorders such as Alzheimer’s disorders as we see in thecase of the rescue mutations in transgenic fruit flies discussed above [54]. Butuntil then, certain classes of molecular therapeutics look particularly promis-ing; as an example, a number of approaches based on antibodies or otherspecific binding agents are being explored, as such binding agents can betargeted against a particular molecular species so as, for example, to sta-bilise the aggregation-resistant native state or to reduce the concentration ofaggregation-prone species [42,72,73]. Moreover, the recent discovery that an-tibodies can be raised against different generic forms of aggregates, includingoligomeric species, suggests that they could in principle play a role analogous

Page 280: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

262 C.M. Dobson

to natural chaperones [57,58]. In addition, the remarkable correlation betweenthe events occurring in vitro, in silico and in vivo not only represents a majorbreakthrough in showing the relevance of carefully designed studies in thetest tube for understanding the equivalent processes in a living system, butalso indicates the value of model organisms for exploring potential therapeuticstrategies [6], and, indeed, in addition provides considerable insight into therelationships between chemistry, physics, biology and medicine.

13.8 Concluding Remarks

Application of the techniques and concepts of experimental and theoreticalchemistry and physics over many years has provided great insight into thenature and properties of biological molecules at the atomic level, includingthe manner in which they undergo normal and aberrant self-assembly in labo-ratory environments; indeed, many of the fundamental principles of the latterhave emerged at least in general terms from these studies [5,7]. Concurrently,the methods of biochemistry and cell biology have revealed much about howthe same molecules are associated with specific functional processes in the cel-lular environment and the ways in which such functions can be impaired [5,60].Further applications of these approaches are likely to continue to increase thedepth of our understanding of the fundamental events associated with theprocesses of protein folding, misfolding and aggregation.

The results discussed in this chapter also indicate that model organismssuch as the fruit fly can be of enormous value in exploring the underlying ori-gins of the phenomena that give rise to disease in humans, and also representa powerful means of exploring the genetic factors that influence such diseasesand the effects of processes such as ageing, and also of rapidly screening poten-tial therapeutic compounds [6,67]. The substantial degree of progress that hasalready been made in recent years provides grounds for great optimism thatmeans will be found in the relatively near future to treat effectively, or evento prevent, at least the most common forms of this set of highly unpleasantand usually fatal disorders. Such progress is urgently needed because of thedramatic increase in the numbers of people who are suffering from, or vulner-able to, these diseases that are leading them to the top of the list of challengesto healthcare and social support in many countries around the world. And,from the point of view of the topic of this volume, the results of the stud-ies described in this chapter demonstrate in a dramatic manner the key rolethat the interaction of biological molecules with water has played in biologi-cal evolution, and still plays in determining the narrow boundary between thenormal and aberrant behaviour of all living systems.

Acknowledgements

I should like to thank in particular the Wellcome Trust and the LeverhulmeTrust for generous funding of the research activities described here over many

Page 281: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

13 Protein Misfolding Diseases 263

years, as well as the UK Research Councils, the European Commission, theRoyal Society and numerous charitable organisation for their crucial support,without which the work described in this chapter could not have been car-ried out. I should also like to thank very deeply all the students, researchfellows and colleagues who have contributed to all aspects of this work, thenames of many of whom appear in the references in this chapter. I shouldalso like to express my gratitude to Professors Kunihiro Kuwajima and YujiGoto, along with the Co-ordinators and Advisers of the “Water and Bio-molecules” Research Project supported by the Japanese Ministry of Science,Culture, Sports and Technology (MEXT), for giving me the privilege of beingassociated with their research programme and for the stimulation that thisconnection has had in the development of many of the ideas discussed in thischapter.

References

1. M. Vendruscolo, J. Zurdo, C.E. MacPhee, C.M. Dobson, Philos. Trans. R. Soc.Lond. A 361, 1205 (2003)

2. C.M. Dobson, Nature 426, 884 (2003)3. M.S. Cheung, A.E. Garcia, J.N. Onuchic, Proc. Natl. Acad. Sci. USA 99, 685

(2002)4. G.G. Tartaglia, S. Pechmann, C.M. Dobson, M. Vendruscolo, Trends Biochem.

Sci. 32, 204 (2007)5. F. Chiti, C.M. Dobson, Annu. Rev. Biochem. 75, 333 (2006)6. L.M. Luheshi, D.C. Crowther, C.M. Dobson, Curr. Opin. Chem. Biol. 12, 25

(2008)7. C.M. Dobson, A. Sali, M. Karplus, Angew. Chem. Int. Ed. Engl. 37, 868 (1998)8. J.S. Richardson, D.C. Richardson, Proc. Natl. Acad. Sci. USA 99(5), 2754

(2002)9. P.G. Wolynes, J.N. Onuchic, D. Thirumalai, Science 267, 1619 (1995)

10. K.A. Dill, H.S. Chan, Nat. Struct. Biol. 4, 10 (1997)11. A.R. Dinner, A. Sali, L.J. Smith, C.M. Dobson, M. Karplus, Trends Biochem.

Sci. 25, 331 (2000)12. M. Vendruscolo, E. Paci, C.M. Dobson, M. Karplus, Nature 409, 641 (2001)13. B. Hardesty, G. Kramer, Prog. Nucleic Acid Res. Mol. Biol. 66, 41 (2001)14. F.U. Hartl, M. Hayer-Hartl, Science 295, 1852 (2002)15. S.T. Hsu, P. Fucini, L.D. Cabrita, H. Launay, C.M. Dobson, J. Christodoulou,

Proc. Natl. Acad. Sci. USA 104, 16516 (2007)16. R.J. Ellis, Curr. Opin. Struct. Biol. 11, 114 (2001)17. C. Hammon, A. Helenius, Curr. Opin. Cell. Biol. 7, 523 (1995)18. R.J. Kaufman, D. Scheuner, M. Schroder, X. Shen, K. Lee, C.Y. Liu, S.M.

Arnold, Nat. Rev. Mol. Cell Biol. 3, 411 (2002)19. M.R.Wilson, S.B. Easterbrook Smith, Trends Biochem. Sci. 25, 95 (2000)20. P.J. Thomas, B.H. Qu, P.L. Pedersen, Trends Biochem. Sci. 20, 456 (1995)21. C.M. Dobson, Philos. Trans. R. Soc. Lond. B 356, 133 (2001)22. A. Horwich, J. Clin. Invest. 110, 1221 (2002)23. A.N. Bullock, A.R. Fersht, Nat. Rev. Cancer 1, 68 (2001)

Page 282: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

264 C.M. Dobson

24. C.M. Dobson, Trends Biochem. Sci. 24, 329 (1999)25. C.P. Jaroniec, C.E. MacPhee, V.S. Bajaj, M.T. McMahon, C.M. Dobson, R.G.

Griffin, Proc. Natl. Acad. Sci. USA 101, 711 (2004)26. J.L. Jimenez, J.I. Guijarro, E. Orlova, J. Zurdo, C.M. Dobson, M. Sunde, H.R.

Saibil, EMBO J. 18, 815 (1999)27. R. Nelson, M.R. Sawaya, M. Balbirnie, A.O. Madsen, C. Riekel, R. Grothe,

D. Eisenberg, Nature 435, 773 (2005)28. C. Ritter, M-L. Maddelein, A.B. Siemer, T. Luhrs, M. Ernst, B.H. Meier,

S. Saupe, R. Riek, Nature 435, 844 (1995)29. S.Y. Tan, M.B. Pepys, Histophathology 25, 403 (1994)30. M. Sunde, C.C.F. Blake, Adv. Protein Chem. 50, 123 (1997)31. M. Fandrich, M.A. Fletcher, C.M. Dobson, Nature 410, 165 (2001)32. M. Fandrich, C.M. Dobson, EMBO J. 21, 5682 (2002)33. S. Auer, C.M. Dobson M. Vendruscolo, HFSP J. 1, 137 (2007)34. C.M. Dobson, in Physical Biology: From Atoms to Medicine, ed. A.H. Zewail

(Imperial College Press, London, 2008), pp. 289–33535. T.P. Knowles, A.W. Fitzpatrick, S. Meehan, H.R. Mott, M. Vendruscolo, C.M.

Dobson, M.E. Welland, Science 318, 1900 (2007)36. F. Chiti, M. Stefani, N. Taddei, G. Ramponi, C.M. Dobson, Nature 424, 805

(2003)37. A.P. Pawar, K.F. DuBay, J. Zurdo, F. Chiti, M. Vendruscolo, C.M. Dobson,

J. Mol. Biol. 350, 379 (2005)38. S. Shammas, T.P.J. Knowles, A.J. Baldwin, C.E. MacPhee, M.E. Welland, C.M.

Dobson, G.L. Devlin, in preparation39. A.J. Baldwin, G.L Devlin, C. Waudby, M-F. Massuto, T.J.P. Knowles, S.J.

Spencer-Cahill, J Christodoulou, P.D. Barker, C.M. Dobson, in preparation40. D.R. Booth, M. Sunde, V. Bellotti, C.V. Robinson, W.L. Hutchinson, P.E.

Fraser, P.N. Hawkins, C.M. Dobson, S.E. Radford, C.C.F. Blake, M.B. Pepys,Nature 385, 787 (1997)

41. M. Dumoulin, J.R. Kumita, C.M. Dobson, Acc. Chem. Res. 39, 603 (2006)42. M. Dumoulin, A.M. Last, A. Desmyter, K. Decanniere, D. Canet, A. Spencer,

D.B. Archer, S. Muyldermans, L. Wyns, A. Matagne, C. Redfield, C.V.Robinson, C.M. Dobson, Nature 424, 783 (2003)

43. J.R. Kumita, S. Poon, G.L. Caddy, C.L. Hagan, M. Dumoulin, J.J. Yerbury,E.M. Stewart, C.V. Robinson, M.R. Wilson, C.M. Dobson, J. Mol. Biol. 369,157 (2007)

44. F. Chiti, C.M. Dobson, Nature Chem. Biol. 5, 15 (2009)45. N.F. Bence, R.M. Sampat, R.R. Kopito, Science 292, 1552 (2001)46. A.J.L. Macario, E.C. Macario, Ageing Res. Rev. 1, 295 (2002)47. J.W. Kelly, Curr. Opin. Struct. Biol. 8, 101 (1998)48. M. Ramirez-Alvarado, J.S. Merkel, L. Regan, Proc. Natl. Acad. Sci. USA 97,

8979 (2000)49. S.B. Prusiner, Science 278, 245 (1997)50. M. Tanaka, S.R. Collins, B.H. Toyama, J.S. Weissman, Nature 442, 585 (2006)51. J.P. Taylor, J. Hardy, K.H. Fischbeck, Science 296, 1991 (2002)52. B. Caughey, P.T. Lansbury Jr., Annu. Rev. Neurosci. 26, 267 (2003)53. D.M. Walsh, I. Klyubin, J.V. Fadeeva, W.K. Cullen, R. Anwyl, M.S. Wolfe,

M.J. Rowan, D.J. Selkoe, Nature 416, 535 (2002)

Page 283: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

13 Protein Misfolding Diseases 265

54. L.M. Luheshi, G.G. Tartaglia, A.C. Brorsson, A.P. Pawar, I.E. Watson, F. Chiti,M. Vendruscolo, D.A. Lomas, C.M. Dobson, D.C. Crowther, PLoS Biol. 5, e290(2007)

55. M. Bucciantini, E. Giannoni, F. Chiti, F. Baroni, L. Formigli, J. Zurdo,N. Taddei, G. Ramponi, C.M. Dobson, M. Stefani, Nature 416, 507 (2002)

56. S. Baglioni, F. Casamenti, M. Bucciantini, L. Luheshi, N. Taddei, F. Chiti, C.M.Dobson, M. Stefani, J. Neurosci. 26, 8160 (2006)

57. R. Kayed, E. Head, J.L. Thompson, T.M. McIntire, S.C. Milton, C.W. Cotman,C.G. Glabe, Science 300, 486 (2003)

58. R. Kayed, C.G. Glabe, Meth. Enzymol. 413, 326 (2006)59. H.A. Lashuel, D. Hartley, B.M. Petre, T. Walz, P.T. Lansbury Jr., Nature 418,

291 (2002)60. M. Stefani, C.M. Dobson, J. Mol. Med. 81, 678 (2003)61. M.Y. Sherman, A.L. Goldberg, Neuron 29, 15 (2001)62. P.J. Muchowski, G. Schaffar, A. Sittler, E.E. Wanker, M.K. Hayer-Hartl, F.U.

Hartl, Proc. Natl. Acad. Sci. USA 97, 7841 (2000)63. P. Csermely, Trends Gen. 17, 701 (2001)64. C.M. Dobson, Nature 418, 729 (2002)65. A. Finelli, A. Kelkar, H.J. Song, H. Yang, M. Konsolaki, Mol. Cell Neurosci. 26,

365 (2004)66. D.C. Crowther, K.J. Kinghorn, E. Miranda, R. Pase, J.A. Curry, F.A. Duthie,

D.C. Gubb, D.A. Lomar, Neuroscience 132, 123 (2005)67. J. Bilen, N.M. Bonini, Annu. Rev. Genet. 39, 153 (2005)68. E. Cohen, J. Bieschke, R.M. Perciavalle, J.W. Kelly, A. Dillon, Science 313,

1604 (2006)69. P.T. Lansbury, Proc. Natl. Acad. Sci. USA 96, 3342 (1999)70. C.M. Dobson, Science 304, 1259 (2004)71. F.E. Cohen, J.W. Kelly, Nature 426, 905 (2003)72. D. Schenck, Nat. Rev. Neurosci. 4, 49 (2003)73. M. Dumoulin, C.M. Dobson, Biochimie 86, 589 (2005)74. C.M. Dobson, Nat. Struct. Mol. Biol. 13, 295 (2006)

Page 284: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

“This page left intentionally blank.”

Page 285: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

14

Effect of UV Light on Amyloidogenic Proteins:Nucleation and Fibril Extension

A.K. Thakur and Ch. Mohan Rao

Abstract. Amyloid fibril formation is associated with a large number of neurode-generative diseases. Understanding the molecular details of amyloidogenesis is crit-ical for developing strategies to intervene in the pathological process. Formation ofamyloid fibrils is a three-stage process: structural perturbation, nucleation and fibrilextension. Absorption of UV light is known to perturb protein conformation andlead to aggregation. We have investigated the effect of UV light on three amyloido-genic proteins: prion protein, β2-microglobulin and α-synuclein, representing threedifferent classes of proteins, largely α-helical, β-sheet and natively unstructured, re-spectively. Of these, only prion protein undergoes amorphous aggregation upon UVexposure. Interestingly, all three proteins, after UV exposure, fail to form amyloidfibrils de novo. It is possible that UV exposure compromises nucleation or fibrilextension, or both. Interestingly, upon seeding, these UV-exposed proteins formedamyloid fibrils. The fibrils formed by UV-exposed prion protein were morphologi-cally different from those formed by the unexposed protein. Upon UV exposure allthe three proteins lose their ability to form de novo fibrils, but remain competentfor seeded fibril growth. UV exposure, therefore, selectively compromises the abil-ity of these proteins to nucleate. UV exposure might be of use in investigating theamyloidogenic process, especially the different processes associated with nucleationand fibril extension.

14.1 Introduction

Molecular self-assembly is one of the key factors of biological structure andfunction. The forces that are associated with the self-assembly also play a rolein the folding of nascent proteins depending on their amino acid sequences.Although several proteins have been shown to be refolded to their correct,functionally active structures in vitro, the situation in vivo is quite different.Owing to molecular crowding obtained in vivo, several nonnative interactionscan cause protein misfolding and aggregation. Molecular chaperones and heatshock proteins prevent such nonproductive interactions and help proteins toachieve and maintain the native state. Small heat shock proteins such as

Page 286: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

268 A.K. Thakur and Ch.M. Rao

αB-crystallin have been shown to inhibit fibril extension of α-synuclein [1]and β2-microglobulin [2]. Proteins have to balance between thermodynamicstability and the flexibility required for biological function. Thus the nativefunctional state of proteins critically depends on several factors. Misfoldingand aggregation of proteins, either amorphous or ordered aggregates, lead todiseases such as cataract, transmissible spongiform encephalopathies (TSE),Alzheimer’s disease, Parkinson’s disease and dialysis-related amyloidosis. Un-derstanding the molecular details of aggregation and amyloid fibril formationis important in designing strategies to mitigate the complications. Amyloidfibril formation involves three steps: structural perturbation, nucleation andelongation. Several modalities are being used to perturb the native structureto initiate amyloid fibril formation in vitro. Would it be possible to use UVexposure as a structural perturbant to initiate nucleation leading to amy-loid fibril formation or aggregation? We have addressed this question usingmouse prion protein [3], human β2-microglobulin and human α-synuclein. In-terestingly, inter alia, we find that UV-exposed proteins fail to form amyloidfibrils; however, they remain competent for fibril extension if provided withpreformed fibrils as seeds. UV exposure, therefore, selectively compromisesthe nucleation process. This chapter provides a brief and contextual overviewof the several structural perturbants and describes the effect of UV light onthe amyloidogenic proteins.

14.2 Amyloid

The amyloid fibrils are characterized by the presence of a cross-β sheet struc-ture and show a structural hierarchy: subprotofibrils twisting around eachother to form protofilaments, which in turn laterally join and twist aroundto form matured fibrils. Recent lines of evidence suggest that such well-ordered structures lead to extensive H-bonding, resulting in a novel bluefluorescence [4]. The fibrils are chemically and thermodynamically stable. De-spite differences in primary structure, all proteins achieve similar cross-β sheetstructures in their amyloid form. This led to the suggestion that formation ofamyloid fibril might be a generic property of any polypeptide chain; all pro-teins can form amyloids under appropriate conditions [5]. Till now, around60 proteins have been shown to form fibrils. Amyloid fibril formation involvesthree major stages: structural perturbation (prenucleation stage), nucleationand fibril extension.

14.2.1 Structural Perturbation

Structural perturbation or conformational change in the soluble protein isimportant for amyloid formation. The observation of an amyloidogenic in-termediate of transthyretin (TTR) in acidic pH led to the hypothesis ofconformational perturbation as a prerequisite for amyloid fibril formation [6].

Page 287: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

14 Effect of UV Light on Amyloidogenic Proteins 269

Several studies since then have supported this suggestion, and now it is widelyaccepted that conformational change/structural perturbation is a prerequisitefor amyloid formation. Structural perturbation involves destabilization of thenative state, thus forming nonnative states or partially unfolded intermediates(kinetic or thermodynamic intermediates), which are prone to aggregation.Mild to harsh conditions such as low pH, exposure to elevated temperatures,exposure to hydrophobic surfaces and partial denaturation using urea andguanidinium chloride are used to achieve nonnative states. Stabilizers of in-termediate states such as trimethylamine N-oxide (TMAO) are also used foramyloidogenesis. However, natively unfolded proteins, such as α-synuclein, tauprotein and yeast prion, require some structural stabilization for the forma-tion of partially folded intermediates that are competent for fibril formation.Conditions for partial structural consolidation include low pH, presence ofsodium dodecyl sulfate (SDS), temperature or chemical chaperones.

pH

In many cases, low pH has been used to form amyloid intermediates. TTRexists as a tetramer at neutral pH; lowering the pH to 4.4 leads to monomeriza-tion. At this pH, an intermediate with well-defined, less hydrophobic, tertiarystructure was observed. This intermediate forms amyloid fibrils and henceit is called the amyloid intermediate; pH > 5 did not result in amyloid for-mation [7]. The recombinant variable domain of immunoglobulin light chain(V(L) domain) forms two intermediates: one at pH 3 with native-like sec-ondary structure and large, exposed hydrophobic surface, and the other atpH 2, which is largely disordered but retains a beta sheet structure. Out ofthese two, the intermediate with native-like conformation, formed at pH 3,appears to act as an intermediate for fibril formation [8]. β2-Microglobulinfibril formation has been shown to be rapid below pH 4.0, and, in addition,ionic strength also plays a role in the fibril formation [9,10]. Presence of saltsat low pH increases hydrophobicity of β2-microglobulin. A balance of electro-static and hydrophobic interaction provided by anionic binding was shown toinfluence the amyloid fibril growth and stability of β2-microglobulin [11].

In contrast, natively unfolded (intrinsically disordered) proteins such asα-synuclein require partially folded intermediates to form fibrils. Low intrinsichydrophobicity and high net charge at neutral pH result in the natively un-folded structure of α-synuclein. Lowering the pH leads to reduced net charge,inducing α-helical intermediates in α-synuclein. The radius of gyration ofα-synuclein at neutral pH is 40 A, and it decreases to 30 A upon lowering thepH; this compaction of the protein molecule correlates with increase in fibrilformation [12]. Low pH thus induces conformational changes and facilitatesfibril formation.

Page 288: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

270 A.K. Thakur and Ch.M. Rao

Temperature

Temperature is one of the major determinants of protein conformation. Ei-ther of the extreme temperatures, high or low, leads to unfolding of pro-teins referred to as thermal or cold denaturation, respectively. The processof protein folding or unfolding is commonly associated with one or moreintermediates. Some of the intermediates thus generated might partitioninto off-pathway processes such as aggregation or amyloid fibril formation.Temperature-induced formation of partially folded intermediates has beenobserved in the cases of α-synuclein [12], β2-microglobulin, lysozyme [13],Aβ-peptide [14–16], prion protein [17], insulin [18] and ataxin [19].

Surface Interactions

Apart from pH and temperature, interaction with various surfaces, such ashydrophobic or hydrophilic, plays major role in fibril formation. It has beensuggested that in vitro fibril formation induced by surface interactions couldbe the best mimic of in vivo fibril formation, as in vivo deposits are associ-ated with surfaces [20]. A few studies indicate the involvement of hydropho-bic surfaces such as graphite, mica and teflon; interaction with these surfaceshas been shown to facilitate fibril formation [21,22]. Conversely, charged sur-faces also can induce the conformational changes required for fibril forma-tion. In light-chain amyloidosis, pathological deposition of amyloid fibrils ofimmunoglobulin light-chain fragments occurs in several tissues including thewalls of blood vessels. Recombinant light-chain variable domain, SMA, formsfibrils on native mica, which has a negatively charged surface. Surface inter-actions accelerate the rate of fibril formation and also alter the mechanism.No fibrils of SMA were observed on hydrophobic or positively charged sur-faces, indicating the role of electrostatic interactions between the surface andproteins [20].

Partially Denaturing Condition

Denaturants such as urea and guanidinium chloride have been used to per-turb the structure of proteins. Partially denaturing conditions such as 2–5 MGdmCl [23–27] or 3 M urea [28] generate partially unfolded intermediates,which facilitate fibril formation. Higher concentrations of denaturants wouldprevent interprotein interactions and solubilize the aggregating species. Lowerconcentration of denaturants might not generate any intermediate species andthus would not facilitate fibril formation. Depending on the protein and de-naturant, an optimal concentration of denaturant would be needed for fibrilformation. Combination of temperature and GdmCl has also been used toform fibrils of the small heat shock protein bovine α-crystallin [29].

Page 289: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

14 Effect of UV Light on Amyloidogenic Proteins 271

Membrane Interactions

Many pathological amyloid deposits are associated with membranes. Am-phiphilic molecules such as SDS and lipids provide a membrane mimeticenvironment that can be used to investigate the role of membranes in theamyloidogenic process. Such membrane mimetics have been shown to en-hance fibril formation [30, 31]. On the contrary, a few other studies showthat such membrane mimetic conditions inhibit fibrillogenesis [32, 33]. Thedual role of membranes is rather intriguing. Recently, we have addressed thisapparent contradiction. We have investigated the interaction of SDS withα-synuclein [34]. The study showed two types of ensembles of α-synucleinand SDS: the fibrillogenic ensembles formed with optimal SDS concentrationof around 0.5–0.75 mM are characterized by enhanced accessible hydropho-bic surfaces and extended to partially helical conformation, while the lessor nonfibrillogenic ensembles formed above 2 mM SDS are characterized byless accessible hydrophobic surfaces and maximal helical content. This find-ing is consistent with both the observations in the literature; the apparentcontradiction is attributable to the relative concentrations of SDS [34]. Fib-ril formation of β2-microglobulin is also reported to be maximum at 0.5 mMSDS [35]. Lipids, particularly negatively charged lipids, such as phosphatidylserine [36], and free fatty acids such as palmitic acid, stearic acid, oleic acidand linoleic acid [37] have been implicated in fibril formation. Electrosta-tic interactions between protein and lipids have been observed [38]. Theseinteractions accelerate the fibril formation of many amyloidogenic proteinssuch as α-synuclein [39], Aβ-peptide [38], lysozyme, insulin, glyceraldehyde-3-phosphate dehydrogenase, myoglobin, transthyretin, cytochrome c, histone H1and α-lactalbumin [36]. Cholesterol and lipid rafts have also been investigatedfor their role in promoting amyloidogenesis [40–42].

Other Perturbants

Organic solvents such as methanol, ethanol, trifluoroethanol, propanol andhexafluoro-2-propanol [33]; osmolytes such as glycerol, betaine, taurine andTMAO [43, 44]; pesticides such as rotenone, dieldrin and paraquat, [45, 46];metal ions [47]; ultrasonication [48] and pressure [49] have been shown toinfluence the rate of fibril formation. In addition to these external factors,intrinsic changes such as point mutations and truncations of amyloidogenicproteins also facilitate fibril formation. Point mutation D187N leads to ex-posure of the hidden cleavage site forming the amyloidogenic fragment ofgelsolin upon proteolysis [50]. Mutations such as A30P, A53T and R46K in α-synuclein protein lead to increase in self-aggregation and oligomerization intoprotofibrils, compared to the wild-type protein [51]. Several point mutations inprion protein cause onset of diseases such as A117V [52], D178N [53], E200K[54], P102L [55], and F198S [56]. Interestingly, metal ions such as copper,

Page 290: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

272 A.K. Thakur and Ch.M. Rao

aluminum and zinc are known to promote the fibril formation of α-synuclein[47,57,58]. However, metal ions are also known to inhibit fibril formation of theAβ-peptide [59].

14.2.2 Nucleation

Lansbury and his group have shown that amyloid formation is a nucleation-dependent process and that the nucleation step can be evaded by using seedsof preformed fibrils. The nucleation process is a rate-limiting step in amy-loidogenesis. It is characterized by a lag phase. During the time required fornucleus formation, the protein appears to be soluble. Nucleus formation re-quires a series of association steps that are thermodynamically unfavorable be-cause the resultant intermolecular interactions do not outweigh the entropiccost of association [60]. Once the nucleus has formed, further addition ofmonomers becomes thermodynamically favorable. The nucleation is concen-tration dependent [61] and shows the presence of hydrophobic cooperativityin the process [62].

14.2.3 Fibril Extension

The lag in kinetics persists till the formation of a critical nucleus, after whichthe reaction proceeds in favor of a rapid increase in size [63]. Bidirectionalgrowth of the elongating fiber was observed at this stage [64]. Binding ofthe monomer to the continuously growing fiber and subsequent conforma-tional change characterize this event [65, 66]. These amyloid aggregates showCongo-red birefringence and cross-β sheet structure. The organization of thesefibrils remains the same among different types of proteins – unbranched 2–3subprotofibrils (10–15 A) helically arrange to form protofilaments (protofibril)(25–30 A), which associate laterally or twisted in bundle of five to form maturefibrils [67].

14.3 UV Light as a Potent Structural Perturbant

The effect of light on proteins has been known for several decades. Many pro-teins such as γ-crystallin, present in the eye lens, aggregate upon exposure toUV light [68]. UV light leads to photo-oxidation of aromatic residues (trypto-phan, tyrosine and phenylalanine), which leads to conformational alterationand eventually to aggregation. This process is associated with the reactiveoxygen species (ROS). We have earlier investigated the photo-aggregation ofγ-crystallin upon UV exposure and the prevention of the aggregation usingα-crystallin [69]. We observed an increase in the hydrophobic surface dueto partial unfolding of this protein upon UV exposure [69]. Eye-lens pro-teins undergo alteration in conformation, as well as in quaternary packing,

Page 291: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

14 Effect of UV Light on Amyloidogenic Proteins 273

leading to the opacity of the lens [70]. Therefore, UV light can be a po-tential protein structural perturbant. We have investigated the possibility ofusing UV exposure as a structural perturbant to initiate nucleation leading toamyloid fibril formation. We have used three amyloidogenic proteins – prionprotein, β2-microglobulin and α-synuclein. These proteins, interestingly, rep-resent three different classes of structures – prion protein is rich in α-helix,β2-microglobulin is rich in β-sheet and α-synuclein is natively unfolded.

14.3.1 UV-Induced Aggregation of Prion Protein

Prion protein has eight tryptophans, and seven of them are in the flexible N-terminal region of the protein. The abundance of tryptophan raises a question– whether perturbing the N-terminal region via photo-oxidation of trypto-phans would have any effect on amyloid aggregation. We have exposed prionprotein to UV light of 290 nm. Within a few minutes of exposure, prion protein,in sodium phosphate buffer, pH 7.4, aggregated extensively. We monitored theaggregation by measuring the Rayleigh scattering by setting the excitation andemission monochromators at 465 nm. The scattering profile is shown Fig. 14.1.Aggregation starts after a lag period of about 4 min and plateaus after 15 min.We have also exposed β2-microglobulin and α-synuclein to UV light under sim-ilar conditions. Surprisingly, neither β2-microglobulin nor α-synuclein exhib-ited any aggregation during the period of the experiment (Fig. 14.1). Extendedexposure for 1 h also did not lead to any aggregation (data not shown).

Prion protein aggregates do not show increase in Thioflavin T (ThT) fluo-rescence, indicating the formation of amorphous aggregates. In order to probe

Fig. 14.1. Photo-aggregation of prion protein, β2-microglobulin and α-synuclein.In each case, the protein (0.05 mg ml−1) in 50 mM phosphate buffer was exposedto light of 290 nm. Light scattering was measured using a fluorescence spectropho-tometer (Fluorolog FL3-22) by setting excitation and emission monochromators at465 nm. Mouse full-length prion protein (filled square) exhibits aggregation upon UVexposure, whereas β2-microglobulin (filled triangle) and human α-synuclein (filledcircle) do not show significant aggregation

Page 292: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

274 A.K. Thakur and Ch.M. Rao

the nature of aggregation (covalent or noncovalent), a photo-aggregated sam-ple of prion protein was treated with 0.1% SDS. We observed a fast decrease inRayleigh scattering within minutes, showing that the aggregates are solublein SDS. SDS solubility indicated the predominance of noncovalent interac-tions in the aggregation of prion protein. Exposure of prion protein at highconcentrations to light under partial denaturing condition also did not leadto increase in Rayleigh scattering, further confirming the noncovalent natureof interactions in the aggregation of prion protein. We also analyzed the roleof the disulfide bond in the photo-aggregation of prion protein by testing thesamples on reducing and nonreducing SDS PAGE. We observed the presenceof intact intradisulfide bond both before and after exposure of prion proteinto UV light.

Thus, exposure to UV light causes perturbation of the N-terminal regionof prion protein (which has seven out of eight tryptophans) and leads toamorphous aggregation. Noncovalent interactions play a predominant role inthe photo-aggregation of prion protein [3].

14.3.2 Prevention of UV-Induced Aggregation of Prion Protein

Tryptophan, upon absorption of light, forms the tryptophanyl radical andgenerates N -formyl kynurenine and kynurenine. In the presence of antioxi-dants, generation of radicals such as superoxide, singlet oxygen, hydroxyl andperoxyl radicals will be inhibited. We used several antioxidants to investigatethe role of ROS in the photo-aggregation of prion protein (since β2M andα-synuclein do not photo-aggregate, the effect of antioxidants has not beeninvestigated with these proteins). Antioxidants such as mannitol, l-cysteine,superoxide dismutase (SOD) and catalase have been used for scavenging hy-doxyl radical, singlet oxygen, superoxide and peroxyl radicals, respectively.Antioxidants were added to the protein sample prior to light exposure, andRayleigh scattering was monitored using light exposure as described above.The presence of mannitol or catalase did not alter the aggregation profile (datanot shown), showing that hydroxyl and peroxyl radicals are not involved inthe photo-aggregation of prion protein. On the other hand, SOD prevented45% of aggregation of prion protein, where as l-cysteine prevented photo-aggregation to an extent of ∼97% (Fig. 14.2). These studies thus showed thatsinglet oxygen and superoxide radicals are involved in the photo-aggregationof prion protein [3].

14.3.3 UV Exposure Alters Conformation of Prion Protein

Far UV circular dichroism (CD) spectrum of prion protein is known to exhibitan α-helical structure (Fig. 14.3a inset). Upon exposure to UV light, prionprotein undergoes aggregation. Hence CD measurements were not possible.However, we could record the CD spectra of the UV-exposed prion proteinin partially denaturing conditions (3 M urea and 1 M GdmCl). Under these

Page 293: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

14 Effect of UV Light on Amyloidogenic Proteins 275

Fig. 14.2. Effect of antioxidants on the photo-aggregation of prion protein. Prionprotein photo-aggregation was monitored in the presence of antioxidants l-cysteine,superoxide dismutase (SOD), mannitol and catalase. Prion protein (PrP) was usedat a concentration of 0.05 mg ml−1. The concentrations of the antioxidants usedwere: l-cysteine, 1mM; SOD, 20 μ g ml−1 (∼64 U ml−1); mannitol, 50mM and cata-lase, 2.5 ng ml−1 (∼0.895 mU ml−1). l-Cysteine (filled circle) prevents the photo-aggregation of prion protein almost completely, while SOD (filled triangle) preventsit partially. Mannitol and catalase do not prevent the aggregation (data not shown)

conditions, prion protein is known to undergo ordered aggregation and formamyloid fibrils; hence this condition is called the amyloid condition. We didnot observe any photo-aggregation of prion protein under amyloid conditions.Interestingly, UV exposure was sufficient to cause observable differences inthe far UV CD of prion protein compared to the unexposed protein underpartially denaturing conditions (Fig. 14.3a). We find that UV exposure leadsto a decrease in the α-helical content of prion protein [3]. It is possible thatphoto-oxidation of aromatic amino acids, leading to side-chain modification,results in conformational change, making prion protein prone to amorphousaggregation.

We have also investigated the effect of UV light exposure on the secondarystructures of β2-microglobulin and α-synuclein. Since β2-microglobulin andα-synuclein do not undergo photo-aggregation, we have studied the effect ofexposure to UV light under conditions that lead to their ordered aggrega-tion. β2-Microglobulin forms amyloid aggregates at pH 2.5. We exposed β2-microglobulin in citrate buffer, pH 2.5, to UV light and monitored changesin the far UV CD spectrum. Upon exposure to UV light, we observed mi-nor changes in the far UV CD spectrum of β2-microglobulin (Fig. 14.3b).α-Synuclein is a natively unfolded molecule. In the presence of 0.5 mM SDSin HEPES buffer, pH 7.0, it adopts a partially folded conformation [34]. In-terestingly, exposure of α-synuclein to UV light under these conditions leadsto no observable change in the far UV CD spectrum (Fig. 14.3c). Thus, we

Page 294: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

276 A.K. Thakur and Ch.M. Rao

Fig. 14.3. Secondary structural changes of prion protein, β2-microglobulin andα-synuclein upon UV exposure under their respective amyloid-forming conditions.Far UV CD spectra of (a) prion protein in 20 mM sodium phosphate buffer (pH 6.8)containing 100mM NaCl, 3 M urea and 1 M GdmCl. Inset shows the Far UV CDspectrum of native prion protein (b) β2-microglobulin in 50 mM citrate buffer (pH2.5) containing 100mM KCl and (c) α-synuclein in 20mM HEPES–NaOH buffer(pH 7.0) containing 100 mM NaCl and 0.5 mM SDS. In each panel, curves 1 and 2show the far UV CD spectra of the protein before and after exposure to UV light,respectively. Panel 3a reproduced from [3]

Page 295: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

14 Effect of UV Light on Amyloidogenic Proteins 277

see a differential effect of UV light exposure on the changes in the secondarystructures of these proteins.

14.3.4 UV-Exposed Proteins Failed to Form Amyloid De Novo

As discussed earlier, conformational change/structural perturbation is a pre-requisite for amyloid formation. Exposure to UV light did not initiate thefibril formation; it led to amorphous aggregation of prion protein and no ob-servable change to the other two proteins. We have employed conditions thatare known to favor amyloid fibril formation and investigated the effect of UVlight exposure of these proteins on their ability to form amyloid fibrils.

UV-Exposed Prion Protein Failed to Form Amyloid De Novo

We have investigated the amyloid formation of UV-exposed prion protein un-der amyloidogenic conditions (3 M urea, 1 M GdmCl, 150 mM NaCl, pH 6.8,at 37◦C with continuous shaking at 600 rpm). Under amyloidogenic condi-tions, prion protein (that is not exposed to UV light) showed increase in ThTfluorescence after ∼40 h and attained saturation by 120 h (Fig. 14.4a). Sur-prisingly, structural perturbation upon light exposure had a negative effecton amyloidogenesis. Even after incubating UV-exposed prion protein for sev-eral days under amyloidogenic conditions, we did not observe any increase inThT fluorescence (Fig. 14.4a), indicating that prion protein completely failedto form fibrils upon mild UV exposure [3].

UV-Exposed β2-Microglobulin and α-Synuclein Failed to FormAmyloid De Novo

We also exposed β2-microglobulin and α-synuclein to UV light and in-vestigated the de novo amyloid fibril formation ability of the UV-exposedproteins. β2-Microglobulin readily forms amyloid fibrils at pH 2.5 in 100 mMKCl. Within 8 h of incubation at 37◦C and with continuous shaking at1,000 rpm, β2-microglobulin attains saturation of fibril formation (Fig. 14.4b).We monitored the fibril formation of β2-microglobulin exposed to UV lightand incubated under the above-mentioned conditions. Interestingly, we foundthat β2-microglobulin, like prion protein, failed to form fibrils upon UV ex-posure (Fig. 14.4b). In the case of α-synuclein, fibril formation occurred inthe presence of 0.5 mM SDS and stirring at 1,000 rpm. α-Synuclein exhibitedincrease in ThT fluorescence within 3 h of incubation and reached a plateauat ∼10 h. We exposed α-synuclein to UV light and incubated it under theconditions mentioned above. UV-exposed α-synuclein failed to form fibrilseven upon prolonged incubation under amyloidogenic conditions (Fig. 14.4c).Thus, the inability to form fibrils was not confined to exposed prion proteinalone. All three proteins, when exposed to UV light, failed to form orderedaggregates.

Page 296: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

278 A.K. Thakur and Ch.M. Rao

Fig. 14.4. Effect of UV exposure on de novo amyloid fibril formation of prion pro-tein, β2-microglobulin and α-synuclein. Amyloid fibril formation of (a) prion protein(b) β2-microglobulin and (c) α-synuclein. In each panel, (filled square) representsproteins that are not exposed to UV light, and (filled circle) the UV-exposed pro-tein. The fibril formation was monitored by ThT fluorescence. An aliquot of thesample was withdrawn at different time points and added to 0.5 ml of 10 μM ThTin 50mM glycine–NaOH buffer (pH 8.5), and the fluorescence intensity at 485 nmwith excitation wavelength set at 445 nm was measured using a Fluorolog FL3-22fluorescence spectrophotometer. The UV-exposed proteins failed to form amyloidfibril de novo

Page 297: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

14 Effect of UV Light on Amyloidogenic Proteins 279

14.3.5 Is Subcritical Concentration of UV-Exposed ProteinResponsible for Failure to Form Amyloid Fibrils?

Figure 14.5 shows the atomic force microscopy (AFM) images of prion pro-tein unexposed and exposed to UV light, which were incubated for 120 h inamyloid-forming conditions. The AFM image of prion protein not exposedto UV light exhibited the typical fibrillar morphology (Fig. 14.5a). The AFMimage of the UV-exposed protein, on the other hand, did not show the pres-ence of any fibrils (Fig. 14.5b). The inability of UV-exposed prion proteinto form amyloid fibrils is intriguing. In order to see whether UV exposurecauses loss of available protein leading to the subcritical level, if any, we haveinvestigated the concentration dependence of prion protein in its amyloido-genesis. Several concentrations ranging from 0.1 to 1.0mg ml−1 of unexposedprion protein were prepared for amyloid formation. Figure 14.6 shows a risein the ThT fluorescence of prion protein after 48 h even at 0.25mg ml−1 (orone-fourth of the initial concentration) (i.e., even assuming the loss of avail-able protein to be 75%). Fibril formation could be seen at dilution as lowas tenfold (0.1mg ml−1). We have also studied the amyloidogenic potentialof β2-microglobulin and α-synuclein (that are not exposed to UV light) atone-fifth and one-tenth of the concentrations used in the experiments for fib-ril formation of UV-exposed proteins (shown in Fig. 14.4b and c). Both β2-microglobulin and α-synuclein showed increase in ThT fluorescence at eachof these concentrations. However, UV-exposed prion protein, β2-microglobulinand α-synuclein even at much higher concentrations showed no fibril formationas monitored by ThT fluorescence (Fig. 14.4a–c) and AFM (Fig. 14.5b). Thus,the ability of all the three proteins to form amyloid fibrils even at one-tenth of

Fig. 14.5. AFM images. (a) AFM image of the amyloid fibrils of prion protein. (b)AFM image of the sample of UV-exposed prion protein not showing the presence ofany fibrils. Reproduced from [3]

Page 298: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

280 A.K. Thakur and Ch.M. Rao

Fig. 14.6. De novo amyloid formation of prion protein at different concentra-tions. Different concentrations of unexposed prion protein (0.1, 0.25, 0.5, 0.75and 1.0 mg ml−1) and UV-exposed prion protein (1.0 mg ml−1) were subjected toamyloid-forming conditions. The figure shows representative data at each concen-tration. Exposed represents UV-exposed prion protein

the concentration used for fibril formation of UV-exposed proteins rules outthe trivial possibility of loss of protein as a possible cause for the observedlack of amyloid formation with the UV-exposed samples.

14.3.6 UV-Exposed Amyloidogenic ProteinsForm Amyloid Upon Seeding

As described earlier, amyloidogenesis involves nucleation and fibril extension.Thus UV exposure could lead to compromised nucleation or fibril extension,or both. Fragments of preformed amyloid fibrils act as seed when mixed withmonomeric protein solution and lead to fibril extension. Seeded fibril extensionreactions have no lag periods in contrast to de novo fibril formation. Seedingthus eliminates the need for nucleation. Does the UV-exposed protein remaincompetent for fibril extension under conditions where seeding is not impor-tant? In order to test this possibility, we have generated fibrils from prionprotein, α-synuclein and β2-microglobulin samples and sonicated them to ob-tain seeds. Seeds were added to the respective UV-exposed monomeric pro-teins, and fibril formation was monitored using ThT fluorescence. Figure 14.7ashows the increase in ThT fluorescence of prion protein either exposed or notexposed to UV light. The UV-exposed protein shows increase in ThT fluores-cence albeit with slower kinetics, compared to that of the unexposed protein.UV-exposed β2-microglobulin (Fig. 14.7b) and α-synuclein (Fig. 14.7c) alsoexhibited similar behavior in terms of elongation of fibrils as well as kinetics

Page 299: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

14 Effect of UV Light on Amyloidogenic Proteins 281

Fig. 14.7. Effect of UV-exposure on seeded amyloid fibril formation in prion protein,β2-microglobulin and α-synuclein. Samples of unexposed (filled square) and UV-exposed (filled circle) (a) 1 mg ml−1 prion protein in 20mM sodium phosphate buffer(pH 6.8) containing 100 mM NaCl, 3M urea and 1 M GdmCl, (b) 0.5 mg ml−1 β2-microglobulin in 50 mM citrate buffer (pH 2.5) containing 100mM KCl and (c)α-synuclein in 20 mM HEPES–NaOH buffer (pH 7.0) containing 100 mM NaCl and0.5 mM SDS, were treated with the respective sonicated fibril seeds, and the fibrilgrowth was monitored with time by ThT fluorescence

Page 300: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

282 A.K. Thakur and Ch.M. Rao

of fibril extension. Thus, UV-exposed proteins retained the ability to formfibrils upon seeding. This is an interesting result, as all these proteins (UV-exposed prion protein, α-synuclein and β2-microglobulin) failed to form fibrilsde novo. However, they have the ability to elongate in the presence of seeds ofamyloid fibrils obtained from unexposed proteins. These results suggest thatUV exposure selectively affects the nucleation, leaving the protein competentfor fibril extension.

14.3.7 UV-Exposed Prion Protein Fibrils Show AlteredFibril Morphology

We further investigated fibril morphology under these conditions using elec-tron microscopy (EM) and AFM. Fibrils formed from monomers of unex-posed prion protein in the seeded reaction were slender and long as shown inFig. 14.8a. These fibrils showed a canonical organization of fibrils with sub-protofibrils of 8.89± 0.355 nm twisting around each other to form protofila-ments of 20.57 ± 0.833 nm. Contrary to this, fibrils obtained from monomersof UV-exposed protein in seeded reactions were thick and stout and flat in ap-pearance and showed a thickness of 30± 0.916 and 47.72± 2.066 nm, indicat-ing different organization of fibrils as observed from the EM image (Fig. 14.8b).We have recoded phase images of these fibrils in the tapping mode of AFM.Phase images provide some insight into the compactness or stiffness of thematerial under investigation. Compactness (or stiffness) refers to hardness orsoftness of the sample. A hard sample gives a larger change in phase angle;soft samples in contrast lead to smaller changes in phase angle. The fibrils

Fig. 14.8. EM images of fibrils of seeded reactions. A small amount of sample wasplaced on a copper grid and stained by uranyl acetate for EM imaging. EM imageof fibrils formed with (a) prion protein and (b) UV-exposed prion protein. Scalebar – 500 nm. Reproduced from [3]

Page 301: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

14 Effect of UV Light on Amyloidogenic Proteins 283

of unexposed prion protein show a phase angle of 37.5 ± 0.358◦ as observedfrom phase image. In contrast, phase images of fibrils of UV-exposed prionprotein showed a significantly low phase angle of 3.82 ± 0.1457◦. A signifi-cantly lower phase angle for UV-exposed prion protein fibrils indicates a lesscompact packing (or less stiffness) of these fibrils.

14.4 Discussion

Our investigations on the effect of UV light exposure on the amyloidogenicproteins, prion protein, β2-microglobulin and α-synuclein provided interest-ing results. All these proteins failed to form fibrils when exposed to UV light.This failure to form fibrils might arise because of the following plausible rea-sons: (1) photo-oxidation causing loss of available monomer protein leading tosubcritical level, if any, of the protein for amyloidogenesis; (2) incapability ofUV-exposed protein to participate in amyloid process probably due to loss ofcrucial structure of monomers and (3) inhibitory effect of the oxidized mole-cule on the amyloid nucleus. The fact that prion protein, β2-microglobulinand α-synuclein not exposed to UV light exhibit the ability to form amy-loid fibrils at significantly lower concentrations (one-tenth) than those usedfor fibril formation of the UV-exposed proteins rules out the possibility thatsubcritical protein concentration is responsible for the observed lack of fibrilformation upon UV exposure. We find that all three UV-exposed proteins, ifprovided with preformed seeds, readily form amyloid fibrils, thus ruling outthe possible inhibitory effects of photo-oxidized molecules. Hence it appearsthat UV exposure renders prion protein incapable of forming amyloid nucleusperhaps as a result of some structural changes.

Our far UV CD studies (Fig. 14.3) show some change in the secondarystructure of prion protein upon exposure to UV light. UV-exposed β2-microglobulin also shows a small change, however, and α-synuclein doesnot show any change in its secondary structure. UV exposure of proteinsleads to photolysis of tryptophan which can cause conformational changes inthe protein. Our earlier studies on mellitin, β-lactoglobulin and crystallinshave shown that photo-oxidation of a protein depends upon its conforma-tion [70]. Photo-oxidation also depends upon the polarity of the tryptophanenvironment [71, 72]. Prion protein has eight tryptophans, seven of whichare completely exposed and are present at the N-terminal domain. Thus,photo-oxidation of prion protein can cause damage to the N-terminal region,leading to conformational change, aggregation and loss of ability to form fib-rils de novo. Interestingly, β2-microglobulin has two tryptophans, whereasα-synuclein has none. Absorption of light by other chromophores and subse-quent generation of ROS might contribute to observed failure of de novo fibrilformation. Further studies are needed to understand these observations.

Prion protein consists of two domains – the flexible N-terminal domainand the C-terminal domain which consists of three α-helices and two β-sheets

Page 302: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

284 A.K. Thakur and Ch.M. Rao

[73, 74]. The aggregation properties of full-length prion protein (PrP 23–231)have not been studied as extensively as its truncated forms (PrP 90–231,PrP 106–126 and PrP 121–231) because the N-terminal flexible domain wasnot considered important for amyloid formation. However, the N-terminaldomain appears to be important, as prion protein with N-terminal deletionshas been shown to form abnormal conformations of prion aggregates [75–77].Moreover, transgenic mice lacking residues 32–106 are not susceptible to prioninfection [78]. In the current study, we have exposed full-length prion proteinto UV light and followed its amyloid formation. Since most of the tryptophanresidues are present in the N-terminal region, we expect this region to be themost affected. Interestingly, UV-exposed prion protein failed to form amyloidfibrils de novo, indicating the importance of the N-terminal domain in amyloidformation.

Our study shows that UV exposure of prion protein, β2-microglobulin andα-synuclein leads to loss of ability of these proteins to form amyloid fibrils denovo. However, they retained the ability to elongate the fibrils when providedwith preformed fibrils as seeds. Thus, UV exposure selectively compromisesthe ability to nucleate fibril growth.

Figure 14.9 schematically describes the effect of UV light on the amy-loidogenic proteins. Prion protein, β2-microglobulin and α-synuclein underamyloidogenic conditions undergo structural changes and form amyloid nu-cleus to which other monomers join to extend the nucleus to protofibrils andsubsequently thicker fibrils and amyloid aggregates (grey arrows). UV expo-sure inhibits the nucleation process and hence fibril formation. UV-exposedprion protein undergoes some structural alterations and forms amorphousaggregates. β2-Microglobulin and α-synuclein, however, do not form suchamorphous aggregates upon exposure to UV light. All three proteins remaincompetent for fibril extension if provided with preformed fibrils as seed. Mor-phology of the fibrils formed by UV-exposed β2-microglobulin and α-synucleinis comparable to that of the fibrils of unexposed proteins. Morphology of fib-rils of UV-exposed prion protein differs in size and compactness from those ofthe fibrils formed by the unexposed protein (Fig. 14.9).

The selective loss of ability to nucleate fibril growth upon UV exposureis an important finding, as research on specific inhibition of the nucleationand elongation processes is scanty and poses a basic problem of separatingthese two intricately interwoven processes. Apolipoprotein E has been shownto specifically inhibit nucleation of Aβ-amyloid aggregation [79,80]. Similarly,tetracycline has been shown to specifically inhibit the elongation process ofamyloid-forming W7FW14F mutant of apomyoglobin [81].

Although light exposure might not be a factor in amyloid-associatedpathologies, other than perhaps in the eye and skin, it appears to be a usefulperturbant to investigate amyloid fibril formation. Since UV exposure leadsto failure of de novo amyloid fibril formation of three different amyloidogenicproteins, subtle structural changes that help prevent fibril formation couldbe investigated further. UV exposure also leads to selective compromise of

Page 303: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

14 Effect of UV Light on Amyloidogenic Proteins 285

Fig. 14.9. Schematic representation of effect of light on amyloid proteins. Adaptedfrom [3]

the nucleation process. Thus, it appears that UV exposure could be exploitedas a tool for investigating the amyloidogenic process, especially the differentprocesses that are associated with nucleation and fibril extension.

Acknowledgments

We thank Dr T. Ramakrishna for critically editing the manuscript, whichhelped in improving its quality; Md. Faiz Ahmad for α-synuclein protein andconstruct of β2-microglobulin; and Dr Shashi Singh for electron microscopy.AKT acknowledges the award of a Senior Research Fellowship by the Councilof Scientific and Industrial Research, New Delhi, India.

Page 304: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

286 A.K. Thakur and Ch.M. Rao

References

1. M.F. Ahmad, B. Raman, T. Ramakrishna, Ch.M. Rao, J. Mol. Biol. 375, 1040(2008)

2. B. Raman, T. Ban, M. Sakai, S.Y. Pasta, T. Ramakrishna, H. Naiki, Y. Goto,Ch.M. Rao, Biochem. J. 392, 573 (2005)

3. A.K. Thakur, Ch.M. Rao, PLoS ONE. 3, e2688 (2008)4. A. Shukla, S. Mukherjee, S. Sharma, V. Agrawal, K.V. Radha Kishan,

P. Guptasarma, Arch. Biochem. Biophys. 428, 144 (2004)5. C.M. Dobson, Trends Biochem. Sci. 24, 329 (1999)6. W. Colon, J.W. Kelly, Biochemistry 31, 8654 (1992)7. Z. Lai, W. Colon, J.W. Kelly, Biochemistry 35, 6470 (1996)8. S.P. Martsev, A.P. Dubnovitsky, A.P. Vlasov, M. Hoshino, K. Hasegawa,

H. Naiki, Y. Goto, Biochemistry 41, 3389 (2002)9. H. Naiki, N. Hashimoto, S. Suzuki, H. Kimura, K. Nakakuki, F. Gejyo, Amyloid

4, 223 (1997)10. V.J. McParland, N.M. Kad, A.P. Kalverda, A. Brown, P. Kirwin-Jones, M.G.

Hunter, M. Sunde, S.E Radford, Biochemistry 39, 8735 (2000)11. B. Raman, E. Chatani, M. Kihara, T. Ban, M. Sakai, K. Hasegawa, H. Naiki,

Ch.M. Rao, Y. Goto, Biochemistry 44, 1288 (2005)12. V.N. Uversky, J. Li, A.L. Fink, J. Biol. Chem. 276, 10737 (2001)13. K. Sasahara, H. Yagi, H. Naiki, Y. Goto, J. Mol. Biol. 372, 981 (2007)14. Y. Kusumoto, A. Lomakin, D.B. Teplow, G.B. Benedek, Proc. Natl. Acad. Sci.

U. S. A. 95, 12277 (1998)15. O. Gursky, S. Aleshkov, Biochem. Biophys. Acta 1476, 93 (2000)16. J. Danielsson, J. Jarvet, P. Damberg, A. Graslund, FEBS J. 272, 3938 (2005)17. O.V. Bocharova, N. Makarava, L. Breydo, M. Anderson, V.V. Salnikov, I.V.

Baskakov, J. Biol. Chem. 281, 2373 (2006)18. A. Arora, C. Ha, C.B. Park, Protein Sci. 13, 2429 (2004)19. E. Shehi, P. Fusi, F. Secundo, S. Pozzuolo, A. Bairati, P. Tortora, Biochemistry

42, 14626 (2003)20. M. Zhu, P.O. Souillac, C. Ionescu-Zanetti, S.A. Carter, A.L. Fink, J. Biol. Chem.

277, 50914 (2002)21. T. Kowalewski, D.M. Holtzman, Proc. Natl. Acad. Sci. U S A 96, 3688 (1999)22. Z. Wang, C. Zhou, C. Wang, L. Wan, X. Fang, C. Bai, Ultramicroscopy 97, 73

(2003)23. Y. Sun, N. Makarava, C.I. Lee, P. Laksanalamai, F.T. Robb, I.V. Baskakov,

J. Mol. Biol. 376, 1155 (2008)24. B.A. Vernaglia, J. Huang, E.D. Clark, Biomacromolecules 5, 1362 (2004)25. A. Ahmad, I.S. Millett, S. Doniach, V.N. Uversky, A.L. Fink, Biochemistry 42,

11404 (2003)26. Z. Lai, J. McCulloch, H.A. Lashuel, J.W. Kelly, Biochemistry 36, 10230 (1997)27. M. Calamai, F. Chiti, C.M. Dobson, Biophys. J. 89, 4201 (2005)28. O.V. Bocharova, L. Breydo, A.S. Parfenov, V.V. Salnikov, I.V. Baskakov, J. Mol.

Biol. 346, 645 (2005)29. S. Meehan, Y. Berry, B. Luisi, C.M. Dobson, J.A. Carver, C.E. MacPhee, J. Biol.

Chem. 279, 3413 (2004)30. H.J. Lee, C. Choi, S.J. Lee, J. Biol. Chem. 277, 671 (2002)31. E.N. Lee, S.Y. Lee, D. Lee, J. Kim, S.R. Paik, J. Neurochem. 84, 1128 (2003)

Page 305: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

14 Effect of UV Light on Amyloidogenic Proteins 287

32. V. Narayanan, S. Scarlata, Biochemistry 40, 9927 (2001)33. L.A. Munishkina, C. Phelan, V.N. Uversky, A.L. Fink, Biochemistry 42, 2720

(2003)34. M.F. Ahmad, T. Ramakrishna, B. Raman, Ch.M. Rao, J. Mol. Biol. 364, 1061

(2006)35. S. Yamamoto, K. Hasegawa, I. Yamaguchi, S. Tsutsumi, J. Kardos, Y. Goto,

F. Gejyo, H. Naiki, Biochemistry 43, 11075 (2004)36. H. Zhao, E.K. Tuominen, P.K. Kinnunen, Biochemistry 43, 10302 (2004)37. Z. Ma, G.T. Westermark, Mol. Med. 8, 863 (2002)38. E.Y. Chi, C. Ege, A. Winans, J. Majewski, G. Wu, K. Kjaer, K.Y. Lee, Proteins

72, 1 (2008)39. D.P. Smith, D.J. Tew, A.F. Hill, S.P. Bottomley, C.L. Masters, K.J. Barnham,

R. Cappai, Biochemistry 47, 1425 (2008)40. P. Critchley, J. Kazlauskaite, R. Eason, T.J. Pinheiro, Biochem. Biophys. Res.

Commun. 313, 559 (2004)41. J. Kazlauskaite, N. Sanghera, I. Sylvester, C. Venien-Bryan, T.J. Pinheiro Bio-

chemistry 42, 3295 (2003)42. N. Sanghera, T.J. Pinheiro, J Mol. Biol. 315, 1241 (2002)43. T. Scheibel, S.L. Lindquist, Nat. Struct. Biol. 8, 958 (2001)44. M.L. Hegde, K.S.J. Rao, Arch. Biochem. Biophys. 464, 57 (2007)45. V.N. Uversky, J. Li, A.L. Fink, FEBS Lett. 500, 105 (2001)46. A.B. Manning-Bog, A.L. McCormack, J. Li, V.N. Uversky, A.L. Fink, D.A. Di

Monte J Biol. Chem. 277, 1641 (2002)47. V.N. Uversky, J. Li, A.L. Fink, J Biol. Chem. 276, 44284 (2001)48. Y. Ohhashi, M. Kihara, H. Naiki, Y. Goto, J Biol. Chem. 280, 32843 (2005)49. E. Chatani, H. Naiki, Y. Goto, J Mol. Biol. 359, 1086 (2006)50. S.L. Kazmirski, M.J. Howard, R.L. Isaacson, A.R. Fersht, Proc. Nat. Acad. Sci.

97, 10706 (2000)51. E.K. Tan, L.M. Skipper, Pathogenic mutations in Parkinson disease. Hum.

Mutat. 28, 641 (2007)52. K. Doh-ura, J. Tateishi, H. Sasaki, T. Kitamoto, Y. Sakaki, Biochem. Biophys.

Res. Commun. 163, 974 (1989)53. L.G. Goldfarb, M. Haltia, P. Brown, A. Nieto, J. Kovanen, W.R. McCombie,

S. Trapp, D.C. Gajdusek, Lancet 337, 425 (1991)54. D. Goldgaber, L.G. Goldfarb, P. Brown, D.M. Asher, W.T. Brown, S. Lin, J.W.

Teener, S.M. Feinstone, R. Rubenstein, R.J. Kascsak, J.W. Boellaard, D.C.Gajdusek, Exp. Neurol. 106, 204 (1989)

55. K. Hsiao, H.F. Baker, T.J. Crow, M. Poulter, F. Owen, J.D. Terwilliger,D. Westaway, J. Ott, S.B. Prusiner, Nature 338, 342 (1989)

56. K. Hsiao, S.R. Dlouhy, M.R. Farlow, C. Cass, M. Da Costa, P.M. Conneally,M.E. Hodes, B. Ghetti, S. B. Prusiner, Nat. Genet. 1, 68 (1992)

57. Bharathi, S.S. Indi, K.S. Rao, Neurosci. Lett. 424, 78 (2007)58. J.A. Wright, D.R. Brown, J. Neurosci. Res. 86, 496 (2008)59. B. Raman, T. Ban, K. Yamaguchi, M. Sakai, T. Kawai, H. Naiki, Y. Goto,

J. Biol. Chem. 280, 16157 (2005)60. C. Chothia, J. Janin, Nature 256, 705 (1975)61. A. Lomakin, D.S. Chung, G.B. Benedek, D.A. Kirschner, D.B. Teplow, Proc.

Natl. Acad. Sci. U. S. A. 93, 1125 (1996)62. R.D. Hills, C.L. Brooks Jr, J. Mol. Biol. 368, 894 (2007)

Page 306: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

288 A.K. Thakur and Ch.M. Rao

63. V.N. Uversky, A.L. Fink, Biochim. Biophys. Acta 1698, 131 (2004)64. C. Goldsbury, J. Kistler, U. Aebi, T. Arvinte, G.J. Cooper, J. Mol. Biol. 285,

33 (1999)65. M. Gobbi, L. Colombo, M. Morbin, G. Mazzoleni, E. Accardo, M. Vanoni,

E. Del Favero, L. Cantu, D.A. Kirschner, C. Manzoni, M. Beeg, P. Ceci, P.Ubezio, G. Forloni, F. Tagliavini, M. Salmona, J. Biol. Chem. 281, 843 (2006)

66. M.J. Cannon, A.D. Williams, R. Wetzel, D.G. Myszka, Anal. Biochem 328, 67(2004)

67. T. Shirahama, A.S. Cohen, J. Cell Biol. 33, 679 (1967)68. B. Chakrabarti, S.K. Bose, K. Mandal, J. Indian Chem. Soc. 63, 131 (1986)69. B. Raman, C.M. Rao J. Biol. Chem. 269, 27264 (1994)70. S.C. Rao, C.M. Rao, D. Balasubramanian, Photochem. Photobiol. 51, 357

(1990)71. L.I. Grossweiner, Curr. Top. Radiat. Res. Quart. 11, 141 (1976)72. L.I. Grossweiner, A. Blum, A.M. Brendzel, in Trends in Photobiology ed. by

C. Helene, M. Charlier, Th. Montenay-Garestier, G. Laustriat (Plenum, NewYork, 1982), p. 67

73. D.G. Donne, J.H. Viles, D. Groth, I. Mehlhorn, T.L. James, F.E. Cohen, S.B.Prusiner, P.E. Wright, H.J. Dyson, Proc. Natl. Acad. Sci. U. S. A. 94, 13452(1997)

74. R. Riek, S. Hornemann, G. Wider, R. Glockshuber, K. Wuthrich, FEBS Lett.413, 282 (1997)

75. V.A. Lawson, S.A. Priola, K. Wehrly, B. Chesebro, J. Biol. Chem. 276, 35265(2001)

76. V.A. Lawson, S.A. Priola, K. Meade-White, M. Lawson, B. Chesebro, J. Biol.Chem. 279, 13689 (2004)

77. K.N. Frankenfield, E.T. Powers J.W. Kelly, Protein Sci. 14, 2154 (2005)78. Weissmann, J. Biol. Chem. 274, 3 (1999)79. K.C. Evans, E.P. Berger, C.G. Cho, K.H. Weisgraber, P.T. Lansbury Jr. Proc.

Natl. Acad. Sci. U. S. A. 92, 763 (1995)80. S.J. Wood, W. Chan, R. Wetzel, Biochemistry 35, 12623 (1996)81. C. Malmo, S. Vilasi, C. Iannuzzi, S. Tacchi, C. Cametti, G. Irace, I. Sirangelo

FASEB J. 20, 346 (2005)

Page 307: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

15

Real-Time Observation of Amyloid FibrilGrowth by Total Internal ReflectionFluorescence Microscopy

H. Yagi, T. Ban, and Y. Goto

Abstract. Amyloid fibrils form through nucleation and growth. To clarify themechanism involved, direct observations are important. We developed a unique ap-proach to monitor fibril growth in real time at the single-fibril level using total inter-nal reflection fluorescence microscopy (TIRFM) combined with thioflavin T (ThT),an amyloid-specific fluorescence dye. We succeeded in visualizing the fibril growthwith β2-microglobulin (β2-m) and amyloid β peptide. On the basis of significant vari-ations in amyloid morphology revealed by TIRFM, we propose that the taxonomyof amyloid supramolecular assemblies will be useful to clarify the structure–functionrelationship of amyloid fibrils.

15.1 Introduction

Amyloid fibrils have been a critical subject in recent studies of proteins be-cause they were recognized to be associated with the pathology of more than20 serious human diseases [1–3]. Additionally, various proteins and peptidesthat are not related to diseases can also form amyloid-like fibrils, implying thatthe formation of amyloid fibrils is a generic property of polypeptides. Althoughno sequence or structural similarity has been found among the amyloid precur-sor proteins, amyloid fibrils share several common structural and spectroscopicproperties. Irrespective of the protein species, electron microscopy (EM) andX-ray fiber diffraction indicate that amyloid fibrils are relatively rigid andstraight with a diameter of 10–15 nm and several layers of cross-β sheets.Amyloid fibrils form via a nucleation-dependent process in which nonnativeforms of precursor proteins or peptides slowly associate to form a nucleus,which is followed by an extension reaction in which the nucleus grows bythe sequential incorporation of precursor molecules. Structural studies usingsolid state NMR have shown that amyloid fibrils are stabilized by juxtapos-ing hydrophobic segments minimizing electrostatic repulsion [4–6]. From thehydrogen/deuterium exchange of amide protons, amyloid fibrils were shown

Page 308: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

290 H. Yagi et al.

to be stabilized by an extensive network of hydrogen bonds substantiatingthe β-sheets [4–7]. On the basis of various approaches, increasingly convincingstructural models of amyloid fibrils are emerging.

The heterogeneity of amyloid fibrils has been in focus recently [4–6]. Ithas been shown that Aβ-amyloid fibrils with different morphological featureshave different underlying side-chain structures as revealed by solid-state NMRmeasurements and that both the morphology and molecular structure are self-propagated by seeding [4–6]. A similar observation of the template-dependentpropagation of distinct fibrils was made with insulin [8]. More recently, mam-malian prion amyloids from different species were shown to differ distinctlyin secondary structure and morphology as measured by Fourier transforminfrared spectroscopy (FTIR) and atomic force microscopy (AFM), respec-tively [9]. Importantly, cross-seeding of prion monomers from one specieswith preformed fibrils from another species produced a new amyloid strainthat inherited the secondary structure and morphology of the template fib-rils. Strain-specific conformational differences were also found for yeast Sup35prion amyloid fibrils [10, 11]. These findings may explain the structural basisunderlying conformational memory as suggested for prion diseases [12].

To obtain further insight into the structure and heterogeneity of amy-loid fibrils, direct observation of individual fibrils is important. Here we de-scribe a unique approach we developed to monitor fibril growth in real timeat the single fibril level [13–17]. On the basis of the observed dramatic di-versity and underlying structural basis, we classify amyloid supramolecularassemblies [18].

15.2 Total Internal Reflection Fluorescence Microscopy

TIRFM has been useful for monitoring single molecules by effectively reducingthe background fluorescence under the evanescent field formed on the surfaceof a quartz slide [19–21] (Fig. 15.1). When a laser is incident on the inter-face between the quartz slide (high reflection index) and an aqueous solution(low reflection index) at the critical angle for total internal reflection, theevanescent field is produced beyond the interface in the solution. The illumi-nation is restricted to fluorophores either bound to the quartz slide surface orlocated close by, resulting in highly reduced background fluorescence. Further-more, with the careful selection of optical elements, the background fluores-cence can be reduced 2,000-fold compared to that in ordinary epi-fluorescencemicroscopy.

On the other hand, ThT is a reagent known to become strongly fluorescentupon binding to amyloid fibrils [22], so that one can detect the fibrils specifi-cally without covalent modification. Importantly, because the evanescent fieldformed by the total internal reflection of the laser light penetrates to a depthof 150 nm, one can selectively monitor fibrils lying along the slide glass within150 nm, and thus can obtain the exact length of the fibrils. By combining

Page 309: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

15 Real-Time Observation of Amyloid Fibril Growth 291

Fig. 15.1. Schematic representation of amyloid fibrils revealed by total internalreflection fluorescence microscopy. (a) The penetration depth of the evanescent fieldformed by the total internal reflection of laser light is ∼150 nm for a laser lightat 455 nm, so only amyloid fibrils lying parallel to the slide glass surface were ob-served. (b) Schematic diagram of a prism-type TIRFM system on an inverted mi-croscope. ISIT: image-intensifier-coupled silicone intensified target camera, CCD:charge-coupled device camera

amyloid fibril-specific ThT fluorescence and TIRFM, it is possible to observethe amyloid fibrils and the process by which they form, without introducingany fluorescence reagent covalently bound to the protein molecule.

15.3 Real-Time Observation of β2-m and Aβ Fibrils

Real-time observation of the growth of individual β2-m fibrils was carried outat pH 2.5 on the surface of quartz slides (Fig. 15.2) [13]. At time zero, theβ2-m seeds appeared as bright fluorescent spots. Then, fibril growth occurredfrom the seed fibrils, with saturation occurring in a couple of hours when themonomeric β2-m was depleted. The overall time course of fibril growth wassimilar to that in solution with similar concentrations of seeds and monomers.Intriguingly, most of the fibrils showed unidirectional growth starting from oneend of the seeds. Although we cannot exclude the possibility that the inter-action with the glass surface was responsible for the unidirectional extension,the unidirectional picture is likely to hold for the formation of fibrils of β2-mand also of Aβ(1–40) (see below).

Page 310: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

292 H. Yagi et al.

Fig. 15.2. Direct observation of β2-m amyloid fibril growth obtained by TIRFM.Adapted from ref. [13] with permission. Incubation times are 0, 30, 60, and 90min

This approach using ThT can be applied to various amyloid fibrils sincethe binding of ThT is common to amyloid fibrils. This was demonstratedwith Aβ(1–40) amyloid fibrils [14], revealing more dramatic images since wecould perform the experiments at pH 7.5, where the fluorescence of ThTis much stronger than at pH 2.5 (Fig. 15.3). The growth of fibrils occurredsimultaneously at many seeds. Although several fibrils often developed fromapparently one seed, it is likely that the clustered seeds produced such a radialpattern. Once started, unidirectional growth continued producing remarkablylong fibrils of more than 15 μm in length. Considering that TIRFM selectivelymonitors fibrils lying along the slide within 150 nm, the interaction of fibrilswith the quartz surface caused the lateral growth. In addition, the combinationof relatively rapid fibril growth and less aggregation of fibrils weakly fixed onthe quartz surface enabled the formation of remarkably long fibrils.

The remarkable length of the fibrils enabled an exact analysis of the rate ofgrowth of individual fibrils. The growth at the early and middle stages seemsto occur in an all-or-none manner: when the fibril extends, the rate is almostconstant (∼0.3 μm min−1) independent of the fibril species. There were caseswhere the growth paused briefly, possibly because of physical obstacles orlocal depletion of monomers. When the growth restarted, however, a similarrate of 0.3 μmmin−1 was regained. Similar discontinuous growth, termed thestop-and-run mechanism, was also observed during the growth of α-synucleinprotofibrils monitored by AFM in situ [23].

15.4 Effects of Various Surfaces on the Growthof Aβ Fibrils

The size and the shape of fibrils, as well as the kinetics of formation, aredependent on the physicochemical nature of the surface [24–26]. We studiedthe effects of the physicochemical properties of the surface on the growth ofamyloid fibrils of Aβ [15]. Using specific chemical modifications, it is possibleto modify the properties of the quartz surface, both in terms of net charge

Page 311: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

15 Real-Time Observation of Amyloid Fibril Growth 293

Fig. 15.3. Direct observation of Aβ(1–40) amyloid fibril growth by TIRFM. Real-time monitoring of fibril growth on glass slides. Arrows indicate the unidirectionalgrowth of Aβ from a single seed fibril. The scale bar represents 10 μm. Reproducedfrom [14] with permission

and hydrophobicity. We observed the seed-dependent formation of Aβ (1–40)fibrils on the surface of various chemically modified substrates that were cre-ated either by alternative adsorption of polyelectrolytes or with self-assembledmonolayer of silanes.

In the presence of the Aβ(1–40) seed fibrils, enhanced fibril formationwas observed on negatively charged surfaces, including quartz and poly-ethyleneimine (PEI)/polyvinylsulfonate (PVS). On quartz, intense growth ledto remarkably long fibrils as reported previously [14]. We often observed ra-dial growth patterns suggesting the presence of clustered seeds. Extensivefibril formation was generally observed on the surfaces with negative charges,regardless of whether they were modified by a polyelectrolyte or silane. Incontrast, fibril growth was largely suppressed on positively charged or hy-drophobic surfaces. Aβ(1–40) is negatively charged at pH 7.5, suggesting thatthe tight interactions between Aβ(1–40) and the surfaces prevent the fibrilgrowth.

Page 312: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

294 H. Yagi et al.

Fig. 15.4. Real-time observations of the formation of Aβ(1–40) spherulite. Real-time observations of Aβ(1–40) amyloid fibril growth on PEI/PVS at pH 7.5 and37◦C. Concentrations of Aβ(1–40) monomers, seeds, and ThT were 50 μM, 5 μg ml−1,and 5 μM, respectively. White arrows in panels of 0–20 min indicate the hazy areadetected before clear images of spherical amyloid fibrils were obtained. At time zero,large clusters were not observed on the surface. At 10 min, hazy globular objectswere identified. At 15 min, fibrils emerged. Fibrils grew both in size and numberwith time, forming huge spherical amyloid assemblies with a radius of more than20 μm at 120min. Reproduced from [15] with permission

Fibril growth was especially prominent on the surfaces covered withPEI/PVS, highly negatively charged and hydrophilic polyelectrolytes(Fig. 15.4). We initially presumed that the growth of fibrils on the PEI/PVSinitiated from large clustered seeds attached to the surface. However, thereal-time observation revealed striking images of fibril growth, producinghuge spherical assemblies with a densely packed radial pattern (Fig. 15.4).Importantly, no branching of the growing ends was observed as on quartz.

Considering that TIRFM illumination has a depth of penetration of∼150 nm and the depth of focus on the objective lens is about 100 nm, thelarge clusters of seeds formed at first in solution and were not in contactwith the substrate. The hazy areas observed at the initial stages, as indicatedby the arrows in Fig. 15.4, may represent the clustered seeds or aggregatedintermediates formed in solution. Since the thickness of the water medium

Page 313: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

15 Real-Time Observation of Amyloid Fibril Growth 295

estimated from the fine-focus stroke between the quartz slide and cover slipis about 10 μm, the spherical assemblies observed here are in fact flattenedspheres. The surface used for TIRFM observation was located on the upperside of the cell, so the clustered fibrils on the surface were not deposited bygravitational force.

Most importantly, these spherulitic structures resemble the amyloid coreof senile plaques observed in the central cortices of patients suffering fromAlzheimer’s disease [27]. Similar spherical amyloid deposits are observed ina mouse model of Alzheimer’s disease [28], in patients with Creutzfeldt–Jakob disease [29], and in several other neurodegenerative diseases [30], in-dicating that they are a common architectural feature of fibrils. Furthermore,spherulites were observed in vitro in many systems including natural and syn-thetic polymers, for example in insulin [31, 32], pathogenic immunoglobulinchains [33], β-lactoglobulin [34] and synthetic peptides [35], indicating thatthey are a common architectural feature of the fibers. We consider that thesenile plaque-like spherical objects observed here correspond to “spherulites”,a higher order spherical assembly of amyloid fibrils ranging in diameter from10 to 150 μm. In a polarizing light microscope, spherulites exhibit a typical“Maltese-cross” extinction pattern [31].

15.5 Spontaneous Formation of Aβ(1–40) Fibrilsand Classification of Morphologies

We also studied the spontaneous formation of Aβ(1–40) fibrils without seedson quartz slides [18]. Spontaneous fibrillation of Aβ(1–40), accelerated by a lowconcentration of sodium dodecyl sulfate and a high concentration of sodiumchloride under the quartz slides, produced various remarkable amyloid as-semblies. Densely packed spherulitic structures with radial fibril growth weretypically observed. When the packing of fibrils was coarse, extremely long fib-rils often protruded from the spherulitic cores. In other cases, a large numberof worm-like fibrils were formed. TEM and AFM revealed relatively short andstraight fibrillar blocks associated laterally without tight interaction, leadingto a random-walk-like fibril growth. These results suggest that, during spon-taneous fibrillation, the nucleation occurring in contact with surfaces is easilyaffected by environmental factors, creating various types of nuclei, and hencevariations in amyloid morphology.

On the basis of the various amyloid supramolecular fibrillar assemblies ofAβ(1–40) fibrils produced dependent on and independently of seeds, there arethree basic types of amyloid supramolecular fibrillar assemblies (Fig. 15.5).

Type I: Basic straight and rigid fibrils with a diameter of about 10–15 nm.Although tremendous lengths can be achieved without lateral association, asobserved for the seed-dependent growth on the quartz surface, the preparationin solution tends to form clustered fibrils. Precursors of mature amyloid fibrilscan be oligomeric species, protofilaments, or initial short fibrils. Variation in

Page 314: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

296 H. Yagi et al.

Fig. 15.5. Schematic models of supramolecular fibrillar assemblies of Aβ(1–40)fibrils. Variation in morphology can arise at the level of oligomeric species, protofil-aments, or initial short fibrils. They associate together on the quartz surface,creating three types of supramolecular fibrillar assemblies: Straight fibrils (TypeI), spherulitic assemblies (Type II), and worm-like fibrils (Type III). A mixed archi-tecture of type I and fibrils (Type I/II) was also observed when the internal densityis coarse. It is to be noted that the different precursors are represented togetherin a box and that the relationships between amyloid precursors and final productsremain unclear. Reproduced from [18] with permission

morphology can arise at the level of these amyloid precursors. On the otherhand, it is possible that different precursors as shown here produce similarmature fibrils. Thus, although it is clear that interactions with surfaces at theearly stages affect the final morphological features, the relationships betweenamyloid precursors and final products remain unclear. This is also true fortype II and III fibrils below.

Type II Spherulitic amyloid assemblies typically made of type I basic fib-rils. The worm-like fibrils (Type III, see below) can also form spherulitic as-semblies. Spherulitic structures were observed in the spontaneous growth ofAβ(1–40) fibrils as well as in seed-dependent growth. The diameter reachesmore than 30 μm. Probably, the clustered seeds or precursors initiate thefibril growth in a radial pattern. Internal density varies depending on thespherulitic assembly. Intriguingly, a densely packed spherulitic interior en-sures concerted growth producing globular architectures. On the other hand,

Page 315: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

15 Real-Time Observation of Amyloid Fibril Growth 297

when the internal density is coarse, independent growth of constituent fibrilsoccurs, making a unique mixed architecture of type I and II fibrils, reminiscentof nerve synapses.

Type III: Another most intriguing morphology is the worm-like fibrils. Al-though the TIRFM images suggest flexible fibrils, the TEM and AFM imagesclarified that the worm-like fibrils are in fact made of rigid fibril blocks asso-ciated laterally. Incomplete lateral association results in curvature of the lon-gitudinal axis, producing the random-walk-like fibril growth. This incompletelateral association may also produce branching of fibrils at the growing ends.Thus, in internal structure, the worm-like fibrils of Aβ(1–40) are distinct fromthe flexible and thin protofilaments often observed for other amyloids [36,37].On the other hand, the remarkable length suggests that the nucleation of theworm-like fibrils does not occur frequently. As far as we know, an architectureas unique as that of type III fibrils has not been reported previously. Theseresults suggest that the amyloid fibrils have high potential to form varioushigh-order structures. We anticipate that the present classification will applyto various amyloid fibrils.

15.6 Conclusion

We visualized the formation of amyloid fibrils in real time at the single fibrillevel. On the basis of the unique images of fibrils, we classified the amyloidsupramolecular fibrillar assemblies of Aβ(1–40) fibrils into three basic types:rigid and straight type I fibrils, spherulitic type II fibrils, and worm-like typeIII fibrils (Fig. 15.5). This classification is likely to be applicable to the fibrilsof other proteins as well. Considering the increased morphological variabilityin the spontaneous fibrillation, interactions with surfaces at the early stagesdetermine the final morphological features. Different amyloid supramolecularassemblies will have distinct biological impacts on the development and, fur-thermore, transmission of amyloidosis. Thus, clarifying the structural basisleading the various types of amyloid fibrils at the different levels, from thestructure of amyloid precursors to protofilament packing and interfibrillar in-teractions, is an important next step. The anatomy and taxonomy of amyloidsupramolecular assemblies will be critical to the progress in amyloid structuralbiology.

Acknowledgments

We would like to acknowledge Hironobu Naiki (Fukui University), TetsushiWazawa (Tohoku University), Kenichi Morigaki (AIST), and Daizo Hamada(Kobe University) for their support and encouragement. This work was sup-ported by the Grants-in-Aid from the Japanese Ministry of Education, Cul-ture, Sports, Science and Technology, and by the Japan Society for Promotionof Science (JSPS) Research Fellowships for Young Scientists to TB.

Page 316: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

298 H. Yagi et al.

References

1. J.C. Rochet, P.T. Lansbury Jr., Curr. Opin. Struct. Biol. 10, 60 (2000)2. C.M. Dobson, Nature 426, 884 (2003)3. V.N. Uversky, A.L. Fink, Biochim. Biophys. Acta 1698, 131 (2004)4. A.T. Petkova, R.D. Leapman, Z. Guo, W.M. Yau, M.P. Mattson, R. Tycko,

Science 307, 262 (2005)5. C. Wasmer, A. Lange, H. Van Melckebeke, A.B. Siemer, R. Riek, B.H. Meier,

Science 319, 1523 (2008)6. C. Ritter et al., Nature 435, 844 (2005)7. M. Hoshino, H. Katou, Y. Hagihara, K. Hasegawa, H. Naiki, Y. Goto, Nat.

Struct. Biol. 9, 332 (2002)8. W. Dzwolak, V. Smirnovas, R. Jansen, R. Winter, Protein Sci. 13, 1927 (2004)9. E.M. Jones, W.K. Surewicz, Cell 121, 63 (2005)

10. M. Tanaka, P. Chien, K. Yonekura, J.S. Weissman, Cell 121, 49 (2005)11. M. Tanaka, P. Chien, N. Naber, R. Cooke, J.S. Weissman, Nature 428, 323

(2004)12. P. Chien, J.S. Weissman, A.H. DePace, T.M. Annu. Rev. Biochem. 73, 617

(2004)13. T. Ban, D. Hamada, K. Hasegawa, H. Naiki, Y. Goto, J. Biol. Chem. 278, 16462

(2003)14. T. Ban, M. Hoshino, S. Takahashi, D. Hamada, K. Hasegawa, H. Naiki, Y. Goto,

J. Mol. Biol. 344, 757 (2004)15. Ban T et al., J. Biol. Chem. 281, 33677 (2006)16. T. Ban, K. Yamaguchi, Y. Goto, Acc. Chem. Res. 39, 663 (2006)17. T. Ban, Y. Goto, Methods. Enzymol. 413, 91 (2006)18. H. Yagi, T. Ban, K. Morigaki, H. Naiki, Y. Goto, Biochemistry 46, 15009 (2007)19. T. Funatsu, Y. Harada, M. Tokunaga, K. Saito, T. Yanagida, Nature 374, 555

(1995)20. R. Yamasaki et al., J. Mol. Biol. 292, 965 (1999)21. T. Wazawa, M. Ueda, Adv. Biochem. Eng. Biotechnol. 95, 77 (2005)22. H. Naiki, K. Higuchi, M. Hosokawa, T. Takeda, Anal. Biochem. 177, 244 (1989)23. W. Hoyer, D. Cherny, V. Subramaniam, D.M. Jovin, J. Mol. Biol. 340, 127

(2004)24. T. Kowalewski, H.K. Holtzman, Proc. Natl. Acad. Sci. U. S. A. 96, 3688 (1999)25. G.H. Blackley, M.C. Sanders, C.J. Davies, S.J. Roberts, M.J. Tendler, P.O.

Wilkinson, J. Mol. Biol. 298, 833 (2000)26. M. Zhu, S.A. Souillac, C. Ionescu-Zanetti, A.L. Carter, L.W. Fink, J. Biol.

Chem. 277, 50914 (2002)27. Y.G. Jin et al., Proc. Natl. Acad. Sci. U. S. A. 100, 15294 (2003)28. K. Hsiao et al., Science 274, 99 (1996)29. L. Manuelidis, W. Fritch, J.D. Xi, Science 277, 94 (1997)30. P.T. Harper, M.R. Lansbury Jr., Annu. Rev. Biochem. 66, 385 (1997)31. C.E. Krebs, A.F. Macphee, I.E. Miller, C.M. Dunlop, A.M. Dobson, S.S. Donald,

Proc. Natl. Acad. Sci. U. S. A. 101, 14420 (2004)32. M.R. Rogers, E.H. Krebs, A.M. Bromley, E. van der Linden, L.M. Donald,

Biophys. J. 90, 1043 (2006)33. R. Raffen et al., Protein Sci. 8, 509 (1999)34. D.M. Sagis, C. Veerman, E. van der Linden, Langmuir 20, 924 (2004)

Page 317: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

15 Real-Time Observation of Amyloid Fibril Growth 299

35. Y. Fezoui, D.M. Hartley, D.J. Walsh, J.J. Selkoe, D.B. Osterhout, D.P. Teplow,Nat. Struct. Biol. 7, 1095 (2000)

36. V.J. Hong, M. Gozu, K. Hasegawa, H. Naiki, Y. Goto, J. Biol. Chem. 277,21554 (2002)

37. T.I. McParland et al., Biochemistry 39, 8735 (2000)

Page 318: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

“This page left intentionally blank.”

Page 319: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

Index

N -formyl kynurenine, 274θ point, 44α-synuclein, 252, 268, 273, 292αB-crystallin, 268α-lactalbumin, 13, 14α, α-1,1 linkage, 225, 231α, α-1,1-glycosidic linkage, 219, 224β-lactoglobulin, 295β-sheets, 290β2-microglobulin, 268, 273, 289γ-crystallin, 272Φ-value analysis, 13, 2417O-NMR spectroscopy, 22231P NMR, 235p–T , 174Aβ-peptide, 252, 257–259, 272CH/π hydrogen bond, 145Ca2+-binding protein, 14(oligomeric) species, 260l-cysteine, 274“Maltese-cross” extinction pattern, 295“condensation-ordering” mechanism of

aggregation, 2493D distribution function, 196, 198, 200,

2063D-RISM, 190, 192, 196, 200, 202, 205,

207, 208

Aβ(1–40) amyloid fibrils, 292Aβ-amyloid fibrils, 290Aβ, 255, 291AA amyloidosis, 246ab initio shape prediction, 137

accelerated molecular dynamics, 212,213

acetylcholinesterase, 213, 214

actin, 217

actin filaments, 251

active site, 213, 215, 216

acylphosphatase, 243, 250

AFM, 290

ageing, 262

aggregation, 241

AL amyloidosis, 246

alanine dipeptide, 80

Alzheimer’s, 245

Alzheimer’s disease, 246, 253, 261, 268,295

Alzheimer’s disease and Type IIdiabetes, 241

amylin, 252

amyloid, 245, 268

amyloid β, 289

amyloid diseases, 256

amyloid fibril, 249–251, 289

amyloid intermediate, 269

amyloid supramolecular assemblies, 290

amyloid supramolecular fibrillarassemblies, 297

amyloidogenesis, 267

amyloidogenic, 267

amyotrophic lateral sclerosis, 246

analytical generalized born plusnon-polar (AGBNP), 99

anhydrobiosis, 219, 229

ankyrin-repeat, 130

Page 320: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

302 Index

antibodies, 261antioxidant function, 235ApoAI amyloidosis, 246apomyoglobin, 1AppA, 149, 157Arctic mutation, 259association, 149atomic force microscopy (AFM), 248,

290ATP hydrolysis, 217ATP synthases, 216

B1 domain of streptococcal protein G,88

bacteriorhodopsin, 86binding, 215, 216binding free energy, 201biological evolution, 262biological self-assembly, 241biological switch, 127biosensor, 150BLUF, 157Boltzmann factor, 63Brownian dynamics simulations, 215

C-peptide of ribonuclease A, 78C. elegans, 260calcium binding protein, 204calcium binding site, 203cancer, 244capillary method, 150catalase, 274cellular, 214channel, 213, 214chaperone, 253chaperonin, 214chemical chaperone, 220chromatin structure, 57circular dichroism, 124closure relations, 193cluster analysis, 29coarse-grained models, 215coil-globule transition, 43, 44computational, 211computer simulation, 243computer simulation methods, 248conformation, 217conformation change, 149conformational changes, 138, 216

conformational ensemble, 124conformational fluctuations, 212, 214conformational substates, 212, 213Congo red, 247coordination numbers, 201coupled folding and binding, 124Creutzfeldt–Jakob disease, 253, 295cross-β, 247cross-β structure, 249cross-β sheets, 289crowded, 215crowded molecular environment of the

cell, 242cryo-electron microscopy, 248cryptic binding site, 214cystic fibrosis, 244cytochrome c, 154cytochrome P450, 110cytoskeleton, 217

degradation, 260dehydration penalty, 188, 202, 236densitometric studies, 178density of states, 64density pair distribution function, 190desiccation tolerance, 230diagrams, 174dialysis-related amyloidosis, 260, 268differential scanning calorimetry (DSC),

226differentiation, 242diffusion, 212, 214diffusion coefficient, 149, 150, 154diffusion detected biosensor, 168, 170diffusion peak, 166diffusion-controlled, 213dimer, 161, 162dimerization, 161, 163direct correlation function, 191dissociation, 149, 163disulphide bond formation, 244DNA condensation, 40docking simulation, 188donor–acceptor distance, 144dose-response curve, 201Drosophila meganister, 257drug design, 188drug discovery, 211, 214drugs, 215

Page 321: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

Index 303

DSC, 226, 231dynamics, 216

electron microscopy, 289electron transfer, 217endoplasmic reticulum (ER), 245energy landscape, 212, 213, 216, 243energy surface, 243enthalpy change, 152enthalpy relaxation, 228enzymatic reaction, 187, 208enzyme, 212, 213enzyme dihydrofolate reductase, 216evolution, 211, 213evolutionary selection, 241

familial amyloidotic polyneuropathy,246

familial diseases, 253fibril extension, 272fibrillogenesis, 271fibrils, 291fibronectin, 214final slope, 142Finnish hereditary amyloidosis, 246fluctuation analysis, 138FMN, 163folding, 13, 138folding intermediate, 15folding of proteins, 241folding pathway, 1Fourier transform infrared (FTIR)

spectroscopy, 225, 290fractal dimension, 142fringe length, 151fruit fly, 242, 257FTIR, 225FTIR imaging spectroscopy, 231FTIR spectra, 233functional unfolded proteins, 122funnel, 243

G proteins, 216G-peptide, 104gain of function diseases, 254gate, 214gate dynamics, 214gated, 215gated binding, 215

gating, 213, 214GDP, 216gel-to-liquid crystalline temperature,

234gene expression, 256gene therapy, 261generalized Born, 98generalized ensemble, 61generalized Langevin equation, 208generalized-ensemble algorithm, 61, 63generic feature of, 248glass transition temperatures, 228glassy state, 232gluco-disaccharides, 228good solvent, 44GTP, 216

hemodialysis-related amyloidosis, 246hen egg-white lysozyme, 196hereditary, 247hereditary cerebral haemorrhage with

amyloidosis, 247high-angle solution X-ray scattering,

137, 139high-pressure structure, 205hinge-bending motions, 21histones, 56HIV integrase, 214HIV protease enzyme, 215HIV-1 integrase inhibitor, 215HNC closure, 193housekeeping mechanisms, 255human lysozyme, 202Huntington’s disease, 246hydration, 30, 221, 223hydration number, 222hydration structure, 197hydrodynamic radius, 154hydrodynamic volume, 221hydrogen abstraction reaction, 236hydrogen bonds, 213, 290hydrogen exchange, 23hydrogen exchange pulse labeling, 2hydrogen/deuterium exchange, 289hydrophobic collapse, 1hydrophobic interaction, 199hydrophobicity, 293hydroxyl, 274

Page 322: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

304 Index

immune response, 244immunoglobulin, 295immunoglobulin-like domains, 130inelastic light scattering, 223influenza, 217inhibitor of NF-κB (IκB), 129injection-localised amyloidosis, 247interprotein interaction, 168intrachain segregation, 41, 42intramolecular correlation function, 192intrinsically disordered, 123ion channel, 208Isentress, 214isobaric-isothermal ensemble, 67isomerization, 138

KH closure, 193kinetic measurement, 139kinetic model of the G-Peptide, 108kinetics, 214, 216Kirkwood–Buff equation, 194Kuhn segment, 43Kuru, 253kynurenine, 274

lactate dehydrogenase, 216landscape, 243late embryogenesis abundant (LEA)

proteins, 234LBHB, 144Le Chaterier’s law, 205, 206ligand binding, 214ligand binding sites, 199ligand–receptor binding, 213liquid crystalline state, 234locomotor defects, 257loss of function diseases, 244LOV, 163low-barrier hydrogen bond, 144lysozyme, 198, 200, 252, 260lysozyme amyloidosis, 246

maltose, 221mannitol, 274maximum dimension, 137MC, 61MD, 61MD simulation, 224MD unfolding simulations, 18, 26

mean activity coefficient, 202Medullary carcinoma of the thyroid,

246membranes, 217methionine, 13Metropolis algorithm, 64model organisms, 262molecular chaperones, 244, 254, 255molecular clocks, 216molecular dynamics, 13, 61, 207, 212,

215molecular dynamics simulation, 138,

217molecular evolution, 253molecular Ornstein–Zernike (MOZ), 191molecular recognition, 187, 207molten globule, 2, 15, 23, 139Monte Carlo, 61, 207motions, 211MREM, 63MUCA, 62multibaric-multithermal, 80multibaric-multithermal algorithm, 68multicanonical algorithm, 61–63multicanonical ensemble, 67multidimensional replica-exchange

method, 63multiple binding partners, 123multiple hydrogen bonds, 236myoglobin, 153, 212–214, 248

N-terminal, 13N-terminal methionine, 16NADH, 216NADPH, 216nanotechnology, 248native and amyloid structures, 250natural selection, 256negative phototaxis, 138nerve, 213net charge, 292network model of protein folding, 103neurodegenerative diseases, 245, 246neuromuscular junctions, 213neuronal dysfunction, 258, 259neurotransmitter acetylcholine, 213neutron crystallography, 145new view of protein folding, 244nicotinamide, 216

Page 323: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

Index 305

NMR, 211NMR relaxation, 1NMR spectroscopy, 1, 124noble gas, 199non-neuropathic localised amyloidoses,

246non-neuropathic systemic amyloidoses,

246nonequilibrium, 214nonpolar hydration, 101Nose-Andersen, 68nuclear factor-kappaB (NF-κB), 129nuclear localization signal, 130nuclear magnetic resonance, 212nucleation, 272, 295nucleosome, 56nucleus, 289

off-pathway, 270old age, 256, 261oligomeric or pre-fibrillar aggregates,

245oligomeric species, 295oligomerization, 149on the edge, 258OPLS all-atom force field, 100Ornstein–Zernike equation, 191

P. vanderplanki, 230pair correlation function, 191pancreatic trypsin inhibitor, 212Parkinson’s disease, 245, 246, 268partial molar compressibility, 222partial molar enthalpy, 83partial molar heat capacity, 222partial molar volume, 84, 194, 204, 205,

221partial molar volume of proteins, 195PAS domain, 138pearl-necklace globule, 52Percus trick, 193peroxyl radicals, 274persistent length, 39phase images, 282phospholipids, 233photoactive yellow protein, 137, 138,

149, 155photodissociation, 167photoexcitation, 217

photointermediate, 137, 139photooxidation, 273photoreceptor, 138photosensor, 148photosignal transduction, 138photosynthetic reaction center, 217phototropins, 149, 163plasticizer, 229polyalanine, 248polylysine, 248polymorphism, 225Polypedilum vanderplanki, 219, 230polypeptide, 215polyQ, 247polythreonine, 248poor solvent, 44population grating, 152positron annihilation lifetime spec-

troscopy, 227pre-fibrillar aggregates, 255preferential hydration model, 235prenucleation stage, 268pressure denaturation of protein, 204pressure perturbation calorimetry, 173pressure unfolding studies, 174prion amyloids, 290prion disorders, 253prion protein, 268, 273product release, 211, 216, 217proline isomerisation, 244protein, 216, 217protein aggregation, 235, 236protein dynamics, 211, 212protein folding, 1, 13, 61, 97, 207protein folding, misfolding and

aggregation, 262protein loop prediction, 103protein misfolding, 260protein misfolding diseases, 241protofilaments, 250, 268, 282, 295proton transfer, 144PYP, 138, 155

quality control mechanisms, 244

radial distribution function, 190radius of gyration, 137raltegravir, 214Raman spectroscopy, 223

Page 324: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

306 Index

random coil, 40rate constants, 215Rayleigh instability, 52real-time observation, 291refractive index, 150regulation of cell growth, 244regulation of cellular growth, 242relaxation dispersion, 8release, 216REM, 62reorganization energy, 217replica exchange molecular dynamics,

102replica-exchange method, 61, 62, 71residual entropy of the ordinary ice, 76ribosome, 244, 254rings-on-a-string, 52, 53RISM, 190, 192, 207, 208rubber state, 232

salt bridge, 217secondary structure packing, 137segmental Q-coordinates, 27selective ion-binding, 201selectivity, 214self-assemble, 241self-assembly, 267semiflexible polymers, 43senile plaques, 295senile systemic amyloidosis, 246SH3 domain, 250signal transduction, 216signaling networks, 123simulating replica exchange, 112simulation, 211, 214, 215single-molecule, 212single-residue mutations, 258singlet oxygen, 274small heat shock proteins, 267solid-state NMR, 248solubility, 241, 252, 260solution X-ray scattering, 137solvation free energy, 194species grating, 152spectral silent processes, 150spherulites, 295spider silk, 251spin label, 4spongiform encephalopathies, 245, 246

spontaneous fibrillation, 295sporadic, 247SPR, 169staphylococcal nuclease, 176static measurement, 139stem-cell techniques, 261stereospecificity, 216Stokes–Einstein equation, 154stop-and-run mechanism, 292stopped-flow circular dichroism, 17strong short hydrogen bond, 144structural change, 137subprotofibrils, 268, 282substrate binding, 211, 216sucrose, 221Sup35, 290superoxide, 274superoxide dismutase (SOD), 274supramolecular, 214surface plasmon, 150surface plasmon resonance (SPR), 150synapses, 214

Taylor dispersion, 150terahertz absorption spectroscopy, 222therapeutic intervention, 261thermal expansivity, 177thermal expansivity and ΔV , 179thermal grating, 152thioflavin T, 289third-generation synchrotron radiation

sources, 138three-dimensional distribution, 194three-dimensional reference interaction

site model (3D-RISM), 189time dependence, 211timescales, 215TIRFM, 289total correlation function, 191total internal reflection, 290total internal reflection fluorescence

microscopy, 289toxicity, 255, 258trafficking, 244trafficking of molecules, 242transcriptional activator CBP, 126transient grating, 149transient grating method, 139transition state, 13, 24, 29

Page 325: Water and biomolecules   physical chemistry of life phenomena (biological and medical physics, biomedical engineering)

Index 307

translocation, 244transmissible spongiform en-

cephalopathies (TSE), 268trehalose, 219, 221trehalose transporter, 237triosephosphate isomerase, 213type II diabetes, 245, 246

ubiquitin, 205unfolding, 13, 138unfolding pathway, 29unsaturated fatty acid, 235UV Light, 272

viscosity, 221vitrification hypothesis, 229volume change, 175volume grating, 152

water, 212, 213, 241

water channel, 227

water entrapment hypothesis, 229

water replacement hypothesis, 229

water stresses, 229

water structure breaker, 223, 225

water structure maker, 225, 234

water with biomolecules, 261

water-binding sites, 197

WHAM, 61

worm-like fibrils, 295, 297

X-ray diffractometry, 226

X-ray fiber diffraction, 248, 289

xenon, 200

xenon sites, 198