designing rna structure and functionpks/presentation/heidelberg-1312.pdf · designing rna structure...
TRANSCRIPT
Designing RNA Structure and Function A Toolbox for Synthetic Biology
Peter Schuster
Institut für Theoretische Chemie, Universität Wien, Austria and
The Santa Fe Institute, Santa Fe, New Mexico, USA
Synthetic Biology – From understanding to application
DKFZ-Heidelberg, 09.– 11.12.2013
Web-Page for further information:
http://www.tbi.univie.ac.at/~pks
Prologue
The goals of synthetic biology
A general method for genetically encoding unnatural amino acids in live cells.
Qian Wang, Angela R. Parrish, Lei Wang. Expanding the genetic code for biological studies. Chemistry & Biology 16:323-336, 2009.
Lei Wang, Peter G. Schultz. Expanding the genetic code. Angew.Chem.Int.Ed. 44:34-66, 2005.
1. One RNA sequence – one structure
2. Many RNA sequences – one structure
3. One RNA sequence – many structures
4. RNA switches
1. One RNA sequence – one structure
2. Many RNA sequences – one structure
3. One RNA sequence – many structures
4. RNA switches
one sequence one structure function
The paradigm of structural biology
GCGGA UUGCACCA
OCH2
OHO
O
PO
O
O
N1
OCH2
OHO
PO
O
O
N2
OCH2
OHO
PO
O
O
N3
OCH2
OHO
PO
O
O
N4
N A U G Ck = , , ,
3' - end
5' - end
Na
Na
Na
Na
5'-end 3’-endGCGGAU AUUCGCUUA AGUUGGGA G CUGAAGA AGGUC UUCGAUC A ACCAGCUC GAGC CCAGA UCUGG CUGUG CACAG
Definition of RNA structure
A symbolic notation of RNA secondary structure that is equivalent to the conventional graphs
Criterion: Minimum free energy (mfe)
Rules: _ ( _ ) _ {AU,CG,GC,GU,UA,UG}
RNA sequence
RNA structure of minimal free energy
RNA folding:
Structural biology, spectroscopy of biomolecules, understanding molecular function
Empirical parameters
Biophysical chemistry: thermodynamics and kinetics
Sequence, structure, and design
Vienna RNA-Package
Version 2.0 http://www.tbi.univie.ac.at
RNA folding into secondary structures
Ivo L. Hofacker, Walter Fontana, Peter F. Stadler, Sebastian Bonhoeffer, Manfred Tacker, Peter Schuster. 1994. Fast folding and comparison of RNA secondary structures. Monath. Chem. 125:167-188.
Michael S. Waterman, T. F. Smith. 1978. RNA secondary structures: A complete mathematical analysis. Math.Biosci. 42:257-266.
Ruth Nussinov, A. B. Jacobson. 1980. Fast algorithm for predicting the secondary structure of single-stranded RNA. Proc.Natl.Acad.Sci. USA 77:6309-6313.
Michael Zuker, P. Stiegler. 1981. Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res. 9:133-148.
Conventional definition of RNA secondary structures
Restrictions on physically acceptable mfe-structures: 3 and 2
RNA sequence
RNA structure of minimal free energy
RNA folding:
Structural biology, spectroscopy of biomolecules, understanding molecular function
Inverse folding of RNA:
Biotechnology, design of biomolecules with predefined structures and functions
Inverse Folding Algorithm
Iterative determination of a sequence for the given secondary structure
Sequence, structure, and design
Vienna RNA-Package
Version 2.0 http://www.tbi.univie.ac.at
Inverse folding algorithm I0 I1 I2 I3 I4 ... Ik Ik+1 ... It S0 S1 S2 S3 S4 ... Sk Sk+1 ... St Ik+1 = Mk(Ik) and dS(Sk,Sk+1) = dS(Sk+1,St) - dS(Sk,St) < 0
M ... base or base pair mutation operator
dS (Si,Sj) ... distance between the two structures Si and Sj ‚Unsuccessful trial‘ ... termination after n steps
Approach to the target structure Sk in the inverse folding algorithm
Target structure Sk
Initial trial sequences
Target sequence
Stop sequence of anunsuccessful trial
Intermediate compatible sequences
Intermediate compatible sequences
Department of Statistics University of Oxford
Rune B. Lyngsø, James J.W. Anderson, Elena Sizikova, Amarendra Badugu, Thomas Hyland, Jotun Hein. 2012. fRNAkenstein: Multiple traget inverse RNA folding. BMC Bioinformatics 13:e260.
Mirela Andronescu, Antony P. Fejes, Frank Hutter, Holger H. Hoos, Anne Condon. 2004. A new algorithm for RNA secondary structure design. J. Mol. Biol. 336:607-624.
Department of Computer Science University of British Comlumbia
Vancouver, BC, Canada
Robert M. Dirks, Milo Lin, Erik Winfree, Niles A. Pierce. 2004. Paradigms for computational nucleic acid design. Nucleic Acids Research 32:1392-1403.
California Institute of Technology Pasadena, CA, USA
RNA inverse folding: Secondary structures sequences
1. One RNA sequence – one structure
2. Many RNA sequences – one structure
3. One RNA sequence – many structures
4. RNA switches
The inverse folding algorithm searches for sequences that form a given RNA secondary structure under the minimum free energy criterion.
A symbolic notation of RNA secondary structure that is equivalent to the conventional graphs
Criterion: Minimum free energy (mfe)
Rules: _ ( _ ) _ {AU,CG,GC,GU,UA,UG}
N= 4n
NS < 3n
Every point in sequence space is equivalent
Sequence space of binary sequences with chain length n = 5
Sketch of structure space
Structures are not equivalent in structure space
many genotypes one phenotype
Evolution as a global phenomenon in genotype space
1. One RNA sequence – one structure
2. Many RNA sequences – one structure
3. One RNA sequence – many structures
4. RNA switches
Computation of suboptimal secondary structures
Michael Zuker. On finding all suboptimal foldings of an RNA molecule. Science 244 (1989), 48-52 Stefan Wuchty, Walter Fontana, Ivo L. Hofacker, Peter Schuster. Complete suboptimal folding of RNA and the stability of secondary structures. Biopolymers 49 (1999), 145-165
Interconversion of suboptimal structures
Base pair probability derived from the partition function Q(T)
( ) ( ) ( ) ( ) ( ) ( )( ) ( )
∑∑
∑∑
≠
−−
−=−=
=
==
ijj ijiij ijiji
k k
kTkkkijk kij
pppps
TTQ
TQegTSaTTXp k
,
/
1withln
/with, 0
γ
γγ εε base pair probability
base pairing entropy
John S. McCaskill. The equilibrium partition function and base pair binding probabilities for RNA secondary structures. Biopolymers 29:1105-1119, 1990.
3'
5'
UUGGAGUACACAACCUGUACACUCUUUC
Example of a small RNA molecule: n=28
Example of a small RNA molecule with two low-lying suboptimal conformations which contribute substantially to the partition function
„Dot plot“ of the minimum free energy structure (lower triangle) and the partition function (upper triangle) of a small RNA molecule (n=28) with low energy suboptimal configurations
U U G G A G U A C A C A A C C U G U A C A C U C U U U C
U U G G A G U A C A C A A C C U G U A C A C U C U U U CC
UU
UC
UC
AC
AU
GU
CC
AA
CA
CA
UG
AG
GU
U UU
GG
AG
UA
CA
CA
AC
CU
GU
AC
AC
UC
UU
UC
U U G G A G UACACA A C
CUGUACACUCU
UUC
U UG
GA
GUA
C AC
AA
CCU
GUA
CAC
UC
U
UUC
U UG
GAGU
AC
ACA A
CC
UG
UA
CA
CUC
UU
UC
second suboptimal configuration
first suboptimal configuration
minimum free energyconfiguration
∆E = 0.55 kcal / mole0→2
∆E = 0.50 kcal / mole0 1→
G = - 5.39 kcal / mole0
3'
5'
Kinetic folding of RNA secondary structures
Michael T. Wolfinger, W. Andreas Svrcek-Seiler, Christoph Flamm, Ivo L. Hofacker, Peter F. Stadler. 2004. Efficient computation of RNA folding dynamics. Z.Phys.A: Math.Gen. 37:4731-4741.
Christoph Flamm, Walter Fontana, Ivo. L. Hofacker, Peter Schuster. 2000. RNA folding kinetics at elementary step resolution. RNA. 6:325-338.
Christoph Flamm, Ivo. L. Hofacker, Sebastian Maurer-Stroh, Peter F. Stadler, Martin Zehl. 2001. Design of multistable RNA molecules. RNA. 7:254-265
Christoph Flamm, Ivo. L. Hofacker, Peter F. Stadler, Michael T. Wolfinger. 2002. Barrier trees of degenerate landscapes. Z.Phys.Chem. 216:155-173.
Sh
S1(h)
S6(h)
S7(h)
S5(h)
S2(h)
S9(h)
Free
ene
rgy
G
0
Local minimum
Suboptimal conformations
Search for local minima in conformation space
Free
ene
rgy
G
0
"Reaction coordinate"
Sk
S{Saddle point T {k
Free
ene
rgy
G
0
Sk
S{
T {k
"Barrier tree"
Definition of a ‚barrier tree‘
open chain
A nucleic acid molecule folding in two dominant conformations
Folding dynamics of the sequence GGCCCCUUUGGGGGCCAGACCCCUAAAAAGGGUC
Interconversion of suboptimal structures
RNA inverse folding and structure design
Vienna RNA Package
Ivo L. Hofacker, Walter Fontana, Peter F. Stadler, Sebastian Bonhoeffer, Manfred Tacker, Peter Schuster. 1994. Fast folding and comparison of RNA secondary structures. Monath. Chem. 125:167-188.
Christoph Flamm, Ivo L. Hofacker, Sebastian Maurer-Stroh, Peter F. Stadler, Martin Zehl. 2001. Design of multistable RNA molecules. RNA 7:254-265.
Ronny Lorenz, Stephan H. Bernhart, Christian Höner zu Siederissen, Hakim Tafer, Christoph Flamm, Peter F. Stadler, Ivo L. Hofacker. 2011. Vienna RNA Package 2.0. Algorithms for Molecular Bioology 6:e26.
Christian Höner zu Siederissen, Stefan Hammer, Ingrid Abfalter, Ivo L. Hofacker, Christoph Flamm, Peter F. Stadler. 2013. Computational design of RNAs with complex energy landscapes. Biopolymers 99:1124-1136.
Peter Schuster. 2006. Prediction of RNA secondary structures: From theory to models and real molecules. Reports on Progress in Physics 69:1419-1477.
Christian Höner zu Siederissen, Stefan Hammer, Ingrid Abfalter, Ivo L. Hofacker, Christoph Flamm , Peter F. Stadler. 2013. Computational design of RNAs with complex energy landscapes. Biopolymers 99:1124-1136.
Christian Höner zu Siederissen, Stefan Hammer, Ingrid Abfalter, Ivo L. Hofacker, Christoph Flamm , Peter F. Stadler. 2013. Computational design of RNAs with complex energy landscapes. Biopolymers 99:1124-1136.
Christian Höner zu Siederissen, Stefan Hammer, Ingrid Abfalter, Ivo L. Hofacker, Christoph Flamm , Peter F. Stadler. 2013. Computational design of RNAs with complex energy landscapes. Biopolymers 99:1124-1136.
Christian Höner zu Siederissen, Stefan Hammer, Ingrid Abfalter, Ivo L. Hofacker, Christoph Flamm , Peter F. Stadler. 2013. Computational design of RNAs with complex energy landscapes. Biopolymers 99:1124-1136.
Christian Höner zu Siederissen, Stefan Hammer, Ingrid Abfalter, Ivo L. Hofacker, Christoph Flamm , Peter F. Stadler. 2013. Computational design of RNAs with complex energy landscapes. Biopolymers 99:1124-1136.
Christian Höner zu Siederissen, Stefan Hammer, Ingrid Abfalter, Ivo L. Hofacker, Christoph Flamm , Peter F. Stadler. 2013. Computational design of RNAs with complex energy landscapes. Biopolymers 99:1124-1136.
1. One RNA sequence – one structure
2. Many RNA sequences – one structure
3. One RNA sequence – many structures
4. RNA switches
JN2C
A
A
AGAA
A UU
UCUU
U
UUUUUUU UUU UC
UUUUU
UG
GG
GGG GG
G
C
C
CCC
AGAA
A U
GG
G
CC
C GG C
AAGA
GC
GC
AGAA
GG CC
C
5' 5'3' 3'
CUGUUUUUGCA U AGCUUCUGUUGGCAGAAGC GCAGAAGC
-19.5 kcal·mol-1 -21.9 kcal·mol-1
A
A
A
B
B B C
C
C
3
3
3
15
15
15
36
36
36
24
24
24
J.H.A. Nagel, C. Flamm, I.L. Hofacker, K. Franke, M.H. de Smit, P. Schuster, and C.W.A. Pleij.
Structural parameters affecting the kinetic competition of RNA hairpin formation. Nucleic Acids Res. 34:3568-3576 (2006)
Synthetic RNA switches
Loop BC
Loop AB
SS A
C K
C T
G15-
G24-
G30-
G36-G39-
G9-
G3-
5 5 5 5 5 5AlT1D
SS C
G15-
G30-
C K 5 5 5Al
T1D
C K 5 5 5Al
T1D
T1 V1 T2K K KT T T
JN2C
T1 V1 T2 T1 V1 T2
5’ hairpin 3’-hairpin
JN2C small fragments
Loop BCLoop AB
-G39
-G24
JN1LH
1D
1D
1D
2D
2D
2D
R
R
R
GGGGUGGAAC GUUC GAAC GUUCCUCCCCACGAG CACGAG CACGAG
-28.6 kcal·mol-1
G/
-31.8 kcal·mol-1
GGG
G
GG
CCC
CC
CAA
UU
UU
G
G C
CUU A
A
GG
G
CCC
AA
AA
G C
GCAA
GC
/G
-28.2 kcal·mol-1
G G
G GG
GGG
CCC
CC C
C C
U
G GG GC C
C CA AA A
A AA AU U
U U
UG
GC
CA A
-28.6 kcal·mol-1
3
3
3
13
13
13
23
23
23
33
33
33
44
44
44
5'
5' 3’
3’
J.H.A. Nagel, C. Flamm, I.L. Hofacker, K. Franke, M.H. de Smit, P. Schuster, and C.W.A. Pleij.
Structural parameters affecting the kinetic competition of RNA hairpin formation. Nucleic Acids Res. 34:3568-3576 (2006)
Synthetic RNA switches
Loop2D
Loop R
Loop1D
C K
C T
G13-
G23-
G33-
C44-
G3-
5 5 5 5 5 5Al
T1D
5
T1 V1 T2K K KT T T
T13 h
J1LH sequencing gels
45
89
11
19 202425 27
33 3436 38 394146 47
3
49
1
2
67
10
12 13 1415
16 17182122 232628 29 30313235 3740 4243
44454850
-26.0
-28.0
-30.0
-32.0
-34.0
-36.0
-38.0
-40.0
-42.0
-44.0
-46.0
-48.0
-50.0
2.77
5.32
2.09
3.42.362.
44
2.44
2.44
1.46 1.44
1.66
1.9
2.14 2.512.14
2.51
2.14 1.47 1.49
3.04 2.973.04
4.886.136.
8
2.89
Free
ene
rgy
[kc
al /
mol
e]
J1LH barrier tree
A ribozyme switch E.A.Schultes, D.B.Bartel, Science 289 (2000), 448-452
Two ribozymes of chain lengths n = 88 nucleotides: An artificial ligase (A) and a natural cleavage ribozyme of hepatitis--virus (B)
The sequence at the intersection: An RNA molecules which is 88 nucleotides long and can form both structures
Two neutral walks through sequence space with conservation of structure and catalytic activity
The thiamine-pyrophosphate riboswitch S. Thore, M. Leibundgut, N. Ban. Science 312:1208-1211, 2006.
The thiamine-pyrophosphate riboswitch
S. Thore, M. Leibundgut, N. Ban. Science 312:1208-1211, 2006.
M. Mandal, B. Boese, J.E. Barrick, W.C. Winkler, R.R, Breaker. Cell 113:577-586 (2003)
Coworkers
Peter Stadler, Bärbel M. Stadler, Universität Leipzig, GE
Jord Nagel, Kees Pleij, Universiteit Leiden, NL
Walter Fontana, Harvard Medical School, MA
Martin Nowak, Harvard University, MA
Christian Reidys, University of Southern Denmark, Odense, DK
Christian Forst, University of Texas, Southwestern Medical Center, TX
Ulrike Göbel, Walter Grüner, Stefan Kopp, Jaqueline Weber, Institut für Molekulare Biotechnologie, Jena, GE
Ivo L.Hofacker, Christoph Flamm, Andreas Svrček-Seiler, Universität Wien, AT
Kurt Grünberger, Michael Kospach , Andreas Wernitznig, Stefanie Widder,
Stefan Wuchty, Jan Cupal, Stefan Bernhart, Ulrike Langhammer, Ulrike Mückstein, Universität Wien, AT
Universität Wien
Universität Wien
Acknowledgement of support
Fonds zur Förderung der wissenschaftlichen Forschung (FWF) Projects No. 09942, 10578, 11065, 13093
13887, and 14898
Wiener Wissenschafts-, Forschungs- und Technologiefonds (WWTF) Project No. Mat05
Jubiläumsfonds der Österreichischen Nationalbank
Project No. Nat-7813
European Commission: Contracts No. 98-0189, 12835 (NEST)
Austrian Genome Research Program – GEN-AU: Bioinformatics Network (BIN)
Österreichische Akademie der Wissenschaften
Siemens AG, Austria
Universität Wien and the Santa Fe Institute
Thank you for your attention!
Web-Page for further information:
http://www.tbi.univie.ac.at/~pks