11/04/05 d dobbs isu - bcb 444/544x: protein structure & function1 11/4/05 protein structure...

32
11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 1 11/4/05 Protein Structure & Function

Upload: kathryn-horn

Post on 14-Dec-2015

219 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 1

11/4/05

Protein Structure & Function

Page 2: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 2

Announcements

Exam 2 - Has been graded - Will be returned at end of class today

Grade statistics – 444 Average = 81/100 544 Average = 100/118

Questions?

Page 3: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 3

AnnouncementsBCB 544 Projects - Important Dates:

Nov 2 Wed noon - Project proposals due to David/Drena

Nov 4 Fri PM - Approvals/responses & tentative presentation schedule to

students

Dec 2 Fri noon - Written project reports due

Dec 5,7,8,9 class/lab - Oral Presentations (20')

(Dec 15 Thurs = Final Exam)

Page 4: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 4

Bioinformatics Seminars

Nov 4 Fri 12:10 PM BCB Faculty Seminar in E164 Lago How to do sequence alignments on parallel computers Srinivas Aluru, ECprE & Chair, BCB Program http://www.bcb.iastate.edu/courses/BCB691F2005.html

Next week:

Nov 10 Thurs 3:40 PM ComS Seminar in 223 Atanasoff

Computational Epidemiology Armin R. Mikler, Univ. North Texas

http://www.cs.iastate.edu/~colloq/#t3

Page 5: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 5

Bioinformatics Seminars

CORRECTION:

Week after next - Baker Center/BCB Seminars: (seminar abstracts available at above link)

Nov 14 Mon 1:10 PM Doug Brutlag, StanfordDiscovering transcription factor binding

sites

Nov 15 Tues 1:10 PM Ilya Vakser, Univ Kansas Modeling protein-protein interactions both seminars will be in Howe Hall Auditorium

Page 6: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 6

RNA Structure & Function/Prediction

Protein Structure & FunctionMon Review - promoter prediction

RNA structure & function

Wed RNA structure prediction 2' & 3' structure prediction miRNA & target prediction - Lab

10

Fri - a few more words re: Algorithms

Protein structure & function

Page 7: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 7

Reading Assignment (for Fri/Mon)

Mount Bioinformatics• Chp 10 Protein classification & structure prediction

http://www.bioinformaticsonline.org/ch/ch10/index.html

• pp. 409-491 • Ck Errata:

http://www.bioinformaticsonline.org/help/errata2.html

Other? That should be plenty…

Page 8: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 8

Review last lecture:

RNA Structure Prediction

Page 9: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 9

miRNA and RNAi pathways

RISC

Dicerprecursor

miRNA siRNAs

Dicer

“translational repression”and/or mRNA degradation mRNA cleavage, degradation

RNAi pathway

microRNA pathwayMicroRNA primary transcript Exogenous dsRNA, transposon,

etc.

target mRNA

Drosha

RISCRISC

C Burge 2005

Page 10: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 10

miRNA Challenges for Computational Biology

• Find the genes encoding microRNAs

• Predict their regulatory targets

• Integrate miRNAs into gene regulatory pathways & networks

Computational Prediction of MicroRNA Genes & Targets

C Burge 2005

Need to modify traditional paradigm of "transcriptional control" primarily by protein-DNA interactions to include miRNA regulatory mechanisms!

Page 11: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 11

RNA structure prediction strategies

1)Energy minimization (thermodynamics)

2) Comparative sequence analysis (co-variation)

3) Combined experimental & computational

Secondary structure prediction

Page 12: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 12

Secondary structure prediction strategies

1)Energy minimization (thermodynamics)

• Algorithm: Dynamic programming to find high probability pairs(also, some genetic algorithms)

• Software:Mfold - ZukerVienna RNA Package -

Hofacker RNAstructure - MathewsSfold - Ding & Lawrence

R Knight 2005

Page 13: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 13

Secondary structure prediction strategies

2) Comparative sequence analysis (co-variation)• Algorithms:

Mutual informationStochastic context-free grammars

• Software: ConStructAlifoldPfoldFOLDALIGNDynalign

R Knight 2005

Page 14: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 14

Secondary structure prediction strategies

3) Combined experimental & computational

• Experiment:Map single-stranded vs double-stranded regions in folded RNA

• How?Enzymes: S1 nuclease, T1 RNaseChemicals: kethoxal, DMS

R Knight 2005

Page 15: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 15

Experimental RNA structure determination?

• X-ray crystallography

• NMR spectroscopy

• Enzymatic/chemical mapping

• Molecular genetic analyses

Page 16: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 16

1) Energy minimization method

What are the assumptions?

Native tertiary structure or "fold" of an RNA molecule is (one of) its "lowest" free energy configuration(s)

Gibbs free energy = G in kcal/mol at 37C

= equilibrium stability of structure lower values (negative) are more favorableIs this assumption valid?

in vivo? - this may not hold, but we don't really know

Page 17: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 17

Free energy minimization

What are the rules?

A UA U

A=UA=U

Basepair

G = -1.2 kcal/mole

A UU A

A=UU=A

G = -1.6 kcal/mole

Basepair

What gives here?

C Staben 2005

Page 18: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 18

Energy minimization calculations:Base-stacking is critical

AAUU

-1.2 CGGC

-3.0

AU or UAUA AU

-1.6 GCCG

-4.3

AG, AC, CA, GAUC, UG, GU, CU

-2.1 GUUG

-0.3

CCGG

-4.8 XG, GXYU, UY

0

- Tinocco et al.

C Staben 2005

Page 19: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 19

Nearest-neighbor parameters

Most methods for free energy minimizationuse nearest-neighbor parameters (derived from experiment) for predicting stability of an RNA secondary structure (in terms of G at 37C)

& most available software packages use the same set of parameters: Mathews, Sabina, Zuker & Turner,

1999

Page 20: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 20

Energy minimization - calculations:

Total free energy of a specific conformation for a specific RNA molecule = sum of incremental energy terms for:

• helical stacking (sequence dependent)• loop initiation• unpaired stacking

(favorable "increments" are < 0)

Fig 6.3Baxevanis & Ouellette 2005

Page 21: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 21

But how many possible conformations for a single RNA molecule?

Huge number:Zuker estimates (1.8)N possible secondary

structures for a sequence of N nucleotides

for 100 nts (small RNA…) = 3 X 1025 structures!

Solution? Not exhaustive enumeration… Dynamic programming

O(N3) in time O(N2) in

space/storage iff pseudoknots

excluded, otherwise:O(N6 ), timeO(N4 ), space

Page 22: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 22

Algorithms based on energy minimization

For outline of algorithm used in Mfold, including description of dynamic programming recursion, please visit Michael Zuker's lecture: http://www.bioinfo.rpi.edu/~zukerm/lectures/RNAfold-html

From this site, you may also download his lecture as either PDF or PS file.

Hmmm, something based on this might make an interesting "Final Exam" question: how could one apply dynamic programming approaches learned in first half of course to RNA structure prediction problem?

Page 23: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 23

2) Comparative sequence analysis (co-variation)

Two basic approaches:

• Algorithms constrained by initial alignment

Much faster, but not as robust as unconstrained

Base-pairing probabilities determined by a partition function

• Algorithms not constrained by initial alignment

Genetic algorithms often used for finding an alignment & set of structures

Page 24: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 24

RNA Secondary structure prediction: Performance?

How evaluate? • Not many experimentally determined structures

currently, ~ 50% are rRNA structures so "Gold Standard" (in absence of tertiary structure):

compare with predicted RNA secondary structure with that determined by

comparative sequence analysis (!!??) using Benchmark Datasets

NOTE: Base-pairs predicted by comparative sequence analysis for large & small subunit rRNAs are 97% accurate when compared with high resolution crystal structures!

- Gutell, Pace

Page 25: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 25

RNA Secondary structure prediction: Performance?

1)Energy minimization (via dynamic programming) 73% avg. prediction accuracy - single

sequence2) Comparative sequence analysis

97% avg. prediction accuracy - multiple sequences (e.g., highly conserved rRNAs)much lower if sequence conservation is lower &/or fewer sequences are available for alignment

3) Combined - recent developments: combine thermodynamics & co-variation

& experimental constraints? IMPROVED RESULTS

Page 26: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 26

RNA structure prediction strategies

Requires "craft" & significant user input & insight

1) Extensive comparative sequence analysis to predict tertiary contacts (co-variation)

e.g., MANIP - Westhof2) Use experimental data to constrain model

building e.g., MC-CYM - Major

3) Homology modeling using sequence alignment & reference tertiary structure (not many of these!)

4) Low resolution molecular mechanics e.g., yammp - Harvey

Tertiary structure prediction

Page 27: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 27

New Today:

Protein Structure & Function

Page 28: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 28

Protein Structure & Function

Protein structure - primarily determined by sequence

Protein function - primarily determined by structure

• Globular proteins: compact hydrophobic core & hydrophilic surface

• Membrane proteins: special hydrophobic surfaces

• Folded proteins are only marginally stable• Some proteins do not assume a stable "fold"

until they bind to something = Intrinsically disordered

Predicting protein structure and function can be very hard -- & fun!

Page 29: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 29

4 Basic Levels of Protein Structure

Page 30: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 30

Primary & Secondary Structure

Primary • Linear sequence of amino acids• Description of covalent bonds linking aa’s

Secondary • Local spatial arrangement of amino acids• Description of short-range non-covalent

interactions• Periodic structural patterns: -helix, -sheet

Page 31: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 31

Tertiary & Quaternary Structure

Tertiary • Overall 3-D "fold" of a single polypeptide chain• Spatial arrangement of 2’ structural elements;

packing of these into compact "domains"• Description of long-range non-covalent

interactions (plus disulfide bonds)

Quaternary• In proteins with > 1 polypeptide chain, spatial

arrangement of subunits

Page 32: 11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function1 11/4/05 Protein Structure & Function

11/04/05 D Dobbs ISU - BCB 444/544X: Protein Structure & Function 32

"Additional" Structural Levels

• Super-secondary elements

• Motifs• Domains• Foldons