strukturna biologija, bioinformatika, biologija sistema biologija 21-og veka
TRANSCRIPT
Strukturna biologija, bioinformatika, biologija sistema
biologija 21-og veka
Danasnja presentacija bice podeljena u tri dela:
1. Pozadina i opsti uvod- NMR- Rentgenska kristalografija2. Strukturna biologija – primeri iz moje
laboratorije3. Poseban osvrt na bioinformatiku i
biologiju sistema kao interdisciplinarne grane relvantne za kompiuterske nauke
Human & other genome sequences
Functionally cloned DNA moleculescoding for proteins of interest.
Amino acid sequences
Protein structure Protein function
Structural biology
Protein chemistry / yeast 2-hybrid screens / proteomics / enzymology / genetics / transgenes / knock-outs / knock-ins / chemical genetics / etc.
Bioinformatics
3D structures of molecules allow us to understand biological processes at the most basic level. We can ‘see’ which molecules interact, how they interact, how they function, how drugs act. They can help us understand disease at an atomic level. 3D structures can be exploited in development of new drugs.
structure-based drug design
Strukturna Biologija• Struktura moze ponekad da odkrije funkciju proteina
direktno • Struktura moza da racionalizuje eksperimentalnu
obzervaciju o selktivitetu i specificnosti enzimaticne reakcije
• Struktura moze da postane osnova za rational drug/inhibitor discovery.
• Struktura moze da razotkrije dinamicki aspekt proteinskog ponasanja.
• Trodimenzionalne topologije polipeptida obezbedjuju podatke za resavanje problema formiranja proteinske strukture -- ‘protein folding problem’.
• [Sometimes a 3D structure can be rather uninformative - the ‘structural genomics’ debate.]
Experimental modes of molecular structural biology
X-ray crystallography
Protein crystals
(3-50 mgs/ml; ca. 100 ml)
[Se-methionine labelling]
X-ray source/-Synchrotron
Applicable to proteins of any size (in principle).
NMR spectroscopy
Protein solutions
(> 0.5 mM; min. volume 0.3ml; 10-100 mg)
15N,13C-Isotope labelling
NMR Spectrometer
Applicable to small(ish) proteins (smaller than ca. 30,000 MW)
Cryo-electron microscopy
Macromolecular assemblies/particles frozen in vitreous ice
Electron microscope
Large particles, typically > 500,000 MW
High resolutionMedium resolution
‘Low’ resolution
x 10 x 10x 10x 10x 10x 10 0.2 nm2 nm20 nm200 nm2 m20 m200 m
CELLS MOLECULES
ORGANELLES ATOMS
Unaided eye
Light microscope
Electronmicroscope
1 m = 103 mm
= 106 m
= 109 nm
The sizes of cells and of their component parts
STA JE NMR?
Nuclear Magnetic Resonance (NMR) je mocna spektroskopska tehnika koja pruza informaciju o strukturnim i hemijskim osobinama molekula.
NMR je ne-destruktivna metoda za analizu strukture i dinamike molekula. NMR koristi osobine odredjenih atoma kada su izlozeni vrlo jakom magnetnom polju. For biochemists these are mainly 1H, 15N, 13C and 31P. 1H and 31P are highly abundant isotopes whilst 15N and 13C are present at only low levels < 1%. Studies using these nuclei generally require isotopic enrichment by production of the molecule from media that has been enriched in these isotopes.
Prof. Kurt WüthrichNobel Prize for Chemistry 2002
Typically the magnets used in NMR spectroscopy are 10,000-15,000 times stronger than the earth’s magnetic field. The NMR experiment generally consists of applying short bursts or pulses of energy in the radio frequency (RF) range, typically 40-800 MHz, to the sample. These pulses of RF cause the nuclei to rotate away from their equilibrium position and they start to precess (rotate) around the axis of the magnetic field. The exact frequency at which the nuclei precess is related to both the chemical and physical environment of the atom in the molecule. By using different combinations of RF pulses and delays it is possible to determine how each atom in the molecule interacts with other atoms in the molecule.
592 709
11 0
11 5
1 2 0
1 2 5
1 0 .0 9 .0 8 .0 7 .0
592 732
11 0
11 5
1 2 0
1 2 5
1 0 .0 9 .0 8 .0 7 .0
1H(ppm)
15N
(ppm
)
11 0
11 5
1 2 0
1 2 5
1 0 .0 9 .0 8 .0 7 .0
543 709
The NMR spectrum is exquisitely sensitive to the conformation of the polypeptide chain, and to the presence of interacting chemical ligands.
These and other features of the rich ‘spin physics’ that underlies the NMR phenomenon mean that NMR spectroscopy is a highly versatile tool for the characterisation of:
Structure Dynamics Molecular interactions
N
C
1
2
3
4
5
6
N
C
N
C
Harris et al. (2004) J. Mol. Biol.
NC
N
C E676E667
E664
E699
E700
E652
D709
4 2 4
3
1
5
6
Harris et al. (2004) J. Mol. Biol.
X-ray Crystallography
- An experimental technique involving diffraction of X-rays by crystalline material.
-X-ray wavelength ~ Å
-Based on the diffraction pattern, electron density of the molecule could be reconstructed. (Need intensities and phases)
-Model is built in the reconstructed electron-density
-Model – the molecular picture – molecular structure from global folds to atomic details
-Limited information about the molecule’s dynamic
-Depends on obtaining crystals
1. Why X-rays?
2. Why electron density?
3. Why crystals?
EYEPIECE LENSmagnification n
Scattered radiation
OBJECTIVE LENSmagnification m
OBJECT
VISIBLE LIGHT
Enlarged image of objectMagnification mn
X-RAYS
OBJECT(crystal)
Scattered radiation
DETECTOR
COMPUTER
PHASES
COMPUTEDELECTRON-DENSITYMAP
CRYSTALLOGRAPHER
Pregled procesa odredjivanja strukture proteina koriscenjem difrakcije X zraka
1. Proizvodnja izolovanog dovoljno velikog kristala kandidat proteina
2. Postavljanje kristala, prikupljanje i evaluacija preliminarnih difrakcionih podataka
3. Kompletno prikupljanje podataka i procena fazi
4. Izgradnja i rafiniranje proteinskih lanaca
5. Validacija strukture
European
Synchrotron
Radiation
Facility
(Grenoble, France)
Structural characterisation of drug-targets from M.tuberculosis
Snezana Djordjevic
Institute of Structural Molecular Biology
M. tuberculosis
• 2-3 million deaths from tuberculosis annually
• 1/3 of world population currently infected with the disease
• Drug resistance
-multidrug-resistant strains
-12.6 % M. tuberculosis isolates resistant to at least one drug
-2.2 % resistant to both isonazid and rifampin
New Drugs
-agents that exhibit activity against drug resistant strains
-completely sterilize infection
-shorten the duration of drug therapy and thus promote drug compliance
METRO – 06/03/2007
Mechanism of resistance to Isoniazid
-Isoniazid is a prodrug that is oxidized by KatG
-KatG is catalase-peroxidase
-Mutation of the KatG leads to resistance
KatG Prodrug activation ResistanceResistance
KatG activity is important for virulence !
-Physiological function of the KatG includes protection of the mycobacterium against H2O2 and other ROS produced by the microbe and its host.
?
KatG AhpC AhpD
AhpD
Alkylhydroperoxidase
From M. tuberculousisPaul Ortiz de Montellano
Dept. of Pharmaceutical Chemistry, UCSF
C2; a=186.38 Å, b=117.28 Å, c=88.99 Å, =113.97°
177 residues/monomer
Structure solution: SeMet/MAD
4 wavelengths data collected in Grenoble
2Fo-Fc map1.9 (1.7) Å resolution
N
CN
C
12
3
4
56
7
8
AhpD Monomer Topology
CXXC
Thioredoxins
-solvent exposed
-pKa ~ 7.1
Peroxiredoxins
From structure to function and From structure to function and the catalytic mechanismthe catalytic mechanism
Cys130
Cys133
His137
Glu118
Putative substrate binding site
Cys133
NADH
NAD+
Lpd(ox)
Lpd(red)
DlaT-LpH2
DlaT-Lp
AhpD(ox)
AhpD(red)
AhpC(red)
AhpC(ox)
ROOH
ROH
Novel redox pathway in M. tuberculosis
Lpd: Dihydrolipoamide dehydrogenase
SucB: Dihydrolipoamide acyltransferase
Components of pyruvate dehydrogenase complexes
E3 E2
Pyruvate Acetyl-CoA + CO2
NAD+ NADH
Molecular Surface
A Prototypical Two-Component Signal
Transduction System
Receptor / input / sensor domain
Kinase Core
ResponseRegulator (RR)
Periplasmic Space
P
ExternalStimulus Response
Histidine Kinase (HK) Sensory Protein
B
P
+CH3
A A
WW
P P+ATP
Y
-CH3
TarSAM
R
B
Chemotaxis
DosS
• Induced by exposure to hypoxia, NO and ethanol.
• Structural studies have been initiated with the aim of describing the signalling mechanism that leads to histidine kinase activation.
• Histidine kinase domain (HK) undergoes autophosphorylation and can carry out a Mg2+ dependant phosphotransfer reaction onto DosR.
• DosS : DosR are a cognate sensor-regulator pair.
Identification of domain boundaries
PDE2A_B 196 DVSVLLQEIITEARN-------LSNAEICSVFLLDQ------------NELVAKVFDGGVVDDe----sY
DosS GAF_A 3 DLEATLRAIVHSATS-------LVDARYGAMEVHDRQH---------RVLHFVYEGIDEETVR------R
cGMP PDE_1 154 DVTALCHKIFLHIHG-------LISADRYSLFLVCEdss-------ndKFLISRLFDVAEGSTleeasnN
cGMP PDE_2 336 SLEVILKKIAATIIS-------FMQVQKCTIFIVDEdcsdsf-ssvfhMECEELEKSSDTLTR------E
anfA 46 DLADALSIVLGVMQQ-------HLKMQRGIVTLYDMr----------aETIFIHDSFGLTEEEk-----K
cGMP PDE_3 228 DATSLQLKVLRYLQQ-------ETQATHCCLLLVSEd----------nLQLSCKVIGEKVLG-------E
ADEN_CYCL_1 79 GFENILQEMLQSITLkt---geLLGADRTTIFLLDEe----------kQELWSIVAAGEGDRS------L
ADEN_CYCL_2 271 DLEDTLKRVMDEAKE-------LMNADRSTLWLIDRd----------rHELWTKITQDNGST-------K
yebR 27 DLNRDFNALMAGETS-------FLATLANTSALLYErlt-------diNWAGFYLLEDDTLVLg----pF
Hypoth. Pro. 54 LIKATLQKTMEASIH-------QTGAQLGSLFLLDGd----------gRVTESILARGATDQSqk---kN
Nif-regul_1 68 RLEVTLANVVNVLSS-------MLQMRHGMICILDSe-----------GDPDMVATTGWTPEMa-----G
Nif-regul_2 46 RLEVTLANVLGLLQS-------FVQMRHGLVSLFNDd-----------GVPELTVGAGWSEG-------T
Nif-regul_3 35 NTARALAAILEVLHD-------HAFMQYGMVCLFDKe----------rNALFVESLHGIDGERkk--etR
Nif-regul_4 21 DLSKTLREVLNVLSA-------HLETKRVLLSLMQDs-----------GELQLVSAIGLSYEEf-----Q
consensus 1 DLEELLQTILEELRQ-------LLGADRVSIYLVDEDK---------
RGELVLVASDGLTLPE------L
PDE2A_B EIRIPADQ-----GIAGHVATTGQILNIP-DAYAHPl--fYRGVDDSTGFR-----TRNILCFPIKNEn- DosS GAF_A IGHLPKGL-----GVIGLLIEDPKPLRLD-DVSAHP----AS-IGFPPYHPP----MRTFLGVPVRVR-- cGMP PDE_1 CIRLEWNK-----GIVGHVAAFGEPLNIK-DAYEDPr--fNAEVDQITGYK-----TQSILCMPIKNHr- cGMP PDE_2 RDANRINY-----MYAQYVKNTMEPLNIP-DVSKDKr---FPWTNENMGNInq-qcIRSLLCTPIKNGk- anfA RGIYAVGE-----GITGKVVETGKAIVAR-RLQEHP-----DFLGRTRVSRng-kaKAAFFCVPIMRA-- cGMP PDE_3 EVSFPLTM-----GRLGQVVEDKQCIQLK-DLTSDD----VQQLQNMLGCE-----LRAMLCVPVISRa- ADEN_CYCL_1 EIRIPADK-----GIAGEVATFKQVVNIPfDFYHDPrsifAQKQEKITGYR-----TYTMLALPLLSEq- ADEN_CYCL_2 ELRVPIGK-----GFAGIVAASGQKLNIPfDLYDHPdsatAKQIDQQNGYR-----TCSLLCMPVFNGd- yebR QGKIACVRipvgrGVCGTAVARNQVQRIE-DVHVFD-------GHIACDAA-----SNSEIVLPLVVK-- Hypoth. Pro. IVGQVLDK-----GLAGWVRENKRTGLIN-DTTKDY----RWLKLPDEPYQ-----ALSALGVPIVWG-- Nif-regul_1 QIRAHVPQ-----KAIDQIVATQMPLVVQ-DVTADP-----LFAGHEDLFGppeeaTVSFIGVPIKAD-- Nif-regul_2 DERYRTCVp---qKAIHEIVATGRSLMVE-NVAAEt---aFSAADREVLGAsd-siPVAFIGVPIRVD-- Nif-regul_3 HVRYRMGE-----GVIGAVMSQRQALVLP-RISDDQ-----RFLDRLNIYDy----SLPLIGVPIPGAd- Nif-regul_4 SGRYRVGE-----GITGKIFQTETPIVVR-DLAQEP-----LFLARTSPRQsqdgeVISFVGVPIKAA--
consensus GVRFPLDE-----GLVGRVAETGRPLVIP-DVEADP----FFFLDLLQRYQL----IRSFLAVPLVAG--
12
3 44 5
2
3
Further structural investigation of GAF domains
Secondary Structure: 1MC0
PDE2A_B -QEVIGVAELVNK-------------------INGPWFSKFDEDLATAFSIYCGISIAHSLLYKKVN 345 DosS GAF_A -DESFGTLYLTDK-------------------TNGQPFSDDDEvlvqalaaaagiavanarlyqqak 150 cGMP PDE_1 -EEVVGVAQAINKk-----------------sGNGGTFTEKDEKDFAAYLAFCGIVLHNAQLYETSL 314 cGMP PDE_2 kNKVIGVCQLVNKmee--------------ttGKVKAFNRNDEQFLEAFVIFCGLGIQNTQMYEAVE 503 anfA -QKVLGTIAAERV-------------------YMNPRLLKQDVELLTMIATMIAPLVELYLIENIER 196 cGMP PDE_3 tDQVVALACAFNK-------------------LGGDFFTDEDERAIQHCFHYTGTVLTSTLAFQKEQ 375 ADEN_CYCL_1 -GRLVAVVQLLNKlkpyspp-----dallaerIDNQGFTSADEQLFQEFAPSIRLILESSRSFYIAT 249 ADEN_CYCL_2 -QELIGVTQLVNKkktgefppynpetwpiapeCFQASFDRNDEEFMEAFNIQAGVALQNAQLFATVK 441 yebR -NQIIGVLDIDST--------------------VFGRFTDEDEQGLRQLVAQLEKVLATTDYKKFFA 179 Hypoth. Pro. -DELLGILTLMHS--------------------QVNHFTPACATAMEKTAELIALVLNNARIQTKHK 202 Nif-regul_1 -HHVMGTLSIDRIw-----------------dGTARFRFDEDVRFLTMVANLVGQTVRLHKLVASDR 220 Nif-regul_2 -STVVGTLTIDRIp------------------EGSSSLLEYDARLLAMVANVIGQTIKLHRLFAGDR 198 Nif-regul_3 -NQPAGVLVAQPM-------------------ALHEDRLAASTRFLEMVANLISQPLRSATPPESLP 186 Nif-regul_4 -REMLGVLCVFRDg------------------QSPSRSVDHEVRLLTMVANLIGQTVRLYRSVAAER 180
consensus -GELLGVLALHRK-------------------DSPRPFTEEEEELLQALANQLAIALALAQLYEELR
56Secondary Structure|1MC0
SAMt99 : to detect remote structural homologues of this protein.
From the 11149 sequence homologies identified, 24 had a known structure but none of those identified produced significant global alignment. Local alignments covered either the C or N terminal regions. No alignment was found that covered both putative GAF domains.
1 structural homologue was identified for DosS GAF A domain : 1MC0
UV-Visible Characterisation of GAF A HaemAbsorption spectra of DosS 63-210
Wavelength (nm)
550 600 650
B. Ferric (solid line)A. Oxy-ferrous (dashed line)
C. Ferrous (dotted line)
D. Ferrous-CO (solid line)
E. Ferrous-NO (solid line)
A0.005
Absorption spectra of Haemoglobin
Wavelength (nm)
500 550 600 650 700
A0.1
A. Ferric haemoglobin (solid line)
B. Oxy-ferrous haemoglobin (dashed line)
C. Ferrous haemoglobin (dotted line)
D. Ferrous-CO haemoglobin (solid line)
E. Ferrous-NO haemoglobin (solid line)
Absorption spectra of Haemoglobin
Absorption spectra of DosS GAF A
Fe2+
His
CO / NO / O2
Visible/UV spectrum of the DosS GAF A (63-210) histidine to alanine mutants
The Model of Signalling
Fe2+
Fe2+
P
O2
PNO
OFF
ON
DosR
DosR
A
A
B
B
GAF B - NMR1H, 15N labeled DosS GAF B HSQC
NMR experiments: HNCO, HNCA, HN(CO)CA, HNCACB, CBCA(CO)NH, HA(CA)NH and HA(CACO)NH were obtained at 1H frequency of 500MHz on a 0.6mM [1H, 13C, 15N]-labelled DosS 231-379, pH6, 20mM phosphate, 100mM NaCl.
GAF B - NMR
Predicted secondary structure for DosS GAF 2 using PSIPRED.
PROBLEMS:
- 48 residues are still to be assigned
- 21 expected cross-peaks are missing from the spectrum.Sekharan MR, Rajagopal et al. 2005. Backbone 1H, 13C, and 15N resonance assignment of the 46 kDa dimeric GAF A domain of phosphodiesterase 5 J Biomol NMR. 33(1):75
- Some of the cross-peaks do not form one peak but multiple peaks.
- High content of Val, Leu and Ala residues in the sequence.
Signalling mechanism
N
C
STRUCTURAL GENOMICS CENTRES IN
NORTH AMERICA, UK, FRANCE, JAPAN • OXFORD STRUCTURAL GENOMICS• Announced in 2003, with operations commencing in July
2004 for an initial three-year period, this initiative received funding from Canadian, Swedish and British sponsors from both the public and private sectors. For the second phase, July 2007, over £49 million is being made from public funding agencies in Canada, Sweden and Ontario, charitable foundations in the UK and Sweden, GlaxoSmithKline plc, Novartis and Merck. Laboratories at the University of Oxford , University of Toronto and Karolinska Institutet, Stockholm.
BIOINFORMATIKA
U toku poslednjih nekoliko dekada, napredak u molekularnoj biologiji, zajedno sa progresom u genetskoj tehnologiji doveo je do eksplozije u kolicini informacija stvorenih u naucnoj zajednici. Pojava te mase informacija proizvela je potrebu i zahtev za kompiuterizovanim bankama podataka (databases) da bi se cuvali, organizovali i katalogovali podaci. Pritom neophodno je bilo razviti sredstva (tools) za pregled, vizualizaciju i analizu tih podataka.
Computational biology(sam proces analize i interpretacije podataka)
• Razvoj i primena alatki (tools) koji omogucavaju pristup, upotrebu i organizaciju raznih informacija
• Razvoj novih algoritma i statistike sa kojima se mogu proceniti relazije medju komponentama u velikoj grupi podataka. Na primer metode za lociranje gene u okviru sekvence, predvidjanje strukture proteina/funkcije, i grupisanje proteinskih sekvenci u familije povezanih (related) slicnih sekvenci.
“Organizmi funkcionisu kao integrisani sistemi – nasa cula, nasi misici, nas metabolizam i nas um rade zajedno u povezanoj celini. Biolozi su tradicionalno proucavali organizme deo po deo i uzivali u modernoj moci da proucavaju molekul po molekul, gen po gen. ISM je posvecen novoj nauci, kriticnoj nauci buducnosti kojoj je za cilj da razume integraciju delova koji sacinjavaju bioloski system.”
David Baltimore (Nobel Laureate) President, Cal. Institute of Tech., Pasadena
Systems biology requires:
-Integration of biology, technology, computation medicine
-a strong cross-disciplinary team of researchers.
-Institutes include scientists trained in biology, physics, chemistry, engineering, computing, mathematics, medicine, immunology, biochemistry, and genetics.
-They all speak the language of biology assembled into a multiplicity of teams that are attacking focused and important problems of systems biology.
Health Care in the 21st Century:• Predictive
(genetic makeup, protein markers) • Preventive
(probability of disease and response to treatment)• Personalized
(customized therapeutic drugs)
http://csbi.mit.edu:8080/infoglueDeliverWorking/
The MIT CSBI links biologists, computer scientists and engineers in a multi-disciplinary approach to the systematic analysis of complex biological phenomena.
http://www.sbml.org/Main_Page
The Systems Biology Markup Language (SBML) is a computer-readable format for representing models of biochemical reaction networks in software. It's applicable to models of metabolism, cell-signaling, and many others. SBML has been evolving since mid-2000 thanks to an international community of software developers and users. This website is the portal for the global SBML development effort; here you can find information about all aspects of SBML.
Manchester Centre for Integrative Systems Biology (MCISB)
• Molecular Biology / Biochemistry / Biophysics), mathematical and computational (Modelling / Data Integration / Text Mining
• Development and exploitation of methods for the quantitative measurement of kinetic and binding constants on a genome-wide scale
• Combined approaches will lead to computer models of parts of living cells. Some of these 'silicon cells' are already available for in silico experimentation, through the Biomodels and JWS databases.
VIRTUELNA CELIJA
My Group: NMR:Sunita Sardiwal Paul DriscollSyeed Hussain Richard HarrisShreenal PatelMark Jeeves ITC/CD:Christine Nunn John Ladbury
Paul Leonard
Collaborators at RVC: UV-VIS SpectraNeil Stoker Peter RichSharon Kendall Doug MarshallFarahnaz MoahedzadehStuart Rison
PHOSPHORYLATION Studies: EM:Irina Tsaneva Helen Saibil
Nadav Elad
Acknowledgements: