bioinformatics medc601 lecture by brad windle ph# 628-1956 email: [email protected] office: massey...

37
Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: [email protected] Office: Massey Cancer Center, Goodwin La Room 319 Web site for lecture: http://www.people.vcu.edu/~bwindle/Courses/MEDC601

Upload: alberta-peters

Post on 13-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

Bioinformatics

MEDC601

Lecture by Brad WindlePh# 628-1956email: [email protected]: Massey Cancer Center, Goodwin LabsRoom 319

Web site for lecture:

http://www.people.vcu.edu/~bwindle/Courses/MEDC601

Page 2: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

Profile

A set of data or characteristics pertaining to an item

Profiles are sometimes referred to as Signatures or Fingerprints

Page 3: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

Cellular Profiles

GeneExpression

ProteinExpression

MiscData

SNPs

DNAMethylation

Cell State

Drug Response

Metabolitics

StructuralGenomic

ProteinStates

Disease

Gene/ProteinSequence

ProteinStructure

DrugStructure

Page 4: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

Cellular Profiles

GeneExpression

ProteinExpression

MiscData

SNPs

DNAMethylation

Drug Response

Metabolitics

StructuralGenomic

ProteinStates

Gene/ProteinSequence

ProteinStructure

DrugStructure

Page 5: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

Profiles Have Two Sides

Genes / Samples Sample 1 Sample 2 Sample 3 Sample 4

Gene 1 1 2 3 32

Gene 2 5 3 17 22

Gene 3 23 65 21 23

Gene 4 2 1 3 3

Genes / Samples Sample 1 Sample 2 Sample 3 Sample 4

Gene 1 1 2 3 32

Gene 2 5 3 17 22

Gene 3 23 65 21 23

Gene 4 2 1 3 3

A gene profile across samplesand a sample profile across genes

Page 6: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

Bioinformatics uses tools for learning from the Profiles

There are two basic forms of learning

Unsupervised Learning

Supervised Learning

Page 7: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

Unsupervised Learning

Definition: Learning from observationExplorationLet the data reveal what you learnYou can learn that what you did not expectAllows you to formulate relevant hypothesesIt’s a hypothesis generator

Supervised Learning

Definition: Learning from exampleIt’s focusedAllows you to test relevant hypotheses but it doesn’t usually allow you to prove the hypothesisIt often involves statistical or computational modelingUsually requires experimental validation of what you learn

Page 8: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

Supervised Learning

Examples of methods

QSAR Modeling, Prediction model

Classification Model, eg., Is a patient a good candidate for a particular drug treatment?

Simulation Modeling, eg., Cell simulation

Page 9: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

Why can’t we observe the patterns unaided?

The patterns are too complex or abstract.

There’s too much data.

There’s too much noise.

Page 10: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Page 11: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Page 12: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Page 13: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Page 14: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Page 15: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Page 16: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Page 17: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

Drug-related profiles

drug1

R

R

R

R

R

R

e1e2e3e4e5e6

Drug profilebased on structure

drug1 drug 2 drug 3 drug 4

e1

Structure profilebased on drugs

How phys/chem properties relate to biological or biochemical properties, such cell killing or enzyme activity is within the realm of QSAR

Page 18: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

cell1

drug1drug2drug3drug4drug5drug6

Cell profile based on various drug sensitivities

Drug profile based on cellsensivities

cell1 cell2 cell3 cell4 cell5 cell6

drug 1

Page 19: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

COMPARE

NCI 60 cell lines profiled for sensitivity to drugs

How do drugs relate to each other based on cellular response?

How do cells relate to each other based on drug response?

Page 20: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

A major tool in unsupervised learning is

Cluster Analysis

It evaluates how similar items are to each otherDo they have similar patterns within their profiles?How relatively close the items are to each otherThere are various ways to measure closeness

Page 21: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Page 22: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

Cell response profile

Monks et al. Anti-Cancer Drug Design 12:553 (1997)

Page 23: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

Cell line clustersbased on drug response

Cell clusters correspondcell type to a limited extent

Scherf et al, nature genetics 24:236 (2000)

Page 24: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

Drug clusters correspond to drug targets or mechanisms of action

not necessarily drug structure.

Scherf et al, nature genetics 24:236 (2000)

Page 25: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

Scherf et al, nature genetics 24:236 (2000)

Page 26: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

QuickTime™ and aH.264 decompressor

are needed to see this picture.

Page 27: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

QuickTime™ and aH.264 decompressor

are needed to see this picture.

Page 28: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

QuickTime™ and aH.264 decompressor

are needed to see this picture.

Page 29: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

QuickTime™ and aH.264 decompressor

are needed to see this picture.

Page 30: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

COMPARE is a resource for exploring targets and mechanisms

Your compound of interest can be profiled and compared tothe profiles for >70,000 compounds

Compounds with good matches may have known characteristics,such as target and mechanism, thus revealing a possible target andmechanism for your compound

http://dtp.nci.nih.gov/docs/compare/compare.html

Page 31: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

Wallqvist StudyWallqvist et al, Molecular Cancer Therapeutics 1:311-320 (2002)

Found genes that correlated with drug sensitivity

Hypothesized that some of those gene’s proteins are the targetof the drugs that correlated with gene expression

Page 32: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Page 33: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

Start with drug response, identify drugs based on correlation with genes

Drugs

123

Identified genes based on correl with drugs 123

Genes

Identify corresponding protein for each gene from structural protein database (PDB)

123

Proteins

Identify small compounds (ligands) that have been fitted to proteins using 3D modeling

123

Ligands

Identify ligands with structural correlation with drugs

Page 34: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

Compound 1Compound 2

Compound 5

Compound 9

Compound 11

Compound 15

Compound 18

Compound 20

Gene 4Gene 5

Gene 7Gene 8

Gene 14

Gene 16

Gene 23Gene 24

Protein 2

Protein 7

Protein 9

Protein 17

Protein 23

Ligand 4

Ligand 8Ligand 9

Ligand 16

Compound 6

Compound 11

Compound 6

Compound 3Compound 4

Compound 7Compound 8

Compound 10

Compound 12Compound 13Compound 14

Compound 16Compound 17

Compound 19

Gene 1Gene 2Gene 3

Gene 6

Gene 9Gene 10Gene 11Gene 12Gene 13

Gene 15

Gene 17Gene 18Gene 19Gene 20Gene 21Gene 22

Gene 25Gene 26Gene 27

Protein 1

Protein 3Protein 4Protein 5Protein 6

Protein 8

Protein 10Protein 11Protein 12Protein 13Protein 14Protein 15Protein 16

Protein 18Protein 19Protein 20Protein 21Protein 22

Protein 24

Ligand 1Ligand 2Ligand 3

Ligand 5Ligand 6Ligand 7

Ligand 10Ligand 11Ligand 12Ligand 13Ligand 14Ligand 15

Ligand 17Ligand 18Ligand 19Ligand 20

Compound 1Compound 2Compound 3Compound 4Compound 5

Compound 7Compound 8Compound 9Compound 10

Compound 12Compound 13Compound 14Compound 15Compound 16Compound 17Compound 18Compound 19Compound 20

Compounds screened in NCI 60 cell lines

Genes that correlate with compounds in NCI 60 cell lines

Corresponding proteinsin 3D structural database

Compounds that bindin silico to proteins

Compounds with structuralsimilarity to ligands

Page 35: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

The similarity between the ligand specific for the protein and the drug that correlates with expression of the protein (gene) suggests that the drug is targeting the protein (interacting)

calcium/calmodulin-dependent protein kinase I

protein kinase C

Wallqvist et al, Molecular Cancer Therapeutics 1:311-320 (2002)

Page 36: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

alcohol dehydrogenase 5

Wallqvist et al, Molecular Cancer Therapeutics 1:311-320 (2002)

Page 37: Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:

4 sets of dataDRGEPPDBLigand

Gene-Drug correlationGene to Protein translationProtein-Ligand predictionLigand-Drug correlation