cmsc858b: computational systems biology and functional...

9
CMSC858B: Computational systems biology and functional genomics Héctor Corrada Bravo Dept. of Computer Science Center for Bioinformatics and Computational Biology University of Maryland University of Maryland, Spring 2012 1 Monday, January 30, 2012 My website: http://www.cbcb.umd.edu/~hcorrada Class webpage: http://www.cbcb.umd.edu/~hcorrada/CMSC858B Mailing list (via Google Groups).You will receive invitation by email. Let me know if you don’t by next lecture. 2 Monday, January 30, 2012 1. Name 2. email (@umd.edu) 3. Department and degree 4. Are you registered?(Y/N) 5. Relevant CS background 6. Relevant stats background 7. Relevant biology background 8. What do you hope to get out of this class? 9. (a) Favorite, and (b) least favorite CS/stats term/name/ word/phrase. Why? 10. (a) Favorite, and (b) least favorite biology term/name/word/ phrase. Why? 3 Monday, January 30, 2012 1. Name: Héctor Corrada Bravo 2. email (@umd.edu): [email protected] 3. Department and degree: CS, PhD 4. Are you registered?(Y/N): No 5. Relevant CS background: Machine Learning, Intro to Bioinformatics, Advanced Bioinformatics 6. Relevant stats background: PhD-level math stats sequence in stats dept. 7. Relevant biology background: undergrad bio course 8. What do you hope to get out of this class? a neat project with publishable results 9. (a) Favorite, and (b) least favorite CS/stats term/name/ word/phrase. (a) pigeon-hole principle, (b) machine learning 10. (a) Favorite, and (b) least favorite biology term/name/word/ phrase. (a) oligonucleotide, (b) mammalome 4 Monday, January 30, 2012

Upload: others

Post on 31-May-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CMSC858B: Computational systems biology and functional ...users.umiacs.umd.edu/~hcorrada/CMSC858B/lectures/lect1/CMSC858P_Lecture1.pdfCMSC858B: Computational systems biology and functional

CMSC858B:Computational systems biology

and functional genomics

Héctor Corrada Bravo Dept. of Computer Science

Center for Bioinformatics and Computational BiologyUniversity of Maryland

University of Maryland, Spring 2012

1Monday, January 30, 2012

• My website:

• http://www.cbcb.umd.edu/~hcorrada

• Class webpage:

• http://www.cbcb.umd.edu/~hcorrada/CMSC858B

• Mailing list (via Google Groups). You will receive invitation by email. Let me know if you don’t by next lecture.

2Monday, January 30, 2012

1. Name2. email (@umd.edu)3. Department and degree4. Are you registered?(Y/N)5. Relevant CS background6. Relevant stats background7. Relevant biology background8. What do you hope to get out of this class?9. (a) Favorite, and (b) least favorite CS/stats term/name/

word/phrase. Why?10. (a) Favorite, and (b) least favorite biology term/name/word/

phrase. Why?

3Monday, January 30, 2012

1. Name: Héctor Corrada Bravo2. email (@umd.edu): [email protected]

3. Department and degree: CS, PhD4. Are you registered?(Y/N): No5. Relevant CS background: Machine Learning, Intro to

Bioinformatics, Advanced Bioinformatics6. Relevant stats background: PhD-level math stats sequence in

stats dept.7. Relevant biology background: undergrad bio course8. What do you hope to get out of this class? a neat project

with publishable results9. (a) Favorite, and (b) least favorite CS/stats term/name/

word/phrase. (a) pigeon-hole principle, (b) machine learning10. (a) Favorite, and (b) least favorite biology term/name/word/

phrase. (a) oligonucleotide, (b) mammalome

4Monday, January 30, 2012

Page 2: CMSC858B: Computational systems biology and functional ...users.umiacs.umd.edu/~hcorrada/CMSC858B/lectures/lect1/CMSC858P_Lecture1.pdfCMSC858B: Computational systems biology and functional

Course Introduction

5Monday, January 30, 2012

!"#$%"&'(

• !"#• $%&'(')'(*• +*,*• -&.,)/&012',3'&3+*,*3451&*))0',• -&.,)6.2',3'&37&'8*0,)• +*,'(*• 9.)*:1.0&3;#3-3$3+<

6Monday, January 30, 2012

)*#$+&"$'#$,*-./&"0$$$(1,*$2-3(4

7Monday, January 30, 2012

5*&6'6(6'"(

-%*)*3.&*3./8=.66>3%=(.,?3#,@3A'&3.3@'B,3)>,@&'(*31.2*,8

8Monday, January 30, 2012

Page 3: CMSC858B: Computational systems biology and functional ...users.umiacs.umd.edu/~hcorrada/CMSC858B/lectures/lect1/CMSC858P_Lecture1.pdfCMSC858B: Computational systems biology and functional

)*+%$-($7"06'-,(4• 8+,*$,"..$,60%+-0($+$,6'2."%"$,62#$69$+0$6&3+0-(':($3"06'";$6&$<.1"2&-0%$96&$+..$,"..1.+&$(%&1,%1&"($+0/$+,=>-="(?

• @*"$3"06'"$-($/-(%&-<1%"/$+.603$,*&6'6(6'"(;$A*-,*$+&"$'+/"$69$,6'2&"(("/$+0/$"0%A-0"/$BCD?

• 5"..($+&"$69$'+0#$/-E"&"0%$%#2"($F"?3?$<.66/;$(G-0;$0"&>"$,"..(H;$<1%$+..$,+0$<"$%&+,"/$<+,G$%6$+$(-03."$,"..;$%*"$9"&=.-I"/$"33?

• 7"06'-,($-($%*"$(%1/#$69$'6.",1.+&$-096&'+=60$%6$10/"&(%+0/$0+%1&+.$*1'+0$>+&-+=60$+0/$/-("+("?$

9Monday, January 30, 2012

BCD

C.8)',3.,@3$&0/D3EFGH

!"#)3;!*'5>&0I',=/6*0/3./0@)<3.&*3('6*/=6*)38'3)8'&*3J*,*2/30,A'&(.2',3'A3.360K0,J3'&J.,0)(?

!"#3/',)0)8)3'A38B'31'6>(*&)3(.@*3A&'(3A'=&38>1*)3'A3,=/6*'2@*)L3.@*,0,*3;#<3J=.,0,*3;+<M3/>8')0,*3;$<3.,@38%>(0,*3;-<?

7=&0,*)L3#M3+N37>&0(0@0,*)L3$M3-

-B'31'6>(*&)3.&*3/'(16*(*,8.&>38'3*./%3'8%*&3.,@3A&'(3.3@'=I6*:%*6053)8&=/8=&*

5’-ACCGTTCGACGGTAA-3’ ||||||||||||||| 3’-TGGCAAGCTGCCATT-5’

10Monday, January 30, 2012

J"+(1&"'"0%

• K6&$+$('+..$"0613*$2-",";$A"$,+0$'"+(1&"$%*"$("L1"0,"$69$<+("(;$&"9"&&"/$%6$+($!"#$"%&'%(

• M1'+0$7"06'"$N&6O",%

11Monday, January 30, 2012

7"06'"TCAGTTGGAGCTGCTCCCCCACGGCCTCTCCTCACATTCCACGTCCTGTAGCTCTATGACCTCCACCTTTGAGTCCCTCCTCTCACACCTGACATGAAAAGGCACATGAGGATCCTCAAATACCCCGTGATCAGTCTCAGGGTAGCTCTCATAGCCTGGACAGGGCCCCCCTCGGGGGTTGCGCCCAGGTCCAGGCGGGGGATGCACAGCAACAGTCACCGAAGCAGAAGCCGTCACAGTGGTGATGGGCTGGCAGTAGCTGGGCACAGAGCTGCCCATGGCGGTGGACGTTGGGTTCCGAGGGTTGTGAGAACGGGCCCCACGGGGCCCTGAGCGGTCCCTATTGCTAGGGCCAGAATGCCCTTCAGTAGAAATTTCAAAAGCGTCTCTGCGCGGTCTGTAGGGGGGTGGCCGCAAGCCTTCTCTAGGGGGATCCCTTCGAGGCTGCTGGCCTTGCCGTCCAGGGGACAAGGAGCCAGAGTCCAGGTGGGGCTGTTGCCGAGGGGTCAAGGGAGGCTGATGTCTGGAGTCCGGATGGACCACCTGCAGAGGAGAGACATAGGTCAACACAGGGAGGTAGGATGGTGGTGATGTTCCACCCACAAAAGAAAACCTATTCCTTTAGAAACCTCCAGGATGTGAATCCTGCCTGCACCTGCACAGCTGGCTGGAGGCATATAGCCACTGCCCATAGATCTCAACTTACCCTCACAACCAACTGCCCCCAGGCCTAAGTTCTCTGCCTCAAAACTGCCAAGGCCTGGATAGCCAAGAGCCTGGGTGTCTTGGAAATATGCAACCATAAATAGTAGCTTTTAGAAGTATAAGGCTCCTGTTTCTGGGTCATATTAGTGTTGTTTTCACCTGTCCCCAGCCCTAAGCCAGGTGTGGCCAGAAGCAAATGTACTGTAAGAGCAGAGCAAAAACTTCCACACAGATAGTTCTGTTAGGCAATACATCTCTGCCTGACTATTAGGAATCTGGTTTCTGGGTCCTCTGTACAAAGCTCGGAGCAACACAGTGGCCACATCAATCAAAAGGACCGTGACCAACTTCAAAGTCGGTGAGCTTGTACCTATTTTTAGGCTCCTGCTGAACAGAACCAGATTCACACTACAGCTCAGCAGGGCATCGTCACGGGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTTGGGGGGGGGGGGTGGACAGAGGACGGGGACACAATTCACTGGCCAGCCCTTCTCTCCTTCAAGGAAGGCTGCTCTAGCCTGGGACTGGAATACACATTTCCTGTAAACATGGTGGGGGCCTCAGGCAAGCCAGAGTTTTGGAGCCTTCCTTAACTCTTCAAGGTGAGCATCTTGACTTGGAGGGTGGGGGTGCGGGTAAGGAAGGAACCTGTGGACTCCTCCCTACAAGACAGAAAAGGAATAAGCCACGAAGACAATAACGATTTTTGTATCAAGCGTCCTCTCCCATTTCAGCTTACCTGACAATGAAATCAAATTCGGACCCTGCAAGCATCAGTACACCCAGCAGAGTGGACACAGCACCGTCCAGAACGGGAGCAAACATGTGCTCCAGAGCGAGCATAGCCCTGTGGTTCTTGTCCCCAATGGCTGTCAGAAAGGCCTGAACAAAGGAGAAAATTGACACGGTCACATTCTGGGTGTGGTAAAGTGCTCAGCTGTGTCTATACTTGGGTTTTGTAT…

!"#$%&$'"()#&"*&+,-&.)&/('$)&01)"'12&3&4&567&8$91&:$.;9&<8:=

12Monday, January 30, 2012

Page 4: CMSC858B: Computational systems biology and functional ...users.umiacs.umd.edu/~hcorrada/CMSC858B/lectures/lect1/CMSC858P_Lecture1.pdfCMSC858B: Computational systems biology and functional

>/?&$;1&#/191&#@"&A.B1;1)#C

!0O*&*,/*)3*516.0,*@3I>3E:EPQ3@0O*&*,/*30,3J*,'(*

R0(06.&02*)3*516.0,*@3I>3)0(06.&3J*,*)13Monday, January 30, 2012

7"0"(

+*,* +*,* +*,* +*,* +*,*

14Monday, January 30, 2012

5"0%&+.$B63'+$$

15Monday, January 30, 2012

J"+(1&"'"0%P$J-,&6+&&+#(

• J6&"$60$%*-($.+%"&

16Monday, January 30, 2012

Page 5: CMSC858B: Computational systems biology and functional ...users.umiacs.umd.edu/~hcorrada/CMSC858B/lectures/lect1/CMSC858P_Lecture1.pdfCMSC858B: Computational systems biology and functional

>/$#&'$D19&#/1'&A.B1;1)#C

S=/%3%=(.,3K.&0.2',30)3@=*38'3@0O*&*,/*30,3T3U3(0660',3I.)*31.0&)3;P?E3Q3'A3J*,'(*<3&*A*&&*@38'3.)3R"7)

17Monday, January 30, 2012

-#$#-#+$$#-$++-#"+-#$-$##-+#-+#-#+*,'(0/3!"#L # R"7

+

R0,J6*3"=/6*'2@*37'6>('&1%0)(3;R"7<3

18Monday, January 30, 2012

E"@&'$)?&8$91:$.;&A.B1;1)F19C

19Monday, January 30, 2012

--$+#--#$+#

##+$-##-+$-

--$+#--#$+#

##+$-##-+$-

Q->"&

R&+-0

20Monday, January 30, 2012

Page 6: CMSC858B: Computational systems biology and functional ...users.umiacs.umd.edu/~hcorrada/CMSC858B/lectures/lect1/CMSC858P_Lecture1.pdfCMSC858B: Computational systems biology and functional

82-3"0"=,(

%V1LWW,0%&'.@(.1?,0%?J'KW47X+4"YSX$RW0(.J*)W*10J*,*2/(*/%.,0)()?Z1J21Monday, January 30, 2012

561&("$S&3+0-I+=60

T?@&+0(,&-2=60P$J"+(1&-03$3"0"$U+,=>-%#VW?X"31.+=60P$D0+.#I-03$%&+0(,&-2=60$&"31.+=60Y?7"0"=,(P$D0+.#I-03$3"06%#2"($+0/$%*"-&$+((6,-+=60$%6$/-E"&"0%$2*"06%#2"(

Z?82-3"0"=,(P$J"+(1&-03$"2-3"0"=,$2&6[."($\?]0%"3&+=60P$M6A$%6$21%$%*"("$/+%+$%63"%*"&$+0/$10/"&(%+0/$<-6.63-,+.$(#(%"'(?$)%*'+'*$,-'."*/0"*'&'%"

[[

22Monday, January 30, 2012

Population Genomics

23Monday, January 30, 2012

)*+%$B6$@*"#$J"+(1&"4

+*,*3451&*))0',3#&&.>)45',3#&&.>)

24Monday, January 30, 2012

Page 7: CMSC858B: Computational systems biology and functional ...users.umiacs.umd.edu/~hcorrada/CMSC858B/lectures/lect1/CMSC858P_Lecture1.pdfCMSC858B: Computational systems biology and functional

Measurements

12........G

1 2 ……….N

DATA MATRIX

Samples (individuals)

Prob

es (

gene

s)

25Monday, January 30, 2012

article

nature genetics • volume 30 • january 2002 41

MLL translocations specify a distinct geneexpression profile that distinguishes aunique leukemiaScott A. Armstrong1–4, Jane E. Staunton5, Lewis B. Silverman1,3,4, Rob Pieters6, Monique L. den Boer6, MarkD. Minden7, Stephen E. Sallan1,3,4, Eric S. Lander5, Todd R. Golub1,3,4,5* & Stanley J. Korsmeyer2,4,8**These authors contributed equally to this work.

Published online: 3 December 2001, DOI: 10.1038/ng765

Acute lymphoblastic leukemias carrying a chromosomal translocation involving the mixed-lineage leukemia gene(MLL, ALL1, HRX) have a particularly poor prognosis. Here we show that they have a characteristic, highly distinctgene expression profile that is consistent with an early hematopoietic progenitor expressing select multilineagemarkers and individual HOX genes. Clustering algorithms reveal that lymphoblastic leukemias with MLL transloca-tions can clearly be separated from conventional acute lymphoblastic and acute myelogenous leukemias. We propose that they constitute a distinct disease, denoted here as MLL, and show that the differences in geneexpression are robust enough to classify leukemias correctly as MLL, acute lymphoblastic leukemia or acute myelogenous leukemia. Establishing that MLL is a unique entity is critical, as it mandates the examination ofselectively expressed genes for urgently needed molecular targets.

1Departments of Pediatric Oncology, 2Cancer Immunology and AIDS and 8Howard Hughes Medical Institute, Dana-Farber Cancer Institute, Boston,Massachusetts, USA. 3Division of Pediatric Hematology/Oncology, Children’s Hospital, Boston, Massachusetts, USA. 4Harvard Medical School, BostonMassachusetts, USA. 5Whitehead Institute/Massachusetts Institute of Technology Center for Genome Research, Cambridge Massachusetts, USA. 6Division of Pediatric Hematology/Oncology, Sophia Children’s Hospital, University of Rotterdam, The Netherlands. 7Princess Margaret Hospital, The University ofToronto, Ontario, Canada. Correspondence and requests for materials should be addressed to S.K. (e-mail: [email protected]) or T.G.(e-mail: [email protected]).

A subset of human acute leukemias with a decidedly unfavorableprognosis possess a chromosomal translocation involving themixed-lineage leukemia gene (MLL, HRX, ALL1) on chromo-some segment 11q23 (refs 1–4). The leukemic cells, which typi-cally have a lymphoblastic morphology, have been classified asacute lymphoblastic leukemia (ALL). Unlike other types of child-hood ALL, however, the presence of the MLL translocation inALL often results in an early relapse after chemotherapy. As MLLtranslocations are typically found in infant leukemias and inchemotherapy-induced leukemia, it has remained uncertainwhether host-related factors or tumor-intrinsic biological differ-ences are responsible for poor survival.

Lymphoblastic leukemias with a rearranged MLL or germlineMLL are similar in most morphological and histochemical char-acteristics. Immunophenotypic differences associated with lym-phoblasts bearing an MLL translocation include a lack of theearly lymphocyte antigen CD10 (ref. 5), expression of the pro-teoglycan NG2 (ref. 6) and a propensity to co-express themyeloid antigens CD15 and CD65 (ref. 5). This prompted thecorresponding gene to be called mixed-lineage leukemia1 andgave rise to models that remain largely unresolved, in which theleukemia reflects disordered cell-fate decisions or the transfor-mation of a more multipotent progenitor.

Translocations in MLL result in the production of a chimericprotein in which the amino–terminal portion of MLL is fused to

the carboxy–terminal portion of 1 of more than 20 fusion part-ners7. This has led to models of leukemogenesis in which theMLL fusion protein either may confer gain of function or neo-morphic properties or may interfere with normal MLL function(with the MLL translocation representing a dominant-negativegene). Moreover, mice heterozygous for Mll (Mll+/–) show devel-opmental aberrations8,9, suggesting that the disruption of oneallele by chromosomal translocation may also manifest itself ashaplo-insufficiency in leukemic cells.

The MLL protein is a homeotic regulator that shares homologywith Drosophila trithorax (trx) and positively regulates the main-tenance of homeotic (Hox) gene expression during develop-ment8. Studies of Mll-deficient mice indicate that Mll is requiredfor proper segment identity in the axioskeletal system and alsoregulates hematopoiesis9. As Mll normally regulates the expres-sion of Hox genes, its role in leukemogenesis may include alteredpatterns of HOX gene expression. Much evidence shows thatHOX genes are important for appropriate hematopoietic devel-opment10. In addition, the t(7;11) (p15;p15) found in humanacute myelogenous leukemia (AML) results in a fusion ofHOXA9 to the nucleoporin NUP98 (refs 11,12). Thus, HOXgenes represent one set of transcriptional targets that warrantassessment in leukemias with MLL translocation.

We considered that MLL translocations might maintain a geneexpression program that results in a distinct form of leukemia

©20

02 N

atur

e Pu

blis

hing

Gro

up h

ttp://

gene

tics.

natu

re.c

om

and reasoned that RNA profiles might resolve whether leukemiasbearing an MLL translocation represent a truly biphenotypicleukemia of mixed identity, a conventional B-cell precursor ALLwith expression of limited myeloid genes, or a less committedhematopoietic progenitor cell. In addition, comparing geneexpression profiles of lymphoblastic leukemias with and withoutrearranged MLL is important because of their markedly differentresponse to standard ALL therapy and because such analysis mayidentify molecular targets for therapeutic approaches. Theexpression profiles reported here show that ALLs possessing arearranged MLL have a highly uniform and distinct pattern thatclearly distinguishes them from conventional ALL or AML andwarrants designation as the distinct leukemia MLL.

ResultsMLL is distinct from conventional ALLTo further define the biological characteristics specified by MLLtranslocations, we compared the gene expression profiles ofleukemic cells from individuals diagnosed with B-precursor ALL bearing an MLL translocationagainst those from individuals diagnosed withconventional B-precursor ALL that lack thistranslocation. Initially, we collected samples from20 individuals with conventional childhood ALL(denoted ALL), 10 of which had a TEL/AML1translocation. In addition, we collected samplesfrom 17 individuals affected with the MLLtranslocation (denoted MLL). Details of theaffected individuals and expression data are avail-able online (Methods).

First, we determined whether there were genesamong the 12,600 tested whose expression patterncorrelated with the presence of an MLL transloca-tion. We sorted the genes by their degree of correla-tion with the MLL/ALL distinction (Fig. 1) andused permutation testing to assess the statistical sig-nificance of the observed differences in gene expres-sion13. For the 37 samples tested, roughly 1,000genes are underexpressed in MLL as compared withconventional ALL, and about 200 genes are rela-tively highly expressed (data not shown). Thus,MLL shows a gene expression profile markedly dif-ferent from that of conventional ALL.

MLL shows multilineage gene expressionInspection of the genes differentially expressedbetween MLL and ALL is instructive (Fig. 1). Manygenes underexpressed in MLL have a function inearly B-cell development. These include genesexpressed in early B cells14,15, MME, CD24, CD22

and DNTT (mouse TdT); genes required for appropriate B-celldevelopment16–19, TCF3, TCF4, POU2AF1 and LIG4; andSMARCA4 (mouse Snf2b), which is correlated with B-precursorALL in an AML/ALL comparison13 (Fig. 1 and Web Note A).Genes encoding certain adhesion molecules are relatively over-expressed in MLL, including LGALS1, ANXA1, ANXA2, CD44and SPN.

Several genes that are expressed in hematopoietic lineagesother than lymphocytes are also highly expressed in MLL. Theseinclude genes that are expressed in progenitors20–22, PROML1,FLT3 and LMO2; myeloid-specific genes23–25, CCNA1, SER-PINB1, CAPG and RNASE3; and at least one natural killercell–associated gene26, the gene encoding NKG2D (Fig. 1 andWeb Note A). Overexpression of HOXA9 and PRG1 in MLL is ofparticular interest, as these genes have been reported to be highlyexpressed in AML13 and overexpression of HOXA9 has beenassociated with a poor prognosis13.

Fig. 1 Genes that distinguish ALL from MLL. The 100 genesthat are most highly correlated with the class distinction areshown. Each column represents a leukemia sample, and eachrow represents an individual gene. Expression levels are nor-malized for each gene, where the mean is 0, expression levelsgreater than the mean are shown in red and levels less thanthe mean are in blue. Increasing distance from the mean isrepresented by increasing color intensity. The top 50 genesare relatively underexpressed and the bottom 50 genes rela-tively overexpressed in MLL. Gene accession numbers and thegene symbols or DNA sequence names are labeled on theright. Individual samples are arranged such that column 1 cor-responds to ALL patient 1, column 2 corresponds to ALLpatient 2, and so on. Information about the samples alongwith the top 200 genes that make the ALL/MLL distinctionand their accession numbers can be found on our web site(http://research.dfci.harvard.edu/korsmeyer/MLL.htm).

article

42 nature genetics • volume 30 • january 2002

©20

02 N

atur

e Pu

blis

hing

Gro

up h

ttp://

gene

tics.

natu

re.c

om

26Monday, January 30, 2012

Population Genomics

• Clustering: Group samples (individuals) that show similar gene expression profiles

• Classification: Discover gene expression profiles that distinguish two populations: e.g., cancer patients vs. healthy people

• Networks: Discover groups of genes whose expression behaves differently in two populations

27Monday, January 30, 2012

Why stats

• If we want to infer things about gene expression in populations, we need to do some statistics

• we want to see if some particular differences we see are due to chance

• we want to make sure an experiment is setup so differences we see are those we care about

• we want to have a sense of how general are inferences are (overfitting)

28Monday, January 30, 2012

Page 8: CMSC858B: Computational systems biology and functional ...users.umiacs.umd.edu/~hcorrada/CMSC858B/lectures/lect1/CMSC858P_Lecture1.pdfCMSC858B: Computational systems biology and functional

^%+=(=,+.$+0/$JQ$'"%*6/(

T?^%+=(=,+.$"(='+=60$+0/$-09"&"0,"W?J6/".$<1-./-03P$(12"&>-("/$."+&0-03;$("'-_(12"&>-("/$."+&0-03

Y?5.1(%"&-03$+0+.#(-($F10(12"&>-("/$."+&0-03HZ?N&"/-,=60$+0/$,.+((-[,+=60

T?^2+&("$'"%*6/($%6$/"+.$A-%*$*-3*_/-'"0(-60+.-%#

\?7&+2*-,+.$'6/".(

[F

29Monday, January 30, 2012 30Monday, January 30, 2012

A quasi-theme

31Monday, January 30, 2012

32

Personal Genomics

32Monday, January 30, 2012

Page 9: CMSC858B: Computational systems biology and functional ...users.umiacs.umd.edu/~hcorrada/CMSC858B/lectures/lect1/CMSC858P_Lecture1.pdfCMSC858B: Computational systems biology and functional

Sequence OnceRead Often

Read what?- genome- variants- methylation- expression- other genome features- medical literature- risk models- population information- ...

33Monday, January 30, 2012

Personal Genomics

• We need to produce reliable genome measurements, but on much bigger scale (Algorithmics, Systems)

• Multiple genome features, decide which are relevant and significant (Information Retrieval, Data Management)

• Population-based science, interpreted individually (Machine Learning/Statistics, Privacy)

34

34Monday, January 30, 2012

The plan

• First two lectures are background and catchup

1. Molecular biology

2. Statistical learning (ML)

• Required reading assignment for next lecture (1/27):

• Larry Hunter, “Life and its molecules”

• You can get it from course website

35Monday, January 30, 2012