presentationmiev0 - aalborg universitet · alzheimer's disease (ad) : the most common form of...

34

Upload: others

Post on 19-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number
Page 2: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

30/08/09 2

Page 3: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

 Alzheimer's disease (AD) : the most common form of dementia   26.6 million people worldwide had Alzheimer's (2006) 

  Increase of the number of patients worldwide (*4 by 2050)  Interest of the biomedical community for the discovery of the genes involved in AD development 

30/08/09 3

Page 4: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

  Study of gene expression: information on the synthesis of a functional gene product  

  Microarrays: to compare the expression of thousands of genes in different tissues, cells or conditions  

  Processing microarray analysis for making biomedical sense is a big challenge because of the large amounts of data  

  Importance of data mining for discovering previously unknown knowledge from huge volumes of data        Adaptations  

30/08/09 4

Page 5: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

30/08/09 5

Genes

Page 6: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

30/08/09 6

Genes

Microarrays

Page 7: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

30/08/09 7

Genes

Microarrays

Intensity (expression) of a gene measured by a microarray

Page 8: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

30/08/09 8

Genes

Microarrays

Intensity (expression) of a gene measured by a microarray

Huge density: Affymetrix U-133 plus 2.0 Array 54,675 probesets

Page 9: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

30/08/09 9

DNA microarray technologies

New knowledge

Online biological knowledge databases and bibliographical resources

Page 10: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

 Processing all those data remains very challenging in terms of biological significance  

30/08/09 10

DNA microarray technologies Online biological knowledge databases and bibliographical resources

New knowledge

Page 11: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

  Collaboration between 

  Objectives: To provide a process which enables experts to interpret transcriptomic data  

  Application: To decipher mechanisms of brain ageing and associated pathologies (Alzheimer's diseases) 

  Data:    Transcriptome of the temporal  cortex of    Microcebus murinus   Affymetrix microarrays 

30/08/09 11

Page 12: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

30/08/09 12

Page 13: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

30/08/09 13

Data Mining techniques 

Sequential patterns 

Clustering and visualisation techniques 

Selected sequential patterns 

Interpretation techniques 

New biological

knowledge

Page 14: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

30/08/09 14

Data Mining techniques 

Sequential patterns 

Clustering and visualisation techniques 

Selected sequential patterns 

Interpretation techniques 

Page 15: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

  G2 has an expression lower than genes G1 and G5 which expressions are close and lower than G3 

30/08/09 15

Microarray  Gene expression sequences 

M1 M2 M3 M4 

<(G2)(G1 G5)(G3)(G4)> <(G2)(G1 G5)(G4)(G3)> <(G2)(G4)(G1 G5)(G3) > <(G2)(G3)(G1 G5)(G4)> 

<(G2)(G1 G5)(G3)>

Page 16: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

30/08/09 16

<(G2)(G1 G5)(G3)>

Itemset Item

Sequence Microarray  Gene expression sequences 

M1 M2 M3 M4 

<(G2)(G1 G5)(G3)(G4)> <(G2)(G1 G5)(G4)(G3)> <(G2)(G4)(G1 G5)(G3) > <(G2)(G3)(G1 G5)(G4)> 

Page 17: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

30/08/09 17

<(G2)(G1 G5)(G3)>

Support: 3/4

Microarray  Gene expression sequences 

M1 M2 M3 M4 

<(G2)(G1 G5)(G3)(G4)> <(G2)(G1 G5)(G4)(G3)> <(G2)(G4)(G1 G5)(G3) > <(G2)(G3)(G1 G5)(G4)> 

Page 18: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

30/08/09 18

<(G2)(G1 G5)(G3)>

Support: 3/4

DBSAP Algorithm [Salle et al., AIME 2009]

Microarray  Gene expression sequences 

M1 M2 M3 M4 

<(G2)(G1 G5)(G3)(G4)> <(G2)(G1 G5)(G4)(G3)> <(G2)(G4)(G1 G5)(G3) > <(G2)(G3)(G1 G5)(G4)> 

Page 19: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

30/08/09 19

  Results: Discriminant sequential patterns for various supports  

 Frequent for a biological class (young adults)  

 Not frequent for the complementary class (aged animals) 

Page 20: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

30/08/09 20

  Results: Discriminant sequential patterns for various supports  

   frequent for a biological class (young adults)   Not frequent for the complementary class (aged animals) 

 Too numerous (between 100 and 185,240)   

Difficult to interpretate 

Page 21: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

30/08/09 21

Data Mining techniques 

Sequential patterns 

Clustering and visualisation techniques 

Selected sequential patterns 

Interpretation techniques 

New knowledge

Page 22: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

  Similarity measure [Saneifar et al., AusDM’08]  

  Method of hierarchical clustering [Nin Guerero et al., AusDM’08] 

30/08/09 22

S75%=<(G1)(G2 G3)> S’75%=< (G2 G3) (G1)> 

Page 23: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

  Similarity measure [Saneifar et al., AusDM’08]  

  Method of hierarchical clustering [Nin Guerero et al., CBMS’08] 

30/08/09 23

S75%,25%=<(G1)(G2 G3)> S75%,25%=< (G2 G3) (G1)> 

 Because of the quantity of patterns and the depth of the hierarchy, these results are not easily understandable and actionable by experts 

Page 24: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

30/08/09 24

Collaboration with PIKKO society 

http://www.lirmm.fr/tatoo/spip.php?page=prototypes

Page 25: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

 Results: S75=<(MRVI1)(PGAP1)(PLA2R1)(A2M)(GSK3B)> 

  Those proteins might be involved in signalling or metabolism 

  Some of them interfere with Alzheimer's disease cellular events 

30/08/09 25

Page 26: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

 Results: S75=<(MRVI1)(PGAP1)(PLA2R1)(A2M)(GSK3B)> 

  Those proteins might be involved in signalling or metabolism 

  Some of them interfere with Alzheimer's disease cellular events 

30/08/09 26

What about the other knowledge available online? 

Page 27: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

30/08/09 27

Data Mining techniques 

Sequential patterns 

Clustering and visualisation techniques 

Selected sequential patterns 

Interpretation techniques 

New knowledge

Page 28: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

 Example: Query of KEGG’s Web services % To find all the genes associated with a particular gene in a pathway 

 To find all diseases, related to a particular gene thanks to a pathway 

 …. 

30/08/09 28

Page 29: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

  Objectives: validation + Research of novelties   Research of documents associated to 1..n genes of a 

pattern   Identification of the synonyms of the genes in GO   With 2 genes, 73% of the queries return less than 15 

documents 

30/08/09 29

S75%,25%=<(G1)(G2 G3)>  Texts 

Page 30: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

30/08/09 30

Page 31: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

30/08/09 31

Data Mining techniques 

Sequential patterns 

Clustering and visualisation techniques 

Selected sequential patterns 

Interpretation techniques 

New knowledge

Page 32: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

 By discovering of new knowledge from transcriptomic data that showed biological significance, we pave the way for promising research both in terms of data mining and biology.  

30/08/09 32

Page 33: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

  Improve each step of this process  Other types of patterns (Fuzzy patterns)  Other clustering methods and Treemap visualisation for the groups 

 Generalise these methods to other types of massive data (genomic data) 

 Use of sequential patterns for prediction tasks 

30/08/09 33

Page 34: PresentationMIEV0 - Aalborg Universitet · Alzheimer's disease (AD) : the most common form of dementia 26.6 million people worldwide had Alzheimer's (2006) Increase of the number

30/08/09 34