a umls- based system for literature-based discovery in medicine

50
UNIVERSITÀ DI PAVIA A UMLS-Based System for Literature-Based Discovery in Medicine Matteo Gabetta MEDINFO Copenhagen, August 21 st 2013

Upload: eithne

Post on 23-Feb-2016

35 views

Category:

Documents


0 download

DESCRIPTION

A UMLS- Based System for Literature-Based Discovery in Medicine . Matteo Gabetta. MEDINFO Copenhagen, August 21 st 2013. Literature Based Discovery (LBD). Discover unknown relationships among scientific knowledge. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

UNIVERSITÀ DI PAVIA

A UMLS-Based Systemfor Literature-Based Discovery

in Medicine

Matteo Gabetta

MEDINFOCopenhagen, August 21st 2013

Page 2: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Literature Based Discovery (LBD)

Discover unknown relationships among scientific knowledge

Swanson DR: “Fish oil, Raynaud’s syndrome, and undiscovered public knowledge”. Perspectives in Biology and Medicine 1986, 30(1):7-18.

Page 3: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Literature Based Discovery

Swanson DR: “Fish oil, Raynaud’s syndrome, and undiscovered public knowledge”. Perspectives in Biology and Medicine 1986, 30(1):7-18.

• Methods of discoveryOPEN vs. CLOSED

• Sources of knowledgeAbstract, Full Text, MeSH, …

• Knowledge representationConcepts, (groups of) words

• Knowledge extractionText mining techniques

• Relationship measurementCitation frequency, association

rules…• Process automation

User interaction level

Page 4: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Literature Based Discovery

Swanson DR: “Fish oil, Raynaud’s syndrome, and undiscovered public knowledge”. Perspectives in Biology and Medicine 1986, 30(1):7-18.

• Methods of discoveryOPEN vs. CLOSED

• Sources of knowledgeAbstract, Full Text, MeSH, …

• Knowledge representationConcepts, (groups of) words

• Knowledge extractionText mining techniques

• Relationship measurementCitation frequency, association

rules…• Process automation

User interaction level

Page 5: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Literature Based Discovery

Swanson DR: “Fish oil, Raynaud’s syndrome, and undiscovered public knowledge”. Perspectives in Biology and Medicine 1986, 30(1):7-18.

• Methods of discoveryOPEN vs. CLOSED

• Sources of knowledgeAbstract, Full Text, MeSH, …

• Knowledge representationConcepts, (groups of) words

• Knowledge extractionText mining techniques

• Relationship measurementCitation frequency, association

rules…• Process automation

User interaction level

Page 6: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Literature Based Discovery

Swanson DR: “Fish oil, Raynaud’s syndrome, and undiscovered public knowledge”. Perspectives in Biology and Medicine 1986, 30(1):7-18.

• Methods of discoveryOPEN vs. CLOSED

• Sources of knowledgeAbstract, Full Text, MeSH, …

• Knowledge representationConcepts, (groups of) words

• Knowledge extractionText mining techniques

• Relationship measurementCitation frequency, association

rules…• Process automation

User interaction level

Page 7: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Literature Based Discovery

Swanson DR: “Fish oil, Raynaud’s syndrome, and undiscovered public knowledge”. Perspectives in Biology and Medicine 1986, 30(1):7-18.

• Methods of discoveryOPEN vs. CLOSED

• Sources of knowledgeAbstract, Full Text, MeSH, …

• Knowledge representationConcepts, (groups of) words

• Knowledge extractionText mining techniques

• Relationship measurementCitation frequency, association

rules…• Process automation

User interaction level

Page 8: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Literature Based Discovery

Swanson DR: “Fish oil, Raynaud’s syndrome, and undiscovered public knowledge”. Perspectives in Biology and Medicine 1986, 30(1):7-18.

• Methods of discoveryOPEN vs. CLOSED

• Sources of knowledgeAbstract, Full Text, MeSH, …

• Knowledge representationConcepts, (groups of) words

• Knowledge extractionText mining techniques

• Relationship measurementCitation frequency, association

rules…• Process automation

User interaction level

Page 9: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Literature Based Discovery

Swanson DR: “Fish oil, Raynaud’s syndrome, and undiscovered public knowledge”. Perspectives in Biology and Medicine 1986, 30(1):7-18.

• Methods of discoveryOPEN vs. CLOSED

• Sources of knowledgeAbstract, Full Text, MeSH, …

• Knowledge representationConcepts, (groups of) words

• Knowledge extractionText mining techniques

• Relationship measurementCitation frequency, association

rules…• Process automation

User interaction level

Page 10: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Literature Based Discovery

Swanson DR: “Fish oil, Raynaud’s syndrome, and undiscovered public knowledge”. Perspectives in Biology and Medicine 1986, 30(1):7-18.

• Methods of discoveryOPEN vs. CLOSED

• Sources of knowledgeAbstract, Full Text, MeSH, …

• Knowledge representationConcepts, (groups of) words

• Knowledge extractionText mining techniques

• Relationship measurementCitation frequency, association

rules…• Process automation

User interaction level

Page 11: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Literature Based Discovery

Swanson DR: “Fish oil, Raynaud’s syndrome, and undiscovered public knowledge”. Perspectives in Biology and Medicine 1986, 30(1):7-18.

• Methods of discoveryOPEN vs. CLOSED

• Sources of knowledgeAbstract, Full Text, MeSH, …

• Knowledge representationConcepts, (groups of) words

• Knowledge extractionText mining techniques

• Relationship measurementCitation frequency, association

rules…• Process automation

User interaction level

Page 12: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

System characteristics• Methods of discovery

OPEN discovery• Sources of knowledge

Abstract• Knowledge representation

UMLS concepts• Knowledge extraction

Text mining techniques• Relationship measurement

Support/Confidence from association rule theory• Process automation

Highly interactive discovery process

Page 13: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

System characteristics• Methods of discovery

OPEN discovery• Sources of knowledge

Abstract• Knowledge representation

UMLS concepts• Knowledge extraction

Text mining techniques• Relationship measurement

Support/Confidence from association rule theory• Process automation

Highly interactive discovery process

Page 14: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

System characteristics• Methods of discovery

OPEN discovery• Sources of knowledge

Abstract• Knowledge representation

UMLS concepts• Knowledge extraction

Text mining techniques• Relationship measurement

Support/Confidence from association rule theory• Process automation

Highly interactive discovery process

Page 15: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

System characteristics• Methods of discovery

OPEN discovery• Sources of knowledge

Abstract• Knowledge representation

UMLS concepts• Knowledge extraction

Text mining techniques• Relationship measurement

Support/Confidence from association rule theory• Process automation

Highly interactive discovery process

Page 16: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

System characteristics• Methods of discovery

OPEN discovery• Sources of knowledge

Abstract• Knowledge representation

UMLS concepts• Knowledge extraction

Text mining techniques• Relationship measurement

Support/Confidence from association rule theory• Process automation

Highly interactive discovery process

Page 17: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

System characteristics• Methods of discovery

OPEN discovery• Sources of knowledge

Abstract• Knowledge representation

UMLS concepts• Knowledge extraction

Text mining techniques• Relationship measurement

Support/Confidence from association rule theory• Process automation

Highly interactive discovery process

Page 18: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

System characteristics• Methods of discovery

OPEN discovery• Sources of knowledge

Abstract• Knowledge representation

UMLS concepts• Knowledge extraction

Text mining techniques• Relationship measurement

Support/Confidence from association rule theory• Process automation

Highly interactive discovery process

Page 19: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

System characteristics

Moreover:• Co-cited UMLS concepts = related

concepts• Semantic Types used for filtering• Literature-Mining Database as a

persistence layer

Technologies:• Java• Entrez Programming Utilities – eUtils• GWT – Google Web Toolkit

Page 20: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

System characteristics

Moreover:• Co-cited UMLS concepts = related

concepts• Semantic Types used for filtering• Literature-Mining Database as a

persistence layer

Technologies:• Java• Entrez Programming Utilities – eUtils• GWT – Google Web Toolkit

Page 21: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

System Workflow

Page 22: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

System Workflow (AB)

Page 23: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

System Workflow (BC)

Page 24: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

System Workflow (final)

Page 25: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Support & Confidence

Page 26: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Support & Confidence

Page 27: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

The INHERITANCE projectIntegrated Heart Research In Translational Genetics of Cardiomyopathies in

Europe

• Dilated cardiomyopathies• 3 year health research project• European commission funding program 7• 11 European centers

Page 28: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Validation

“Re-discover” DCM/gene association

• Only literature prior to 1st explicit DCM/gene association

TNNT2 TPM1 DES LMNATTN MYH7 DMD MVCL

MYBPC3 ABCC9 DSP PLNACTC CLP LDB3 SGCD

Page 29: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Validation

“Re-discover” DCM/gene association

• Only literature prior to 1st explicit DCM/gene association

TNNT2 TPM1 DES LMNATTN MYH7 DMD MVCL

MYBPC3 ABCC9 DSP PLNACTC CLP LDB3 SGCD

Page 30: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Validation: idea

“Re-discover” DCM/gene association

• Only literature prior to 1st explicit DCM/gene association

Angiology. 1975 Nov;26(10):723-33.The differential diagnosis of congestive cardiomyopathyand ischemic cardiomyopathy by echocardiography.Shors CM, et al.

DCM

Nov 1975 time

Page 31: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Validation: idea

“Re-discover” DCM/gene association

• Only literature prior to 1st explicit DCM/gene association

J Biol Chem. 1982 Apr 25;257(8):4328-32.Oligomeric structure of the major nuclear envelope protein lamin B.Shelton KR, et al.

DCM

Nov 1975

LMNA

Apr 1982 time

Page 32: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Validation: idea

“Re-discover” DCM/gene association

• Only literature prior to 1st explicit DCM/gene association

N Engl J Med. 1999 Dec 2;341(23):1715-24.Missense mutations in the rod domain of the lamin A/C gene as causes of dilated cardiomyopathy and conduction-system disease.Fatkin D, et al.

DCM

Nov 1975

LMNA

Apr 1982 Dec 1999

LMNA+DCM

time

Page 33: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Validation: idea

“Re-discover” DCM/gene association

• Only literature prior to 1st explicit DCM/gene associationDCM

Nov 1975

LMNA

Apr 1982 Dec 1999

LMNA+DCM

time

Page 34: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Validation: an example• A string : “Dilated cardiomyopathy”

• A concept : “Cardiomyopathy, Dilated –

(C0007193)”

• Query dates : (Apr 1982 – Nov 1999)

• Literature A obtained

• B concepts:o Semantic Type filter (21 types allowed)o Support & Confidence (greater than average)

Page 35: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Validation: an example• A string : “Dilated cardiomyopathy”

• A concept : “Cardiomyopathy, Dilated –

(C0007193)”

• Query dates : (Apr 1982 – Nov 1999)

• Literature A obtained

• B concepts:o Semantic Type filter (21 types allowed)o Support & Confidence (greater than average)

Page 36: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Validation: an example• A string : “Dilated cardiomyopathy”

• A concept : “Cardiomyopathy, Dilated –

(C0007193)”

• Query dates : (Apr 1982 – Nov 1999)

• Literature A obtained

• B concepts:o Semantic Type filter (21 types allowed)o Support & Confidence (greater than average)

Page 37: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Validation: an example• A string : “Dilated cardiomyopathy”

• A concept : “Cardiomyopathy, Dilated –

(C0007193)”

• Query dates : (Apr 1982 – Nov 1999)

• Literature A obtained

• B concepts:o Semantic Type filter (21 types allowed)o Support & Confidence (greater than average)

Page 38: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Validation: an example

• Query dates : (Apr 1982 – Nov 1999)

• Literature B obtained

• C concepts:o One Semantic Type: “Gene or Genome –

T028”

Page 39: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Validation: an example

• Query dates : (Apr 1982 – Nov 1999)

• Literature B obtained

• C concepts:o One Semantic Type: “Gene or Genome –

T028”

Page 40: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Validation: an example

• Query dates : (Apr 1982 – Nov 1999)

• Literature B obtained

• C concepts:o One Semantic Type: “Gene or Genome –

T028”

Is LMNA between C concepts?Evaluation of Support and Score

Page 41: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Validation: resultsGene First date First date

w/ DMCB

concepts#

Papers

TNNT2 1994 May 2000 Jan Not Found 5

TTN 1975 Jan 1994 Oct 64 546

MYBPC3 1993 Feb 1997 Mar Not Found 17

ACTC 1977 Feb 1998 May 98 1313

TPM1 1974 Jan 2000 Jan Not Found 51

MYH7 1989 Feb 2000 Jan Not Found 35

ABCC9 2001 Apr 2004 Apr Not Found 9

CLP 1991 Sep 1997 Feb Not Found 11

DES 1976 Dec 1990 Jan 82 943

DMD 1978 May 1990 Feb 35 290

DSP 1982 Jan 2000 Oct 189 313

LDB3 1993 Jan 2003 Dec Not Found 14

LMNA 1983 Jan 1999 Dec 166 214

MVCL 1985 Jan 1997 Jan Not Found 30

PLN 1975 Jan 1990 May 45 203

SGCD 1999 Aug 1999 Aug Not Available 2

Page 42: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Validation: resultsGene First date First date

w/ DMCB

concepts#

Papers

TNNT2 1994 May 2000 Jan Not Found 5

TTN 1975 Jan 1994 Oct 64 546

MYBPC3 1993 Feb 1997 Mar Not Found 17

ACTC 1977 Feb 1998 May 98 1313

TPM1 1974 Jan 2000 Jan Not Found 51

MYH7 1989 Feb 2000 Jan Not Found 35

ABCC9 2001 Apr 2004 Apr Not Found 9

CLP 1991 Sep 1997 Feb Not Found 11

DES 1976 Dec 1990 Jan 82 943

DMD 1978 May 1990 Feb 35 290

DSP 1982 Jan 2000 Oct 189 313

LDB3 1993 Jan 2003 Dec Not Found 14

LMNA 1983 Jan 1999 Dec 166 214

MVCL 1985 Jan 1997 Jan Not Found 30

PLN 1975 Jan 1990 May 45 203

SGCD 1999 Aug 1999 Aug Not Available 2

Page 43: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Validation: resultsGene First date First date

w/ DMCB

concepts#

Papers

TNNT2 1994 May 2000 Jan Not Found 5

TTN 1975 Jan 1994 Oct 64 546

MYBPC3 1993 Feb 1997 Mar Not Found 17

ACTC 1977 Feb 1998 May 98 1313

TPM1 1974 Jan 2000 Jan Not Found 51

MYH7 1989 Feb 2000 Jan Not Found 35

ABCC9 2001 Apr 2004 Apr Not Found 9

CLP 1991 Sep 1997 Feb Not Found 11

DES 1976 Dec 1990 Jan 82 943

DMD 1978 May 1990 Feb 35 290

DSP 1982 Jan 2000 Oct 189 313

LDB3 1993 Jan 2003 Dec Not Found 14

LMNA 1983 Jan 1999 Dec 166 214

MVCL 1985 Jan 1997 Jan Not Found 30

PLN 1975 Jan 1990 May 45 203

SGCD 1999 Aug 1999 Aug Not Available 2

Page 44: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Validation: resultsGene First date First date

w/ DMCB

concepts#

Papers

TNNT2 1994 May 2000 Jan Not Found 5

TTN 1975 Jan 1994 Oct 64 546

MYBPC3 1993 Feb 1997 Mar Not Found 17

ACTC 1977 Feb 1998 May 98 1313

TPM1 1974 Jan 2000 Jan Not Found 51

MYH7 1989 Feb 2000 Jan Not Found 35

ABCC9 2001 Apr 2004 Apr Not Found 9

CLP 1991 Sep 1997 Feb Not Found 11

DES 1976 Dec 1990 Jan 82 943

DMD 1978 May 1990 Feb 35 290

DSP 1982 Jan 2000 Oct 189 313

LDB3 1993 Jan 2003 Dec Not Found 14

LMNA 1983 Jan 1999 Dec 166 214

MVCL 1985 Jan 1997 Jan Not Found 30

PLN 1975 Jan 1990 May 45 203

SGCD 1999 Aug 1999 Aug Not Available 2

Page 45: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Validation: results

Gene Score Support Rank Sup Rank Score

TTN 26832 92 68/542 41/542

ACTC 203577 1025 7/662 6/662

DES 21598 150 11/349 8/349

DMD 15268 300 2/349 21/349

DSP 256598 1115 5/887 8/887

LMNA 252739 752 9/822 5/822

PLN 7906 47 69/380 75/380

Page 46: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Validation: results

Gene Score Support Rank Sup Rank Score

TTN 26832 92 68/542 41/542

ACTC 203577 1025 7/662 6/662

DES 21598 150 11/349 8/349

DMD 15268 300 2/349 21/349

DSP 256598 1115 5/887 8/887

LMNA 252739 752 9/822 5/822

PLN 7906 47 69/380 75/380

Page 47: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Validation: results

Gene Score Support Rank Sup Rank Score

TTN 26832 92 68/542 41/542

ACTC 203577 1025 7/662 6/662

DES 21598 150 11/349 8/349

DMD 15268 300 2/349 21/349

DSP 256598 1115 5/887 8/887

LMNA 252739 752 9/822 5/822

PLN 7906 47 69/380 75/380

Page 48: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Discussion and Future Developments

• Effective in ranking DCM related genes• Heuristic score good alternative to Support• Limitation: fails for C concepts with small

literature• Analyze in depth the “threshold problem”• Practical comparison with other systems• Improve effectiveness of Text Mining system

Page 49: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Discussion and Future Developments

• Effective in ranking DCM related genes• Heuristic score good alternative to Support• Limitation: fails for C concepts with small

literature• Overcome the empirical set-up of some

parameters• Practical comparison with other systems• Improve effectiveness of Text Mining system

Page 50: A  UMLS- Based  System for  Literature-Based Discovery in  Medicine

Angelo Nuzzo IIT@SEMM, Milan, 2011MEDINFO 2013 - Copenhagen, August 21st 2013Matteo Gabetta

Thank You.

In loving memory ofGilles Belley