supporting clinical trial data curation and integration with table mining
Post on 14-Apr-2017
149 Views
Preview:
TRANSCRIPT
Supporting clinical trial data curation and integration
with table miningNikola Milosevic1, Cassie Gregson3, Robert Hernandez3, Goran Nenadic1,2
1School of Computer Science, University of Manchester2 The Farr Institute @HeRC3AstraZeneca
Clinical trial publications• Around 800 000 clinical trials in PubMed• Difficult to digest/search• Text mining approaches• But tables and figures are
often not processed
Tables in publications• Present factual information• Usually:• Experimental settings (i.e. demographics)• Findings and results (e.g. DDI, side effects, adverse events…)• Background information (previous research, datasets, etc.)• Examples
• Important information about trials
Extraction and curation of table data
Challenges• Complex structure• Table dimensionality (1, 2, multi-dimensional)• Visual relationships
• Dense content• Ambiguous short text• Lack of context• Acronyms and abbreviations• Incomplete information
Table analysis overview
Table types (1)• 4 types: list, matrix, super-row and multi-tables• List table:
Table types (2)• Matrix table
Table types (3)• Super-row table
Table types (4)• Multi-table
Example of decomposition
Example of decomposition
Example of decomposition
Results
Next steps• Add semantic annotations• Link patterns in data cells with its meaning• Build/Expand knowledge bases• Relate to existing knowledge on the semantic web
Annotation schema• Meta-data• Paper (name, abstract, authors, publisher)• Authors (names, emails, affiliations)• Table (caption, footers)• Cells (content, role)• Inter-cell relationships• Semantics (links to ontologies, dictionaries, knowledge bases)
Summary• Tables contain valuable information such as settings or
results • System for extraction and curation of table data• Decomposition and annotation of the tables• Accuracy of 85%
• Semantic analysis and information extraction
nikola.milosevic@manchester.ac.uk
top related