i nformation extraction from radiology reports: system design and implementation information model...

22
Information Extraction from Radiology Reports: System Design and Implementation Information Model System Architecture – UIMA Automatic Report Segmentation NER Negation Discovery Coreference Resolution, Relationship Discovery, Inference

Post on 21-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Information Extraction from Radiology Reports: System Design and Implementation

• Information Model

• System Architecture – UIMA

• Automatic Report Segmentation

• NER

• Negation Discovery

• Coreference Resolution, Relationship Discovery, Inference

Information Schemas

DICOM SR AIM (Information and Image Markup)

Coding Scheme - Code Value - Description

UMLS-2008AA,C0006141, Breast [Body Part, Organ, or Organ Component].

Overall System Design

Automatic Report Segmentation

SVM classifier achieved accuracy of over 0.9

NER

• Image ROI and Image Referents Discovery

• Imaging Observations and Characteristics

• Imaging Procedure

• Body Parts and Organs

• Findings and Abnormalities

• Persons, Dates, Times

Image and Image ROIs Referents

NER – OBA and MetaMap

Negation Discovery

The NegEx Algorithm

• A rule based system for the discovery of negation of findings and diseases in discharge summaries.

• A list of 35 negation phrases - negations preceding a term (e.g. not signs of, no evidence of, negative for ), negations following a term (e.g. declined, unlikely ), and what they refer to as ‘pseudo negations’ - false negations triggers such as double negatives or ambiguous negations (e.g. not necessarily, not rule out, not certain wether ).

• A 5-token window around the negation cue

BioScope

Clinical free-texts (radiology reports), biological full papers, and biological paper abstracts from the GENIA corpus.

Minimal retrocardiac opacity, <xcope id="X382.1.1"> <cue type="speculation" ref="X382.1.1">likely</cue> atelectasis</xcope>.

Normal chest x-ray <xcope id="X394.1.1"> <cue type="negation" ref="X394.1.1"> without</cue> radiographic evidence of residual bronchopulmonary dysplasia</xcope>.

Coreference Resolution

Coreference resolution is the process of determining whether two expressions in natural language refer to the same entity in the world.

The largest lymph node is inferiorly positioned in the level IV and measures 29 mm in diameter. Just superior to this, there is a necrotic lymph node measuring 16 mm in size.

Relationship Discovery

The goal of relationship extraction is to detect occurrences of a prespecified type of relationship between a pair of entities of given types.

• Associations between an imaging observation and imaging observation characteristics.

• Between an imaging observation and a body part or organ.

• Between imaging observations/characteristics and inferred diagnosis.

• Spatial relationships.

Inference Module

Domain specific inference module that would have the ability to fill in gaps in the relationships between the named entities present in the report.

The inference module will also be used to validate the output of both named entity and relationship discovery modules.

Q/A