Download - Information Extraction From Medical Records
![Page 1: Information Extraction From Medical Records](https://reader036.vdocuments.site/reader036/viewer/2022070400/568134c2550346895d9be56e/html5/thumbnails/1.jpg)
Information Extraction From Medical Records
by Alexander Barsky
![Page 2: Information Extraction From Medical Records](https://reader036.vdocuments.site/reader036/viewer/2022070400/568134c2550346895d9be56e/html5/thumbnails/2.jpg)
Current Methodology:
Broad assessment of patient contained in beginning of chart with references to more specific areas. Specific divisions follow broad assessment. Records are listed in chronological order of activity.
![Page 3: Information Extraction From Medical Records](https://reader036.vdocuments.site/reader036/viewer/2022070400/568134c2550346895d9be56e/html5/thumbnails/3.jpg)
Chart Example:
.
![Page 4: Information Extraction From Medical Records](https://reader036.vdocuments.site/reader036/viewer/2022070400/568134c2550346895d9be56e/html5/thumbnails/4.jpg)
Problem:
A patient's medical chart is very detailed and very complex in nature. Any attempt to quickly locate specific information will be met with frustration.
![Page 5: Information Extraction From Medical Records](https://reader036.vdocuments.site/reader036/viewer/2022070400/568134c2550346895d9be56e/html5/thumbnails/5.jpg)
Example:
.
![Page 6: Information Extraction From Medical Records](https://reader036.vdocuments.site/reader036/viewer/2022070400/568134c2550346895d9be56e/html5/thumbnails/6.jpg)
Solution:
Create a system that properly extracts wanted information based on a predefined set of parameters. Example: "Hormonal imbalance during puberty". Retrieve all references to hormonal imbalances but only between two specific time periods in medical chart.
![Page 7: Information Extraction From Medical Records](https://reader036.vdocuments.site/reader036/viewer/2022070400/568134c2550346895d9be56e/html5/thumbnails/7.jpg)
Tool At our disposal:
JAPE : Java Annotation Patterns Engine. Use : pattern matching and semantic extraction GATE : General Architecture for Text Engineering. Use: Information Extraction, document annotation, and XML output. C# : Visual C# Winforms. Use: Medium for conversion between XML and .csv file formats.
![Page 8: Information Extraction From Medical Records](https://reader036.vdocuments.site/reader036/viewer/2022070400/568134c2550346895d9be56e/html5/thumbnails/8.jpg)
Solution Methodology:
1. Create corpus of documents in GATE.2. Introduce rules for information extraction.3. Annotate documents in corpus.4. Output annotated documents in XML.5. Strip file of unnecessary elements and convert to .csv.
![Page 9: Information Extraction From Medical Records](https://reader036.vdocuments.site/reader036/viewer/2022070400/568134c2550346895d9be56e/html5/thumbnails/9.jpg)
![Page 10: Information Extraction From Medical Records](https://reader036.vdocuments.site/reader036/viewer/2022070400/568134c2550346895d9be56e/html5/thumbnails/10.jpg)
ANNIE
A-Nearly-New-Information-Extraction-System -Tokeniser - splits sentence into simple tokens-Gazetter - identify entity names contained in lists-Sentence Splitter - splits text into sentences based on lists.-Parts of Speech Tagger - identifies text as different POS.-Coreference Matcher- identifies relationships between previously defined entities.
![Page 11: Information Extraction From Medical Records](https://reader036.vdocuments.site/reader036/viewer/2022070400/568134c2550346895d9be56e/html5/thumbnails/11.jpg)
Success in Information Extraction is based on integrating most if not all ANNIE components -
![Page 12: Information Extraction From Medical Records](https://reader036.vdocuments.site/reader036/viewer/2022070400/568134c2550346895d9be56e/html5/thumbnails/12.jpg)
JAPE : Key to Extraction
-
![Page 13: Information Extraction From Medical Records](https://reader036.vdocuments.site/reader036/viewer/2022070400/568134c2550346895d9be56e/html5/thumbnails/13.jpg)
JAPE Example
-
![Page 14: Information Extraction From Medical Records](https://reader036.vdocuments.site/reader036/viewer/2022070400/568134c2550346895d9be56e/html5/thumbnails/14.jpg)
XML Output:
-
![Page 15: Information Extraction From Medical Records](https://reader036.vdocuments.site/reader036/viewer/2022070400/568134c2550346895d9be56e/html5/thumbnails/15.jpg)
Problem: Too much unorganized information.
Solution :
XLST to the rescue!!!
XLST - Extensible Stylesheet Language Transformations - Add specific rules to seperate needed from unnecessary information.
![Page 16: Information Extraction From Medical Records](https://reader036.vdocuments.site/reader036/viewer/2022070400/568134c2550346895d9be56e/html5/thumbnails/16.jpg)
XLST Example
-Find all the nodes within the <Lookup>. Add string between the tags.
![Page 17: Information Extraction From Medical Records](https://reader036.vdocuments.site/reader036/viewer/2022070400/568134c2550346895d9be56e/html5/thumbnails/17.jpg)
CSV File Type Comma Seperated Value - Used to present information in a tabular system. Useful for analyzing large amount of data in an easy to understand format. Most common program to use it is Excel.
.
![Page 18: Information Extraction From Medical Records](https://reader036.vdocuments.site/reader036/viewer/2022070400/568134c2550346895d9be56e/html5/thumbnails/18.jpg)
Potential Problem:
Regardless of how well all the ANNIE tools are utilized and how well the JAPE rules are defined, proper recall precentage won't ever be exact.
![Page 19: Information Extraction From Medical Records](https://reader036.vdocuments.site/reader036/viewer/2022070400/568134c2550346895d9be56e/html5/thumbnails/19.jpg)
Solution: Machine Learning
Machine learning is our best chance to increase precision of output results. Training a computer to recognize commonally used reporting phraseology will organize extraction better with more precise, concise outputs. Lucky for us, GATE include plugins to program machine learning.