semantic mediawiki (smw) for scientific literature management · 2021. 2. 23. · nlp service 1 nlp...
TRANSCRIPT
Semantic MediaWiki (SMW) for Scientific Literature Management
Bahar Sateli, Rene Witte
Semantic Software LabDepartment of Computer Science and Software Engineering
Concordia University, Montreal
SMWCon Spring 2014
SMW for Scientific Literature Management 1 / 13
Introduction Background System Design Application Conclusion
Outline
1 Introduction
2 Background
3 System Design
4 Application
5 Conclusion
SMW for Scientific Literature Management 2 / 13
Introduction Background System Design Application Conclusion
Motivation
I Abundance of publications leads to bottlenecks in curating literature
I Existing bibliography management systems have limited content analysis support
I We need an environment that can encompass various activities of a researcher
I We hypothesize that our tool can improve knowledge-intensive literature analysistasks through a novel collaboration pattern between humans and AI assistants.
We envision a collaborative, wiki-based solution for the semantic management of researchliterature that integrates:
I a web-based interface
I semantic knowledge representation
I text mining for automatic content analysis
SMW for Scientific Literature Management 3 / 13
Introduction Background System Design Application Conclusion
Related Work
I Post-publication semantic analysisI WikiPapers1 uses Semantic Forms to collect literature focused on wiki research
I AcaWiki2, designed to collect summaries and literature reviews of peer-reviewedacademic research
I Pre-publication semantic enrichmentI The SALT (Groza et al., 2007) framework uses custom LATEX commands with explicit
semantics
I Our work is complementary to these efforts: we generate bibliographical andsemantic entities using human-AI collaboration
I We transform papers into queryable artifacts, while remaining amenable tohuman-created semantic annotation
1WikiPapers, http://wikipapers.referata.com2AcaWiki, http://www.acawiki.org
SMW for Scientific Literature Management 4 / 13
Introduction Background System Design Application Conclusion
Natural Language Processing (NLP)
I A branch of AI that uses various techniques to process content written in naturallanguage
I Multitude of NLP techniques exist, e.g.,
I Named Entity Recognition (e.g., finding Persons, Organizations, etc.)
I Quality Assessment
I Summarization
I Various NLP APIs (e.g., OpenCalais, GATE, . . . )
I Semantic Assistants framework
NLP Service 1
NLP Service 2
NLP Service n
NLP ServiceResult
FocusedSummarization
...Client
− Parameter− Calling an NLP Service
Word Processor
Server
SMW for Scientific Literature Management 5 / 13
Introduction Background System Design Application Conclusion
Requirements
I Centralized Repository of Knowledge (R1)The system must provide users with the ability to store raw data as well as anyinformation generated by users and analysis tools
I Automatic Text Analysis Support (R2)The proposed system must provide access to various NLP pipelines in a unifiedmanner.
I Collaborative Analysis Environment (R3)The proposed system shall provide an environment where all researchers have accessto the most up-to-date information and can keep track of content modifications.
SMW for Scientific Literature Management 6 / 13
Introduction Background System Design Application Conclusion
System Architecture
Design Decisions:
I Wiki-based Collaborative Web Interface (R3)
I Semantic MediaWiki as a Knowledge Base (R1)
I Text Mining Pipelines for Literature Analysis (R2)
Language
Descriptions
Service
Database
We
b S
erv
er
Web Server
MediaWiki Engine
Wik
i−N
LP
Co
mp
on
en
t
Ze
eva
Fa
cts
NLP Service Connector
NLP Subsystem
OntologiesWiki−NLPWiki Extensions
...
Semantic AssistantsZeeva WikiUser
Se
ma
ntic
Me
dia
Wik
i
Service Information
Service Invocation
Readability Analysis
Automatic Indexer
SMW for Scientific Literature Management 7 / 13
Introduction Background System Design Application Conclusion
Semantic Metadata Extraction
Given a paper, we are interested in extracting:
I Structural Entities, i.e., parts of text that uniquely identify a paper, like title orauthor
I Semantic (Rhetorical) Entities, i.e., parts of text that describe the contributions,claims, findings and conclusions postulated by the papers authors
:hasTitle:hasAuthor
Database
Semantic Wiki
#paper_1234
#JohnDoe "Towards a semantic ..."
SMW for Scientific Literature Management 8 / 13
Introduction Background System Design Application Conclusion
Templating Mechanism
SMW for Scientific Literature Management 9 / 13
Introduction Background System Design Application Conclusion
Automatic Processing of Publications
SMW for Scientific Literature Management 10 / 13
Introduction Background System Design Application Conclusion
Querying the Zeeva Knowledge Base
I Transform Zeeva from a collaborative analysis environment to a knowledge base
I The generated semantic metadata can be used within wiki or exported to externalapplications
I Semantic MediaWiki provides a simple inline query syntax, e.g.,
NL Question: “Give me all the contributions of Bahar Sateli.”
Corresponding SMW query:
Advantages:
1 The results queried by the system are always up-to-date
2 Lets users discover knowledge created by other users of the wiki
SMW for Scientific Literature Management 11 / 13
Introduction Background System Design Application Conclusion
Querying the Zeeva Knowledge Base
SMW for Scientific Literature Management 12 / 13
Introduction Background System Design Application Conclusion
Summary and Future Works
I The main question in this research is to evaluate whether concrete literature analysistasks can be improved using state-of-the-art semantic technologies.
I We introduced Zeeva, an empirical wiki-based evaluation platform with an extensiblearchitecture
I Identify literature analysis tasks that can improved with semantic technologies
I Develop more NLP services relevant to the context of literature analysis
I Perform an extrinsic evaluation of our hypothesis to assess the usability andefficiency of the proposed approach
http://www.semanticsoftware.info/zeeva
SMW for Scientific Literature Management 13 / 13