content markup / plazi

5
Virtual Biodiversity ViBRANT Interactive Content Extraction (semi- and fully automated) GUIDO SAUTTER KIT [email protected] ViBRANT Virtual Biodiversity Workpackage 7 Biodiversity Literature Access and Data Mining

Upload: vbrant

Post on 19-Jan-2015

600 views

Category:

Documents


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Content Markup / Plazi

Virtual BiodiversityViBRANT

Interactive Content Extraction (semi- and fully automated)

GUIDO SAUTTERKIT

[email protected]

ViBRANTVirtual Biodiversity

Workpackage 7Biodiversity Literature Accessand Data Mining

Page 2: Content Markup / Plazi

2 of

Virtual BiodiversityViBRANT

Who we are & what we do

5

Prof. Dr. Klemens BöhmHead of Database & Information Systems GroupComputer Science Department @ KIT

Data Analytics, Citizen Science and Crowdsourcing, Query Processing in Databases

Guido SautterResearcher / PhD StudentComputer Science Department @ KIT

Semantic Markup of & Data Extraction from Legacy Documents, Interactive NLP, ... GoldenGATE

Page 3: Content Markup / Plazi

3 of

Virtual BiodiversityViBRANT

What we will do in ViBRANT

5

Community-contributed bibliography [month 12]:- “Link as you browse!“ (an unlinked reference? link it)- “Parse & correct as you browse!“ (as you encounter the need)

Markup Modules [month 24]:- “Markup & correct as you browse!“ (same idea as above)- “No time? Then note it!“ (add note to public TODO list)

Advanced Search [month 35]:- “Browse, don‘t google!“ (generate rich in-text search links)- ... from marked / annotated parts of documents

Page 4: Content Markup / Plazi

4 of

Virtual BiodiversityViBRANT

Engaging Citizen Scientists

5

Make existing parsing facilities for bibliographic references available via Scratchpads to help with shared bibliography:

Page 5: Content Markup / Plazi

5 of

Virtual BiodiversityViBRANT

Assisting Taxonomists in Scratchpads

5

Make available parsing functionality for taxonomic names: