so where are we now? the tdm landscape

8
Horizon 2020 Coordination and Support Action GARRI-3-2014 Scientific Information in the Digital Age: Text and Data Mining (TDM) Project number: 665940 So where are we now? The TDM Landscape FutureTDM Reducing Barriers and Increasing Uptake of Text and Data Mining for Research Environments using a Collaborative Knowledge and Open Information Approach FTDM workshop/Brussels 27 September 2016

Upload: futuretdm

Post on 08-Jan-2017

55 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: So where are we now? The TDM landscape

OpenDataMonitor

Horizon 2020Coordination and Support Action

GARRI-3-2014 Scientific Information in the Digital Age: Text and Data Mining (TDM)

Project number: 665940

So where are we now? The TDM Landscape

FutureTDMReducing Barriers and Increasing Uptake of Text and Data Mining for Research Environments using a Collaborative Knowledge and Open Information Approach

FTDM workshop/Brussels

27 September 2016

Page 2: So where are we now? The TDM landscape

Text and Data Mining : Definition

2

• “Discovery by computer of new, previously unknown information, by automatically extracting and relating information from different (…) resources, to reveal otherwise hidden meanings” (Hearst, 1999), or

• “Exploratory data analysis that leads to the discovery of heretofore unknown information, or to answers for questions for which the answer is not currently known” (Hearst, 1999).

Raw data(sensor data, text, images, multimedia)

Processed content (diagrams, charts, tables, references, maps, formulas, chemical structures)

Text and Data Mining- Business or Competitive Intelligence; - Research Analytics;- Learning analytics or

Educational Data Mining;- Predictive Analytics;

Hearst, M.A., 1999. Untangling Text Data Mining. College Park, Maryland, Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics (ACL’99), pp. 3–10. http://www.aclweb.org/anthology/P99-1001

Page 3: So where are we now? The TDM landscape

TDM: Technology perspective

3

Information Retrieval

and Search

Artificial Intelligence

Classification

Clustering

Regression

Association Rules

Statistics

Information Extraction Concept

Extraction

Opinion Mining

Summarisation

Negation and Modality

Detection

TextualEntailment

Natural LanguageProcessing

Machine Learning

Machine Translation

Predictive Analytics

Multimedia Processing

Source: M. Eskevich, A. van den Bosch, M. Caspers, L. Guibault, A. Bertone, S. Reilly, C. Munteanu, P. Leitner, S. Piperidis. Research Report on TDM Landscape in Europe. Project H2020 GARRI-3-2014--665940 FutureTDM Deliverable D3.1, 2016. http://project.futuretdm.eu/wp-content/uploads/2016/05/D3.1-Research-Report-on-TDM-Landscape-in-Europe-FutureTDM.pdf

Page 4: So where are we now? The TDM landscape

General economic structure : TDM across all economic sectors

4

Source: M. Eskevich, A. van den Bosch, M. Caspers, L. Guibault, A. Bertone, S. Reilly, C. Munteanu, P. Leitner, S. Piperidis. Research Report on TDM Landscape in Europe. Project H2020 GARRI-3-2014--665940 FutureTDM Deliverable D3.1, 2016. http://project.futuretdm.eu/wp-content/uploads/2016/05/D3.1-Research-Report-on-TDM-Landscape-in-Europe-FutureTDM.pdf

Page 5: So where are we now? The TDM landscape

TDM : business perspective

5

Enabling players :13%

ICT Enablers : 19%

Market places :3%

Analytics: 27%

Cross Infrastructures :

2%

Vertical Applications:

27%

Data users: 9%

Source: S. Piperidis, K. Pouli, M. Gavriilidou, D. Galanis, J. Bakagianni, M. Eskevich, A. Bertone. European Landscape \\& of TDM Applications Report. Project H2020 GARRI-3-2014--665940 FutureTDM Deliverable D4.1, 2016.http://project.futuretdm.eu/wp-content/uploads/2016/06/FutureTDM_D4.1-European-Landscape-of-TDM-Applications-Report.pdf

Page 6: So where are we now? The TDM landscape

TDM : scientific perspective

6

97.10%

0.38%

2.53%2.90%

all projectsTDM specificTDM associated

51,2 B€75,3 M€633,8

M€

852,4 M€

319,9 M€

149 M€

All FP7 & Horizon 2020 projects Primary sectorSecondary sector Tertiary sectorQuaternary sector Quinary sector

Source: S. Piperidis, K. Pouli, M. Gavriilidou, D. Galanis, J. Bakagianni, M. Eskevich, A. Bertone. European Landscape \\& of TDM Applications Report. Project H2020 GARRI-3-2014--665940 FutureTDM Deliverable D4.1, 2016.http://project.futuretdm.eu/wp-content/uploads/2016/06/FutureTDM_D4.1-European-Landscape-of-TDM-Applications-Report.pdf

Page 7: So where are we now? The TDM landscape

TDM in Europe: Language perspective

7

Source: http://www.meta-net.eu/whitepapers/key-results-and-cross-language-comparison

Page 8: So where are we now? The TDM landscape

To conclude and to start the discussion

• TDM technologies are present across all sectors of EU economy:• There is none and can‘t be one fit all solution

• Shared infrastructures can help cross-polination of TDM solutions

between sectors

• EU case is specific due to the additional languagedimension:• Bias towards English language in terms of availability of tools/data/overall

focus

• Multilinguality of the population requires additional costs on TDM solutions

localisation

8