horizon 2020 - europa · 2020-03-12 · horizon 2020 policy support facility page 3 big data… n a...
TRANSCRIPT
Horizon 2020 Policy Support
Facility
Big data for evidence-informed policy making:A short introduction
Martijn Poel9 June 2017, Brussels
Page 2Horizon 2020 Policy Support Facility
Page 3Horizon 2020 Policy Support Facility
Big data…
n A step change in the scale and scope of the sources of materials (and tools for manipulating these sources) available in relation to a given object of interest (Schroeder 2014)
n Compatible with definitions that focus on:Ø Volume, e.g. number of objects, time series Ø Variety, e.g. official statistics, sensor, social mediaØ Velocity, e.g. real-time dataØ The challenge of Veracity, e.g. cleaning, addressing noise, bias
n Velocity and Exhaustivity (‘n=all’) often lead to Volume, e.g. when using sensors or web scraping (Kitchin & McArdle 2016)
3
Page 4Horizon 2020 Policy Support Facility
…for evidence-informed policy making
4
Data collection and data linking- Established
data sources- New data
sources
Data analytics: theory-based, indicator-based, using statistics,
econometrics, machine learning, etc.
Visualisation and visual analytics
Policymaking- Agenda and priority setting- Policy options- Policy design- Ex ante evaluation and
impact assessment- Implementation- Monitoring- Ex post evaluation and impact
assessment
Page 5Horizon 2020 Policy Support Facility
Inventory of relevant operational initiatives (58)
5
n Status of the initiative: pilot, demonstrator, implementation
n Type of responsible authority: department, agency, PPP, etc.
n Policy area(s) addressed by the data for policy initiative: alln (Expected) use of the data in the policy cycle: first halfn Data sources, variety, volume, velocity: some new data sourcesn Data format and data interoperability: good progressn Data veracity: un/verified, in/complete, in/consistent: okn Level of openness of data: half the data openn Data analysis methodologies: mostly descriptiven Data analytics tools and platforms: basic and advanced
Page 6Horizon 2020 Policy Support Facility
Examples related to research and innovation
6
n Data about R&I policies and impact: online, interactive country reports and cross-country thematic reports
n STAR METRICS repository of datasets and tools to assess the impact of US federal R&D investments
n Data on R&D projects, public procurement, patents, trademarks, publications, websites of ICT companies (e.g. employment), blogs/news sites, job portals
Big data for the evaluation of R&D grants in the ICT sector (Spain)
Page 7Horizon 2020 Policy Support Facility
4
8
10
12
14
24
39
Small-scale survey data
Social media
Commercial/traded business data consumer / market data
Large-scale survey data
Sensor-based data
Statistical offices
Administrative data
0 5 10 15 20 25 30 35 40 45
n = 111
Data sources
Administrative data
Statistical offices
Sensor-based data
(other) Large-scale surveys
Commercial/traded data
Social media data
Small-scale surveys
Data sources
Type of data source closely linked to openness of data: ’half the data open, half the data closed’
58 initiatives using111 data sources: data linking
Page 8Horizon 2020 Policy Support Facility
2
5
6
7
8
10
12
13
43
0 10 20 30 40 50
Agent-based modelling
Benchmarking
Sentiment mining
Predictive analytics
Text analytics
Profiling
Other
Trend analysis
Descriptive statistics
n = 58
Type of data analysis
Page 9Horizon 2020 Policy Support Facility
Conclusions of our September 2015 workshop
n Transparency, accuracy, inclusion, accountability: no black boxing!
n Skills for developing and interpreting data-driven approaches
n Relevant data or readily available data? Intervention logic!
n Policy-based evidence: lies, statistics and big data
n Policy for data, e.g. privacy, ownership, standardisation
n Dual learning: policy experiment + big data experiment
Page 10Horizon 2020 Policy Support Facility10
Page 11Horizon 2020 Policy Support Facility
Possible (sub) themes for this Mutual Learning Exercise
n Data linking and data sharing: the IT and data/information infrastructures and the partnerships requiredØ Standardised document exchange formats, unique identifiers of companies,
re-coding of data (human intervention or machine learning) Ø Political, cultural and financial aspects of sharing data between
organisations such as funding agencies, ministries, statistical offices, Chambers of Commerce, regional governments
n Emerging data sources and related data analytical methodsØ Using new open or proprietary data about companies, to obtain more
complete and timely information about R&D, innovation, emerging topics, growth and employment of companies, etc.
Ø Text mining of company websites, commercial databases, open databases such as H2020, company websites, job search websites
n Ethical issues of linking, sharing and analysing more data
11