supporting big data, open data, data analytics and data science

16
Supporting Big Data, Open Data, Data Analytics and Data Science Dr Simon Price Research IT Manager

Upload: simon-price

Post on 11-Apr-2017

48 views

Category:

Services


10 download

TRANSCRIPT

Page 1: Supporting Big Data, Open Data, Data Analytics and Data Science

Supporting Big Data, Open Data, Data Analytics and Data Science

Dr Simon PriceResearch IT Manager

Page 2: Supporting Big Data, Open Data, Data Analytics and Data Science

2

• Bristol is a research-intensive university

• 6 Faculties: Social Science & Law, Science, Engineering, Arts and two Medical Faculties

• Employs 2000+ researchers (excluding PhDs)

• Each year (approximately):• 1500 research funding applications• £100M research income• 4500 research outputs

Page 3: Supporting Big Data, Open Data, Data Analytics and Data Science

3

Outline

1. Big Data2. Open Data3. Data Analytics4. Data Science

5. Implications for IT support

Page 4: Supporting Big Data, Open Data, Data Analytics and Data Science

4

Big Data

Page 5: Supporting Big Data, Open Data, Data Analytics and Data Science

5

Big Data

• Lots and lots of technology buzzwords!• Some important ones:

• MapReduce• The Hadoop stack

• Distributed file systems• Query languages & programming languages

• NoSQL databases (columns, document, graph, ...)

Page 7: Supporting Big Data, Open Data, Data Analytics and Data Science

7

Big Data

• Trends in Hadoop stack• Near realtime analytics• Streaming analytics• In-memory

• Trends in NoSQL• Relational and NoSQL moving closer together

Page 8: Supporting Big Data, Open Data, Data Analytics and Data Science

8

Open Data

Page 9: Supporting Big Data, Open Data, Data Analytics and Data Science

9

Open Data - data.bris• Each PI allocated 5TB "forever"• Research Data Management• Open Data Publication

Page 10: Supporting Big Data, Open Data, Data Analytics and Data Science

10

Open Data - public data

Page 11: Supporting Big Data, Open Data, Data Analytics and Data Science

11

140+ datasets live on opendata.bristol.gov.uk Some real time data Transport API repository now available Examples

Government: Elections since 2007 Community: Quality of Life survey Education: School Results Energy: Installed PV, Energy Use in Council Buildings Environment: Real time & Historic Air Quality, Flood Alerts (EA) Land use: 2013 Planning applications Health: Life expectancy/ Mortality, Obesity, NHS Spend

Bristol is Open - datasets

Page 12: Supporting Big Data, Open Data, Data Analytics and Data Science

12

Data Analytics

• Operational focus• variables are "known knowns and known unknowns"

• Descriptive• summarisation known variables and alerting

• Predictive• correlations between known variables

Page 13: Supporting Big Data, Open Data, Data Analytics and Data Science

13

Data Science

• Multidisciplinary data-intensive research• Focus on research insights, causation and prediction• Usually involves Machine Learning and Statistics

• Different perspectives:• Computer Scientists view DS as a research domain• Statisticians view DS as a research domain• Other academics view DS as a service

Page 14: Supporting Big Data, Open Data, Data Analytics and Data Science

14

3 May 2023

Page 15: Supporting Big Data, Open Data, Data Analytics and Data Science

15

3 May 2023

Page 16: Supporting Big Data, Open Data, Data Analytics and Data Science

16

Implications for IT support

• Governance• Shift from IT-owned to academic-owned (Shadow IT)

• Skills• IT experts need to train and trust academics• Nurture internal skills pipeline (interns, postgrads)

• Systems• Mixed economy of internal and external