learning analytics & linked data – opportunities, challenges, examples
TRANSCRIPT
MotivationData on the Web
Some eyecatching opener illustrating growth and or diversity of web data
Learning Analytics & Linked Data – Opportunities, Challenges, Examples
Stefan Dietze (L3S Research Center, DE, @stefandietze)
Mathieu d’Aquin (The Open University, UK)
12/03/13Stefan Dietze
Linked Data for Education & Learning AnalyticsWhy is it useful?
1. Linked Data as body of knowledge for education, analytics and TEL recommender sytems:
vast amount of publicly available resources and data
HTTP access according to state of the art principles
2. Linked Data as set of principles for data sharing:
to improve interoperability of educational data
facilitate learning analytics and recommender system scenarios across isolated platforms
Further reading:
Interlinking educational Resources and the Web of Data – a Survey of Challenges and ApproachesStefan Dietze, Salvador Sanchez-Alonso, Hannes Ebner, Hong Qing Yu, Daniela Giordano, Ivana Marenzi, Bernardo Pereira Nunes, Emerald Program: electronic Library and Information Systems, Volume 47, Issue 1 (2013).
Linked Data for Open and Distance LearningMathieu d’Aquin, report for the Common Wealth of Learning,
Domain Number of datasets
Triples % (Out-)Links
Media 25 1,841,852,061 5.82 % 50,440,705
Geographic 31 6,145,532,484 19.43 % 35,812,328
Government 49 13,315,009,400 42.09 % 19,343,519
Publications 87 2,950,720,693 9.33 % 139,925,218
Cross-domain 41 4,184,635,715 13.23 % 63,183,065
Life sciences 41 3,036,336,004 9.60 % 191,844,090
User-generated content
20 134,127,413 0.42 % 3,449,143
295 31,634,213,770
503,998,829
Source: http://lod-cloud.net/state, September 2011
12/03/13Stefan Dietze
06/11/12 3Stefan Dietze
16
12
9
8
7
7
6
6
6
5
5
5
5
5
5
5
5
5
0 10 20
http://www.flickr.com
http://www.bbc.co.uk
http://en.cyberdodo.com
http://www.moma.org/
http://www.mocp.org/
http://www.google.it
http://www.oxfam.org
http://www.deutsches-museum.de
http://www.metmuseum.org
http://www.slideshare.com
http://museo.ferrari.com
http://www.ecokids.ca
http://www.nationalgeograohic.com
http://www.ducati.it
http://www.museumofbrands.com
http://learnenglishkids.britishcouncil…
http://www.childrenoftheearth.org
http://kidshealth.org
Count
Count
0,00
5,00
10,00
15,00
20,00
25,00
30,00
35,00
40,00
45,00
50,00
M1: R (% of total)
M2: Resources R / 1K queries
M3: Resources R´/ 1K queries
Educationally relevant Web dataWhere does it come from?
Vast amounts of educational resource collections (OpenCourseWare etc) but…
…increasing relevance of (social) Web content for education
Data source: LearnWeb (http://learnweb.l3s.uni-hannover.de/)
Educationally relevant data, eg for informal learning
Publications & literature: ACM, PubMed, DBLP (L3S), OpenLibrary Domain-specific knowledge & resources: Bioportal for Life Sciences,
historic artefacts in Europeana, Geonames Cross-domain knowledge: DBpedia, Freebase, … (Social) media resource metadata: BBC, Flickr, …
LD as body of knowledge for education
12/03/13Stefan Dietze
Educationally relevant data, eg for informal learning
Publications & literature: ACM, PubMed, DBLP (L3S), OpenLibrary Domain-specific knowledge & resources: Bioportal for Life Sciences,
historic artefacts in Europeana, Geonames Cross-domain knowledge: DBpedia, Freebase, … (Social) media resource metadata: BBC, Flickr, …
Explicitly educational datasets and schemas
University Linked Data: eg The Open University UK, http://data.open.ac.uk, Southampton University, University of Munster (DE), http://education.data.gov.uk
OER Linked Data: mEducator Linked ER (http://ckan.net/package/meducator), Open Learn LD
Schemas: Learning Resource Metadata Initiative (LRMI, http://www.lrmi.net/), mEducator Educational Resources schema (http://purl.org/meducator/ns)
http://linkededucation.org & http://linkeduniversities.org
LD as body of knowledge for education
12/03/13Stefan Dietze
LD for integration and analyticsExamples
Further reading:
Linked Education: interlinking educational Resources and the Web of DataStefan Dietze, Honq Qing Yu, Daniela Giordano, Eleni Kaldoudi, Nikolas Dovrolis and Davide Taibi, ACM Symposium On Applied Computing (SAC-2012), Special Track on Semantic Web and Applications
1. Integration & analytics of biomedial resources: Linked Data as means to lift, enrich, disambiguate and cluster educational resources from disparate repositories (http://www.meducator.net)
2. Curation and analytics of educational datasets in LinkedUp: towards a unified educational graph (http://linkedup-project.eu)
12/03/13Stefan Dietze
LinkedUp vision: a global data space for education
The Open University
University of Bristol
University of Southampton
mEducator
University of Muenster, DE
OrganicEduNet
Data.gov.uk education
Orgs., Buidings, Locations
Learning resources
Research ouputs
LinkedUp European-funded „Support Action“ Started Nov/2012 http://linkedup-project.eu
Challenges Finding the needle in the heystack:
which datasets to consider? Lack of structured & precise
descriptions of datasets according to dimensions such as topic coverage, represented types, quality, relevance
Dataset heterogeneity: lack of links between (a) dataset schemas and (b) resources and entities
12/03/13Stefan Dietze
Educational data gathering and cataloging: Linked Education cloud “LinkedUp/Linked Education cloud” as subset of LOD cloud CKAN – “The DataHub” for data collection, dedicated group “linked-education” Public RDF vocabulary of datasets (“Linked Education Catalog”)
Educational data integration & infrastructure: Linked Education graph Linked Education cloud => Linked Education graph & dataset Integration of (selected) datasets into coherent (RDF) graph Infrastructure, unified (SPARQL) endpoint & APIs for querying
LinkedUp data cataloging and assessmentLinked Education Cloud & Linked Education Graph
Educational Data
12/03/13Stefan Dietze
12/03/13Stefan Dietze
Linked Education/LinkedUp @ The DataHub
http://datahub.io/group/linked-education
http://data.linkededucation.org/linkedup/catalog
Analytics on Learning Analyticsin a nutshell CKAN linkededucation
LAK tutorial
LAK Data
LAK challenge
LILE2013 @ www
http://www.solaresearch.org/resources/lak-dataset/
Linked Data (including full text) of 300+ papers from LAK and Educational Data Mining community
Unprecedented resource for further research & analytics
Further reading: Taibi, D., Dietze, S., Fostering analytics on learning analytics research: the LAK dataset, Technical Report, 03/2013, URL: http://resources.linkededucation.org/2013/03/lak-dataset-taibi.pdf
Dataset analytics: topic coveragein a nutshell
Dataset Entities Categories Types
Yovisto 25 99 91
education.data.gov.uk 24 81 22
Educational Programs - SISVU 23 95 55
Achievement Standards Networks – ASN:US 22 97 64
Linking Italian University Statistics Project 20 78 24
Open Data from the Italian National Research Council 19 59 54
Nature Publishing Group - ALL 19 36 20
Organic Edunet Linked Open Data 16 47 19
DBLP Bibliography Database in RDF (FU Berlin) 16 53 70
Linked Data from the Open University 13 50 62
Italian public schools (LinkedOpenData.it) 13 46 80
Learning Analytics & Knowledge (LAK) Data 12 46 66
COLINDA - Conference Linked Data 10 28 48
mEducator: Linked Educational Resources 8 37 128
TheSoz Thesaurus for the Social Sciences (GESIS) 7 23 58
DBLP in RDF (L3S) 7 24 70
Catalogus Professorum Lipsiensis 6 15 55
OxPoints (University of Oxford) 2 9 49
Goal
Broader understanding of the topics / disciplines covered within Linked Education cloud (and LOD in general)
Identifying similarities between datasets
Creating richer dataset descriptions
Approach
Enriching sample resources from each dataset with DBpedia entities/categories
Dataset analytics: topic coveragein a nutshell Top-20 Categories in
Linked Education Catalog
Category #Occurrence
http://dbpedia.org/resource/Category:Educational_psychology 133
http://dbpedia.org/resource/Category:Philosophy_of_science 115
http://dbpedia.org/resource/Category:Thought 111
http://dbpedia.org/resource/Category:Concepts_in_metaphysics 104
http://dbpedia.org/resource/Category:Management 97
http://dbpedia.org/resource/Category:Concepts_in_physics 96
http://dbpedia.org/resource/Category:Cognition 95
http://dbpedia.org/resource/Category:Teaching 92
http://dbpedia.org/resource/Category:Mental_processes 90
http://dbpedia.org/resource/Category:Systems_science 90
http://dbpedia.org/resource/Category:Developmental_psychology 86
http://dbpedia.org/resource/Category:Concepts_in_epistemology 83
http://dbpedia.org/resource/Category:Cognitive_science 82
http://dbpedia.org/resource/Category:Neuropsychology 82
http://dbpedia.org/resource/Category:Design 81
http://dbpedia.org/resource/Category:Greek_loanwords 81
http://dbpedia.org/resource/Category:Learning 79
http://dbpedia.org/resource/Category:Evaluation 74
12/03/13Stefan Dietze
Data assessment: topic coverageIdentifying dataset similarities and correlations in a nutshell
Data assessment: topic coverageIdentifying dataset similarities and correlations in a nutshell
<dc:title> <akt:has-title>?
OER
Publication
VideoLecture
LinkedUniversities educational videos
Step 1 – Alignment of types/properties
12/03/13 15Mathieu d‘Aquin, Stefan Dietze
Linked Education graph: resource analytics
6 million distinct (but linked) resources
97 million RDF triples
21.6 GB of data
Schema: http://data.linkededucation.org/ns/linked-education.rdf
SPARQL: http://data.linkededucation.org/request/linked-learning/sparql
Entity enrichment => disambiguation & correlationVia DBpedia/Freebase
<led:Resource-OpenLearn-2139393292>…<led:title>…laws of gravity…</led:title>…</led:Resource-OpenLearn-2139393292>
<led:Resource-BBC-519215>…<led:title>…gravitating…</led:title>…</led:Resource-BBC-519215>
12/03/13Stefan Dietze
12/03/13Stefan Dietze
Linked Education graph: resource analytics
LD for integration & analytics of heterogeneous Web Data Use case: biomedical education
=> http://metamorphosis.med.duth.gr/ Metamorphosis+ Tailored (L)CMS plugins
=> http://www.meducator3.net/
Educational Web Resources
Data/services integration & retrieval/search APIs
?
Data/services integration & retrieval/search APIs Linked Educational Resources
http://linkededucation.org/meducator http://purl.org/smartlink
LD for integration & analytics of heterogeneous Web Data Use case: biomedical education
?
Data enrichment via DBpedia & Freebase
Semi-structured RDF description of educational resource
12/03/13Stefan Dietze
?
Data enrichment via DBpedia & Freebase
Semi-structured RDF description of educational resource
12/03/13Stefan Dietze
?
?
Data enrichment via DBpedia & Freebase
?
?
!
!
Data enrichment via DBpedia & Freebase
NER & disambiguation, eg, via
Semi-automated data enrichment
http://metamorphosis.med.duth.gr/ Metamorphosis+
Example: OER annotation in MetaMorphosis+
Further reading: Dietze, S., Kaldoudi, E., Dovrolis, N., Yu, H.Q., Taibi, D. (2011) MetaMorphosis+ – A social network of educational Web resources based on semantic integration of services and data, 10th International Semantic Web Conference (ISWC2011), Bonn, Germany
12/03/13Stefan Dietze
Access to 324 ontologies and over 5 Mio entities http://bioportal.bioontology.org/ http://metamorphosis.med.duth.gr/
Metamorphosis+
2. Suggested Entities
3. Selected entities from BioPortal used to describe discipline, keywords of resource
1. User-specified term during learning resource annotation
Semi-automated data enrichment
12/03/13Stefan Dietze
Number of resources per DBpedia reference/enrichment (subject) in mEducator datasetDBpedia concept (http://dbpedia.org/resource/....)
Cervical_cancer Screening Cervical Hpv Oxygenation Childhood differential_diagnosis Knowledge Learning decision_making Training Lecture Risk hpv_infection Fear pap_smear Abnormal Ventilation Ecg
Linked by number of resources
59 31 29 29 26 22 19 18 17 16 15 15 15 15 15 15 14 14 14
Data analytics: clustering & correlation
DBpedia references used most frequently to describe the „subject“ of particular educational resources
12/03/13Stefan Dietze
DBpedia concept (http://dbpedia.org/resource/....)
Cervical_cancer Screening Cervical Hpv Oxygenation Childhood differential_diagnosis Knowledge Learning decision_making Training Lecture Risk hpv_infection Fear pap_smear Abnormal Ventilation Ecg
Linked by number of resources
59 31 29 29 26 22 19 18 17 16 15 15 15 15 15 15 14 14 14
Clustering of resources graph (blue nodes: resources, green nodes: enrichments)
Cluster of educational resources relating to „cervical cancer“ subject
Number of resources per DBpedia reference/enrichment (subject) in mEducator dataset
Data analytics: clustering & correlation
12/03/13Stefan Dietze
Exploratory search enabled via clusteringExample: search results of OER in MetaMorphosis+
http://metamorphosis.med.duth.gr/ Metamorphosis+
Educational resources retrieved based on particular user query
12/03/13Stefan Dietze
Exploratory search enabled via clustering
http://metamorphosis.med.duth.gr/ Metamorphosis+
Related resources (ranked)
Example: search results of OER in MetaMorphosis+
12/03/13Stefan Dietze
ConclusionsSummary
Linked Data as knowledge resource: growing amount of educationally related datasets available
Linked Data principles for data interoperability as enabler for Learning Analytics and educational recommender systems across platform boundaries
Challenges:
Data heterogeneity (in particular when considering all forms of related data and resources)
Insufficient knowledge & descriptions of data(sets)
Ongoing and future work Linked Education data catalog (http://linkedup-project.eu,http://data.linkededucation.org/linkedup/
catalog)
Assessment & annotation of datasets according to topic / type coverage, educational relevance, …
Exploitation in innovative Learning Analytics scenarios and applications => LinkedUp Challenge (http://linkedup-challenge.org)
12/03/13Stefan Dietze
References
12/03/13 31Mathieu d‘Aquin, Stefan Dietze
Interlinking educational Resources and the Web of Data – a Survey of Challenges and Approaches, Stefan Dietze, Salvador Sanchez-Alonso, Hannes Ebner, Hong Qing Yu, Daniela Giordano, Ivana Marenzi, Bernardo Pereira Nunes, Emerald Program: electronic Library and Information Systems, Volume 47, Issue 1 (2013).
Linked Education: interlinking educational Resources and the Web of Data, Stefan Dietze, Honq Qing Yu, Daniela Giordano, Eleni Kaldoudi, Nikolas Dovrolis and Davide Taibi, ACM Symposium On Applied Computing (SAC-2012), Special Track on Semantic Web and Applications
As Simple As It Gets – A sentence simplifier for different learning levels and contexts Nunes, B. P., Kawase, R., Siehndel, P., Casanova, M.A., Dietze, S., in ICALT 2013: 13th IEEE International Conference on Advanced Learning Technologies (ICALT), Beijing, China, July 15-18 (2013).
Fostering analytics on learning analytics research: the LAK dataset, Taibi, D., Dietze, S., Technical Report, 03/2013, URL: http://resources.linkededucation.org/2013/03/lak-dataset-taibi.pdf
Semantic Web Journal Special Issue on Linked Data for Science and Education. , Kessler C., d’Aquin M. and Dietze S. (eds) http://iospress.metapress.com/content/m87017012802/
Putting Linked Data to Use in a Large Higher-Education OrganisationMathieu d’Aquin, Interacting with Linked Data workshop 2012
Information Organization on the Internet based on Heterogeneous Social Networks, Kaldoudi, E., Dovrolis, N., Dietze, S., 29th ACM International Conference on Design of Communication (ACM SIGDOC’11), Pisa, 2011.
MetaMorphosis+ – A social network of educational Web resources based on semantic integration of services and data, Dietze, S., Kaldoudi, E., Dovrolis, N., Yu, H.Q., Taibi, D. (2011), 10th International Semantic Web Conference (ISWC2011), Bonn, Germany
Thank you!
Contact http://purl.org/dietze | @stefandietze
See also (general)
http://linkedup-project.eu
http://linkedup-challenge.org
http://linkededucation.org
http://linkeduniversities.org
See also (data)
http://datahub.io/group/linked-education
http://data.linkededucation.org/linkedup/catalog
http://www.solaresearch.org/resources/lak-dataset/
http://datahub.io/dataset/meducator
http://datahub.io/dataset/smartlink