semantic web in an sms as presented at ekaw2016
TRANSCRIPT
The Semantic Web in an SMS
Onno Valkering, Victor de Boer, Awa Gossa Lô, Romy Valkering, Stefan SchlobachVrije Universiteit Amsterdam
Can the (Semantic) Web (be made to) mean something for knowledge sharing even under very constraining conditions?
No internet, no computer, no electricity
Multitude of languages, levels of literacy
Web for Development challenge
http://worldwidesemanticweb.org
Low-resource knowledge sharing platform
Low-power, ubiquitous, cheap hardware FLOSS components
Rapid prototyping + deployment of (knowledge-intensive) services
User InterfacesVoice servicesSMS-based Visual
Wifi, 3g or GSM network
RDF Data store (Linked Data) allows for flexible data integration across applications, deployments
allows for easy development of information services relevant to local communities in their preferred language
Kasadaka (“talking box”)
Data store
GSM/Voice interfaceWeb Interface
Text-To-Speech / ASR
Radio users
Field operative
Users
Information Services
Market Information System (Mali)
Veterinary service (N-Ghana)Knowledge base and diagnosis system(CommonKADS)Information sharing across locations
Poultry vaccination service (Mali)
Seed market (Mali and Burkina Faso)Seed quality
Current cases
Machine to machine information integration
GSM network
SPARQL in an SMSConverters to translate SPARQL HTTP request to SMS message (140 or 160 chars) and vice versa
CONSTRUCT, INSERT/DELETE DATA
ChallengesBlending synchronous and asynchronous messagingSPARQL/ RDF compressionUnpredictable query result sizes
Compression for small datasets experiments
StrategiesDifferent serializations: RDF/XML, N-triples, Turtle, HDT1, EXI2 Compression (zip)Assume shared vocabularies (top 20 from prefix.cc) and remove redundant (inferenced) triples (RDFS reasoning)
Evaluated on real-world datasets LOD Laundromat3
232,822 small datasets (1-1,000 triples)
[1] Fernández et al. “Compact Representation of Large RDF Data Sets for publishing and exchange” (ISWC 2010)[2] Käbisch et al. “Standardized and Efficient RDF Encoding for Constrained Embedded Networks” (ESWC2015)[3] http://lodlaundromat.org/ and Rietveld et al. “LOD Lab: Experiments at LOD Scale “ (ISWC2015)
Compression experiments resultsNumber of SMSes
Avg. number of triples
1 0
2 3
3 8
4 16
5 24
6 84
7 98
8 126
9 189
10 301
For very small datasets (<40 triples), n-triples + gzip works best For larger datasets Turtle+gzip compresses bestRemoving redundancies using shared vocabularies adds additional compression
1-10 11-20
21-30
31-40
41-50
51-60
61-70
71-80
81-90
91-100
101-200
201-300
901-1000
0
10
20
30
40
50
60
70
80
N-triples +gzipTurtle+GzipBest + vocabulary-based
Size of dataset in triples (binned)
Com
pres
sion
(per
cent
age
wrt
n-tr
iple
s)
Evaluation: 4 scenarios in 2 cases Digivet and RadioMarche applicationsFour scenarios / SPARQL queries in total
Results
Conclusions
Semantic Web over SMS is feasible using data compression + semantic background knowledgeeconomically feasible for ICT4D services for small datasets
Semantic Web without the Web is possible cf. IOT
Knowledge engineering, knowledge sharing for allTowards a Computer Science for Development (CS4D)
Thank you
github.com/onnovalkering/sparql-over-smsgithub.com/abaart/KasaDaka
kasadaka.comworldwidesemanticweb.org
w4ra.org
TriplesN-triples +gzip
RDF/ XML
RDF/XML+Gzip Turtle
Turtle+G zip HDT
HDT+ Gzip EXI
EXI+ Gzip
Best + vocabulary-based
1-10 50.7 103.8 77 102 70.3 495.5 180.1 57.5 65.9 44.211-20 22.5 62 27.1 50.5 24.2 122.2 47 23.3 24.9 18.921-30 16.2 58.2 18.5 48.7 16.3 79.5 31.1 16.5 17.5 13.631-40 28.3 69.1 30.9 62.1 28.6 86.5 40.7 28.2 29.1 23.541-50 9.8 51.2 10.2 42.3 8.6 38.1 14.8 9.3 9.7 851-60 17.2 59.2 17.5 50.1 15.9 50.5 22.8 15.8 16.3 8.761-70 11.8 58.5 12.4 42.4 10 43 17.7 11.1 11.6 671-80 8.8 54.8 8.5 40.9 7 31.6 11.2 7.5 7.8 6.481-90 6.7 52 6.3 40.6 5.1 25.4 9.1 5.8 6 4.491-100 8.1 54.9 7.6 40.4 6.2 26.9 9.7 6.8 7 5.7101-200 8.8 62 8.3 39.2 6.7 24.7 10.1 7.6 7.9 5.7201-300 4.8 50.8 3.6 39 2.8 13.4 4 3.6 3.6 2.7301-400 4.8 51.5 3.3 37.7 2.5 11.4 3.3 3 3.1 2.5401-500 4.4 51.5 2.9 37.4 2.2 10.4 2.7 2.6 2.7 2.2501-600 5 53.8 3.4 38.7 2.5 8.9 3 2.9 3 2.4601-700 4.1 51 2.5 35.9 1.7 8.5 2.2 2.3 2.4 1.7701-800 4.5 51.1 2.7 36.2 1.9 8.1 2.1 2.4 2.4 1.9801-900 4.4 51.1 2.6 36.4 1.8 7.9 1.9 2.3 2.3 1.8901-1000 4.1 50.9 2.4 36.5 1.7 7.7 1.7 2.1 2.1 1.7801-900 4.4 51.1 2.6 36.4 1.8 7.9 1.9 2.3 2.3 1.8901-1000 4.1 50.9 2.4 36.5 1.7 7.7 1.7 2.1 2.1 1.7
For very small datasets (<40 triples), n-triples + gzip works best For larger datasets Turtle+gzip compresses bestRemoving redundancies using shared vocabularies adds additional compression
SPARQL in an SMS
Enable (Semantic) Web data exchange over GSM networks.
Practical differences HTTP and SMS:SMS works with phone number, HTTP works with URLs.SMS has a size restriction, HTTP practically has none.SMS is one-way messaging, HTTP follows request-response.
Basic M2M communication based on SPARQL.