graph databases and web frameworks (nodejs, angularjs, gridfs, openlink virtuoso)

25
Web frameworks and graph databases Overview and code demos João Rocha da Silva May 2014 [email protected]

Upload: joao-rocha-da-silva

Post on 25-May-2015

690 views

Category:

Engineering


4 download

DESCRIPTION

Invited Lecture on NoSQL databases and modern web-development frameworks. JavaScript + JSON = easy parsing, less verbose code NodeJS = asynchronous everything. Needs precise flow control ElasticSearch = Scalable indexing, easy to use JSON API GridFS = Transparent scaling for huge numbers of large files; querying using JSON-based API Graph Databases = Model certain problems better than their • relational counterparts. Simpler queries using SPARQL. Less mature than RDBMs. No transactions. Socket.io = Real-time library for client-server-client push communication

TRANSCRIPT

Page 1: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

Web frameworks and

graph databasesOverview and code demos

João Rocha da Silva

May 2014

[email protected]

Page 2: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

Contents• Modeling limits of relational databases

• Entities with variable attributes

• Time-variant values

• Inheritance

• Hierarchies (parents of parents of parents…)

Page 3: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

Contents (cont’d)• Modeling problems in a graph

• Ontologies and SPARQL

• OpenLink Virtuoso

• Scalable file storage: GridFS within MongoDB

• Scalable document indexing : ElasticSearch

Page 4: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

• NodeJS and asynchronous flow control

• AngularJS for dynamic web interfaces

• BONUS : Socket.io sneak peek

Contents (cont’d)

Page 5: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

Relational databases • Good when you know everything about the

problem at the time of modeling

• A column can only be of a single type (VARCHAR, int, etc)

• Hard to document

• Model can become too attached to the code

Page 6: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

Relational databases

• Handling historical values = complex SQL

• Hierarchies = Foreign Key loops

• Variable attributes, inheritance = [null + if Hell] or many JOINs

Page 7: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

Relational models

Page 8: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

(one of 78,826 tables and counting)

source : SAP

Beautiful, meaningful column names ;-)

Even better table names

Page 11: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

Attribute name

Timestamps

Value (always varchar)

Entity with variable, time-dependent

attributes

Fixed attrs.

!source CKAN

Page 12: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

Graph models

Page 13: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

Graph databases • Represent entities (Users, Products, Places…) as

vertexes (entity types are called classes)

• Connections between them are directed graph edges (edge types are called properties)

!

• The meaning of these connections is expressed in ontologies that can be shared and reused

Page 14: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

Representing a person using ontologies

Page 15: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

http://www.fe.up.pt/~pro11004

“João Rocha”

foaf:name

up:PhDStudent rdf:type

http://www.w3.org/TR/rdf-schema/http://www.foaf-project.org/

http://www.fe.up.pt/

org:memberOf

Page 16: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

Getting all the studentsSELECT ?uri ?attribute ?value FROM <http://myorganization.com/data> WHERE { ?uri rdfs:type up:Student. ?uri ?attribute ?value }

• Will fetch all the students, regardless of their type

• Will also return their attributes (“database columns”)

• Different types of students will have different attributes

Page 17: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

Inference

• Transitive Properties (subclass of subclass…) • Subclasses • Multiple Inheritance Handling

(Student + Researcher + ScholarshipHolder)

Saves coding time spent writing complex queries

Page 18: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

Nothing comes for free• Aggregation operators slow

• Transactions are not supported in standard SPARQL

• (“SPARQL 1.1 Query/Update Services should be atomic but that they are not required to be atomic.”)

• Graph DBMS Solutions are in early stages (many bugs, many “beta”s, many mailing lists…)

Page 19: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

An example application

Page 20: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

Dendro (dendro-dev.fe.up.pt:3001)

• Dropbox and File/Folder description platform

• Variable descriptions

• Time-dependent values

• Directory structures (hierarchy)

• Need for simple querying…

Page 21: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

nie:isLogicalPartOf

Pn

Dn

280mm

“DCB Base Data”

120

Dn-1

dcb:initialCrackLength

dc:title

dcb:specimenWidth

dc:isReferencedBy

Fn

120

dc:title

dcb:specimenWidth

dc:isVersionOf

Added propertyinstance

01/01/2014^^xsd:date

dc:created

01/01/2014^^xsd:date

dc:modified

Changedmodificationtimestamp

Revision creation

timestamp

Un

dc:creator

Current dataset version Past Revisions

ddr:pertainsTo

Change recording

C

ddr:initialCrackLen

gth

ddr:changedDescriptor

“add”

ddr:operation

“DCB Base Data”

Page 22: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

Socket.io Real-time eventsNodeJSBusiness

Logic

AngularJS

Dynamic interfaces à la Google Docs

Files

GridFS

Database

OpenLink Virtuoso

Free-text search

ElasticSearch

Page 23: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

Code DemosNodeJS (Dendro) http://192.168.5.75:3001

GridFS http://192.168.5.75:27017

OpenLink Virtuoso http://192.168.5.75:8890

ElasticSearch http://192.168.5.75:9200/_plugin/head/

Socket.io (BattleBits) http://localhost:3000

Page 24: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

Conclusions• JavaScript + JSON = easy parsing, less verbose code

• NodeJS = asynchronous everything. Needs precise flow control

• ElasticSearch = Scalable indexing, easy to use JSON API

• GridFS = Transparent scaling for huge numbers of large files; querying using JSON-based API

• Graph Databases = Model certain problems better than their relational counterparts. Simpler queries using SPARQL. Less mature than RDBMs. No transactions.

• Socket.io = Real-time library for client-server-client push communication

Page 25: Graph Databases and Web Frameworks (NodeJS, AngularJS, GridFS, OpenLink Virtuoso)

João Rocha da Silva is an Informatics Engineering PhD student at the Faculty of Engineering of the University of Porto. He specializes on research data management, applying the latest Semantic Web Technologies to the adequate preservation and discovery of research data assets. !He is experienced in many programming languages (Javascript-Node, PHP with MVC frameworks, Ruby on Rails, J2EE, etc etc) running on the major operating systems (everyday Mac user). Regardless of language, he is a quick learner that can adapt to any new technology quickly and effectively. !He is also an experienced freelancer iOS Developer with several Apps published on the App Store, and a self-taught DIY mechanic with a special interest in classic cars, particularly his 1987 Toyota Corolla GT Twin Cam, also known as Hachi-Roku or AE86.

!Research Data Management and Semantic Web Researcher, Web & iPhone Developer

João Rocha da Silva!

[email protected]