applied databases - the university of edinburgh · 2017. 1. 20. · applied databases apply...
TRANSCRIPT
![Page 1: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/1.jpg)
Applied Databases
Sebastian Maneth
Lecture 1Introduction, Basics of XML
Univeristy of Edinburgh - January 16th, 2017
![Page 2: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/2.jpg)
Applied Databases
Apply database technology (e.g. MySQL) in varying contexts
Together with other technologies: - XML - Lucene (full-text search) - RDF
2
![Page 3: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/3.jpg)
Applied Databases
Apply database technology (e.g. MySQL) in varying contexts
Together with other technologies: - XML - Lucene (full-text search) - RDF
WARNING Course Catalogue mentions
Similarity Search Data Analytics
Unfortunately, these will NOT be covered this year
3
![Page 4: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/4.jpg)
Course Organization
Lectures Monday 14:10–15:00 G.07, Medical School
Thursday 14:10–15:00 Lecture Theatre 2, Appleton Tower
Lecturer Sebastian Maneth ([email protected])TA Fabian Peternek
Assessment Exam (60%)
Assignment 1 (20%) due 17th February, 4:00pm
Assignment 2 (20%) due 24th March, 4:00pm
4
![Page 5: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/5.jpg)
Course Format
20 Lectures All material covered in the lectures is examinable
Assignments Lectures 1–12 cover material relevant to the Assignments
5
![Page 6: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/6.jpg)
Assignments
taken, with consent and warm thanks, from UCLA lecture “CS144: Web Applications”
Assignments 1 & 2
- Programming assignments, in Java & SQL
- Pair programming: you are allowed to program in pairs of two persons
Rules: either alone or with partner may change partner for 2nd assignment submit one solution same mark for both in the team
6
![Page 7: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/7.jpg)
Assignment 1
1) design a relational schema for EBAY data
2) convert EBAY data from XML into relational tables (csv files)
3) import csv files into a MySQL database
4) execute some SQL queries over the database
7
![Page 8: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/8.jpg)
Assignment 1
1) design a relational schema for EBAY data
2) convert EBAY data from XML into relational tables (csv files)
3) import csv files into a MySQL database
4) execute some SQL queries over the database
Requires
- XML parsing (DTDs, DOM, SAX)
- basic DB knowledge (schema design, basic SQL queries)
8
![Page 9: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/9.jpg)
Assignment 1
1) design a relational schema for EBAY data
2) convert EBAY data from XML into relational tables (csv files)
3) import csv files into a MySQL database
4) execute some SQL queries over the database
Requires
- XML parsing (DTDs, DOM, SAX)
- basic DB knowledge (schema design, basic SQL queries)
Lectures 5 – 8
Lectures 1 – 4
9
![Page 10: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/10.jpg)
Assignment 1
Pair programming
together design database schema
individually write load functions for different tables
Ideally together find abstractions that make the code small, elegant, and readable
10
![Page 11: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/11.jpg)
Assignment 2
1) create a Lucene full-text Index (from Java)
2) implement a basic keyword search function
3) build a spatial index in MySQL
4) implement spatial search
5) create web interface for keyword & spatial search and for display of results
10km
11
![Page 12: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/12.jpg)
Assignment 2
1) create a Lucene full-text Index (from Java)
2) implement a basic keyword search function
3) build a spatial index in MySQL
4) implement spatial search
5) create web interface for keyword & spatial search and for display of results
Requires
- spatial search
- basic knowledge of Lucene / text-indexing
Lectures 10–12
Lecture 9
10km
12
![Page 13: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/13.jpg)
Assignment 2
1) create a Lucene full-text Index (from Java)
2) implement a basic keyword search function
3) build a spatial index in MySQL
4) implement spatial search
5) create web interface for keyword & spatial search and for display of results
Assignments 1 & 2
hands-on experience to implement a web store such as EBAY or similar!
10km
13
![Page 14: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/14.jpg)
Applied Databases
Main Topics
XML
DB schema design, SQL
Lucene
String Matching
XPath, XSLT, RDF, SPARQL
Lectures 1 – 4
Lectures 5 – 8
Lectures 9 – 12
Lectures 13 – 16
Lectures 17 – 19
14
![Page 15: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/15.jpg)
Lecture 1
Basics of XML
15
![Page 16: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/16.jpg)
Outline
1. Motivations for XML
2. Well-formed XML
3. Parsing / DTD Validation: Introduction
16
![Page 17: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/17.jpg)
Similar to HTML (Berners-Lee, CERN W3C) use your own tags
XML is the de-facto standard for data exchange on the web
XML17
![Page 18: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/18.jpg)
1. XML
Motivation
to have one language to speak about data
18
![Page 19: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/19.jpg)
19
![Page 20: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/20.jpg)
XML is a Data Exchange Format
1974 SGML Standardized Generalized Markup Language (Charles Goldfarb at IBM Research)
1989 HTML (Tim Berners-Lee at CERN/Geneva)
1994 Berners-Lee founds Web Consortium (W3C)
1996 XML (W3C draft, v1.0 in 1998)
1. XML Motivation
http://www.w3.org/TR/REC-xml/
20
![Page 22: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/22.jpg)
Philip WadlerU. of [email protected]
…
Helmut SeidlTU [email protected]
Text file
<Related><colleague><name>Philip Wadler</name><affil>U. of Edinburgh</affil><email>[email protected]</email></colleague>…<friend><name>Helmut Seidl</name><affil>TU Munich</affil><email>[email protected]</email></friend></Related>
XML document
“markit
up!”
XML = data + structure (mark-up) 22
![Page 23: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/23.jpg)
Philip WadlerU. of [email protected]
…
Helmut SeidlTU [email protected]
Text file XML document
“markit
up!”
XML = data + structure (mark-up)
Is this a “good” structure?
<Related><colleague><name>Philip Wadler</name><affil>U. of Edinburgh</affil><email>[email protected]</email></colleague>…<friend><name>Helmut Seidl</name><affil>TU Munich</affil><email>[email protected]</email></friend></Related>
23
![Page 24: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/24.jpg)
XML Documents
Ordinary text files (UTF-8, UTF-16, US-ASCII …)
Originates from typesetting/DocProcessing community
Idea of labeled brackets (“mark up”) for structure is not new! (already used by Chomsky in the 1960’s)
Brackets describe a tree structure
Allows applications from different vendors to exchange data!
standardized, extremely widely accepted!
24
![Page 25: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/25.jpg)
XML Documents
Ordinary text files (UTF-8, UTF-16, US-ASCII …)
Originates from typesetting/DocProcessing community
Idea of labeled brackets (“mark up”) for structure is not new! (already used by Chomsky in the 1960’s)
Brackets describe a tree structure
Allows applications from different vendors to exchange data!
standardized, extremely widely accepted!
Social Implications!All sciences (biology, geography, meteorology, astrology…)have own XML “dialects” to exchange their data optimally
25
![Page 26: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/26.jpg)
XML Documents
Ordinary text files (UTF-8, UTF-16, US-ASCII …)
Originates from typesetting/DocProcessing community
Idea of labeled brackets (“mark up”) for structure is not new! (already used by Chomsky in the 1960’s)
Brackets describe a tree structure
Allows applications from different vendors to exchange data!
standardized, extremely widely accepted!
Problem highly verbose, lots of repetitive markup
26
![Page 27: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/27.jpg)
XML Documents
Ordinary text files (UTF-8, UTF-16, US-ASCII …)
Originates from typesetting/DocProcessing community
Idea of labeled brackets (“mark up”) for structure is not new! (already used by Chomsky in the 1960’s)
Brackets describe a tree structure
Allows applications from different vendors to exchange data!
standardized, extremely widely accepted!
Contra.. highly verbose, lots of repetitive markup
Pro.. we have a standard! A STANDARD! You never need to write a parser again! Use XML!
27
![Page 28: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/28.jpg)
XML: Validation & Parsing
… instead of writing a parser, you simply fix your own “XML dialect”,
by describing all “admissible structures” (+ maybe even the specific
data types that may appear inside).
You do this, using an XML Type definition language such as DTD, XML Schema, or Relax NG.
type definition languages must be SIMPLE, because youwant the parsers to be efficient!
They are similar to EBNF context-free grammar with reg. expr’s in the right-hand sides.
28
![Page 29: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/29.jpg)
XML Documents
Element names and their content
Example DTD (Document Type Description)
Related (colleague | friend | family)*colleague (name,affil*,email*)friend (name,affil*,email*)family (name,affil*,email*)name (#PCDATA)…
29
![Page 30: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/30.jpg)
XML Documents
Related
friend … colleague family
…
name affil email name email email
…
Helmut ..
Element names and their content
Example DTD (Document Type Description)
Related (colleague | friend | family)*colleague (name,affil*,email*)friend (name,affil*,email*)family (name,affil*,email*)name (#PCDATA)…
ordered, unranked tree
30
![Page 31: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/31.jpg)
XML Documents
Example DTD
Related (colleague | friend | family)*colleague (name,affil*,email*)friend (name,affil*,email*)family (name,affil*,email*)name (#PCDATA)…
Related
friend … colleague family
…
name affil email name email email
…
Element names and their content
“Element node”
31
Helmut ..
![Page 32: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/32.jpg)
XML Documents
Related
friend … colleague family
…
name affil email
…
Element names and their content
“Element node”
“Text node”
Example DTD
Related (colleague | friend | family)*colleague (name,affil*,email*)friend (name,affil*,email*)family (name,affil*,email*)name (#PCDATA)…
name email email
32
Helmut ..
![Page 33: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/33.jpg)
XML Documents
Related
friend … colleague family
…
name affil email
…
Element names and their content
“Element node”
“Text node”
Terminology
document is valid wrt the DTD
“It validates”
Example DTD
Related (colleague | friend | family)*colleague (name,affil*,email*)friend (name,affil*,email*)family (name,affil*,email*)name (#PCDATA)…
name email email
33
Helmut ..
![Page 34: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/34.jpg)
XML Documents
What else: (besides element and text nodes)
attributes processing instructions comments namespaces entity references (two kinds)
34
![Page 35: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/35.jpg)
XML Documents
What else: (besides element and text nodes)
attributes processing instructions comments namespaces entity references (two kinds)
<entry date=“2017-01-16”><name>…</entry>
35
![Page 36: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/36.jpg)
XML Documents
What else: (besides element and text nodes)
attributes processing instructions comments namespaces entity references (two kinds)
<entry date=“2017-01-16”><name>…</entry> → at most one date-attribute
→ no substructure possible
versus: <date>2017-01-16</date><date> <year>2017</year> <month>01</month> <day>16</day></date>
36
![Page 37: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/37.jpg)
XML Documents
What else:
attributes processing instructions comments namespaces entity references (two kinds)
<entry date=“2017-01-16”><name>…</entry>
<?php sql (“SELECT * FROM …”) …?>
intended to carry instructions tothe application
37
![Page 38: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/38.jpg)
XML Documents
What else:
attributes processing instructions comments <!-- some comment --> namespaces entity references (two kinds)
<entry date=“2017-01-16”><name>…</entry>
38
intended to carry instructions tothe application
<?php sql (“SELECT * FROM …”) …?>
![Page 39: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/39.jpg)
XML Documents
What else:
attributes processing instructions comments <!-- some comment --> namespaces entity references (two kinds)
<entry date=“2017-01-16”><name>…</entry>
<!-- the 'price' element's namespace is http://ecommerce.org/schema --> <edi:price xmlns:edi='http://ecommerce.org/schema' units='Euro'>32.18</edi:price>
39
intended to carry instructions tothe application
Namespaces provide unique element and attribute names
<?php sql (“SELECT * FROM …”) …?>
![Page 40: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/40.jpg)
XML Documents
What else:
attributes processing instructions comments <!-- some comment --> namespaces entity references (two kinds)
<entry date=“2017-01-16”><name>…</entry>
character referenceType <key>less-than</key> (<) to save options.
40
intended to carry instructions tothe application
<?php sql (“SELECT * FROM …”) …?>
<!-- the 'price' element's namespace is http://ecommerce.org/schema --> <edi:price xmlns:edi='http://ecommerce.org/schema' units='Euro'>32.18</edi:price>
Namespaces provide unique element and attribute names
![Page 41: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/41.jpg)
XML Documents
What else:
attributes processing instructions comments <!-- some comment --> namespaces entity references (two kinds)
<entry date=“2017-01-16”><name>…</entry>
character referenceType <key>less-than</key> (<) to save options.
This document was prepared on &docdate; and
41
intended to carry instructions tothe application
<?php sql (“SELECT * FROM …”) …?>
<!-- the 'price' element's namespace is http://ecommerce.org/schema --> <edi:price xmlns:edi='http://ecommerce.org/schema' units='Euro'>32.18</edi:price>
Namespaces provide unique element and attribute names
![Page 42: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/42.jpg)
XML: not tree but Graph
attributes of type ID: must be unique, i.e., no duplicate values
may be referenced via attributes of type IDREF
UID=“173478”
Related
friend … colleague family
…
name affil email name email email
…
Helmut ..
friend uidREF=“173478”
→ ID-attributes are similar to keys in relational DBs
42
![Page 43: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/43.jpg)
XML, typical usage scenario
<Product> <product_id> m101 </product_id> <name> Sony walkman </name> <currency> AUD </currency> <price> 200.00 </price> <gst> 10% </gst></Product>…
XML
XML Stylesheet
XML Stylesheet
PresentationFormat info
XML Stylesheet
One data source several dynamically generated views
Document structureDef. of price, gst, …
DTD, XML Schema
![Page 44: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/44.jpg)
XML: has it succeded?
Yes and No:
has become *very* popular and adopted
technically it is still (!) challenging:
(*) standard too complex (*) causes, e.g., slowness of XML parsers (a “threat to databases”)
JSON - invented in 2001 by Douglas Crockford - took off since 2005/2006
JavaScript Object Notation
44
![Page 45: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/45.jpg)
XML vs JSON
<Related><colleague><name>Philip Wadler</name><affil>U. of Edinburgh</affil><email>[email protected]</email></colleague>…<friend><name>Helmut Seidl</name><affil>TU Munich</affil><email>[email protected]</email></friend></Related>
Related = {“colleague”:{“name”:”Philip Wadler”,“affil”:”U. of Edinburgh”,“email”:”[email protected]”}…“friend”: {“name”:”Helmut Seidl”, “affil”:”TU Munich”,“email”:”[email protected]”}}
45
![Page 46: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/46.jpg)
XML vs JSON
- 7 node types- DTDs are built in
Very rich schema languages, e.g.,
- XML Schema(e.g., XHTML schema: >2000 lines)
6 data types:
- number- string- boolean (true / false)- array- object (set of name:value pairs)- empty value (null)
46
![Page 47: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/47.jpg)
2. Well-Formed XML
http://www.w3.org/TR/REC-xml/
From the W3C XML recommendation
“A textual object is well-formed XML if,
(1) taken as a whole, it matches the production labeled document(2) it meets all the well-formedness constraints given in this specification ..”
document = start symbol of a context-free grammar (“XML grammar”)
(1) contains the contex-free properties of well-formed XML (2) contains the context-dependent properties of well-formed XML
There are 10 WFCs (well-formedness constraints).E.g.: Element Type Match “The Name in an element’s end tag must match the element name in the start tag.” Why is this not context-free?
47
![Page 48: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/48.jpg)
XML Grammar - EBNF-style[1] document ::= prolog element Misc*[2] Char ::= a Unicode character[3] S ::= (‘ ’ | ‘\t’ | ‘\n’ | ‘\r’)+[4] NameChar ::= (Letter | Digit | ‘.’ | ‘-’ | ‘:’)[5] Name ::= (Letter | '_' | ':') (NameChar)*
[22] prolog ::= XMLDecl? Misc* (doctypedecl Misc*)?[23] XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>‘[24]VersionInfo ::= S'version'Eq("'"VersionNum"'"|'"'VersionNum'"')[25] Eq ::= S? '=' S?[26]VersionNum ::= '1.0‘
[39] element ::= EmptyElemTag | STag content Etag[40] STag ::= '<' Name (S Attribute)* S? '>' [41] Attribute ::= Name Eq AttValue [42] ETag ::= '</' Name S? '>‘[43] content ::= (element | Reference | CharData?)*[44]EmptyElemTag ::= '<' Name (S Attribute)* S? '/>‘
[67] Reference ::= EntityRef | CharRef [68] EntityRef ::= '&' Name ';‘[84] Letter ::= [a-zA-Z][88] Digit ::= [0-9]
48
![Page 49: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/49.jpg)
in: Proceedings of the 2003 ACM CIKM International Conferenceon Information and Knowledge Management, New Orleans, Louisiana, USA, November 2-8, 2003
49
![Page 50: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/50.jpg)
How expensive is XML Parsing?
DTD is part of XML
DTDs may contain (deterministic) regular expressions
How expensive is it to match a text of size n against a regular expression of size m?
DTDs allow recursive definitions
DTDs allow ID and IDREF attributes (ID: check uniqueness, IDREF: check existence)
50
![Page 51: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/51.jpg)
How expensive is XML Parsing?
DTD is part of XML
DTDs may contain (deterministic) regular expressions
How expensive is it to match a text of size n against a regular expression of size m?
DTDs allow recursive definitions
DTDs allow ID and IDREF attributes (ID: check uniqueness, IDREF: check existence)
Compare this to parsing complexity of
JSON csv files (csv = “comma-separated values”) [IBM Fortran, 1967]
51
![Page 52: Applied Databases - The University of Edinburgh · 2017. 1. 20. · Applied Databases Apply database technology (e.g. MySQL) in varying contexts Together with other technologies:](https://reader034.vdocuments.site/reader034/viewer/2022052008/601d09594ba59c3bd1725bb9/html5/thumbnails/52.jpg)
ENDLecture 1