xml & json: interchangeability and case studies

52
Salvatore Cristofaro, Pietro Sichera and Daria Spampinato Consiglio Nazionale delle Ricerche Istituto di Scienze e Tecnologie della Cognizione Catania XML & JSON: interchangeability and case studies Part 1: from text to XML/JSON

Upload: others

Post on 06-Jun-2022

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: XML & JSON: interchangeability and case studies

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato

Consiglio Nazionale delle Ricerche Istituto di Scienze e Tecnologie della Cognizione

Catania

XML & JSON: interchangeability and case studies Part 1: from text to XML/JSON

Page 2: XML & JSON: interchangeability and case studies

Semantic web

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  Classic web enhancement!•  Information encoding!•  Information ambiguity!•  Information transfer systems!•  Searching, maintaining and preserving reliable data!

•  Methods for data use and exchange!

XML and JSON !

Page 3: XML & JSON: interchangeability and case studies

XML and JSON

•  Created for the exchange between client and server!

•  Readable!

•  Hierarchical !

•  Many tools that read and use them !

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

Page 4: XML & JSON: interchangeability and case studies

XML and JSON: differences

•  Longer!

•  Need a parser to be interpreted !

•  No data type “array”!

XML and JSON!or!

XML vs JSON!Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  Shorter!

•  No parser to be interpreted !

•  Native data type “array”!

XML! JSON!

Page 5: XML & JSON: interchangeability and case studies

Information encoding

•  Communication!

•  Character encoding!

•  Text storing!

•  Text transmission!

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

Page 6: XML & JSON: interchangeability and case studies

Information encoding

•  String!

•  Repertoire of characters!

•  Charset!

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

Definitions!

Page 7: XML & JSON: interchangeability and case studies

Information encoding

•  Morse!

•  Enigma!

•  ASCII!

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

Page 8: XML & JSON: interchangeability and case studies

Information encoding

•  Morse!

•  Enigma!

•  ASCII!

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

Page 9: XML & JSON: interchangeability and case studies

Information encoding

•  Morse!

•  Enigma!

•  ASCII!

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

Page 10: XML & JSON: interchangeability and case studies

Information encoding

01001000 01100101 01101100 01101100 01101111 00100000 01010111 01101111 01110010 01101100 01100100!

48 65 6C 6C 6F 20 77 6F 72 6C 64!

Hello world!Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

ASCII!

Page 11: XML & JSON: interchangeability and case studies

Information encoding

•  From 128 to 256 (from 7 bit to 8 bit)!

•  Charsets from IBM, HP, Apple, Microsoft!

•  From code page to ISO!

•  ISO vs ANSI !

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

ASCII!

Page 12: XML & JSON: interchangeability and case studies

Information encoding

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  143.859 characters!

•  Covering 154 modern and historic scripts!

•  Character encoding:!•  UTF-32!•  UTF-16!•  UTF-8!

UNICODE!

Page 13: XML & JSON: interchangeability and case studies

Information encoding

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  2-4 bytes!

•  3 schemas!•  UTF-16!•  UTF-16LE (Little Endian)!•  UTF-16BE (Big Endian)!

UTF-16!

Page 14: XML & JSON: interchangeability and case studies

Information encoding

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  1-4 bytes!

•  1.112.064 valid character code points in Unicode!•  1 byte: Standard ASCII!•  2 bytes: Arabic, Hebrew, most European scripts!•  3 bytes: BMP (Basic Multilingual Plane)!•  4 bytes: All Unicode characters!

UTF-8!

Page 15: XML & JSON: interchangeability and case studies

Information encoding

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

Mojibake!

The UTF-8-encoded Japanese Wikipedia article for Mojibake as displayed if interpreted as Windows-1252 encoding!

Page 16: XML & JSON: interchangeability and case studies

Information encoding

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  The most common encoding for the World Wide Web!

•  Accounting for 97% of all web pages!

•  Up to 100% for some languages!

UTF-8!

Page 17: XML & JSON: interchangeability and case studies

Data exchange

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

Findable!

Accessible!

Interoperable!

Reusable!

FAIR principles!

Page 18: XML & JSON: interchangeability and case studies

Data exchange

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

CSV!

Page 19: XML & JSON: interchangeability and case studies

Data exchange

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

CSV Advantages!•  CSV is human readable and easy to edit manually!•  CSV is simple to implement and parse!•  CSV is processed by almost all existing applications!•  CSV provides a straightforward information schema!•  CSV is faster to handle!•  CSV is smaller in size!•  CSV is considered to be standard format!•  CSV is compact. For XML you start tag and end tag for each column in each row. In CSV you write the column headers only once.!•  CSV is easy to generate!

CSV!

CSV Disdvantages!•  CSV allows to move most basic data only. Complex configurations cannot be imported and exported this way!•  There is no distinction between text and numeric values!•  No standard way to represent binary data!•  Problems with importing CSV into SQL (no distinction between NULL and quotes)!•  Poor support of special characters!•  No standard way to represent control characters!•  Lack of universal standard!•  Feld data may also contain commas or even embedded line-breaks!

Page 20: XML & JSON: interchangeability and case studies

Data exchange

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

ISO/OSI!

Page 21: XML & JSON: interchangeability and case studies

Data exchange

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

ISO/OSI!

Page 22: XML & JSON: interchangeability and case studies

Data exchange

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

ISO/OSI!

Page 23: XML & JSON: interchangeability and case studies

Data exchange

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  www!

•  Tim Berners-Lee!

•  SGML!

•  Netscape vs Microsoft !

HTML - The Web 1.0!

Page 24: XML & JSON: interchangeability and case studies

Data exchange

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  Programming language!

•  Standard markup language!

•  Web browser!

HTML - The Web 1.0!

Page 25: XML & JSON: interchangeability and case studies

Data exchange

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  Syntax!

•  Semantic!

•  Representation!

•  Behaviour!

HTML - The Web 1.0!

Page 26: XML & JSON: interchangeability and case studies

Data exchange

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

HTML - The Web 1.0!

Page 27: XML & JSON: interchangeability and case studies

Data exchange

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

HTML - The Web 1.0!

EUPORIA web page source!

Page 28: XML & JSON: interchangeability and case studies

Data exchange

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  eXtensible Markup Language !

•  Specification for the definition of markup languages!

•  World Wide Web Committee (W3C)!

•  HTML as an XML application -> XHTML!

XML - The Web 1.1!

Page 29: XML & JSON: interchangeability and case studies

Data exchange

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  Integrity of data in any XML document!

•  Technology to interoperate with any platform!

•  Technology to interoperate with any platform!

XML - The Web 1.1!

Page 30: XML & JSON: interchangeability and case studies

Data exchange

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  Sun and Microsoft!

•  Java!•  object-oriented programming languages !•  “write once run anywhere”!

•  .NET, C#!• XML to solve the data interoperability puzzle!

The way to JSON: Java, .NET e AJAX !

Page 31: XML & JSON: interchangeability and case studies

Data exchange

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  AJAX: “Asynchronous JavaScript and XML”!

•  Communications in background!

•  Single-page Application (SPA)!

•  JavaScript for everyone!

•  Web 2.0!

The way to JSON: Java, .NET e AJAX !

Page 32: XML & JSON: interchangeability and case studies

Data exchange

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  HTML document containing some JavaScript!

•  Interoperability across all browsers!

•  Interchange data between arbitrary language!

JSON!

Page 33: XML & JSON: interchangeability and case studies

Data exchange

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

“XML is the most fully developed means of getting data in and out of an AJAX client, but there’s no

reason you couldn’t accomplish the same effects using a technology like JavaScript Object Notation or

any similar means of structuring data.”!

JSON!

Page 34: XML & JSON: interchangeability and case studies

XML vs JSON

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  eXtensible Markup Language!

•  Store and transport data!

•  Human- and machine-readable!

XML!

Page 35: XML & JSON: interchangeability and case studies

XML vs JSON

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  XML was designed to carry!

•  HTML was designed to display data!

•  XML tags are not predefined!

•  HTML tags are predefined!

XML vs HTML!

Page 36: XML & JSON: interchangeability and case studies

XML vs JSON

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

XML!

Page 37: XML & JSON: interchangeability and case studies

XML vs JSON

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  Documents must have a root element!

•  Prolog is optional!

•  All elements must have a closing tag!

•  Properly nested!

•  Attribute values must always be quoted!

•  Well formed!

XML syntax rules!

Page 38: XML & JSON: interchangeability and case studies

XML vs JSON

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  An element can contain:!•  text!•  attributes!•  other elements!•  or a mix of the above!

•  An attribute must be quoted!

•  Avoid attributes (if unnecessary):!•  attributes cannot contain multiple values (elements can)!•  attributes cannot contain tree structures (elements can)!•  attributes are not easily expandable (for future changes)!

XML elements and attributes!

Page 39: XML & JSON: interchangeability and case studies

XML vs JSON

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

XML elements and attributes!

Page 40: XML & JSON: interchangeability and case studies

XML vs JSON

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  XSLT is style sheet language for XML!

•  XSLT is far more sophisticated than CSS!

XML and XSLT!

Page 41: XML & JSON: interchangeability and case studies

XML vs JSON

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

XML and XSLT!

Page 42: XML & JSON: interchangeability and case studies

XML vs JSON

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  Describes the structure of an XML document!

•  “Well Formed”!

•  “Valid”!

XML schema!

Page 43: XML & JSON: interchangeability and case studies

XML vs JSON

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

XML example: TEI!

Page 44: XML & JSON: interchangeability and case studies

XML vs JSON

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  JSON: JavaScript Object Notation!

•  JSON is a syntax for storing and exchanging data!

•  JSON is text, written with JavaScript object notation!

JSON!

Page 45: XML & JSON: interchangeability and case studies

XML vs JSON

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

JSON syntax!

Page 46: XML & JSON: interchangeability and case studies

XML vs JSON

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

JSON schema!

Page 47: XML & JSON: interchangeability and case studies

XML vs JSON

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

JSON example!

Page 48: XML & JSON: interchangeability and case studies

XML vs JSON

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  JSON is like XML because!•  Both JSON and XML are "self describing" (human readable)!•  Both JSON and XML are hierarchical (values within values)!•  Both JSON and XML can be parsed and used by lots of programming languages!

•  JSON is unike XML because!

•  JSON doesn't use end tag!•  JSON is shorter!•  JSON is quicker to read and write!•  JSON can use arrays!

•  XML is much more difficult to parse than JSON!

•  JSON is parsed into a ready-to-use JavaScript object!

JSON vs XML!

Page 49: XML & JSON: interchangeability and case studies

XML vs JSON

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

JSON vs XML!

Page 50: XML & JSON: interchangeability and case studies

XML vs JSON

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

•  XML has a schema outside!

•  XML more powerful schema!

JSON vs XML!

Page 51: XML & JSON: interchangeability and case studies

XML vs JSON

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

JSON and XML!

Page 52: XML & JSON: interchangeability and case studies

XML vs JSON

Salvatore Cristofaro, Pietro Sichera and Daria Spampinato – 1st March 2021

Grazie!!