using xml in viper and more. xml direct access to information, without worrying about parsing. xml...

21
Using XML In ViPER and More

Upload: kerry-snow

Post on 17-Dec-2015

229 views

Category:

Documents


0 download

TRANSCRIPT

Using XML

In ViPER and More

XML

• Direct access to information, without worrying about parsing.

• XML Information Set– XML provides a way to access information

independent of access, format, etc.– XML is just a serialization of a set of

information arranged in a tree.

ViPER Tree

• viper– config

• descriptor

• descriptor

– data• sourcefile filename=“file.mpg”

– file

– object

ViPER File Format

<?xml version="1.0" encoding="UTF-8"?><viper xmlns="http://lamp.cfar.umd.edu/viper"

xmlns:data="http://lamp.cfar.umd.edu/viperdata"><config>

<descriptor name="Information" type="FILE"> <attribute name="SOURCEDIR" dynamic="false" type="svalue"/></descriptor>

</config><data>

<sourcefile filename="comm-001_00001.jpg" ><file name="Information" id="0" framespan="0:0">

<attribute name="SOURCEDIR"><data:svalue value="/fs/lampa/FaceTextDB/JPEG/advertisements" /> </attribute>

</file> </sourcefile>

</data></viper>

Accessing Via XPath

• Get data from a specific file– /viper/data/sourcefile[@filename=“f.mpg”]

• Gets the sourcefile node

– //sourcefile[@fname=“f.mpg]//bbox• Gets all bbox nodes

Matlab with Java% Add xerces.jar to classpath.txt (find using 'which classpath.txt')% need to restart matlab after changingimport org.apache.xerces.parsers.* org.w3c.dom.*;import java.lang.String org.xml.sax.*;

input = InputSource('C:\MATLAB6p1\work\advertisements.xml');parser = DOMParser;parser.setFeature('http://apache.org/xml/features/validation/schema', 0)parser.parse(input);doc = parser.getDocument;sfs = doc.getElementsByTagName('sourcefile')

files = cell(sfs.getLength, 1);i = 0;while i < sfs.getLength fileattr = sfs.item(i).getAttributes.getNamedItem('filename'); i = i + 1; files(i) = fileattr.getValue;end

Perl

use XML::LibXML;my $parser = XML::LibXML->new();my $tree = $parser->parse_file($datafiles[0]);my $root = $tree->getDocumentElement;foreach my $source ($root->findnodes('sourcefile')){

my $image = $source->findvalue('@filename');foreach my $d ($source->findnodes('content|object')){

[$startFrame, $endFrame] = split(/:/,$d->findvalue('@framespan'));

foreach my $shape ($d->findnodes(lc($attribType))) {$orig_x = $shape->findvalue( ‘@x' );$orig_y = $shape->findvalue( ‘@y' );

C with libxml2#include <libxml/xmlmemory.h>#include <libxml/parser.h>---- xmlDocPtr doc = = xmlParseFile(‘truth.xml’);if (doc == NULL) return(NULL);xmlNodePtr cur = xmlDocGetRootElement(doc);xmlNsPtr viperns = xmlSearchNsByHref(doc, cur,

(const xmlChar *) "http://lamp.cfar.umd.edu/viper");cur = cur->xmlChildrenNode;while (cur != NULL) {

if ((!strcmp(cur->name, “config”)) && (cur->ns == viperns)) parseConfig (doc, viperns, cur); else if ((!strcmp(cur->name, “data”)) && (cur->ns == viperns)) parseData (doc, viperns, cur); cur = cur->next;}xmlCleanupParser();

XML Databases

• Uses existing tools to access persistent data– DOM and XPath – XQuery and XUpdate

• Many different implementations– Open Source: Apache Xindice, eXist– Proprietary:TextML, X-Hive, – Relational: MS SQL, Oracle

XSL:Transformations

• The idea is to look at the incoming data as a tree, using XPath, and select various nodes to copy to the output.

• While the output does not have to be XML, the input and the document itself must be well formed.

• On system 7, ‘testXSLT’ runs stylesheets.

XSLT<?xml version=“1.0” encoding=“UTF-8”?><xsl:stylesheet version=“1.0”

xmlns:xsl=“http://www.w3.org/1999/XSL/Transform”xmlns:gtf=“http://lamp.cfar.umd.edu/viper”xmlns:data=“http://lamp.cfar.umd.edu/viperdata”xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

<xsl:strip-space elements="gtf:viper"/><xsl:output method="xml" omit-xml-declaration="yes"/><!– continued -->

XSLT<xsl:template match="/gtf:viper">

<xsl:text>#VIPER_VERSION_3.01

</xsl:text><xsl:apply-templates select="*/"/>

</xsl:template><xsl:templatematch="//gtf:sourcefile[starts-with(@filename, 'comm-001')]">

<xsl:value-of select="@filename" /><xsl:text>

</xsl:text></xsl:template></xsl:stylesheet>

CSS-1

• Supported in the majority of browsers in use today.

• Basic styling. Hopefully will reduce reliance on HTML tables as a way to lay out web pages.

<style type="text/css">p {

font-size: 12pt;line-height: 18pt;

}

p:first-letter {font-size: 200%;float: left;

}</style>

CSS-2

• Added support for pagination, including widow and orphan control, page breaks, and margins.

• Aural style sheets for voice browsing.

• Can be applied directly to XML.

• Possible to do some multi-column layout.

CSS-3

• Modularized• Through Ruby, support for Japanese,

Arabic, etc.• Multi-column layout• Support for other W3C specs, like

– SVG– MathML– SMIL

XSL:FO• Basically, the idea is to put CSS-2 in an XML

dialect, and use XPath and other XML technologies to make printed media look nice.

• Extremely verbose – designed to be generated from semantic markup.– However, its lack of semantics leads Opera CTO Lie to

call them “Harmful.”

• Additions include footnotes, hyphenation, odd/even pages, citations for indices and tables of contents.

• RenderX, Apache FOP

Defining an XML Dialect

• Document Type Definitions– Simple, BNF type definition of tags, attributes, and

how they may be arranged.

• Schema– XML based replacement for non-XML DTDs.

– Complex.

– Define data types, and associate them with tag names.

• Rule based constriction– Schematron

ViPER Schema

<?xml version="1.0" encoding="UTF-8"?><xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://lamp.cfar.umd.edu/viper" xmlns:viper="http://lamp.cfar.umd.edu/viper" elementFormDefault="qualified"><xsd:element name="viper" type="viper:viperType"/><xsd:complexType name="viperType">

<xsd:sequence><xsd:element name="config" type="viper:configType"/><xsd:element name="data" type="viper:dataType" minOccurs="0"/></xsd:sequence>

</xsd:complexType>

ViPER Data Schema

<xsd:schema xmlns:xsd=“http://www.w3.org/2001/XMLSchema”targetNamespace=“http://lamp.cfar.umd.edu/viperdata”xmlns:viperdata=“http://lamp.cfar.umd.edu/viperdata”xmlns:viper=“http://lamp.cfar.umd.edu/viper”elementFormDefault=“qualified”>

<xsd:import namespace=“http://lamp.cfar.umd.edu/viper” schemaLocation=“file:viper.xsd” /><xsd:element name="point" substitutionGroup="viper:null">

<xsd:complexType><xsd:complexContent><xsd:extension base="viper:descriptorAttributeData">

<xsd:attribute name="x" type="xsd:integer"/><xsd:attribute name="y" type="xsd:integer"/>

</xsd:extension></xsd:complexContent></xsd:complexType>

</xsd:element>

MPEG-7

• Based on XML-Schema.• Extensions to deal better with video type

data, including matrix data types, etc.• Designed to work with any level of

description, from low level to high.• W3C has only a working draft for DOM

access to schemas, so using generic MPEG-7 documents is currently difficult.

Resources

• www.xml.com– O'Reilly's XML resource

• www.w3.org – The standards themselves,

and lots of good links to implementations.

• xml.apache.org– DOM, SAX, and XSLT for

C and Java

• xmlsoft.org– libxml creators

• msdn.microsoft.com/xml– MS-XML parser is the one to use

on Windows.

• mpeg.telecomitalialab.com– MPEG-7 Working Group

• pyxml.sourceforge.net– Using xml with Python.

• okmij.org/ftp/Scheme/xml.html– Using XML with Scheme.