snu oopsla lab. dom/sax applications the ubiquitous xml(9) © copyright 2001 snu oopsla lab
Post on 13-Dec-2015
222 Views
Preview:
TRANSCRIPT
3SNUOOPSLA Lab.The ubiquitous XML
Contents of DOM What is the DOM ? Java implementation Nodes Elements Attributes Node lists
DOM
4SNUOOPSLA Lab.The ubiquitous XML
What is DOM ? DOM(Document Object Model)
Was developed by W3C Specify how future Web browser and embedded
scripts should access HTML and XML documents
DOM
5SNUOOPSLA Lab.The ubiquitous XML
Java implementation SUN provide a class for parsing XML,
called Xml Document. Xml Document method parses XML file,
build the document tree. To use the SUN parser => import org.w3c.dom.*; import com.sun.xml .tree.*; import org.xml.sax.*;
DOM
6SNUOOPSLA Lab.The ubiquitous XML
Nodes (1/4) Nodes
describe elements, text, comments, processing instructions, CDATA section, entity references ...
The Node interface itself defines a number of methods. 1. Each node has characteristics (type, name, value) 2. Having a contextual location in the document tree. 3. Capability to modify its contents.
DOM
7SNUOOPSLA Lab.The ubiquitous XML
Nodes (2/4) Node characteristics
getNodeType => determining its type
getNodeName => returning the name of the node
setNodeValue => replacing the value of node
hasChildNodes => whether node has children or not
getAttributes => accessing attribute
DOM
8SNUOOPSLA Lab.The ubiquitous XML
Nodes (3/4) Node navigation
When processing a document via the DOM interface, it is to use node as a stepping-stones.
Each node has methods that return references to surrounding nodes.getParentNode( ) getPreviousSibling( )
getFirstChild( )
getChildNodes( )
getLastChild( )
getNextSibling( )
DOM
9SNUOOPSLA Lab.The ubiquitous XML
Nodes (4/4) Node manipulation
remove child method. appendChild method insertbefore method replaceChild methodEx) Old Child
New Child
DOM
10SNUOOPSLA Lab.The ubiquitous XML
Documents• An entire XML document is represented
by a special type of node. - getDoctype - getImplementation - getDocumentElement - getElementsByTagName
DOM
11SNUOOPSLA Lab.The ubiquitous XML
Elements• Element interface
• Extends the Node interfaces• Adds element-specific functionality• General element processing
- getTagName method
- getElementsByTagName method - normalize method
Ex)Here is some
text “here is some text”
DOM
12SNUOOPSLA Lab.The ubiquitous XML
Attributes• Attribute characteristics - getName method
- getValue - setValue - getSpecified
• Creating attribute - createAttribute
DOM
13SNUOOPSLA Lab.The ubiquitous XML
Node lists• The Nodelist interface contains two
method - Node item(int index); int getLength( ); 3
getLength( );
node 1
node 0getLength( )
node 2
Item(1);
DOM
14SNUOOPSLA Lab.The ubiquitous XML
Named node maps• The NamedNodemap interface is designed to
contain nodes, in no particular order, that can be accessed by name.
4
getLength( ); Lang
IDgetLength( )
Security
Item(1);
Added
getNamedItem(“Security”);
setNamedItem(O);
removeNamedItem(“Added”)
DOM
16SNUOOPSLA Lab.The ubiquitous XML
Contents of SAX What is SAX? Call-backs and interfaces The Parser Document handlers Attribute lists Error handlers Locators Handler bases
SAX
17SNUOOPSLA Lab.The ubiquitous XML
What is SAX?
SAX(the Simple API for XML) Is a standard API for event-driven
processing of XML data Allowing parsers to deliver information to
applications in digestible chunks
SAX
18SNUOOPSLA Lab.The ubiquitous XML
Call-backs and interfaces The SAX interface are:
Parser Document Handler AttributeList ErrorHandler EntityResolver Locator DTD Handler
SAX
19SNUOOPSLA Lab.The ubiquitous XML
The Parser
The Work of Parser The parser developer creates a class that actually
parses the XML document or data stream The parser reads the XML source data Stops reading when encounters a meaningful object Sends the information to the main application by
calling an appropriate method Waits for this method to return before continuing
SAX
20SNUOOPSLA Lab.The ubiquitous XML
Document handlers In order for the application to receive basic markup
events from the parser, the application developer must create a class that implements the DocumentHandler interface.
Application
Parser
Document Handler
create
give
startDocument()
startElement()
characters()
endElement()
endDocument()
<!……………>
<->………….</->
parsing
FeedbackWhen event driven
Event driven
SAX
21SNUOOPSLA Lab.The ubiquitous XML
Attribute lists A wrapper object for all attribute details
int getLength(); … to associate how many attributes are present.
String getName(int i); … to discover the name of one of the attributes
String getType(int i); … when a DTD is in use, to get a data type
String getType(String name); assigned to each attribute. String getValue(int i); … to get the value of an attribute String getValue(String name);
SAX
22SNUOOPSLA Lab.The ubiquitous XML
Error handlers When the application needs to be informed of
warnings and errors It can implement ErrorHandler interface
SAX
23SNUOOPSLA Lab.The ubiquitous XML
Locators Necessity
An error message is not particularly helpful when no indication is given as to where the error occurred.
Locator interface can tell the entity, line number and character number
of the warning or error
SAX
24SNUOOPSLA Lab.The ubiquitous XML
Handler bases HandlerBase class
Providing some sensible default behavior for each event, which could be subclassed to add application-specific functionality
SAX
25SNUOOPSLA Lab.The ubiquitous XML
DOM/SAX Applications
DOM SAX How to make XML application?
Making XML Application
26SNUOOPSLA Lab.The ubiquitous XML
Contents
XML Application Architecture Parser Basics Kinds of Parsers The Document Object Model(DOM)
DOM Application The Simple API for XML(SAX)
SAX Application
Making XML Application
27SNUOOPSLA Lab.The ubiquitous XML
XML Application Architecture An XML Application is typically built
around an XML parser It has an interface to its users, and an
interface to some sort of back-end data store
XMLApplicationUser
InterfaceDataStore
XML Parser
Making XML Application
28SNUOOPSLA Lab.The ubiquitous XML
Parser Basics A piece of code that reads a document
and analyzes its structure How to use a parser
Create a parser object Pass your XML document to the parser Process the results
Building an XML Application is obviously more involved than this
Making XML Application
29SNUOOPSLA Lab.The ubiquitous XML
Kinds of Parsers Validating versus non-validating parsers
Validating parsers validate XML documents as they parse them
Non-validating parsers ignore any validation errors
Parsers that support the Document Object Model(DOM)
Parsers that support the Simple API for XML(SAX)
Making XML Application
30SNUOOPSLA Lab.The ubiquitous XML
DOM Parser Tree structure that contains all of the
elements of a document Provides a variety of functions to
examine the contents and structure of the document
Making XML Application
31SNUOOPSLA Lab.The ubiquitous XML
SAX Parser Generates events at various points in
the document It’s up to you to decide what to do with
each of those events
Making XML Application
32SNUOOPSLA Lab.The ubiquitous XML
DOM vs SAX Why use DOM?
Need to know a lot about the structure of a document
Need to move parts of the document around
Need to use the information in the document more than once
Why use SAX? Only need to extract a few
elements from an XML document
Making XML Application
33SNUOOPSLA Lab.The ubiquitous XML
DOM
DOM interfaces Node : The base data type of the DOM. Element : The vast majority of the objects
you’ll deal with are Elements. Attr : Represents an attribute of an
element. Text : The actual content of an Element or
Attr. Document : Represents the entire XML
document.
Making XML Application
34SNUOOPSLA Lab.The ubiquitous XML
Common DOM methods
getDocumentElement() Returns the root element of the document.
getFirstChild() and getLastChild() Returns the first or last child of a given Node.
getNextSibling() and getPreviousSibling() These methods return the next or previous sibling of
a given Node) getAttribute(attrName)
For a given Node, returns the attribute with the requested name
- Document Class- Node Class
Making XML Application
35SNUOOPSLA Lab.The ubiquitous XML
Our first DOM Application!<?xml version="1.0"?>
<sonnet type="Shakespearean"> <author>
<last-name>Shakespeare</last-name><first-name>William</first-name><nationality>British</nationality><year-of-birth>1564</year-of-birth><year-of-death>1616</year-of-death>
</author><title>Sonnet 130</title><lines> <line> My mistress’s eyes are …
Sonnet.xml
First Application simply reads an XML document and writes the document’s contents to standard outputParse the sonnet.xml
Making XML Application
36SNUOOPSLA Lab.The ubiquitous XML
domOne to Watch Over Me
public class domOne
{public void parseAndPrint(String uri)...public void printDOMTree(Node node)...public static void main(String argv[])...
domOne.java
Create a new class called domOneIt has two methods, parseAndPrint and printDOMTree
In main methodprocess the command line, create a domOne object, pass the file name to domOne objectdomOne object creates a parser object, parses the document, then process the DOM tree via the printDOMTree method
Making XML Application
37SNUOOPSLA Lab.The ubiquitous XML
Create a domOne objectpublic static void main(String argv[])
{if (argv.length == 0){
System.out.println("Usage: ... ");...System.exit(1);
}domOne d1 = new domOne();d1.parseAndPrint(argv[0]);
}
Sonnet.xml
Create a separate class called domOneTo parse the file and print the results, create a new instance of the domOne classUse a recursive function to go through the DOM tree and print out the results
Making XML Application
38SNUOOPSLA Lab.The ubiquitous XML
Create a parser objecttry
{DOMParser parser = new DOMParser();parser.parse(uri);doc = parser.getDocument();
}
In a parseAndPrint method
Create a new Parser object using a DOMParser objectDOMParser object : a java class that implements the DOM interface
ExceptionAn invalid URI, a DTD that can’t be found, or an XML document that isn’t valid or well-formed
Making XML Application
39SNUOOPSLA Lab.The ubiquitous XML
Parse the XML documenttry
{DOMParser parser = new DOMParser();parser.parse(uri);doc = parser.getDocument();
}
...
if (doc != null)printDOMTree(doc);
Parsing the document is don with a single line of codeGet the Document object created by the parserPass it the printDOMTree Method
Making XML Application
40SNUOOPSLA Lab.The ubiquitous XML
Process the DOM treepublic void printDOMTree(Node node)
{int nodeType = Node.getNodeType();switch (nodeType){
case DOCUMENT_NODE:
printDOMTree(((Document)node).GetDocumentElement()); ...
case ELEMENT_NODE: ...
NodeList children = node.getChildNodes(); if (children != null) { for(int i =0;i < children.getLength();i++) printDOMTree(children.item(i); }
Call the printDOMTree recursively for each of the node’s children
Making XML Application
41SNUOOPSLA Lab.The ubiquitous XML
Nodes a-plentyDocument Statistics for sonnet.xml:
====================================Document Nodes: 1Element Nodes: 23Entity Reference Nodes: 0CDATA Sections: 0Text Nodes: 45Processing Instructions: 0
----------Total: 69 Nodes
Just run domCounter program that counts the number of nodesIn sonnet.xml, there are twenty-four tags. Why not twenty-four nodes?
There are actually 69 nodes in sonnet.xml; one document node, 23 element nodes, and 45 text nodes.
Making XML Application
42SNUOOPSLA Lab.The ubiquitous XML
Sample node listing<?xml version="1.0"?><!DOCTYPE sonnet SYSTEM "sonnet.dtd"><sonnet type="Shakespearean"> <author>
<last-name>Shakespeare</last-name>
1. The Document node2. The Element node corresponding to the <sonnet> tag3. A Text node containing the carriage return at the end of the <sonnet> tag and the two spaces in front of the <author> tag 4. The Element node corresponding to the <author> tag5. A Text node containing the carriage return at the end of the <author> tag and the four spaces in front of the <last-name> tag6. The Element node corresponding to the <last-name> tag7. A Text node containing the characters “Shakespeare”
The nodes returned by the parser All of the blank spaces at the start of the lines at the left are Text
Making XML Application
43SNUOOPSLA Lab.The ubiquitous XML
Brief : DOM Believe it or not, that’s about all you
need to know to work with DOM objects. Our domOne code did several things: Created a Parser object Gave the Parser an XML document to
parse Took the Document object from the
Parser and examined it
Making XML Application
44SNUOOPSLA Lab.The ubiquitous XML
A wee listing of SAX events startDocument
Signals the start of the document. endDocument
Signals the end of the document. startElement
Signals the start of an element. endElement
Signals the end of an element. Characters
Contains character data, similar to a DOM Text node.
Making XML Application
45SNUOOPSLA Lab.The ubiquitous XML
SAX interfaces The SAX API actually defines four
interfaces for handling events EntityHandler TDHandler DocumentHandler ErrorHandler
All of these interfaces are implemented by HandlerBase.
Making XML Application
46SNUOOPSLA Lab.The ubiquitous XML
Our first SAX Application!<?xml version="1.0"?>
<sonnet type="Shakespearean"> <author>
<last-name>Shakespeare</last-name><first-name>William</first-name><nationality>British</nationality><year-of-birth>1564</year-of-birth><year-of-death>1616</year-of-death>
</author><title>Sonnet 130</title><lines> <line> My mistress’s eyes are …
Sonnet.xml
This application is similar to domOne, except it uses the SAX API instead of DOMParse the sonnet.xml
Making XML Application
47SNUOOPSLA Lab.The ubiquitous XML
SAX method in saxOne.javapublic class saxOne extends HandlerBase
{ public void startDocument()...public void startElement(String name, AttributeList attrs)...public void characters(char ch[], int start, int length)...public void ignorableWhitespace(char ch[],int start, int length)...public void endElement(String name)...public void endDocument()...public void warning(SAXParseException ex)...public void error(SAXParseException ex)...public void fatalError(SAXParseException ex) throws SAXException
saxOne.java
SAX methods that handle SAX events
Making XML Application
48SNUOOPSLA Lab.The ubiquitous XML
Create a saxOne object
Create a separate class called saxOneThe main procedure creates an instance of this class and uses it to parse the XML documentsaxOne extends the HandlerBase class, we can use saxOne as an event handler for a SAX parser
public static void main(String argv[])
{if (argv.length == 0){
System.out.println("Usage: ... ");...System.exit(1);
}saxOne s1 = new saxOne();s1.parseURI(argv[0]);
}
Making XML Application
49SNUOOPSLA Lab.The ubiquitous XML
Create a Parser object
It first creates a new Parser objectIn this sample, we use the SAXParser class instead of DOMParsersetDocumentHandler and setErrorHandler tell our newly-created SAXParser to use saxOne to handle events
SAXParser parser = new SAXParser();parser.setDocumentHandler(this);parser.setErrorHandler(this);
try{
parser.parse(uri);}
Making XML Application
50SNUOOPSLA Lab.The ubiquitous XML
Parse the XML document
Once our SAXParser object is set up, it takes a single line of code to process our document.
SAXParser parser = new SAXParser();parser.setDocumentHandler(this);parser.setErrorHandler(this);
try{
parser.parse(uri);}
Making XML Application
51SNUOOPSLA Lab.The ubiquitous XML
Process SAX eventspublic void startDocument()...public void startElement(String name, AttributeList attrs)...public void characters(char ch[], int start, int length)...public void ignorableWhitespace(char ch[],int start, int length)...
As the SAXParser object parses our document, it calls our implementations of the SAX event handlers as the various SAX events occur.Each event handler writes the appropriate information to System.out
Ex) For startElement events, we write the XML syntax of the original tag out to the screen.
Making XML Application
52SNUOOPSLA Lab.The ubiquitous XML
A cavalcade of ignorable eventsDocument Statistics for sonnet.xml:====================================DocumentHandler Events: startDocument 1 endDocument 1 startElement 23 endElement 23 processingInstruction 0 character 20 ignorableWhitespace 25ErrorHandler Events: warning 0 error 0 fatalError 0
----------Total: 93 events
The SAX interface returns more events than you might thinkOne advantage of the SAX interface is that the twenty-five ignorableWhitespace events are simply ignoredWe don’t have to write code to handle those events
Making XML Application
53SNUOOPSLA Lab.The ubiquitous XML
Sample event listing<?xml version="1.0"?><!DOCTYPE sonnet SYSTEM "sonnet.dtd"><sonnet type="Shakespearean"> <author>
<last-name>Shakespeare</last-name>
1. A startDocument event2. A startElement event for the <sonnet> element3. An ignorableWhitespace event for the line break and the two blank spaces in front of the <author> tag4. A startElement event for the <author> element5. An ignorableWhitespace event for the line break and the four blank spaces in front of the <last-name> tag6. A startElement event for the <last-name> tag7. A character event for the characters “Shakespeare”8. An endElement event for the <last-name> tag
The events returned by the parser
Making XML Application
54SNUOOPSLA Lab.The ubiquitous XML
SAX vs DOM – part one<book id="1">
<verse> Sing, O goddess, the anger of Achilles son of Peleus, that brought countless ills upon the Achaeans. Many a brave soul did it send hurrying down to Hades, and many a hero did it yield a prey to dogs and vultures, for so were the counsels of Jove fulfilled from the day on which the son of Atreus, king of men, and great Achilles, first fell out with one another.</verse><verse> And which of the gods was it that set them on to quarrel? It was the son of Jove and Leto; for he was angry with the king and sent a pestilence upon ...
SAX API would be much more efficientDoing this with the DOM would take a lot of memory
Making XML Application
55SNUOOPSLA Lab.The ubiquitous XML
SAX vs DOM – part one...
<address><name> <title>Mrs.</title> <first-name>Mary</first-name> <last-name>McGoon</last-name></name><street>1401 Main Street</street><city>Anytown</city><state>NC</state><zip>34829</zip>
</address>
<address><name>
...
If we were parsing an XML document containing 10,000 address, and we wanted to sort them by last name??DOM would automatically store all of the dataWe could use DOM functions to move the nodes n the DOM tree
Making XML Application
top related