java + xml = jdom by jason hunter and brett mclaughlin co-creators of jdom mountain view java...

33
Java + XML = JDOM Java + XML = JDOM by Jason Hunter and Brett by Jason Hunter and Brett McLaughlin McLaughlin co-creators of JDOM co-creators of JDOM Mountain View Java User's Group Mountain View Java User's Group April 26, 2000 April 26, 2000

Upload: bertina-lloyd

Post on 23-Dec-2015

227 views

Category:

Documents


0 download

TRANSCRIPT

Java + XML = JDOMJava + XML = JDOM

by Jason Hunter and Brett McLaughlinby Jason Hunter and Brett McLaughlin

co-creators of JDOMco-creators of JDOM

Mountain View Java User's GroupMountain View Java User's Group

April 26, 2000April 26, 2000

IntroductionsIntroductions

Jason HunterJason Hunter

[email protected]@jdom.org

K&A SoftwareK&A Software

http://www.servlets.comhttp://www.servlets.com

Author of Author of

"Java Servlet Programming""Java Servlet Programming"

(O'Reilly)(O'Reilly)

IntroductionsIntroductions

Brett McLaughlinBrett McLaughlin

[email protected]@jdom.org

Metro Information ServicesMetro Information Services

http://www.newInstance.comhttp://www.newInstance.com

Author of upcomingAuthor of upcoming

"Java and XML""Java and XML"

(O'Reilly)(O'Reilly)

What is JDOM?What is JDOM?

• JDOM is the Java Document Object ModelJDOM is the Java Document Object Model

• A way to represent an XML document for easy and A way to represent an XML document for easy and efficient reading, manipulation, and writingefficient reading, manipulation, and writing– Straightforward APIStraightforward API– Lightweight and fastLightweight and fast– Java-optimizedJava-optimized

• Despite the name similarity, it's not build on DOM or Despite the name similarity, it's not build on DOM or modeled after DOMmodeled after DOM– Although it integrates well with DOM and SAXAlthough it integrates well with DOM and SAX– Name chosen for accuracy, not similarity to DOMName chosen for accuracy, not similarity to DOM

• An open source project with an Apache-style licenseAn open source project with an Apache-style license

The JDOM PhilosophyThe JDOM Philosophy

• JDOM should be straightforward for Java programmersJDOM should be straightforward for Java programmers– Use the power of the language (Java 2)Use the power of the language (Java 2)– Take advantage of method overloading, the Take advantage of method overloading, the

Collections APIs, reflection, weak referencesCollections APIs, reflection, weak references– Provide conveniences like type conversionsProvide conveniences like type conversions

• JDOM should hide the complexities of XML wherever JDOM should hide the complexities of XML wherever possiblepossible– An Element has content, not a child Text node, An Element has content, not a child Text node,

which has content (ala DOM)which has content (ala DOM)– Exceptions should contain useful error messagesExceptions should contain useful error messages– Give line numbers and specifics, use no SAX or Give line numbers and specifics, use no SAX or

DOM classes or constructsDOM classes or constructs

More JDOM PhilosophyMore JDOM Philosophy

• JDOM should integrate with DOM and SAXJDOM should integrate with DOM and SAX– Support reading and writing DOM documents and Support reading and writing DOM documents and

SAX eventsSAX events– Support runtime plug-in of Support runtime plug-in of anyany DOM or SAX parserDOM or SAX parser– Easy conversion from DOM/SAX to JDOMEasy conversion from DOM/SAX to JDOM– Easy conversion from JDOM to DOM/SAXEasy conversion from JDOM to DOM/SAX

• JDOM should stay current with the latest XML JDOM should stay current with the latest XML standardsstandards– DOM Level 2, SAX 2.0, XML SchemaDOM Level 2, SAX 2.0, XML Schema

• JDOM does not need to solve every problemJDOM does not need to solve every problem– It should solve 80% of the problems with 20% of It should solve 80% of the problems with 20% of

the effortthe effort– We think we got the ratios to 90% / 10%We think we got the ratios to 90% / 10%

The Historical Alternatives: DOMThe Historical Alternatives: DOM

• DOM is a large API designed for complex DOM is a large API designed for complex environmentsenvironments– Represents a document tree fully held in memoryRepresents a document tree fully held in memory– Has to 100% accurately represent any XML Has to 100% accurately represent any XML

document (well, it attempts to)document (well, it attempts to)– Has to have the same API on multiple languagesHas to have the same API on multiple languages– Reading and changing the document is non-Reading and changing the document is non-

intuitiveintuitive– Fairly heavyweight to load and store in memoryFairly heavyweight to load and store in memory

The Historical Alternatives: SAXThe Historical Alternatives: SAX

• SAX is a lightweight API designed for fast readingSAX is a lightweight API designed for fast reading– Callback mechanism reports when document Callback mechanism reports when document

elements are encounteredelements are encountered– Lightweight since the document is never entirely in Lightweight since the document is never entirely in

memorymemory– Does not support modifying the documentDoes not support modifying the document– Does not support random access to the documentDoes not support random access to the document– Fairly steep learning curve to use correctlyFairly steep learning curve to use correctly

Do you need JDOM?Do you need JDOM?

• JDOM is a lightweight APIJDOM is a lightweight API– Benchmarks of "load and print" show performance Benchmarks of "load and print" show performance

on par with SAXon par with SAX– Manipulation and output are also lightning fastManipulation and output are also lightning fast

• JDOM can represent a full documentJDOM can represent a full document– Not all must be in memory at onceNot all must be in memory at once

• JDOM supports document modificationJDOM supports document modification– And document creation from scratch, no "factory"And document creation from scratch, no "factory"

• JDOM is easy to learnJDOM is easy to learn– Optimized for Java programmersOptimized for Java programmers– Doesn't require in-depth XML knowledgeDoesn't require in-depth XML knowledge– Allows easing into SAX and DOM, if neededAllows easing into SAX and DOM, if needed– Simple support for namespaces, validationSimple support for namespaces, validation

The Document classThe Document class

• Documents are represented by the Documents are represented by the org.jdom.Documentorg.jdom.Document class class– A lightweight object holding a A lightweight object holding a DocTypeDocType, ,

ProcessingInstructionProcessingInstructions, a root s, a root ElementElement, , and and CommentCommentss

• It can be constructed from scratch:It can be constructed from scratch:

• Or it can be constructed from a file, stream, or URL:Or it can be constructed from a file, stream, or URL:

Document doc = new Document(new Element("rootElement"));

Builder builder = new SAXBuilder(); Document doc = builder.build(url);

The Build ProcessThe Build Process

• A Document can be constructed using any build toolA Document can be constructed using any build tool– The SAX build tool uses a SAX parser to create a The SAX build tool uses a SAX parser to create a

JDOM documentJDOM document

• Current builders are SAXBuilder and DOMBuilderCurrent builders are SAXBuilder and DOMBuilder– org.jdom.input.SAXBuilderorg.jdom.input.SAXBuilder is fast and is fast and

recommendedrecommended– org.jdom.input.DOMBuilderorg.jdom.input.DOMBuilder is useful for is useful for

reading an existing DOM treereading an existing DOM tree– A builder can be written that lazily constructs the A builder can be written that lazily constructs the

Document as neededDocument as needed– Other possible builders: LDAPBuilder, SQLBuilderOther possible builders: LDAPBuilder, SQLBuilder

Builder ClassesBuilder Classes

• Builders have optional parameters to specify Builders have optional parameters to specify implementation classes and whether DTD-based implementation classes and whether DTD-based validation should occur.validation should occur.

• Not all DOM parsers have the same APINot all DOM parsers have the same API– Xerces, XML4J, Project X, Oracle (V1 and V2)Xerces, XML4J, Project X, Oracle (V1 and V2)– The DOMBuilder The DOMBuilder adapterClassadapterClass implements implements

org.jdom.adapters.DOMAdapterorg.jdom.adapters.DOMAdapter– Implements standard methods by passing through Implements standard methods by passing through

to an underlying parserto an underlying parser– Adapters for all popular parsers are providedAdapters for all popular parsers are provided– Future parsers require just a small adapter classFuture parsers require just a small adapter class

• Once built, documents are not tied to their build toolOnce built, documents are not tied to their build tool

SAXBuilder(String parserClass, boolean validate);DOMBuilder(String adapterClass, boolean validate);

The Output ProcessThe Output Process

• A Document can be written using any output toolA Document can be written using any output tool– org.jdom.output.XMLOutputterorg.jdom.output.XMLOutputter tool writes tool writes

the document as XMLthe document as XML– org.jdom.output.SAXOutputterorg.jdom.output.SAXOutputter tool tool

generates SAX eventsgenerates SAX events– org.jdom.output.DOMOutputterorg.jdom.output.DOMOutputter tool creates tool creates

a DOM document (coming soon)a DOM document (coming soon)– Any custom output tool can be usedAny custom output tool can be used

• To output a To output a DocumentDocument as XML: as XML:

• For machine-consumption, pass optional parametersFor machine-consumption, pass optional parameters– Zero-space indent, no new linesZero-space indent, no new lines

XMLOutputter outputter = new XMLOutputter(); outputter.output(doc, System.out);

outputter = new XMLOutputter("", false); outputter.output(doc, System.out);

Pretty PrinterPretty Printer

import java.io.*; import org.jdom.*; import org.jdom.input.*; import org.jdom.output.*;

public class PrettyPrinter { public static void main(String[] args) { // Assume filename argument String filename = args[0]; try { // Build w/ SAX and Xerces, no validation Builder b = new SAXBuilder(); // Create the document Document doc = b.build(new File(filename));

// Output as XML to screen XMLOutputter outputter = new XMLOutputter(); outputter.output(doc, System.out); } catch (Exception e) { e.printStackTrace(); } }}

The DocType classThe DocType class

• A A DocumentDocument may have a may have a DocTypeDocType

• This specifies the DTD of the documentThis specifies the DTD of the document– It's easy to read and writeIt's easy to read and write

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

DocType docType = doc.getDocType();System.out.println("Element: " + docType.getElementName());System.out.println("Public ID: " + docType.getPublicID());System.out.println("System ID: " + docType.getSystemID());

doc.setDocType( new DocType("html", "-//W3C...", "http://..."));

The Element classThe Element class

• A A DocumentDocument has a root has a root ElementElement::

• Get the root as an Get the root as an ElementElement object: object:

• An An ElementElement represents something like represents something like web-appweb-app– Has access to everything from the open Has access to everything from the open

<web-app><web-app> to the closing to the closing </web-app></web-app>

<web-app id="demo"> <description> Gotta fit servlets in somewhere! </description> <distributable/> </web-app>

Element webapp = doc.getRootElement();

Playing with ChildrenPlaying with Children

• An element may contain child elementsAn element may contain child elements

• getChild()getChild() may throw may throw NoSuchElementExceptionNoSuchElementException

// Get a List of direct children as Elements List allChildren = element.getChildren(); out.println("First kid: " + allChildren.get(0).getName());

// Get all direct children with a given name List namedChildren = element.getChildren("name");

// Get the first kid with a given name Element kid = element.getChild("name");

// Namespaces are supported kid = element.getChild("nsprefix:name"); kid = element.getChild("nsprefix", "name");

Playing with GrandchildrenPlaying with Grandchildren

• Grandkids can be retrieved easily:Grandkids can be retrieved easily:

• Future JDOM versions are likely to support XPathFuture JDOM versions are likely to support XPath

<linux-config> <gui> <window-manager> <name>Enlightenment</name> <version>0.16.2</version> </window-manager> <!-- etc --> </gui> </linux-config>

String manager = root.getChild("gui") .getChild("window-manager") .getChild("name") .getContent();

Managing the PopulationManaging the Population

• Children can be added and removed through Children can be added and removed through ListList manipulation or convenience methods:manipulation or convenience methods:

List allChildren = element.getChildren();

// Remove the fourth child allChildren.remove(3);

// Remove all children named "jack" allChildren.removeAll( element.getChildren("jack")); element.removeChildren("jack");

// Add a new child allChildren.add(new Element("jane")); element.addChild(new Element("jane"));

// Add a new child in the second position allChildren.add(1, new Element("second"));

Making KidsMaking Kids

• Elements are constructed directly, no factory method Elements are constructed directly, no factory method neededneeded

• Some prefer a nesting shortcut, possible since Some prefer a nesting shortcut, possible since addChild()addChild() returns the returns the ElementElement on which the child on which the child was added:was added:

• A subclass of A subclass of ElementElement can be made, already can be made, already containing child elements and contentcontaining child elements and content

Element element = new Element("kid");

Document doc = new Document( new Element("family") .addChild(new Element("mom")) .addChild(new Element("dad") .addChild("kidOfDad")));

root.addChild(new FooterElement());

Making the linux-config DocumentMaking the linux-config Document

• This code constructs the This code constructs the <linux-config><linux-config> seen seen previously:previously:

Document doc = new Document( new Element("linux-config") .addChild(new Element("gui") .addChild(new Element("window-manager") .addChild(new Element("name") .setContent("Enlightenment")) .addChild(new Element("version") .setContent("0.16.2")) ) );

Getting Element AttributesGetting Element Attributes

• Elements often contain attributes:Elements often contain attributes:

• Attributes can be retrieved several ways:Attributes can be retrieved several ways:

• getAttribute()getAttribute() may throw may throw NoSuchAttributeExceptionNoSuchAttributeException

<table width="100%" border="0"> </table>

String value = table.getAttribute("width").getValue();

// Get "border" as an int, default of 2int value = table.getAttribute("border").getIntValue(2);

// Get "border" as an int, no defaulttry { value = table.getAttribute("border").getIntValue();}catch (DataConversionException e) { }

Setting Element AttributesSetting Element Attributes

• Element attributes can easily be added or removedElement attributes can easily be added or removed

// Add an attribute table.addAttribute("vspace", "0");

// Add an attribute more formally table.addAttribute( new Attribute("prefix", "name", "value"));

// Remove an attribute table.removeAttribute("border");

// Remove all attributes table.getAttributes().clear();

Element ContentElement Content

• Elements can contain text content:Elements can contain text content:

• The content is directly available:The content is directly available:

• And can easily be changed:And can easily be changed:

<description>A cool demo</description>

String content = element.getContent();

// This blows away all current content element.setContent("A new description");

Mixed ContentMixed Content

• Sometimes an element may contain comments, text Sometimes an element may contain comments, text content, and childrencontent, and children

• Text and children can be retrieved as always:Text and children can be retrieved as always:

• This keeps the standard uses simpleThis keeps the standard uses simple

<table> <!-- Some comment --> Some text <tr>Some child</tr> </table>

String text = table.getContent(); Element tr = table.getChild("tr");

Reading Mixed ContentReading Mixed Content

• To get all content within an To get all content within an ElementElement, use , use getMixedContent()getMixedContent()– Returns a Returns a ListList containing containing CommentComment, , StringString, and , and

ElementElement objects objects

List mixedContent = table.getMixedContent(); Iterator i = mixedContent.iterator(); while (i.hasNext()) { Object o = i.next(); if (o instanceof Comment) { // Comment has a toString() out.println("Comment: " + o); } else if (o instanceof String) { out.println("String: " + o); } else if (o instanceof Element) { out.println("Element: " + ((Element)o).getName()); } }

The ProcessingInstruction classThe ProcessingInstruction class

• Some documents have Some documents have ProcessingInstructionProcessingInstructionss

• PIs can be retrieved by name and their "attribute" PIs can be retrieved by name and their "attribute" values are directly available:values are directly available:

• All PIs can be retrieved as a All PIs can be retrieved as a ListList with with doc.getProcessingInstructions()doc.getProcessingInstructions()– For simplicity JDOM respects PI order but not the For simplicity JDOM respects PI order but not the

actual placementactual placement

• getProcessingInstruction()getProcessingInstruction() may throw may throw NoSuchProcessingInstructionExceptionNoSuchProcessingInstructionException

<?cocoon-process type="xslt"?>

ProcessingInstruction cp = doc.getProcessingInstruction( "cocoon-process"); cp.getValue("type");

NamespacesNamespaces

• Namespaces are a DOM Level 2 additionNamespaces are a DOM Level 2 addition

– JDOM always supports even with DOM Level 1 JDOM always supports even with DOM Level 1 parsers and even with validation on!parsers and even with validation on!

• Namespace prefix to URI mappings are held in the Namespace prefix to URI mappings are held in the Document objectDocument object

– Element knows prefix and local nameElement knows prefix and local name

– Document knows prefix to URI mappingDocument knows prefix to URI mapping

– Lets Elements easily move between DocumentsLets Elements easily move between Documents

• Retrieve and set a namespace URI for a prefix with:Retrieve and set a namespace URI for a prefix with:

• This mapping applies even for elements added This mapping applies even for elements added previouslypreviously

String uri = doc.getNamespaceURI("linux"); doc.addNamespaceMapping( "linux", "http://www.linux.org");

Using NamespacesUsing Namespaces

• Elements have "full names" with a prefix and local Elements have "full names" with a prefix and local namename– Can be specified as two stringsCan be specified as two strings– Can be specified as one "Can be specified as one "prefix:localnameprefix:localname" "

stringstring

• Allows apps to ignore namespaces if they want.Allows apps to ignore namespaces if they want.

• Element constructors work the same way.Element constructors work the same way.

kid = elt.getChild("JavaXML", "Contents"); kid = elt.getChild("JavaXML:Contents"); kid = elt.getChild("Contents");

List DetailsList Details

• The current implementation uses The current implementation uses LinkedListLinkedList for for speedspeed– Speeds growing the Speeds growing the ListList, modifying the , modifying the ListList– Slows the relatively rare index-based accessSlows the relatively rare index-based access

• All All ListList objects are mutable objects are mutable– Modifications affect the backing documentModifications affect the backing document– Other existing list views do not see the changeOther existing list views do not see the change– Same as SQL Same as SQL ResultSetResultSets, etc.s, etc.

ExceptionsExceptions

• JDOMExceptionJDOMException is the root exception is the root exception– Thrown for build errorsThrown for build errors– Always includes a useful error messageAlways includes a useful error message– May include a "root cause" exceptionMay include a "root cause" exception

• Subclasses include:Subclasses include:– NoSuchAttributeExceptionNoSuchAttributeException– NoSuchElementExceptionNoSuchElementException– NoSuchProcessingInstructionExceptionNoSuchProcessingInstructionException– DataConversionExceptionDataConversionException

FutureFuture

• There may be a new high-speed builderThere may be a new high-speed builder– Builds a skeleton but defers full analysisBuilds a skeleton but defers full analysis– Use of the List interface allows great flexibilityUse of the List interface allows great flexibility

• There could be other implementations outside There could be other implementations outside org.jdomorg.jdom– The should follow the specificationThe should follow the specification– The current implementation is flexibleThe current implementation is flexible– We don't expect alternate implementations to be We don't expect alternate implementations to be

necessarynecessary

Get InvolvedGet Involved

• Download the softwareDownload the software– http://jdom.orghttp://jdom.org

• Read the specificationRead the specification– Coming soonComing soon

• Sign up for the mailing lists (see Sign up for the mailing lists (see jdom.orgjdom.org))– jdom-announcejdom-announce– jdom-interestjdom-interest

• Watch for JavaWorld and IBM developerWorks articlesWatch for JavaWorld and IBM developerWorks articles– http://www.javaworld.comhttp://www.javaworld.com– http://www.ibm.com/developerWorkshttp://www.ibm.com/developerWorks

• Help improve the software!Help improve the software!