xml, databases and business intelligence presentation to the gcpcug data warehousing sig - march 19,...

Post on 27-Mar-2015

218 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

XML, Databases and XML, Databases and Business IntelligenceBusiness Intelligence

Presentation to thePresentation to the

GCPCUG Data Warehousing SIG -GCPCUG Data Warehousing SIG -

March 19, 2001March 19, 2001

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Presentation OverviewPresentation Overview

Introduction to XMLIntroduction to XML

XML and DatabasesXML and Databases

XML and Business IntelligenceXML and Business Intelligence

XML ResourcesXML Resources

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

What is XML?What is XML?

Extensible Markup Language - born Extensible Markup Language - born 2/19982/1998

Extensible - allows new markup Extensible - allows new markup languageslanguages

More than HTML, less than SGMLMore than HTML, less than SGML XML family of specificationsXML family of specifications

• XML, XSL, DOM, XML Namespaces, XLink, XML, XSL, DOM, XML Namespaces, XLink, XPointer, XPath, etc.XPointer, XPath, etc.

More specifications on the wayMore specifications on the way• XML Schema, XML Query LanguageXML Schema, XML Query Language

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Uses of XMLUses of XML

Data StorageData Storage

Data InterchangeData Interchange

Data Display/RenderingData Display/Rendering

It’s about It’s about datadata

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Uses of XMLUses of XML

Data StorageData Storage

• Products marketed as “XML Products marketed as “XML databases”databases”– TaminoTamino– TEXTMLTEXTML

• Texts dealing with XML databasesTexts dealing with XML databases• XML-enabled databasesXML-enabled databases

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Uses of XMLUses of XML

When is XML Suited for Data Storage?When is XML Suited for Data Storage?• Data needs to be accessed by many Data needs to be accessed by many

systemssystems• Hierarchical dataHierarchical data• Smaller data setSmaller data set• Speed not criticalSpeed not critical• Simpler queries usedSimpler queries used• Data types not criticalData types not critical• Data must be stored for a long timeData must be stored for a long time

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Uses of XMLUses of XML

Data InterchangeData Interchange

• No middleware needed if applications No middleware needed if applications can read and write XMLcan read and write XML

• By 2003, up to 80% of data By 2003, up to 80% of data interchange between applications interchange between applications over public networks will be in XML over public networks will be in XML (per Gartner Group)(per Gartner Group)

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Uses of XMLUses of XML

Data Display/RenderingData Display/Rendering• Present the same content differently Present the same content differently

for different devicesfor different devices

Before XML . . .Before XML . . .• Either support older standard only Either support older standard only

(e.g., HTML 3.2)(e.g., HTML 3.2)• Or develop multiple sets of pages and Or develop multiple sets of pages and

redirect user based on their browserredirect user based on their browser

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Uses of XMLUses of XML

With XML . . .With XML . . .

• One set of XML documentsOne set of XML documents– One XSL document for each browser/deviceOne XSL document for each browser/device

• If a new device or new use for existing If a new device or new use for existing device emerges…device emerges…– develop new standard protocol (e.g., WAP)develop new standard protocol (e.g., WAP)– develop another XSL documentdevelop another XSL document

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Uses of XMLUses of XML

Then eitherThen either• serve XML and XSL to clientserve XML and XSL to client

OrOr• transform XML with XSL at servertransform XML with XSL at server• serve appropriate markup to clientserve appropriate markup to client

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Why is XML needed?Why is XML needed?

Consider HTMLConsider HTML

• HyperText Markup LanguageHyperText Markup Language

• Based on SGMLBased on SGML

• Most web pages use HTMLMost web pages use HTML

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Why is XML needed?Why is XML needed?

Advantages of HTMLAdvantages of HTML• Easy to learn compared to most Easy to learn compared to most

programming languagesprogramming languages Readily available authoring tools Readily available authoring tools

(even a text file editor)(even a text file editor) Readily available rendering toolReadily available rendering tool

Browsers are free, all new PCs have Browsers are free, all new PCs have browsers installedbrowsers installed

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Why is XML needed?Why is XML needed?

Disadvantages of HTMLDisadvantages of HTML• Deviation from its original purpose Deviation from its original purpose

– Presentation should be based on a styling Presentation should be based on a styling languagelanguage

• Lack of extensibilityLack of extensibility• Toleration of faulty codeToleration of faulty code

– acceptable for web page designacceptable for web page design– unacceptable for transmission of drug unacceptable for transmission of drug

datadata

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Why is XML needed?Why is XML needed?

Consider SGMLConsider SGML

• Standard Generalized Markup Standard Generalized Markup LanguageLanguage– No toleration of faulty codeNo toleration of faulty code– Completely extensibleCompletely extensible

• HTML, XML based on SGMLHTML, XML based on SGML

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Why is XML needed?Why is XML needed?

The advantages of SGML are actually The advantages of SGML are actually disadvantages in the web disadvantages in the web environmentenvironment

Complete extensibility of SGML Complete extensibility of SGML meansmeans• It is not cost-effective to develop It is not cost-effective to develop

browsers to support SGMLbrowsers to support SGML• Potentially huge bandwidth and storage Potentially huge bandwidth and storage

issuesissues

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Why is XML needed?Why is XML needed?

XML allows the use of metadata - XML allows the use of metadata - “data about data”“data about data”

HTML tagsHTML tags• <p>The Gettysburg Address was <p>The Gettysburg Address was

written by Abraham Lincoln</p>written by Abraham Lincoln</p> XML elementsXML elements

• <document>The Gettysburg Address <document>The Gettysburg Address </document> was written by </document> was written by <president><author>Abraham <president><author>Abraham Lincoln</author></president>Lincoln</author></president>

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Basic XMLBasic XML

<?xml version="1.0"?><CONTACTS> <DATABASE USERTYPE="PERSONAL">Contact List</DATABASE> <ENTRY NUMBER="1"> <NAME> <LAST_NAME>Sanford</LAST_NAME> <FIRST_NAME>Bill</FIRST_NAME> </NAME> <TITLE>VP, Controller</TITLE> <COMPANY>SDC, Inc.</COMPANY> <WEBSITE>www.sdcinc.biz</WEBSITE> <ADDRESS>4132 Homestead Rd.</ADDRESS> <CITY>Parma</CITY> <STATE>OH</STATE> <ZIP>44134</ZIP> <PHONE> <DIRECT>440-398-2098</DIRECT> <CELLULAR>440-123-4567</CELLULAR> </PHONE> <EMAIL>bsanford@sdc.biz</EMAIL> </ENTRY></CONTACTS>

XML Markup includes:

• XML declaration

• Root Element

• Elements

• Attributes

• Entities

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

XHTMLXHTML

Next-generation of HTMLNext-generation of HTML HTML specification rewritten to be XML HTML specification rewritten to be XML

compliantcompliant XML is XML is notnot going to replace HTML, going to replace HTML,

XHTML isXHTML is Differences between HTML, XHTML Differences between HTML, XHTML

include:include:• lower case tags requiredlower case tags required• proper nesting and closure of tagsproper nesting and closure of tags• quoting attributesquoting attributes

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

ParsersParsers

A parser is a program that A parser is a program that processes an XML document. processes an XML document.

IE includes a parser that allows the IE includes a parser that allows the rendering of XML documents. rendering of XML documents.

Parsers are either validating or Parsers are either validating or non-validating. non-validating.

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Well-formednessWell-formedness

An XML document is An XML document is well-formedwell-formed if if• attribute values are in quotesattribute values are in quotes• tags are properly nestedtags are properly nested• start and end tags are the same casestart and end tags are the same case• there is one root elementthere is one root element• empty elements must be formatted empty elements must be formatted

properlyproperly

If it’s not well-formed, it’s not XMLIf it’s not well-formed, it’s not XML

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Document Type Definition Document Type Definition (DTD)(DTD)

Used to specify how elements, Used to specify how elements, attributes, etc. relate to each otherattributes, etc. relate to each other

DTDs are DTDs are notnot XML documents, but are XML documents, but are used by themused by them

DTDs do not support data typingDTDs do not support data typing XML Schema being developed to XML Schema being developed to

address lack of data typingaddress lack of data typing• Schemas currently exist (e.g., Microsoft Schemas currently exist (e.g., Microsoft

XDR)XDR)• The W3C is working on an XML Schema The W3C is working on an XML Schema

recommendationrecommendation

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Document Type Definition Document Type Definition (DTD)(DTD)

<!ELEMENT CONTACTS (DATABASE, ENTRY+)><!ELEMENT DATABASE (#PCDATA)><!ATTLIST DATABASE USERTYPE (PERSONAL|CORPORATE) "PERSONAL"><!ELEMENT ENTRY (NAME, TITLE, COMPANY, WEBSITE?, ADDRESS, CITY, STATE, ZIP, PHONE, PAGER?, FAX?, EMAIL?)><!ATTLIST ENTRY NUMBER CDATA #IMPLIED><!ELEMENT NAME (LAST_NAME, FIRST_NAME)><!ELEMENT LAST_NAME (#PCDATA)><!ELEMENT FIRST_NAME (#PCDATA)><!ELEMENT TITLE (#PCDATA)><!ELEMENT COMPANY (#PCDATA)><!ELEMENT WEBSITE (#PCDATA)><!ELEMENT ADDRESS (#PCDATA)><!ELEMENT CITY (#PCDATA)><!ELEMENT STATE (#PCDATA)><!ELEMENT ZIP (#PCDATA)><!ELEMENT PHONE (OFFICE?, DIRECT?, CELLULAR?)>. . .ETC.

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Validating XMLValidating XML

An XML document that conforms to its An XML document that conforms to its DTD is DTD is validvalid

Validating parsersValidating parsers• IBM's XML4J ParserIBM's XML4J Parser

– online at online at http://www.oasis-open.org/cover/xml4j-http://www.oasis-open.org/cover/xml4j-check00.htmlcheck00.html

• IBM's DOMit: A servlet for XML validationIBM's DOMit: A servlet for XML validation– online at http://www.networking.ibm.com/ online at http://www.networking.ibm.com/

xml/XmlValidatorForm.htmxml/XmlValidatorForm.htm

• IE itself, modified by installing a download IE itself, modified by installing a download from http://msdn.microsoft.comfrom http://msdn.microsoft.com

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Validating XMLValidating XML

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Validating XMLValidating XML

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Validating XMLValidating XML

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

XSLXSL

Extensible Stylesheet LanguageExtensible Stylesheet Language

Two specificationsTwo specifications• XSL Transformations (XSLT)XSL Transformations (XSLT)• XSL Formatting ObjectsXSL Formatting Objects

XSLT is a W3C recommendation, XSLT is a W3C recommendation, XSL Formatting Objects is not (yet)XSL Formatting Objects is not (yet)

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

XSLTXSLT

Transforms XML into other markup Transforms XML into other markup languageslanguages

Often used to transform XML to Often used to transform XML to HTMLHTML

Limited query-like functionalityLimited query-like functionality

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

An XSL DocumentAn XSL Document

<?xml version="1.0" ?><xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl"><xsl:template match="/"><html> <head> <title> <xsl:value-of select="CONTACTS/DATABASE" /> </title> </head><body style="background-color: DDDDDD;"> <h2 align="center"> <xsl:value-of select="CONTACTS/DATABASE" /> <hr /> </h2><!-- --> <xsl:for-each select="CONTACTS/ENTRY[COMPANY='SDC, Inc.']" order-by ="NAME/LAST_NAME"> <table align="center" width="400" style="font-family: sans-serif; font-size: 10pt; background-color: EEEEEE;"> <tr> <td width="200"><b> <xsl:value-of select="NAME/FIRST_NAME" />

SELECT

WHERE

ORDER BY

XSLT Query-like

functionality:

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

An XSL DocumentAn XSL Document

<?xml version="1.0" ?><xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl"><xsl:template match="/"><html> <head> <title> <xsl:value-of select="CONTACTS/DATABASE" /> </title> </head><body style="background-color: DDDDDD;"> <h2 align="center"> <xsl:value-of select="CONTACTS/DATABASE" /> <hr /> </h2><!-- --> <xsl:for-each select="CONTACTS/ENTRY[COMPANY='SDC, Inc.']" order-by ="NAME/LAST_NAME"> <table align="center" width="400" style="font-family: sans-serif; font-size: 10pt; background-color: EEEEEE;"> <tr> <td width="200"><b> <xsl:value-of select="NAME/FIRST_NAME" />

XSLT

HTML

CSS

Other functionality:

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

XML, XSL and JavaScriptXML, XSL and JavaScript

<html><head><title>Test XML Page</title></head><body><script language = "JavaScript">

var xmlObject = new ActiveXObject("microsoft.xmldom")xmlObject.async = falsexmlObject.load("contacts.xml")

var xslObject = new ActiveXObject("microsoft.xmldom")xslObject.async = falsexslObject.load("contacts.xsl")

document.write(xmlObject.transformNode(xslObject))

</script></body></html>

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

XML, XSL and JavaScriptXML, XSL and JavaScript

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

XML and DatabasesXML and Databases

Microsoft SQL Server 2000Microsoft SQL Server 2000

Oracle products (various)Oracle products (various)

IBM DB2 UDB v. 7.1IBM DB2 UDB v. 7.1

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Microsoft SQL Server 2000Microsoft SQL Server 2000

SQL can retrieve results in XML SQL can retrieve results in XML formatformat

Three XML modes: Raw, Auto, ExplicitThree XML modes: Raw, Auto, Explicit Raw mode - result row tagged <row>Raw mode - result row tagged <row> Auto mode - more control over tagsAuto mode - more control over tags Explicit modeExplicit mode

• Default tags - table names, field namesDefault tags - table names, field names• Overwrite by specifying DTD with queryOverwrite by specifying DTD with query• Specify shape of the XML tree Specify shape of the XML tree • Requires relatively complex SQL queriesRequires relatively complex SQL queries

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Microsoft SQL Server 2000Microsoft SQL Server 2000

XML View MapperXML View Mapper• Create schema file to relate XML Data Create schema file to relate XML Data

Reduced (XDR) schema to SQL Server Reduced (XDR) schema to SQL Server schemaschema

UpdategramsUpdategrams• Express changes to XML document as Express changes to XML document as

database inserts, updates, and database inserts, updates, and deletesdeletes

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Oracle ProductsOracle Products

Intelligent Webhouse InitiativeIntelligent Webhouse Initiative

Oracle 8i - “the world’s first XML-Oracle 8i - “the world’s first XML-enabled database”enabled database”

Oracle Reports 6iOracle Reports 6i• Reports can be stored as XSLReports can be stored as XSL

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Oracle ProductsOracle Products

Oracle JDeveloper 3.1Oracle JDeveloper 3.1• Allows development of web Allows development of web

applications that process XML dataapplications that process XML data• Syntax-checking for XML, XSLSyntax-checking for XML, XSL• XSQL: Java programs that read XML XSQL: Java programs that read XML

from and write XML to databasefrom and write XML to database• Integration with Oracle 8iIntegration with Oracle 8i

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

IBM DB2 UDB v. 7.1IBM DB2 UDB v. 7.1

DB2 XML ExtenderDB2 XML Extender• facility to enable DB2 to work with facility to enable DB2 to work with

XMLXML

Net.DataNet.Data• macro language for DB2 UDBmacro language for DB2 UDB

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

IBM DB2 UDB v. 7.1IBM DB2 UDB v. 7.1

DB2 XML ExtenderDB2 XML Extender

• Repository for XML and DTDsRepository for XML and DTDs

• Storage methodsStorage methods– XML columnXML column– XML collectionXML collection

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

IBM DB2 UDB v. 7.1IBM DB2 UDB v. 7.1

XML columnXML column• Entire XML document stored in one Entire XML document stored in one

column as an XML UDTcolumn as an XML UDT• Data Access Definition (DAD) defines Data Access Definition (DAD) defines

indexes based on elements and indexes based on elements and attributesattributes

XML collectionXML collection• Relational tables mapped to/from XMLRelational tables mapped to/from XML• DAD maps DTD to tables and columnsDAD maps DTD to tables and columns

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

IBM DB2 UDB v. 7.1IBM DB2 UDB v. 7.1

DB2 XML Extender also allowsDB2 XML Extender also allows

• SQL to query XML based on elements SQL to query XML based on elements and attributesand attributes

• Stored procedures to generate XML Stored procedures to generate XML from DB2from DB2

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

IBM DB2 UDB v. 7.1IBM DB2 UDB v. 7.1

Net.DataNet.Data

• Allows conversion of SQL results to Allows conversion of SQL results to XMLXML

• Is not restricted to DB2 UDB as a data Is not restricted to DB2 UDB as a data sourcesource

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

XML and Query XML and Query LanguagesLanguages

XPathXPath• not based on XMLnot based on XML• limited functionalitylimited functionality• relatively difficult to understandrelatively difficult to understand

XSLTXSLT• based on XMLbased on XML• works with XPath, HTML, CSSworks with XPath, HTML, CSS• also has limited functionalityalso has limited functionality

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

XML and Query XML and Query LanguagesLanguages

Per the W3C website:

"The mission of the XML Query working group is to provide flexible query facilities to extract data from real and virtual documents on the Web, therefore finally providing the needed interaction between the web world and the database world. Ultimately, collections of XML files will be accessed like databases.”

(emphasis added)

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

XML EditorsXML Editors

Microsoft - XML NotepadMicrosoft - XML Notepad Tanyitech - Easy XML 1.0Tanyitech - Easy XML 1.0

– $39 at http://www.tanyitech.com$39 at http://www.tanyitech.com

Altova - XML SpyAltova - XML Spy– $199 at http://www.xmlspy.com$199 at http://www.xmlspy.com

Extensibility - Turbo XMLExtensibility - Turbo XML– $269 at http://www.entensibility.com$269 at http://www.entensibility.com

Popkin Software - Envision XMLPopkin Software - Envision XML– http://www.popkin.comhttp://www.popkin.com

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

XML EditorsXML Editors

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

XML Servers/DatabasesXML Servers/Databases

IxiaSoft - TEXTML ServerIxiaSoft - TEXTML Server• http://www.ixiasoft.comhttp://www.ixiasoft.com• TEXTML Server LiteTEXTML Server Lite, a free evaluation , a free evaluation

version, is availableversion, is available

Software AG - TaminoSoftware AG - Tamino• http://www.softwareag.com/taminohttp://www.softwareag.com/tamino

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

XML and Business XML and Business IntelligenceIntelligence

XML for AnalysisXML for Analysis

Common Warehouse Metamodel Common Warehouse Metamodel (CWM)(CWM)

Predictive Model Markup Language Predictive Model Markup Language (PMML)(PMML)

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

XML for AnalysisXML for Analysis

A platform-independent Microsoft A platform-independent Microsoft specificationspecification

Enable access to analytical data from Enable access to analytical data from XML for Analysis-compliant clientsXML for Analysis-compliant clients

Based on HTTP, XML, SOAP, OLE DB Based on HTTP, XML, SOAP, OLE DB for OLAP, OLE DB for Data Miningfor OLAP, OLE DB for Data Mining

Supporters include AlphaBlox, Brio, Supporters include AlphaBlox, Brio, Business Objects, Cognos, SAS, SPSSBusiness Objects, Cognos, SAS, SPSS

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Common Warehouse Common Warehouse MetamodelMetamodel

Per the CWM website (http://www.cwmforum.org):

“The purpose of OMG’s Common Warehouse Metadata Initiative (CWMI) is to enable easy interchange of metadata between data warehousing tools and metadata repositories in distributed heterogeneous environments.

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

Common Warehouse Common Warehouse MetamodelMetamodel

The CWM is a specification for The CWM is a specification for modeling metadata (relational, non-modeling metadata (relational, non-relational, multidimensional) found in relational, multidimensional) found in a data warehousing environment. a data warehousing environment.

Instances of the metamodel are Instances of the metamodel are exchanged via XMI (XML Metadata exchanged via XMI (XML Metadata Interchange) documents. Interchange) documents.

““The ultimate goal of CWM is to do for The ultimate goal of CWM is to do for data warehousing and business data warehousing and business intelligence tools what HTML did for intelligence tools what HTML did for web browsersweb browsers.”.”

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

PMMLPMML

Predictive Model Markup LanguagePredictive Model Markup Language• Developed by the Data Mining Group Developed by the Data Mining Group

(http://www.dmg.org/html/pmml_v1_1(http://www.dmg.org/html/pmml_v1_1.html).html)

Allows reuse of predictive models Allows reuse of predictive models between PMML-compliant between PMML-compliant applicationsapplications

Copyright © 2001 by Michael A. Mina - mikeamina@aol.com

XML ResourcesXML Resources

World Wide Web ConsortiumWorld Wide Web Consortium• http://www.w3.orghttp://www.w3.org

The XML Industry PortalThe XML Industry Portal• http://www.xml.orghttp://www.xml.org

XML101.comXML101.com• http://www.xml101.comhttp://www.xml101.com

XML MagicXML Magic• http://www.xmlmagic.comhttp://www.xmlmagic.com

<closing><closing>Thank You For AttendingThank You For Attending

</closing></closing>

top related