1 advanced topics xml and databases. 2 xml u overview u structure of xml data –xml document type...
TRANSCRIPT
![Page 1: 1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation](https://reader036.vdocuments.site/reader036/viewer/2022082422/56649d965503460f94a7f838/html5/thumbnails/1.jpg)
1
Advanced Topics
XML and Databases
![Page 2: 1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation](https://reader036.vdocuments.site/reader036/viewer/2022082422/56649d965503460f94a7f838/html5/thumbnails/2.jpg)
2
XML
Overview Structure of XML Data
– XML Document Type Definition DTD
– Namespaces
– XML Schema Query and Transformation
– XPath
– XSLT
– XQuery
![Page 3: 1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation](https://reader036.vdocuments.site/reader036/viewer/2022082422/56649d965503460f94a7f838/html5/thumbnails/3.jpg)
3
XML Overview
eXtensible Markup Language xML Hyper-Text Markup Language (HTML) for document
presentation and Standard Generalized Markup Language SGML for document management.
XML can handle structured data typical of DBMS. XML is flexible and can handle semi-structured data
that cannot be handled by relational DBMS. XML is the de facto representation to exchange data
between applications on the Web.
![Page 4: 1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation](https://reader036.vdocuments.site/reader036/viewer/2022082422/56649d965503460f94a7f838/html5/thumbnails/4.jpg)
4
XML Overview
Markup Language– separation of content and markup;– meaning of the markup;– E.g., HTML shows document markup for
presentation;– Tags – <title> Database System Concepts </title>– HTML has a specific set of tags;– XML is extensible and applications can specify tags
as needed.
![Page 5: 1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation](https://reader036.vdocuments.site/reader036/viewer/2022082422/56649d965503460f94a7f838/html5/thumbnails/5.jpg)
5
XML Overview
Comparison with DBMS– Focus is on the EXCHANGE of data between
applications.– Storage and management of XML is more complex
than for relational DBMS since XML is semi-structured.
– Tagged XML means that the message is self-documenting. No need for catalog, etc.
– Format of XML is not rigid and an application can ignore any fields.
– Versatile since most browsers are XML enabled and most DBMS vendors support XML data.
![Page 6: 1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation](https://reader036.vdocuments.site/reader036/viewer/2022082422/56649d965503460f94a7f838/html5/thumbnails/6.jpg)
6
Structure of XML Data
XML document; single root, e.g., bank in Figure 10.1 Element: bank is the root element and document also
contains customer, account and depositor elements. Elements in the XML document must be properly
nested, i.e., matching start and end tag within parent. <account> <balance> </balance> </account> is
properly nested. <account> <balance> </account> </balance> is not
properly nested. Figure 10.2 – Combine unstructured data (text) and
semi-structured data. This is one of the strengths of XML data exchange.
![Page 7: 1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation](https://reader036.vdocuments.site/reader036/viewer/2022082422/56649d965503460f94a7f838/html5/thumbnails/7.jpg)
7
Structure of XML Data
Nested data in XML can be considered similar to the output of a join from multiple tables or an unnormalized (nested) relational table.
Figure 10.3 shows account elements nested within customer elements. – Advantage is that there is no need to join customer
and account.– Shipping address is stored with each shipment. – Disadvantage is that if customer and account is a
many-to-many relationship then the account information will be replicated with all the disadvantages of replicated information.
![Page 8: 1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation](https://reader036.vdocuments.site/reader036/viewer/2022082422/56649d965503460f94a7f838/html5/thumbnails/8.jpg)
8
Structure of XML Data
Element Subelement– <element> </element> or <element/>
Attribute – Figure 10.4– Attribute is of type string; it cannot be repeated
within an element and cannot have sub-elements.– account is an element; acct-type is an attribute; account-number and branch-name and balance are
subelements of element account.
![Page 9: 1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation](https://reader036.vdocuments.site/reader036/viewer/2022082422/56649d965503460f94a7f838/html5/thumbnails/9.jpg)
9
XML Namespace
Namespace allows organizations to specify globally unique names for element tags.
Each tag or attribute is associated with a URI and this combination of URI and tag (attribute) is unique.
Namespace can be declared in the root element. <bank xmlns:FB=http://www.FirstBank.com> …. <FB:branch> <FB:branchname> …. </FB:branchname> <FB:branchaddress> … </FB:branchaddress> </bank>
![Page 10: 1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation](https://reader036.vdocuments.site/reader036/viewer/2022082422/56649d965503460f94a7f838/html5/thumbnails/10.jpg)
10
XML DTD
XML documents do not have to conform to any schema or set of pre-defined tags.
However, in most cases, applications require that data conforms to some pre-defined tags.
XML DTD
– Allowed list of elements and subelements within elements.
– Does not identify data types and other constraints.
– | (or) + (1 or more) ? (0 or more)
![Page 11: 1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation](https://reader036.vdocuments.site/reader036/viewer/2022082422/56649d965503460f94a7f838/html5/thumbnails/11.jpg)
11
XML DTD Figure 10.6 DTD Example– bank element consists of one or more account or
customer or depositor elements (in that order).– account element has subelements account-number,
branch-number, balance, etc.– elements account-number, branch-name, etc. are of
type #PCDATA (text or string).– empty – element has no contents.– any – element can have any subelements.– attrributes must have a type declaration and a default
value. <!ATTLIST account acct-type CDATA “checking”>
![Page 12: 1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation](https://reader036.vdocuments.site/reader036/viewer/2022082422/56649d965503460f94a7f838/html5/thumbnails/12.jpg)
12
XML DTD ID and IDREF and IDREFS Figure 10.7 ID
– An attribute of type ID for an element provides a unique (global) identifier or key for that element.
– An element can at most have one such attribute of type ID.– <!ATTLIST account account-number ID #REQUIRED
An attribute of type IDREF is a reference to an element; its value MUST BE the unique ID value of some element in the document.
IDREFS is a set of ID values. ID and IDREF and IDREFS capture primary key and foreign key
functionality of the relational data model. Figure 10.8 Example of XML document with ID and IDREFS. IDREF must point to an ID but there is no type checking so it can point
to the ID of an account or the ID of a customer or the ID of a branch!
![Page 13: 1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation](https://reader036.vdocuments.site/reader036/viewer/2022082422/56649d965503460f94a7f838/html5/thumbnails/13.jpg)
13
XML Schema – Figure 10.9 XML Schema is closer in spirit to relational schemas. It is closely associated with namespaces, e.g., xmlns:xsd=http://www.w3.org/2001/XMLSchema>
Supports uniqueness of primary keys and constraints on foreign keys.
element has name and type complexType (account or customer or depositor) is a sequence of
subelements. complexType BankType is a sequence of references to elements
of type account or customer or depositor.– More well defined than XML DTD since IDREF could refer to
an element irrespective of whether it was an account or a customer.
minOccurs and maxOccurs are multiplicity constraints.
![Page 14: 1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation](https://reader036.vdocuments.site/reader036/viewer/2022082422/56649d965503460f94a7f838/html5/thumbnails/14.jpg)
14
Query and Transformation of XML
3 kinds of query languages– XPath is the building block of path expressions.– XSLT is a transformation language.
» Originally designed to convert to HTML.» XSLT can transform one XML document to another so it is also a query
language.» Most widely supported.
– XQuery is more like an object query language. Tree model of XML data
– Root– Nodes are either elements or attributes.– Element nodes can have children which are subelements or
attributes of that element.
![Page 15: 1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation](https://reader036.vdocuments.site/reader036/viewer/2022082422/56649d965503460f94a7f838/html5/thumbnails/15.jpg)
15
Query and Transformation of XML Path expression
– Sequence of /xx/yy/zz where / refers to the root.– Result is a set of values from the XML document.– /bank-2/customer/customer-name on Figure 10.8 returns <customer-
name>Joe</customer-name> and <customer-name>Lisa</customer-name> and <customer-name>Mary</customer-name>
– /bank-2/customer/customer-name/text() would return only the values and not the tagged elements.
– /bank-2/account/@account-number also returns the set of account numbers. @ cannot be applied to IDREFS.
Selection– /bank-2/account[balance > 400]– /bank-2/account[balance > 400]/@account-number
Count– /bank-2/account/[customer/count() > 2]
Skip intermediate elements– /bank-2//name
![Page 16: 1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation](https://reader036.vdocuments.site/reader036/viewer/2022082422/56649d965503460f94a7f838/html5/thumbnails/16.jpg)
16
BMGTG402 Namespace
<402s04grade xmlns:402s04=http://www.rhsmith.umd.edu/is/aqiuol/402s04>
<402s04:grade>
<402s04:student> …. </402s04:student>
<402s04:team> … </402s04:team>
</402s04:grade>