xml overview
DESCRIPTION
โดย อ.ประพจน์ สุขมานนท์TRANSCRIPT
eXtensible Markup Language
Markup Language
SGML
WML
GML
..ML
Markup Language
� A Markup Language is a computer language that specifies the structure and content of a document by breaking the document down into the series of elements
XML
� XML stands for eXtensible Markup Language
� XML is a markup language much like HTML
� It’s a new markup language,developed by the W3C(www.w3.org),mainly to overcome limitations in HTML
� XML was designed to describe data
� XML tags are not predefined in XML.You must define your own tags
XML & HTML
� HTML and XML were designed with different goals
� HTML was designed to display data and to focus on how data looks
� XML was designed to describe data and to focus on what data is
� HTML is about displaying information, XML is about describing information
XML & HTML
� HTML tag : pre-defined tag<b>John</b>
� XML tag : user-defined tag<Name>John</Name>
HTML
<b>025447891</b>
Markup Language
XML
<myphonenumber>025447891</myphonenumber>
Markup Language Example
XML Example
<?xml version=“1.0”?><employee><id>001</id><name>Prapoj Sukmanont</name><city>Bangkok</city> <email>[email protected]</email>
</employee>
XML
xml01.xml
Element Structure
<tag> Content </tag>
Open Tag(start-tag) Close Tag(end-tag)
Element Name
XML Structure
<?xml version=“ 1.0” ?><root>
<child><sub_child></sub_child>
</child><child>
<sub_child></sub_child>
</child></root>
root
child child
sub_child sub_child
XML File Structure
Document Elements
Prolog
File name : *.xml
Prolog
� XML Declaration� <?xml version=“1.0”?>
� Document Type Declaration� <!DOCTYPE PurchaseOrder SYSTEM “po.dtd”>
� Processing Instruction(PI)� <?xml-stylesheet type=“ text/css” href=“ test.css”?>
Document Elements
<employee><id>001</id><name>Prasit Lee</name><city>Bangkok</city> <email>[email protected]</email>
</employee>
XML File Structure : Prolog
Document Elements
File name : *.xml
1. XML Declaration
2. Document Type Declaration (DTD)
3. Processing Instruction (PI)
Prolog
XML Example
<?xml version=“1.0” ?><employee>
<id>001</id><name>Prasit Lee</name><city>Bangkok</city> <email>[email protected]</email>
</employee>
xml01.xml
Thai Language XML Example
<?xml version=“1.0” encoding=“ windows-874” ?><employee>
<id>001</id><name>ประสิทธิ� ลี</name><city>กรุงเทพ</city>
<email>[email protected]</email></employee>
xml01th.xml
XML Benefits
Self-describe Data
Data Exchange
Messaging Format for Application
XML BenefitsXML Benefits
So on…(RSS, ebXML, XML Applications…)
CML, MathML, MusicML, VoiceML …
Text File & XML File
Text File
1,John,Bangkok2,David,New York3,Peter,London
XML File
<?xml version=“1.0” ?><employee><id>1</id><name>John</name><city>Bangkok</city> <id>2</id><name>David</name><city>New York</city> <id>3</id><name>Peter</name><city>London</city>
</employee>
employee.txt employee.xml
Data Exchange ExampleDB Server 1
XML File
<?xml version=“1.0” ?><employee><id>1</id><name>John</name><city>Bangkok</city> <id>2</id><name>David</name><city>New York</city> <id>3</id><name>Peter</name><city>London</city>
</employee>
employee.xml
Database
DB Server 2
Database
XML Parser XML Parser
XML Processor
� After the XML document is created, it needs to be evaluated by an application known as an XML processor or XML parser
� Part of the function of the parser is to interpret the document’s code and verify that it satisfies all of the XML specifications for document structure and syntax
� Microsoft developed an XML parser called MSXML (msxml.exe) for its Internet Explorer browser
Well-Formed XML
“A Well-Formed XML document contains no syntax
errors and satisfies the specifications for XML codes
as laid out by W3C”
1. Root Element
<?xml version=“1.0” ?><employee>
<id>001</id><name>Prapoj Sukmanont</name><city>Bangkok</city><email>[email protected]</email>
</employee>
Root Element
2. Element Naming
� XML elements must follow these naming rules
� Names can contain letters,numbers, and other characters.Names must not start with a number or other punctuation characters
� Names must not start with the letter xml(or XML or Xml ...)
� Names cannot contain space,(*,? And +)
� Avoid “ -” and “ .” in names
� The “ :” should not be used in element names
Element Naming : Example
<Company><_Company><My_company><First-name><Last.name>
<-Company><9Company><.My_company><Name*><Name?><Name+><xmlbook><first name>
OKNot OK
3. Closed Tag
<?xml version=“1.0” ?><employee><id>001</id><name>Prapoj Sukmanont</name><city>Bangkok</city> <email>[email protected]</email></employee>
4. Proper Nesting Tag
<ID><name>
</ID></name>
<ID><name>
</name></ID>
5. Case-sensitive
<ID><name>
</Name></ID>
<ID><name>
</name></ID>
6. Attribute Value
<employee id=001><name>
</name></employee>
<employee id=“001”><name>
</name></employee>
<employee id=‘001’><name>
</name></employee>
Element Attribute
� An attribute describes a feature or characteristic of an element .Attributes
are often used to provide additional information about an element .The
syntax for adding an attribute to an element is
� Attribute name constraints:
� The name must begin with a letter or underscore (_)
� Space are not allowed in attribute names
� Attribute names should not begin with the text string “xml”
<element_name attribute_name=“Attribute value”> … </element_name>
Well-Formed XML Example
<?xml version=“1.0” ?><employee><id>001</id><name prefix=“Mr”>Prapoj Sukmanont</name><city>Bangkok</city> <email>[email protected]</email>
</employee>
Well-FormedXML
xml02.xml
� Normal element
� <Name>Prasit Lee </Name>
� Empty element
� <telephone></telephone>
� <telephone/>
XML Element Types
Element Content
� Nested element
� Character data
� Reference Entity
� CDATA
� Comment
Element Content
<Start-tag> Content </End-tag>
1. Nested Element
…<BOOK>
<TITLE> XML Book </TITLE><AUTHOR>Prasit Lee </AUTHOR>
</BOOK>
2. Character Data
…<TITLE> XML Book </TITLE><AUTHOR>Prasit Lee </AUTHOR>…
3. Entity Reference
& &< <> >“ "‘ '
XML Example
<?xml version=“1.0” ?><Example>
<Statement>if x < y </Statement></Example>
xmlerror1.xml
Not XML Well-formed
XML Example
<?xml version=“1.0” ?><Example>
<Statement>if x < y </Statement></Example>
xml03.xml
4. CDATA Section
<?xml version=“1.0”?><Example>
<Statement><![CDATA[
if x > y and a < b]]>
</Statement></Example>
xml04.xml
CDATA
� Sometimes, an XML document needs to store large blocks of text containing the < and > symbols. In that case, it would be cumbersome to replace all of the < and > symbols with < and > character reference, the code itself will be difficult to read
� Instead of using character references, you can place large blocks of text into a CDATA section
� A CDATA section is a large block text that the XML processor interprets only a text
<![CDATA[Text block
]]>
5. Comment
…<BOOK>
<!-- This is comment create by “Prasit” 03.02.2012--><TITLE> XML Book </TITLE><AUTHOR>Prasit Lee </AUTHOR>
</BOOK>