Alex He

...永远保持希望与激情...约会未来更强大的自己...

 

语义WEB之XML[A Semantic Web Primer阅读笔记]

1. HTML文档不是结构化的信息。 XML中可以对元素的值进行定义(正则); XML让内容和显示分开;

2. XML schema将逐步取代DTD;XPath用户存取和查询XML文档;使用XSL和XSLT转换显示XML文档;

3. 一个XML文档由文档头、一些元素和可选的文档尾组成:

1) 文档头Prolog

<?xml version="1.0" encoding="UTF-16"?>

<?xml version="1.0" encoding="UTF-16" standalone="no"?>

<!DOCTYPE book SYSTEM "book.dtd">

2) 元素Elements

3) 属性Attributes

<lecturer name="David Billington" phone="+61-7-3875 507"/>

4) 注释Comments

<!-- This is a comment -->

5) 处理指令Processing Instructions(PIs)

格式为<?target instruction?>,比如<?stylesheet type="text/css" href="mystyle.css"?>

表示使用stylesheet命令处理该文档

6) Well-Formed XML Documents:只有一个根元素;每个元素必须是封闭的;标签不要重叠;每个元素中的属性不能重名;元素和标签名必须是可访问的;

4. 必须注意到的是元素是有顺序的,而属性是没有顺序的; 一个XML文档符合一定的语法规则则说明其是良好定义的。但是我们说一个XML文档是有效的是指它定义良好和使用了结构化的信息;

5. 有两种定义XML结构的方式,DTDs和XML schema,后者对前者进行了扩展并可用于定义数据类型;

6. DTDs

元素定义

<!ELEMENT lecturer (name,phone)>

<!ELEMENT name (#PCDATA)>

<!ELEMENT phone (#PCDATA)>

<!ELEMENT lecturer (name|phone)>

<!ELEMENT lecturer ((name,phone)|(phone,name))>

属性定义

<!ELEMENT order (item+)>

<!ATTLIST order

orderNo ID #REQUIRED

customer CDATA #REQUIRED

date CDATA #REQUIRED>

<!ELEMENT item EMPTY>

<!ATTLIST item

itemNo ID #REQUIRED

quantity CDATA #REQUIRED

comments CDATA #IMPLIED>

属性类型

• CDATA, a string (sequence of characters),

• ID, a name that is unique across the entire XML document,

• IDREF, a reference to another element with an ID attribute carrying the same value as the IDREF attribute,

• IDREFS, a series of IDREFs,

• (v1| . . . |vn), an enumeration of all possible values.

值类型

• #REQUIRED.必须出现的。

• #IMPLIED.可选的。

• #FIXED "value".

• "value".

XML实体

<!ENTITY thisyear "2007">

可以使用&thisyear引用该实体

7. XML Schema

元素类别

<element name="email"/>

<element name="head" minOccurs="1" maxOccurs="1"/>

<element name="to" minOccurs="1"/>

属性类别

<attribute name="id" type="ID" use="required"/>

<attribute name="speaks" type="Language" use="optional" default="en"/>

数据类别

除了预定义的基本类型,还有下面三个复合类型

sequence:顺序数据结合

all:数据集合,必须全部包括

choice:元素集合,包含其中一个

<complexType name="lecturerType">

     <sequence>

          <element name="firstname" type="string"

               minOccurs="0" maxOccurs="unbounded"/>

          <element name="lastname" type="string"/>

     </sequence>

     <attribute name="title" type="string" use="optional"/>

</complexType>

数据类型扩展

<complexType name="extendedLecturerType">

     <extension base="lecturerType">

          <sequence>

               <element name="email" type="string"

                    minOccurs="0" maxOccurs="1"/>

          </sequence>

          <attribute name="rank" type="string" use="required"/>

     </extension>

</complexType>

数据类型限制

<complexType name="restrictedLecturerType">

     <restriction base="lecturerType">

          <sequence>

               <element name="firstname" type="string"     

                    minOccurs="1" maxOccurs="2"/>

          </sequence>

          <attribute name="title" type="string" use="required"/>

     </restriction>

</complexType>

简单类型限制

<simpleType name="dayOfMonth">

     <restriction base="integer">

          <minInclusive value="1"/>

          <maxInclusive value="31"/>

     </restriction>

</simpleType>

<simpleType name="dayOfWeek">

     <restriction base="string">

          <enumeration value="Mon"/>

          <enumeration value="Tue"/>

          <enumeration value="Wed"/>

          <enumeration value="Thu"/>

          <enumeration value="Fri"/>

          <enumeration value="Sat"/>

          <enumeration value="Sun"/>

     </restriction>

</simpleType>

一个完整的例子

<element name="email" type="emailType"/>
<complexType name="emailType">
     <sequence>
          <element name="head" type="headType"/>
          <element name="body" type="bodyType"/>
     </sequence>
</complexType>
<complexType name="headType">
     <sequence>
          <element name="from" type="nameAddress"/>
          <element name="to" type="nameAddress"
               minOccurs="1" maxOccurs="unbounded"/>
          <element name="cc" type="nameAddress"
               minOccurs="0" maxOccurs="unbounded"/>
          <element name="subject" type="string"/>
     </sequence>
</complexType>
<complexType name="nameAddress">
     <attribute name="name" type="string" use="optional"/>
     <attribute name="address" type="string" use="required"/>
</complexType>
<complexType name="bodyType">
     <sequence>
          <element name="text" type="string"/>
          <element name="attachment" minOccurs="0"
               maxOccurs="unbounded">
               <complexType>
                    <attribute name="encoding" use="optional" default="mime">
                         <simpleType>
                              <restriction base="string">
                                   <enumeration value="mime"/>
                                   <enumeration value="binhex"/>
                              </restriction>
                         </simpleType>
                    </attribute>
                    <attribute name="file" type="string" use="required"/>
               </complexType>
          </element>
     </sequence>
</complexType>

8. 命名空间(Namespaces)

     <?xml version="1.0" encoding="UTF-16"?>

     <vu:instructors

          xmlns:vu="http://www.vu.com/empDTD"

          xmlns:gu="http://www.gu.au/empDTD"

          xmlns:uky="http://www.uky.edu/empDTD">

     <uky:faculty

          uky:title="assistant professor"

          uky:name="John Smith"

          uky:department="Computer Science"/>

     <gu:academicStaff

          gu:title="lecturer"

          gu:name="Mate Jones"

          gu:school="Information Technology"/>

     </vu:instructors>

命名空间声明如下:xmlns:prefix="location"

或者默认命名空间声明如下:xmlns="location"

9. 定位和查询XML文档(Addressing and Querying XML Documents)

也就是XPath的语法

/library/author --> 定位到所有位于library之下的author元素

//author --> 定位到所有author类型的元素

/library/@location --> 定位到library元素下的location属性节点

//book/@title=[.="Artificial Intelligence"] --> 定位到book下的title并且其值为"Artificial Intelligence”

//book[@title="Artificial Intelligence"] --> 定位到所有的book节点,其title属性值为"Artificial Intelligence”

//author[1] --> 定位到第一个author元素节点

//author[1]/book[last()] --> 定位到author节点第一个元素下的最后一个book元素

//book[not (@title)] --> 定位到不包含title属性的book元素

10. 处理:使用XSL,XSLT转换显示XML文档

一个例子

<?xml version="1.0" encoding="UTF-16"?>

<xsl:stylesheet version="1.0"

     xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

     <xsl:template match="/author">

          <html>

               <head><title>An author< /title></head>

               <body bgcolor="white">

                    <b><xsl:value-of select="name"/></b><br></br>

                    <xsl:value-of select="affiliation"/><br></br>

                    <i><xsl:value-of select="email"/></i>

               </body>

          </html>

     </xsl:template>

</xsl:stylesheet>

另一个例子

对于下面的XML文档

<authors>

     <author>

          <name>Grigoris Antoniou</name>

          <affiliation>University of Bremen</affiliation>

          <email>ga@tzi.de</email>

     </author>

     <author>

          <name>David Billington</name>

          <affiliation>Griffith University</affiliation>

          <email>david@gu.edu.net</email>

     </author>

</authors>      

定义下面的XSLT

<?xml version="1.0" encoding="UTF-16"?>

<xsl:stylesheet version="1.0"

     xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

     <xsl:template match="/">

          <html>

               <head><title>Authors< /title></head>

               <body bgcolor="white">

                    <xsl:apply-templates select="authors"/>

                    <!-- Apply templates for AUTHORS children -->

               </body>

          </html>

     </xsl:template>

     <xsl:template match="authors">

          <xsl:apply-templates select="author"/>

     </xsl:template>

     <xsl:template match="author">

          <h2><xsl:value-of select="name"/></h2>

          Affiliation:<xsl:value-of select="affiliation"/><br>

          Email: <xsl:value-of select="email"/>

          <p>

     </xsl:template>

</xsl:stylesheet>

获取属性值的XSLT如下

<xsl:template match="person">

     <person

          firstname="{@firstname}"

          lastname="{@lastname}"/>

</xsl:template>

posted on 2013-02-21 09:06  Alex木头  阅读(421)  评论(0编辑  收藏  举报

导航