iwmw 2002: web standards briefing (session c2)

55
A centre of expertise in digital information management Web Standards Briefing Brian Kelly UKOLN University of Bath Bath, BA2 7AY Email [email protected] URL http://www.ukoln.ac.uk/ UKOLN is supported by:

Upload: iwmw

Post on 09-Jan-2017

235 views

Category:

Education


5 download

TRANSCRIPT

A centre of expertise in digital information management

www.ukoln.ac.uk

Web Standards Briefing

Brian KellyUKOLNUniversity of BathBath, BA2 7AY

[email protected]://www.ukoln.ac.uk/

UKOLN is supported by:

A centre of expertise in digital information management

www.ukoln.ac.uk 2

Contents• Introduction• Standards• The Original Web

Architecture• Architectural

Developments• Deployment Issues• Discussion

Aims of Talk• To give brief overview

of Web architecture• To describe

developments to Web standards

• To briefly address implementation models

Please feel free to ask questions at any time, especially to clarify any unexplained TLAs or XTLAs

A centre of expertise in digital information management

www.ukoln.ac.uk 3

About MeBrian Kelly:

• UK Web Focus – a JISC-funded post to advise HE and FE communities on Web developments

• Based in UKOLN - a national focus of expertise in digital information management based at the University of Bath

• Involved in Web since 1993, while working in the Computing Service at University of Leeds

• Represent JISC on the World Wide Web Consortium (W3C)

A centre of expertise in digital information management

www.ukoln.ac.uk 4

Standards in HE/FE ContextStandards are important in the HE and FE sector to:

• Ensure widespread access to resources• Enables resources to be reused and repurposed• Ensure scholarly resources can be preserved• Address accountability of public funding • Minimise resource costs for upgrading systems • Provide universal access to resources (cf

disability legislation)

A centre of expertise in digital information management

www.ukoln.ac.uk 5

Standards

Need for standards to provide:• Platform and application independence• Avoidance of patented technologies • Flexibility and architectural integrity• Long-term access to data

Ideally look at standards first, then find applications which support the standards. However it can be difficult to achieve this ideal!

Before the WebAccess to resources typically required use of software vendor’s software – which was only available on limited no. of platforms. Often the software would be licensed.The goal of the Web was to provide universal access to resources. Who could argue with this goal?

A centre of expertise in digital information management

www.ukoln.ac.uk 6

Standards and the Web

W3C• Produces W3C

Recommendations on Web protocols

• Managed approach to developments

• Protocols initially developed by W3C members

• Decisions made by W3C, informedby member & public review

IETF• Produces Internet

Drafts on Internet protocols• Bottom-up approach to developments• Protocols may be developed

by interested individuals• "Rough consensus and working

code"

ISO• Produces ISO

Standards• Can be slow moving

and bureaucratic• Produce robust

standards

Proprietary• De facto standards• Often initially appealing

(cf PowerPoint, PDF)• May emerge as

standards

PNGHTMLZ39.50Java

HTML, XML, PNG, …

HTTPURNwhois++

HTML extensionsPDF and Java?

A centre of expertise in digital information management

www.ukoln.ac.uk 7

The Case For W3C StandardsWhy use open standards developed by the W3C? Why not leave it to the marketplace?

W3C’s open standards have been developed in an open environment, with the aim of achieving platform and application independency

Commercial companies develop proprietary formats in order to maximise their profits and dividends to shareholders

W3C’s open standards have been developed to interoperate with each other according to W3C’s design vision

Commercial companies typically develop proprietary formats in isolation, or along the lines of a company vision

A centre of expertise in digital information management

www.ukoln.ac.uk 8

Standards, Architectures, Applications, ResourcesThis talk touches on several areas

Architectures: models for implementing systems

Standards: concerned with protocols and file formats

Open standards vs. Proprietary

HTML / XML vs. PDFCSS / XSL vs. HTMLGIF vs PNG

Which standards are applicableNT / UnixFile system / database applicationHTML tools / content management

Apache / IISFrontPage / DreamweaverOracle / SQLServerColdFusion vs ASP

Development vs. Migration costsUse of in-house expertiseIn-house vs. out-sourced Licensed vs. open source

Resources: financial and staff costs needed to implement systems

Applications: software products used to implement systems

A centre of expertise in digital information management

www.ukoln.ac.uk 9

GIFAs an example of the dangers of use of proprietary solutions, consider the GIF file format:

• Unisys announce that they hold patent to compression algorithm used in GIF images and users of GIF will have to pay

• Following much debate, Unisys require payment for licence from software developers - and also for end users of unlicensed software ($5,000!)

• Web community responds with PNG format• See <http://burnallgifs.org/>

WARNING:• There is no guarantee that payment will not be

required for proprietary file formats which are currently free

A centre of expertise in digital information management

www.ukoln.ac.uk 10

How Does The Web Work?The Web has three fundamental concepts:

• URLs: addresses of resources• HTTP: dialogue between client and server• HTML: format of resources

The Netsoft home page

1 User clicks on link to the address (URL)http://www.netsoft.com/hello.html

2 Browser converts link to HTTP command (METHOD):Connect to computer at www.netsoft.com

GET /hello.html3 Remote computer sends file

Welcome toNetsoft

4 Local computer displays HTML file

Web Browser

Web server

<HTML><TITLE>Welcome</TITLE>..<P>The <A HREF=“…”>Netsoft</A> home page</P>

A centre of expertise in digital information management

www.ukoln.ac.uk 11

Approaches To HTMLEmphasis on managing HTML resources inappropriate:

• HTML is an output format, which cannot easily be reused (e.g. WAP, e-Books, etc.)

• Need to manage HTML fragments (only partly achievable with SSIs)

• Need to manage collections of resources• Need to have single master source of data• Need to support new developments such as

personalisation• Difficult to integrate with new formats

Issues• Should we stop giving HTML courses?• Should we stop buying HTML authoring tools?

A centre of expertise in digital information management

www.ukoln.ac.uk 12

XMLXML:

• Extensible Markup Language• A lightweight SGML designed for network use• Addresses HTML's lack of evolvability• Arbitrary elements can be defined (<STUDENT-NUMBER>, <PART-NO>, etc)

• Agreement achieved quickly - XML 1.0 became W3C Recommendation in Feb 1998

• Support from industry (SGML vendors, Microsoft, etc.)

• Support in latest versions of Web browsers

A centre of expertise in digital information management

www.ukoln.ac.uk 13

XML Concepts (1)Well-formed XML resources:

Make end-tags explicit: <li>...</li>Make empty elements explicit: <img ... />Quote attributes <img src="logo.gif" height="20"Use consistent upper/lower case

<p> and <P> are different

XML Namespaces:Mechanism for ensuring unique XML elements:

<?xml:namespace ns="http://foo.org/1998-001" prefix="i">

<p>Insert <i:PART>M-471</i:PART></p>

A centre of expertise in digital information management

www.ukoln.ac.uk 14

XML Concepts (2)XML Schemas

• Allow constraints to be applied on XML attributes• Express shared vocabularies and allow machines

to carry out rules made by people• Richer than DTDs• See <http://www.w3.org/XML/Schema>

XSLT• A language for transforming XML from one DTD

to another, or to another format (e.g. PDF)• Written in XML• Knows about XML (e.g. tree structures, etc.)• See <http://www.xslt.com/>

A centre of expertise in digital information management

www.ukoln.ac.uk 15

XML Concepts (3)XLink provides sophisticated hyperlinking:

• Links that allow you to choose multiple destinations• Bidirectional links• Links with special behaviours:

• Expand-in-place / Replace / Create new window• Link on load / Link on user action

• Link databases• See <http://www.xml.com/pub/a/2000/09/xlink/>

XPointer• Provides access to arbitrary portions of XML resource

• See <http://www.devshed.com/Server_Side/XML/XPointer/page1.html>

EnglandFrance

A centre of expertise in digital information management

www.ukoln.ac.uk 16

Getting to XML With XHTMLXHTML:

• HTML represented in XML• Some small changes to HTML:

Elements in lowercase <p> not <P> Attributes must be quoted <img src="logo" height="50"> Elements must be closed:

< p >... </ p >)<img src="logo" ... />

• Gain benefits from XML• Tools available (e.g. HTML-Kit from http://www.chami.com/html-kit/)

• See <http://www.webreference.com/xml/column6/>, <http://groups.yahoo.com/group/XHTML-L/> and <http://www.ariadne.ac.uk/issue27/web-focus/>

Note the IWMW 2002 Web site is (mostly) XHTML

A centre of expertise in digital information management

www.ukoln.ac.uk 17

CSSCSS:

• Cascading Style Sheets• XHTML/XML defines structure, CSS describes

the appearance• CSS 1.0 and 2.0 now W3C recommendations• CSS 3.0 in preparation (modularised)• We should be using CSS:

Part of architecture Ease of maintenance Becoming much richer Accessibility

• See <http://www.w3c.org/Style/CSS/>

A centre of expertise in digital information management

www.ukoln.ac.uk 18

SVGSVG:

• Scalable Vector Graphics• A language for describing two-dimensional

graphics in XML• See <http://www.w3.org/Graphics/SVG/Overview.htm8>

• Also see presentation on XML written in SVG at <http://www.w3c.org/Talks/2001/12/IH-Euroweb/W3CInTheWorldslide.svgz>

• WWW 2002 talk at <http://www.w3c.org/2002/Talks/www2002-SVG/>

A centre of expertise in digital information management

www.ukoln.ac.uk 19

A centre of expertise in digital information management

www.ukoln.ac.uk 20

SVG Example

http://www.karto.ethz.ch/neumann/cartography/vienna/

A centre of expertise in digital information management

www.ukoln.ac.uk 21

SVG and XSLTThis example:

• Originally written in Java

• Author realised that XSLT would be easier

• Uses SVG for chess board and pieces

• Uses XSLT to move pieces

http://people.w3.org/maxf/ChessGML/

A centre of expertise in digital information management

www.ukoln.ac.uk 22

CML, SVG and XSLThttp://www.adobe.com/svg/demos/cml2svg/html/index.html

A molecule described in CML can be transformed using XSLT into SVG, allowing it to be displayed and manipulated

A centre of expertise in digital information management

www.ukoln.ac.uk 23

SMILSMIL:

• Synchronized Multimedia Integration Language

• A language for authoring of interactive audiovisual presentations

• Allows you to synchronize text, images, audio and video in a document

• An XML Application• See <http://www.w3c.org/AudioVideo/>

A centre of expertise in digital information management

www.ukoln.ac.uk 24

SMIL Examplehttp://www.kevlindev.com/tutorials/basics/animation/svg_smil/index.htm

http://www.reseau.it/smil/smilapp_en.html

A centre of expertise in digital information management

www.ukoln.ac.uk 25

MathMLMathML:

• An XML application for maths

• Various plugins, dedicated readers, etc.

• Mozilla renders natively

See <http://www.mozilla.org/projects/mathml/>

A centre of expertise in digital information management

www.ukoln.ac.uk 26

ModularisationHow can you:

• Include XML resources such as MathML, ChemML, etc in XHTML documents?

• Provide a subset of XHTML features in browsers on devices such as mobile phones, PDAs, etc.?

The answer is:• XHTML modularisation (modularization )• See

<http://www.w3.org/TR/xhtml-modularization/> and<http://www.xml.com/pub/a/2002/01/16/xhtml-m12n.html>

A centre of expertise in digital information management

www.ukoln.ac.uk 27

Addressing (1)URLs have limitations:

• Lack of long-term persistencyUniv. changes name or department shut down or mergedDirectory structure reorganised

• Inability to support multiple versions (mirroring)

URIs:• Were an address of a resource – and moving a

resource was annoying but not critical • With the development of “Web services”, structured

resources, B2B communications, etc. the availability of URIs will be of great importance

A centre of expertise in digital information management

www.ukoln.ac.uk 28

Addressing (2)Solutions:

• Unique identifiers possible, but resolution difficult

• Solutions include DOIs, PURLs, OpenURLs, etc.

• Interest mostly in publishing sector• "URIs don’t break - people break them" • Think about URL persistency & naming

guidelines:<http://www.ariadne.ac.uk/issue31/web-focus/>

A centre of expertise in digital information management

www.ukoln.ac.uk 29

Transport - The Original RoadmapHTTP/0.9 and HTTP/1.0:

Design flaws and implementation problemsHTTP/1.1:

Addresses some of these problems 60% server support Performance benefits! (60% packet traffic

reduction) Is acting as fire-fighter Not sufficiently flexible or extensible

HTTP/NG: Radical redesign using object-oriented

technologies Undergoing trials Gradual transition (using proxies)

A centre of expertise in digital information management

www.ukoln.ac.uk 30

Transport - Today Today:

• Responsibility for development moved from W3C to IETF

• Little progress with HTTP/NG• Problems with HTTP/1.1:

Lengthy (176-page) specification without much explicit rationale for design decisions

Environment has become more complex Lack of a clean underlying data model …

• See “Clarifying the Fundamentals of HTTP” <http://www2002.org/CDROM/refereed/444/>

A centre of expertise in digital information management

www.ukoln.ac.uk 31

SOAP SOAP:

• Simple Object Access Protocol• Facilitates development of machine-to-machine

communications using Web protocols by providing a richer XML-based messaging mechanism

• A protocol for invoking methods on servers, services, components and objects

• Codifies existing practice of using XML and HTTP as a method invocation mechanism

• See FAQ at <http://www.develop.com/soap/soapfaq.htm>

A centre of expertise in digital information management

www.ukoln.ac.uk 32

MetadataMetadata - the missing architectural component from the initial implementation of the web

Metadata - RDF

PICS, TCN,

MCF, DSig,

DC,...

AddressingURL

Data formatHTML

TransportHTTP

Metadata Needs:• Resource discovery• Content filtering• Authentication• Improved navigation• Multiple format support• Rights management

A centre of expertise in digital information management

www.ukoln.ac.uk 33

Metadata ExamplesDSig (Digital Signatures initiative):

• Key component for providing trust on the web• DSig 2.0 will be based on RDF and will support

signed assertion:• This page is from the University of Bath• This page is a legally-binding list of courses

provided by the University

P3P (Platform for Privacy Preferences):• Developing methods for exchanging Privacy

Practices of Web sites and userNote that discussions about additional rights management metadata are currently taking place

A centre of expertise in digital information management

www.ukoln.ac.uk 34

RDFRDF (Resource Description Framework):

• Highlight of WWW 7 conference• Provides a metadata framework ("machine

understandable metadata for the web")• Based on ideas from content rating (PICS), resource

discovery (Dublin Core) and site mapping (MCF)• Applications include:

cataloging resources resource discovery

electronic commerce intelligent agents

digital signatures content rating intellectual property rights privacy

• See <URL: http://www.w3.org/Talks/1998/0417-WWW7-RDF>

A centre of expertise in digital information management

www.ukoln.ac.uk 35

RDF ModelRDF:

• Based on a formal data model (direct label graphs)

• Syntax for interchange of data

• Schema model

Resource ValuePropertyType

Property

page.html £0.05Cost

23-Mar-99ValidUntil

RDF Data Model

page.html £0.05

23-Mar-99

Property

Cost

InstanceOf

ValidUntil

ValuePropObj

Cost

PropName

A centre of expertise in digital information management

www.ukoln.ac.uk 36

Browser Support for RDFMozilla (Netscape's source code release) provides support for RDF.Mozilla supports site maps in RDF, as well as bookmarks and history lists See Netscape's or HotWired home page for a link to the RDF file.

Trusted 3rd

Party Metadata

Embedded Metadata

e.g. sitemaps

Image from http://purl.oclc.org/net/eric/talks/www7/devday/

A centre of expertise in digital information management

www.ukoln.ac.uk 37

RDF Conclusion RDF is a general-purpose framework RDF provides structured, machine-

understandable metadata for the Web Metadata vocabularies can be developed

without central coordination RDF Schemas describe the meaning of

each property name Signed RDF is the basis for trust

But:• Is it too complex?• Is it the right approach?

A centre of expertise in digital information management

www.ukoln.ac.uk 38

RSS – An XML/RDF ApplicationRSS (Rich / RDF Site Summary):

• Initially XML, now an RDF application

• Used for news feeds• Lightweight

approach that we should be investigating (e.g. see news page on IWMW 2002 Web site)

See example of an RSS authoring tool and parser at <http://rssxpress.ukoln.ac.uk/>

A centre of expertise in digital information management

www.ukoln.ac.uk 39

Model For News Feeds

Good For UserThe end user can choose her news feeds, including local news, news from JISC services and news from third partiesGood For ServiceThe service can chose its own informationflow model. Its news is disseminated automatically.

RSS Institution (e.g. Bath)

RSSCommunity(e.g. MIDAS)

RSS External(e.g. BBC)

Local News..JISC News..National News

XHTML converted to RSS

Structured database converted to RSS

Zope CMS outputs to RSS & XHTML

A centre of expertise in digital information management

www.ukoln.ac.uk 40

What About Tomorrow?Two interesting areas:The Semantic Web

• Will allow intelligent agents to know about resources

• AI and ontologists meet the Web• Uses RDF (Resource Description Framework) –

W3C’s framework for metadata• Some concerns over scale of problem• See <http://www.w3.org/2001/sw/>

Web Services• Highlight of the WWW 10 and WWW 2002

conferences

A centre of expertise in digital information management

www.ukoln.ac.uk 41

Web ServicesThe Web:

• Initially used for viewing static resources• Then interactive services built (e.g. e-learning)

We now want:• Programmable Web services which can be used

by other Web services using standards Web protocols

We have experience of the first generation of externally-hosted Web services (stats services, voting systems, etc.) - see <http://www.ariadne.ac.uk/issue23/web-focus/>.The next generation will be programmable and machine-understandableNote that concerns over outsourcing may be an issue

A centre of expertise in digital information management

www.ukoln.ac.uk 42

ExampleSome examples at gotdotnet.com:

• Mailsender• Thumbnail

GeneratorConcepts been around for some time (see Auditing & Evaluating Web Sites workshop) Now being standardised (UDDI, WSDL, SOAP, …) http://www.gotdotnet.com/playground/

services/thumbnailgen.aspx

A centre of expertise in digital information management

www.ukoln.ac.uk 43

We’ve Been Here BeforeReusable components available on the network:

• Sounds like COM/DCOM, CORBA, etc. for reusable program components

Network services for use within a community:• Sounds like JISCmail, RDN, EDINA, MIMAS, BIDS,

Mirror Service and other JISC Services• It’s outsourcing – but it’s OK!

Web Services And UK HE / FE CommunitiesSounds like a great idea:

• We’ve the organisational framework to develop national services (JISC, etc.)

• We’ve got the network• We’ve a community which is willing to exploit centrally-provided

services and wants to avoid reinventing the wheel (haven’t we?)

A centre of expertise in digital information management

www.ukoln.ac.uk 44

Currently...

End user

Local content National content International content

Web Web Web Web Web Web

We should be moving away from providing separate Web services with their own interfaces …

A centre of expertise in digital information management

www.ukoln.ac.uk 45

Currently...

End user

Collection Description(e.g. Agora)

User Profile(e.g. Headline)

Authentication(Athens)

Local content National content International content

Web Web Web Web Web Web

… and separate metadata repositories and access services (which are sometimes centralised) …

Agora and headline are eLib Hybrid libraries

A centre of expertise in digital information management

www.ukoln.ac.uk 46

Future...

Content

End user

User profile

Collectiondescription

Authentication

Metadata Services /Access (Web) Services

Application Services?

Bookmarks

Spell-checker

.. and move to Web-accessible, machine-understandable Web services as well as seamless access to content

Brokered access provide by

institutional portal(MLE, …)

A centre of expertise in digital information management

www.ukoln.ac.uk 47

Other W3C AreasSee

• W3C site map at <http://www.w3c.org/Help/siteindex>

• TimBL’s Web Design Issues at <http://www.w3c.org/DesignIssues>

• Web Architecture from 50,000 feet at <http://www.w3.org/DesignIssues/Architecture.html>

A centre of expertise in digital information management

www.ukoln.ac.uk 48

ArchitecturesLet us consider the following areas:

• Content Management• Systems Architecture• Access (Browser support)

A centre of expertise in digital information management

www.ukoln.ac.uk 49

Position TodayWhat should we be doing today?

• Move away from creating new content in HTML• Move to XHTML as part of the migration• Deploying XML applications• Storing structured information in a neutral

database• Using a CMS to manage our content• Deploying B2B applications to avoid human

bottleneck (such as RSS)

Note that these are aspirations. We will, of course, be constrained by existing systems, resource implications, vested interests, inertia, etc.

A centre of expertise in digital information management

www.ukoln.ac.uk 50

The CMS To The RescueHTML authoring tools have limitations (as has HTML). A CMS (Content Management System):

• Allows fragments to be managed• Allows collections to be managed• Allows resources to be stored in a neutral format

(backend database)• Allows resources to be reused• Often provides access control• Often provides workflow processes and project

managementIssues

• CMS can be expensive• CMS can be free but have support implications• Which one to choose?

A centre of expertise in digital information management

www.ukoln.ac.uk 51

Content ManagementStoring resources in HTML and GIF/JPEG is:

Easy to do and is a low cost solution Makes reuse and management of resources difficult

GIF /JPEG

XML

TIFF /….

On-the-flyor batch conversion

WMLHTML

Use

r-ag

ent

Neg

otia

tion

Content Management System for:• Management of content (content maintenance, metadata

management, access rights, project management, …)• Delivery of content (e.g. user-agent negotiation, alternative file

formats [such as WML], etc.))

A centre of expertise in digital information management

www.ukoln.ac.uk 52

Systems ArchitectureIssues for you to consider:

• Operating System:Should you go for a Unix OS or Windows NT?If Unix, should you go for Linux?

• Open Source vs Licensed Solution:Should you go for an open source solution or buy a licensed application?

• Package vs Do It Yourself:Should you make use of a pre-packages solution or develop your own solution based on a toolkit (e.g. database, scripting language, …)?

There are no global solutions – your choice should be based on expertise available locally, resourcing issues, discussions with partners, solutions provider, etc.

A centre of expertise in digital information management

www.ukoln.ac.uk 53

Browser IssuesWhich approach to browser issues should you take?

Web sites should be usable to old browsers as these are still in use and we aim to maximise access. Therefore you should deliver HTML 3.2 / 4.0 and avoid technologies such as JavaScript and CSS.

Old browsers are broken and fail to implement new technologies which provide (a) richer functionality (b) support for new devices and (c) better support for people with disabilities. Therefore you should use the latest stable versions of HTML (XHTML), CSS, etc.

NOTE• Use of ‘clean’ HTML should

degrade gracefully• XHTML is a useful transition• User-agent negotiation may

be relevantQUESTION

• Should organisations / community implement a browser policy?

A centre of expertise in digital information management

www.ukoln.ac.uk 54

ConclusionsTo conclude:

• Standards are important• HTML won’t do the job• XHTML is a useful transition• Many new standards being developed• Need to keep up-to-date and avoid developing

systems with built-in obsolescence • We’ll need a CMS to manage richly functional

institutional Web services• “Web services” should be important – and we

shouldn’t be too concerned about using remote services

A centre of expertise in digital information management

www.ukoln.ac.uk 55

QuestionsAny questions?