iwmw 2002: web standards briefing (session c2)
TRANSCRIPT
A centre of expertise in digital information management
www.ukoln.ac.uk
Web Standards Briefing
Brian KellyUKOLNUniversity of BathBath, BA2 7AY
[email protected]://www.ukoln.ac.uk/
UKOLN is supported by:
A centre of expertise in digital information management
www.ukoln.ac.uk 2
Contents• Introduction• Standards• The Original Web
Architecture• Architectural
Developments• Deployment Issues• Discussion
Aims of Talk• To give brief overview
of Web architecture• To describe
developments to Web standards
• To briefly address implementation models
Please feel free to ask questions at any time, especially to clarify any unexplained TLAs or XTLAs
A centre of expertise in digital information management
www.ukoln.ac.uk 3
About MeBrian Kelly:
• UK Web Focus – a JISC-funded post to advise HE and FE communities on Web developments
• Based in UKOLN - a national focus of expertise in digital information management based at the University of Bath
• Involved in Web since 1993, while working in the Computing Service at University of Leeds
• Represent JISC on the World Wide Web Consortium (W3C)
A centre of expertise in digital information management
www.ukoln.ac.uk 4
Standards in HE/FE ContextStandards are important in the HE and FE sector to:
• Ensure widespread access to resources• Enables resources to be reused and repurposed• Ensure scholarly resources can be preserved• Address accountability of public funding • Minimise resource costs for upgrading systems • Provide universal access to resources (cf
disability legislation)
A centre of expertise in digital information management
www.ukoln.ac.uk 5
Standards
Need for standards to provide:• Platform and application independence• Avoidance of patented technologies • Flexibility and architectural integrity• Long-term access to data
Ideally look at standards first, then find applications which support the standards. However it can be difficult to achieve this ideal!
Before the WebAccess to resources typically required use of software vendor’s software – which was only available on limited no. of platforms. Often the software would be licensed.The goal of the Web was to provide universal access to resources. Who could argue with this goal?
A centre of expertise in digital information management
www.ukoln.ac.uk 6
Standards and the Web
W3C• Produces W3C
Recommendations on Web protocols
• Managed approach to developments
• Protocols initially developed by W3C members
• Decisions made by W3C, informedby member & public review
IETF• Produces Internet
Drafts on Internet protocols• Bottom-up approach to developments• Protocols may be developed
by interested individuals• "Rough consensus and working
code"
ISO• Produces ISO
Standards• Can be slow moving
and bureaucratic• Produce robust
standards
Proprietary• De facto standards• Often initially appealing
(cf PowerPoint, PDF)• May emerge as
standards
PNGHTMLZ39.50Java
HTML, XML, PNG, …
HTTPURNwhois++
HTML extensionsPDF and Java?
A centre of expertise in digital information management
www.ukoln.ac.uk 7
The Case For W3C StandardsWhy use open standards developed by the W3C? Why not leave it to the marketplace?
W3C’s open standards have been developed in an open environment, with the aim of achieving platform and application independency
Commercial companies develop proprietary formats in order to maximise their profits and dividends to shareholders
W3C’s open standards have been developed to interoperate with each other according to W3C’s design vision
Commercial companies typically develop proprietary formats in isolation, or along the lines of a company vision
A centre of expertise in digital information management
www.ukoln.ac.uk 8
Standards, Architectures, Applications, ResourcesThis talk touches on several areas
Architectures: models for implementing systems
Standards: concerned with protocols and file formats
Open standards vs. Proprietary
HTML / XML vs. PDFCSS / XSL vs. HTMLGIF vs PNG
Which standards are applicableNT / UnixFile system / database applicationHTML tools / content management
Apache / IISFrontPage / DreamweaverOracle / SQLServerColdFusion vs ASP
Development vs. Migration costsUse of in-house expertiseIn-house vs. out-sourced Licensed vs. open source
Resources: financial and staff costs needed to implement systems
Applications: software products used to implement systems
A centre of expertise in digital information management
www.ukoln.ac.uk 9
GIFAs an example of the dangers of use of proprietary solutions, consider the GIF file format:
• Unisys announce that they hold patent to compression algorithm used in GIF images and users of GIF will have to pay
• Following much debate, Unisys require payment for licence from software developers - and also for end users of unlicensed software ($5,000!)
• Web community responds with PNG format• See <http://burnallgifs.org/>
WARNING:• There is no guarantee that payment will not be
required for proprietary file formats which are currently free
A centre of expertise in digital information management
www.ukoln.ac.uk 10
How Does The Web Work?The Web has three fundamental concepts:
• URLs: addresses of resources• HTTP: dialogue between client and server• HTML: format of resources
The Netsoft home page
1 User clicks on link to the address (URL)http://www.netsoft.com/hello.html
2 Browser converts link to HTTP command (METHOD):Connect to computer at www.netsoft.com
GET /hello.html3 Remote computer sends file
Welcome toNetsoft
4 Local computer displays HTML file
Web Browser
Web server
<HTML><TITLE>Welcome</TITLE>..<P>The <A HREF=“…”>Netsoft</A> home page</P>
A centre of expertise in digital information management
www.ukoln.ac.uk 11
Approaches To HTMLEmphasis on managing HTML resources inappropriate:
• HTML is an output format, which cannot easily be reused (e.g. WAP, e-Books, etc.)
• Need to manage HTML fragments (only partly achievable with SSIs)
• Need to manage collections of resources• Need to have single master source of data• Need to support new developments such as
personalisation• Difficult to integrate with new formats
Issues• Should we stop giving HTML courses?• Should we stop buying HTML authoring tools?
A centre of expertise in digital information management
www.ukoln.ac.uk 12
XMLXML:
• Extensible Markup Language• A lightweight SGML designed for network use• Addresses HTML's lack of evolvability• Arbitrary elements can be defined (<STUDENT-NUMBER>, <PART-NO>, etc)
• Agreement achieved quickly - XML 1.0 became W3C Recommendation in Feb 1998
• Support from industry (SGML vendors, Microsoft, etc.)
• Support in latest versions of Web browsers
A centre of expertise in digital information management
www.ukoln.ac.uk 13
XML Concepts (1)Well-formed XML resources:
Make end-tags explicit: <li>...</li>Make empty elements explicit: <img ... />Quote attributes <img src="logo.gif" height="20"Use consistent upper/lower case
<p> and <P> are different
XML Namespaces:Mechanism for ensuring unique XML elements:
<?xml:namespace ns="http://foo.org/1998-001" prefix="i">
<p>Insert <i:PART>M-471</i:PART></p>
A centre of expertise in digital information management
www.ukoln.ac.uk 14
XML Concepts (2)XML Schemas
• Allow constraints to be applied on XML attributes• Express shared vocabularies and allow machines
to carry out rules made by people• Richer than DTDs• See <http://www.w3.org/XML/Schema>
XSLT• A language for transforming XML from one DTD
to another, or to another format (e.g. PDF)• Written in XML• Knows about XML (e.g. tree structures, etc.)• See <http://www.xslt.com/>
A centre of expertise in digital information management
www.ukoln.ac.uk 15
XML Concepts (3)XLink provides sophisticated hyperlinking:
• Links that allow you to choose multiple destinations• Bidirectional links• Links with special behaviours:
• Expand-in-place / Replace / Create new window• Link on load / Link on user action
• Link databases• See <http://www.xml.com/pub/a/2000/09/xlink/>
XPointer• Provides access to arbitrary portions of XML resource
• See <http://www.devshed.com/Server_Side/XML/XPointer/page1.html>
EnglandFrance
A centre of expertise in digital information management
www.ukoln.ac.uk 16
Getting to XML With XHTMLXHTML:
• HTML represented in XML• Some small changes to HTML:
Elements in lowercase <p> not <P> Attributes must be quoted <img src="logo" height="50"> Elements must be closed:
< p >... </ p >)<img src="logo" ... />
• Gain benefits from XML• Tools available (e.g. HTML-Kit from http://www.chami.com/html-kit/)
• See <http://www.webreference.com/xml/column6/>, <http://groups.yahoo.com/group/XHTML-L/> and <http://www.ariadne.ac.uk/issue27/web-focus/>
Note the IWMW 2002 Web site is (mostly) XHTML
A centre of expertise in digital information management
www.ukoln.ac.uk 17
CSSCSS:
• Cascading Style Sheets• XHTML/XML defines structure, CSS describes
the appearance• CSS 1.0 and 2.0 now W3C recommendations• CSS 3.0 in preparation (modularised)• We should be using CSS:
Part of architecture Ease of maintenance Becoming much richer Accessibility
• See <http://www.w3c.org/Style/CSS/>
A centre of expertise in digital information management
www.ukoln.ac.uk 18
SVGSVG:
• Scalable Vector Graphics• A language for describing two-dimensional
graphics in XML• See <http://www.w3.org/Graphics/SVG/Overview.htm8>
• Also see presentation on XML written in SVG at <http://www.w3c.org/Talks/2001/12/IH-Euroweb/W3CInTheWorldslide.svgz>
• WWW 2002 talk at <http://www.w3c.org/2002/Talks/www2002-SVG/>
A centre of expertise in digital information management
www.ukoln.ac.uk 20
SVG Example
http://www.karto.ethz.ch/neumann/cartography/vienna/
A centre of expertise in digital information management
www.ukoln.ac.uk 21
SVG and XSLTThis example:
• Originally written in Java
• Author realised that XSLT would be easier
• Uses SVG for chess board and pieces
• Uses XSLT to move pieces
http://people.w3.org/maxf/ChessGML/
A centre of expertise in digital information management
www.ukoln.ac.uk 22
CML, SVG and XSLThttp://www.adobe.com/svg/demos/cml2svg/html/index.html
A molecule described in CML can be transformed using XSLT into SVG, allowing it to be displayed and manipulated
A centre of expertise in digital information management
www.ukoln.ac.uk 23
SMILSMIL:
• Synchronized Multimedia Integration Language
• A language for authoring of interactive audiovisual presentations
• Allows you to synchronize text, images, audio and video in a document
• An XML Application• See <http://www.w3c.org/AudioVideo/>
A centre of expertise in digital information management
www.ukoln.ac.uk 24
SMIL Examplehttp://www.kevlindev.com/tutorials/basics/animation/svg_smil/index.htm
http://www.reseau.it/smil/smilapp_en.html
A centre of expertise in digital information management
www.ukoln.ac.uk 25
MathMLMathML:
• An XML application for maths
• Various plugins, dedicated readers, etc.
• Mozilla renders natively
See <http://www.mozilla.org/projects/mathml/>
A centre of expertise in digital information management
www.ukoln.ac.uk 26
ModularisationHow can you:
• Include XML resources such as MathML, ChemML, etc in XHTML documents?
• Provide a subset of XHTML features in browsers on devices such as mobile phones, PDAs, etc.?
The answer is:• XHTML modularisation (modularization )• See
<http://www.w3.org/TR/xhtml-modularization/> and<http://www.xml.com/pub/a/2002/01/16/xhtml-m12n.html>
A centre of expertise in digital information management
www.ukoln.ac.uk 27
Addressing (1)URLs have limitations:
• Lack of long-term persistencyUniv. changes name or department shut down or mergedDirectory structure reorganised
• Inability to support multiple versions (mirroring)
URIs:• Were an address of a resource – and moving a
resource was annoying but not critical • With the development of “Web services”, structured
resources, B2B communications, etc. the availability of URIs will be of great importance
A centre of expertise in digital information management
www.ukoln.ac.uk 28
Addressing (2)Solutions:
• Unique identifiers possible, but resolution difficult
• Solutions include DOIs, PURLs, OpenURLs, etc.
• Interest mostly in publishing sector• "URIs don’t break - people break them" • Think about URL persistency & naming
guidelines:<http://www.ariadne.ac.uk/issue31/web-focus/>
A centre of expertise in digital information management
www.ukoln.ac.uk 29
Transport - The Original RoadmapHTTP/0.9 and HTTP/1.0:
Design flaws and implementation problemsHTTP/1.1:
Addresses some of these problems 60% server support Performance benefits! (60% packet traffic
reduction) Is acting as fire-fighter Not sufficiently flexible or extensible
HTTP/NG: Radical redesign using object-oriented
technologies Undergoing trials Gradual transition (using proxies)
A centre of expertise in digital information management
www.ukoln.ac.uk 30
Transport - Today Today:
• Responsibility for development moved from W3C to IETF
• Little progress with HTTP/NG• Problems with HTTP/1.1:
Lengthy (176-page) specification without much explicit rationale for design decisions
Environment has become more complex Lack of a clean underlying data model …
• See “Clarifying the Fundamentals of HTTP” <http://www2002.org/CDROM/refereed/444/>
A centre of expertise in digital information management
www.ukoln.ac.uk 31
SOAP SOAP:
• Simple Object Access Protocol• Facilitates development of machine-to-machine
communications using Web protocols by providing a richer XML-based messaging mechanism
• A protocol for invoking methods on servers, services, components and objects
• Codifies existing practice of using XML and HTTP as a method invocation mechanism
• See FAQ at <http://www.develop.com/soap/soapfaq.htm>
A centre of expertise in digital information management
www.ukoln.ac.uk 32
MetadataMetadata - the missing architectural component from the initial implementation of the web
Metadata - RDF
PICS, TCN,
MCF, DSig,
DC,...
AddressingURL
Data formatHTML
TransportHTTP
Metadata Needs:• Resource discovery• Content filtering• Authentication• Improved navigation• Multiple format support• Rights management
A centre of expertise in digital information management
www.ukoln.ac.uk 33
Metadata ExamplesDSig (Digital Signatures initiative):
• Key component for providing trust on the web• DSig 2.0 will be based on RDF and will support
signed assertion:• This page is from the University of Bath• This page is a legally-binding list of courses
provided by the University
P3P (Platform for Privacy Preferences):• Developing methods for exchanging Privacy
Practices of Web sites and userNote that discussions about additional rights management metadata are currently taking place
A centre of expertise in digital information management
www.ukoln.ac.uk 34
RDFRDF (Resource Description Framework):
• Highlight of WWW 7 conference• Provides a metadata framework ("machine
understandable metadata for the web")• Based on ideas from content rating (PICS), resource
discovery (Dublin Core) and site mapping (MCF)• Applications include:
cataloging resources resource discovery
electronic commerce intelligent agents
digital signatures content rating intellectual property rights privacy
• See <URL: http://www.w3.org/Talks/1998/0417-WWW7-RDF>
A centre of expertise in digital information management
www.ukoln.ac.uk 35
RDF ModelRDF:
• Based on a formal data model (direct label graphs)
• Syntax for interchange of data
• Schema model
Resource ValuePropertyType
Property
page.html £0.05Cost
23-Mar-99ValidUntil
RDF Data Model
page.html £0.05
23-Mar-99
Property
Cost
InstanceOf
ValidUntil
ValuePropObj
Cost
PropName
A centre of expertise in digital information management
www.ukoln.ac.uk 36
Browser Support for RDFMozilla (Netscape's source code release) provides support for RDF.Mozilla supports site maps in RDF, as well as bookmarks and history lists See Netscape's or HotWired home page for a link to the RDF file.
Trusted 3rd
Party Metadata
Embedded Metadata
e.g. sitemaps
Image from http://purl.oclc.org/net/eric/talks/www7/devday/
A centre of expertise in digital information management
www.ukoln.ac.uk 37
RDF Conclusion RDF is a general-purpose framework RDF provides structured, machine-
understandable metadata for the Web Metadata vocabularies can be developed
without central coordination RDF Schemas describe the meaning of
each property name Signed RDF is the basis for trust
But:• Is it too complex?• Is it the right approach?
A centre of expertise in digital information management
www.ukoln.ac.uk 38
RSS – An XML/RDF ApplicationRSS (Rich / RDF Site Summary):
• Initially XML, now an RDF application
• Used for news feeds• Lightweight
approach that we should be investigating (e.g. see news page on IWMW 2002 Web site)
See example of an RSS authoring tool and parser at <http://rssxpress.ukoln.ac.uk/>
A centre of expertise in digital information management
www.ukoln.ac.uk 39
Model For News Feeds
Good For UserThe end user can choose her news feeds, including local news, news from JISC services and news from third partiesGood For ServiceThe service can chose its own informationflow model. Its news is disseminated automatically.
RSS Institution (e.g. Bath)
RSSCommunity(e.g. MIDAS)
RSS External(e.g. BBC)
Local News..JISC News..National News
XHTML converted to RSS
Structured database converted to RSS
Zope CMS outputs to RSS & XHTML
A centre of expertise in digital information management
www.ukoln.ac.uk 40
What About Tomorrow?Two interesting areas:The Semantic Web
• Will allow intelligent agents to know about resources
• AI and ontologists meet the Web• Uses RDF (Resource Description Framework) –
W3C’s framework for metadata• Some concerns over scale of problem• See <http://www.w3.org/2001/sw/>
Web Services• Highlight of the WWW 10 and WWW 2002
conferences
A centre of expertise in digital information management
www.ukoln.ac.uk 41
Web ServicesThe Web:
• Initially used for viewing static resources• Then interactive services built (e.g. e-learning)
We now want:• Programmable Web services which can be used
by other Web services using standards Web protocols
We have experience of the first generation of externally-hosted Web services (stats services, voting systems, etc.) - see <http://www.ariadne.ac.uk/issue23/web-focus/>.The next generation will be programmable and machine-understandableNote that concerns over outsourcing may be an issue
A centre of expertise in digital information management
www.ukoln.ac.uk 42
ExampleSome examples at gotdotnet.com:
• Mailsender• Thumbnail
GeneratorConcepts been around for some time (see Auditing & Evaluating Web Sites workshop) Now being standardised (UDDI, WSDL, SOAP, …) http://www.gotdotnet.com/playground/
services/thumbnailgen.aspx
A centre of expertise in digital information management
www.ukoln.ac.uk 43
We’ve Been Here BeforeReusable components available on the network:
• Sounds like COM/DCOM, CORBA, etc. for reusable program components
Network services for use within a community:• Sounds like JISCmail, RDN, EDINA, MIMAS, BIDS,
Mirror Service and other JISC Services• It’s outsourcing – but it’s OK!
Web Services And UK HE / FE CommunitiesSounds like a great idea:
• We’ve the organisational framework to develop national services (JISC, etc.)
• We’ve got the network• We’ve a community which is willing to exploit centrally-provided
services and wants to avoid reinventing the wheel (haven’t we?)
A centre of expertise in digital information management
www.ukoln.ac.uk 44
Currently...
End user
Local content National content International content
Web Web Web Web Web Web
We should be moving away from providing separate Web services with their own interfaces …
A centre of expertise in digital information management
www.ukoln.ac.uk 45
Currently...
End user
Collection Description(e.g. Agora)
User Profile(e.g. Headline)
Authentication(Athens)
Local content National content International content
Web Web Web Web Web Web
… and separate metadata repositories and access services (which are sometimes centralised) …
Agora and headline are eLib Hybrid libraries
A centre of expertise in digital information management
www.ukoln.ac.uk 46
Future...
Content
End user
User profile
Collectiondescription
Authentication
Metadata Services /Access (Web) Services
Application Services?
Bookmarks
Spell-checker
.. and move to Web-accessible, machine-understandable Web services as well as seamless access to content
Brokered access provide by
institutional portal(MLE, …)
A centre of expertise in digital information management
www.ukoln.ac.uk 47
Other W3C AreasSee
• W3C site map at <http://www.w3c.org/Help/siteindex>
• TimBL’s Web Design Issues at <http://www.w3c.org/DesignIssues>
• Web Architecture from 50,000 feet at <http://www.w3.org/DesignIssues/Architecture.html>
A centre of expertise in digital information management
www.ukoln.ac.uk 48
ArchitecturesLet us consider the following areas:
• Content Management• Systems Architecture• Access (Browser support)
A centre of expertise in digital information management
www.ukoln.ac.uk 49
Position TodayWhat should we be doing today?
• Move away from creating new content in HTML• Move to XHTML as part of the migration• Deploying XML applications• Storing structured information in a neutral
database• Using a CMS to manage our content• Deploying B2B applications to avoid human
bottleneck (such as RSS)
Note that these are aspirations. We will, of course, be constrained by existing systems, resource implications, vested interests, inertia, etc.
A centre of expertise in digital information management
www.ukoln.ac.uk 50
The CMS To The RescueHTML authoring tools have limitations (as has HTML). A CMS (Content Management System):
• Allows fragments to be managed• Allows collections to be managed• Allows resources to be stored in a neutral format
(backend database)• Allows resources to be reused• Often provides access control• Often provides workflow processes and project
managementIssues
• CMS can be expensive• CMS can be free but have support implications• Which one to choose?
A centre of expertise in digital information management
www.ukoln.ac.uk 51
Content ManagementStoring resources in HTML and GIF/JPEG is:
Easy to do and is a low cost solution Makes reuse and management of resources difficult
GIF /JPEG
XML
TIFF /….
On-the-flyor batch conversion
WMLHTML
Use
r-ag
ent
Neg
otia
tion
Content Management System for:• Management of content (content maintenance, metadata
management, access rights, project management, …)• Delivery of content (e.g. user-agent negotiation, alternative file
formats [such as WML], etc.))
A centre of expertise in digital information management
www.ukoln.ac.uk 52
Systems ArchitectureIssues for you to consider:
• Operating System:Should you go for a Unix OS or Windows NT?If Unix, should you go for Linux?
• Open Source vs Licensed Solution:Should you go for an open source solution or buy a licensed application?
• Package vs Do It Yourself:Should you make use of a pre-packages solution or develop your own solution based on a toolkit (e.g. database, scripting language, …)?
There are no global solutions – your choice should be based on expertise available locally, resourcing issues, discussions with partners, solutions provider, etc.
A centre of expertise in digital information management
www.ukoln.ac.uk 53
Browser IssuesWhich approach to browser issues should you take?
Web sites should be usable to old browsers as these are still in use and we aim to maximise access. Therefore you should deliver HTML 3.2 / 4.0 and avoid technologies such as JavaScript and CSS.
Old browsers are broken and fail to implement new technologies which provide (a) richer functionality (b) support for new devices and (c) better support for people with disabilities. Therefore you should use the latest stable versions of HTML (XHTML), CSS, etc.
NOTE• Use of ‘clean’ HTML should
degrade gracefully• XHTML is a useful transition• User-agent negotiation may
be relevantQUESTION
• Should organisations / community implement a browser policy?
A centre of expertise in digital information management
www.ukoln.ac.uk 54
ConclusionsTo conclude:
• Standards are important• HTML won’t do the job• XHTML is a useful transition• Many new standards being developed• Need to keep up-to-date and avoid developing
systems with built-in obsolescence • We’ll need a CMS to manage richly functional
institutional Web services• “Web services” should be important – and we
shouldn’t be too concerned about using remote services