1 metadata for joined-up government paul miller interoperability focus uk office for library &...
TRANSCRIPT
1
Metadata for Joined-up Government
Paul Miller
Interoperability FocusUK Office for Library & Information Networking (UKOLN)
[email protected] http://www.ukoln.ac.uk/
UKOLN is funded by the Library and Information Commission, the Joint Information Systems Committee (JISC) of the Higher and Further Education Funding Councils, as well as by project funding from the JISC and the EU. UKOLN also receives support from the Universities of Bath and Hull where staff are based.
2
me.gov“Not me, ‘Gov”
3
my.gov
my.schools.gov
my.health.gov
my.environment.gov
my.library.gov
my.trains.gov
my.farming.gov
4
The Vision thing
Vision is two–fold:• Access to government for the Citizen
– me.gov– NELfH– People’s Network, etc.
• Access to government by government– Information Asset Register– GSI– Joined–up Government.
5
The Premise
Government needs to be visible on the Internet• Use of metadata will increase recall from
the major commercial search engines– Not really…
• Use of metadata will increase recall from customised search engines deployed on government Portal sites
– Absolutely.
6
What is ‘Metadata’?
– meaningless jargon
– ora fashionable, and terribly misused, term for what we’ve always done
– or“a means of turning data into information”
– and“data about data”
– andthe name of an author (‘William Golding’)
– andthe title of a book (‘The Name of the Rose’).
7
The Portal mentality
Portals are becoming very common……but what are they?• In HE and FE’s DNER, we distinguish
between Portals and Gateways;– A Portal is ‘deep’, and provides access to the
contents of a set of resources– A Gateway is ‘shallow’, and provides
descriptions of the contents of a set of resources.
8
The Portal mentalityIn the wider Web world, they might be
defined more as:“Portal: a Web-based network service that provides
access to a range of heterogeneous network services, local and remote, structured and unstructured. Such network services might typically include resource discovery services, email access and online discussion fora. Portals are aimed at human end-users using common Web 'standards' such as HTTP, HTML, Java and JavaScript.”
(DRAFT RDN definition)
9
Portals and GovernmentThere need not be only one government
portal:• me/y.gov
– General public face of Government
• me/y.schools.gov– Interface tailored to primary and secondary
education ‘customers’, drawing information from DfEE, DSS (?), Benefits Agency, etc.
• etc.All presenting information drawn from a common
data pool, according to common — or interoperable — standards…
10
A little language...
Semantics
Structure
Syntax
“Let’s talk English”Standardisation ofcontent
Standardisation ofform
“Here’s how to make a sentence”
Standardisation ofexpression
“These are the rulesof grammar”
“cat milk sat drank mat ”
“Cat sat on mat. Drankmilk.”
“The cat sat on the mat.It drank some milk.”
11
Semantics: the Dublin Core
• An attempt to improve resource discovery on the Web
– now adopted more broadly
• Building an interdisciplinary consensus about a core element set for resource discovery
– simple and intuitive– cross–disciplinary — not just libraries!!– international– open and consensual– flexible.
See http://purl.org/dc/See http://purl.org/dc/
12
• 15 elements of descriptive metadata• All elements optional• All elements repeatable• The whole is extensible
– offers a starting point for semantically richer descriptions
• Interdisciplinary– libraries, government, museums,
archives…
• International– available in more than 20 languages, with
more on the way...
Semantics: the Dublin Core
13
• Title• Creator• Subject• Description• Publisher• Contributor• Date• Type
• Format• Identifier• Source• Language• Relation• Coverage• Rights
http://purl.org/dc/
Semantics: the Dublin Core
14
Syntax: XML
• eXtensible Markup Language• World Wide Web Consortium
recommendation• Simplified subset of SGML for use on Web• Addresses HTML’s lack of evolvability• Easily extended• Supported by major vendors• Increasingly used as a transfer syntax, but
capable of far more….
See http://www.w3.org/XML/See http://www.w3.org/XML/
15
Structure: RDF
• Resource Description Framework• W3C Recommendation• Improves upon XML, HTML, PICS…• Machine understandable metadata!• Usually XML as syntax• Locally defined semantics• Supports structure• Increasing interest.
See http://www.ukoln.ac.uk/metadata/resources/dc/datamodel/WD–dc–rdf/
See http://www.ukoln.ac.uk/metadata/resources/dc/datamodel/WD–dc–rdf/
See http://www.w3.org/RDF/See http://www.w3.org/RDF/
16
What do we need?
• Common Semantics• Definitely.
– Without these, we cannot be sure that different sets of metadata are describing the same concept.
• Common Syntax• Almost certainly.
– Although data need not always be stored this way; simply converted for transfer.
• Common Structure• Almost certainly.
– Although RDF might not be ready for this. A conceptual structure (a model) might be enough.
17
Issues
• Dublin Core• Qualification mechanisms not fully finalised yet…
• XML• XML Query specification probably a year away
• RDF• Few real–world implementations or applications.
Query specification not even begun.
18
The need for Guidelines
• Many standards are flexible and permissive.• Dublin Core, for example, actually does little
more than define 15 optional free–text fields.• We can usefully specify
– Minimum set of required fields– Controlled term lists
– Language, Type, etc.
– Best practice
• Without this, we probably won’t do much good…
19
The need for tools
• Tools can…• Ensure enforcement of Cataloguing Guidelines• Automate aspects of the process• Incorporate agreed term lists• Facilitate the update process.
20
The need for tools
See http://www.ukoln.ac.uk/metadata/dcdot/See http://www.ukoln.ac.uk/metadata/dcdot/
21
The Z thing
• Seen as a cornerstone of the DNER
• Widely used by GILS services
• Part of most modern library systems
• Used in the NGDF Gateway
• Useful across Government when/if we want to provide access to the current contents of diverse distributed databases.
See http://www.ariadne.ac.uk/issue21/z3950/See http://www.ariadne.ac.uk/issue21/z3950/
22
What is Z39.50?
• ANSI/NISO Z39.50–1995, Information Retrieval (Z39.50): Application Service Definition and Protocol Specification
• ISO 23950:1998, Information and Documentation — Information Retrieval (Z39.50) — Application Service Definition and Protocol Specification.
See http://lcweb.loc.gov/z3950/agency/1995doce.htmlSee http://lcweb.loc.gov/z3950/agency/1995doce.html
23
What is Z39.50?
“This standard specifies a client/server based protocol for Information Retrieval. It specifies procedures and structures for a client to search a database provided by a server, retrieve database records identified by a search, scan a term list, and sort a result set. Access control, resource control, extended services, and a ‘help’ facility are also supported. The protocol addresses communication between corresponding information retrieval applications, the client and server (which may reside on different computers); it does not address interaction between the client and the end-user.”
(Z39.50–1995, page 0).
See http://lcweb.loc.gov/z3950/agency/1995doce.htmlSee http://lcweb.loc.gov/z3950/agency/1995doce.html
24
Some gory details…• Z39.50 follows client/server model
• But calls them Origin and Target
Client/origin
Server/target
25
Client/Server architecture
26
Client/Server architecture
27
Using Z39.50
• Z39.50 widely deployed in the library sector and elsewhere, although often invisibly• The Origin can be either a human user
or a second Origin computer– e.g. Z39.50 portals, summing resources
from multiple targets
• Users access Z39.50 Targets using proprietary clients or, increasingly, via web interfaces
– e.g. WinWillow, ZNavigator, many WOPACs.
28
Using Z39.50© A
rts & H
umanities D
ata S
ervice
29
Using Z39.50© A
rts & H
umanities D
ata S
ervice
30
Using Z39.50© U
niversity of C
alifornia
31
Using Z39.50© U
niversity of C
alifornia