school of electronics and computer science knowledge repositories: the next 10 years professor nigel...

37
School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Upload: aaron-lindsey

Post on 27-Mar-2015

219 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

School of Electronicsand Computer Science

Knowledge Repositories: The Next 10 YearsProfessor Nigel Shadbolt

Page 2: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Drivers for Change

The Open Access debate and the Open Archive Initiative

Moore’s Law The Semantic Web The Nature of Research

Publications

Page 3: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Drivers for Change

The Open Access debate and the Open Archive Initiative

Moore’s Law The Semantic Web The Nature of Research

Publications

Page 4: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Faster and Smaller

Devices are getting smaller and faster all the time

Moore’s Law has held for 40 years

This leads to orders of magnitude

Increase in power Increase in memory Decrease in size Decrease in cost

Constant migration and obsolescence

Our processors will have very limited shelf life

Our storage does too Our physics does too

Page 5: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Drivers for Change

The Open Access debate and the Open Archive Initiative

Moore’s Law The Semantic Web The Nature of Research

Publications

Page 6: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Making the Web Semantic…

Page 7: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Via meta content…

This is a type of object event and this is its title

This is the URL of the web page for the event

This is a type of object photograph and the photograph is of Tim Berners-Lee

Tim Berners-Lee is an invited speaker at the event

That is machine readable….

Page 8: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Can Annotate Anything

Publications…

Databases…

Metadata on scientific structures

Web data set (XHTML)

Page 9: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

The SW Community: Structured Spaces

Linkage of heterogeneous information web content databases meta-data

repository multimedia

Via ontologies as information mediation structures

Using Semantic Web languages

Oncogene(MYC): Found_In_Organism(Human). Gene_Has_Function(Transcriptional_Regulation). Gene_Has_Function(Gene_Transcription). In_Chromosomal_Location(8q24). Gene_Associated_With_Disease(Burkitts_Lymphoma).

NCI Cancer Ontology (OWL)

<meta> <classifications> <classification type="MYC” subtype="old_arx_id">bcr-2-1-059</classification> </classifications></meta>

BioMedCentral Metadata (XML)

Web data set (XHTML)

Vocabulary (RDFS)

Page 10: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Ontologies: Fundamental Building Blocks of the Semantic Web

Page 11: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

The Ontology

A shared conceptualisation of a domain

Provides the semantic backbone

Lightweight and is deployed using a W3C recommended standard language

Page 12: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Genetics: Gene Ontology

One of the earliest examples of the benefits of ontologies

Integration and interoperability were big wins

Specific tool support Considerable resources

invested and continuing in maintenance

Spawned more generic biological ontology efforts

Page 13: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Standards are fundamental

HTML XML + Name Space + XML Schema

Topic Maps

SMIL

RDF(S)XOL

OWL

RDF

Unicode URI

Page 14: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Advanced Knowledge Technologies IRC

AKT started Sept 00, 6 years, £8.8 Meg, EPSRC

www.aktors.org

Around 65 investigators and research staff

Page 15: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Infrastructures and Components

Built core infrastructures Constructed component technologies that cover the knowledge life

cycle in a number of applications

Page 16: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Exemplar Technology: ClassAKT

Page 17: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Semantic Spaces: Integrating Knowledge Technologies

Page 18: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

The CS AKTive Space:International Semantic Web Challenge Winner

24/7 update of content Content continually harvested and acquired against

community agreed ontology Easy access to information gestalts - who, what, where Hot spots

• Institutions

• Individuals

• Topics Impact of research

• citation services etc

• funding levels

• Changes and deltas Dynamic Communities of Practice…

Page 19: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Components of a Solution

Information sets

Ontology to mediate information sets

Semantic Storage Capability

Query Capability on Storage

Network and graph analysis tools

Browsing and Visualisation tools

Page 20: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

CS AKTiveSpace

Page 21: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Extending the model

Page 22: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

EPSRC: Knowing what they know

data sources

gatherers and

mediators

ontology knowledge repository

(triplestore)

applications

Page 23: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Visualising Interaction

Page 24: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Visualising Interaction: Programmes

Page 25: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Drivers for Change

The Open Access debate and the Open Archive Initiative

Moore’s Law The Semantic Web The Nature of Research and

Publication Knowledge Mapping

Page 26: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

New ways of discovery: e-Science

A large part of scientific discovery is now a joint human machine endeavour

Without considerable compute power no hope of progress

Examples from physics, astronomy, biology, chemistry and engineering

Page 27: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Grid

E-Scientists

Entire E-Science CycleEncompassing experimentation, analysis, publication, research, learning

5

Institutional Archive

LocalWebPublisher

Holdings

Digital Library

E-Scientists Graduate Students

Undergraduate Students

Virtual Learning Environment

E-Experimentation

E-Scientists

Technical Reports

Reprints

Peer-Reviewed Journal &

Conference Papers

Preprints & Metadata

Certified Experimental

Results & Analyses

Data, Metadata & Ontologies

Page 28: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Combichem

Combechem

The need for xtl-Prints

DATA PUBLICATION

DISSEMINATION

Page 29: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Structural Eprints

Page 30: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Drivers for Change

The Open Access debate and the Open Archive Initiative

Moore’s Law The Semantic Web The Nature of Research and

Publication Knowledge Mapping

Page 31: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Increasing Use of Value Added Services

Page 32: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Communities of Authors

● An example of a small coauthorship network depicting collaborations among scientists at a private research institution. Newman, M. E. J. (2004)

● Web services to run over archives at varying grainsize

Page 33: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Evolving Domains: Impact Analysis

● Three time periods in the PNAS high-impact map show the progression from the basic gene and protein work and techniques that dominated the 1980s to more diverse applications in the 1990s (Boyack, Kevin W. 2004)

Page 34: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Fig. 2. Bursting onto the scene: New Topics

● Co-word space of the top 50 highly frequent and bursty words used in the top 10% most highly cited PNAS publications in 1982-2001

Page 35: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Self Organising Maps: Topic Landscapes

● Use of k-means clustering in combination with a term dominance landscape to support semantic zooming. Skupin et al 2004

Page 36: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

Detecting Key Moments: Pathfinder

A 624-node merged network with global pruning by using Pathfinder Chen (2004)

Page 37: School of Electronics and Computer Science Knowledge Repositories: The Next 10 Years Professor Nigel Shadbolt

A future…

With institutional OAI at its heart… A semantic web of knowledge Knowledge repositories as key holdings Knowledge mapping services increasing in

range and capability Beyond bibliometrics…