tnc2012 federated and scholarly identity - match made in heaven?

48
Federated identity and scholarly identity - a match made in heaven? Gudmundur A. Thorisson, PhD <[email protected] > Research associate, University of Leicester Guest scientist, University of Iceland Participant in the GEN2PHEN Consortium and the ORCID Technical Working Group This work is published under the Creative Commons Attribution license (CC BY: http://creativecommons.org/licenses/by/3.0/ ) which means that it can be freely copied, redistributed and adapted, as long as proper attribution is given. TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012 ??????

Upload: gudmundur-thorisson

Post on 14-Jan-2015

875 views

Category:

Technology


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: TNC2012 Federated and scholarly identity - match made in heaven?

Federated identity and scholarly identity - a match made in heaven?Gudmundur A. Thorisson, PhD <[email protected]> Research associate, University of LeicesterGuest scientist, University of IcelandParticipant in the GEN2PHEN Consortium and the ORCID Technical Working Group

This work is published under the Creative Commons Attribution license (CC BY: http://creativecommons.org/licenses/by/3.0/) which means that it can be freely copied, redistributed and adapted, as long as proper attribution is given.

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

??????

Page 2: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

๏ Crash course in scholarly identity

๏ Some problems: name ambiguity and online identity fragmentation

๏ The Open Researcher & Contributor ID initiative - ORCID background, current status and roadmap

๏ Applications of ORCID in the scholarly identity landscape

๏ Some thoughts on collaboration between ORCID and identity feds

Overview

Page 3: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Scholarly identity

‣ A scientific researcher’s publication record

- Defined by “authorship” of mostly “traditional” kind of works

- Articles in peer-reviewed journals, books, conf. proceedings

‣ The “publish or perish” culture in scientific research

- Authorship of papers in top-tier, high-impact journal is single biggest factor in career advancement

- Not enough high-profile papers? no grant, no tenure, etc.

‣ At the heart of peer recognition / professional reputation

Page 4: TNC2012 Federated and scholarly identity - match made in heaven?
Page 5: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Digital scholarshipin the 21st century

‣ Creation of online digital research outputs increasingly common & important part of doing scientific work

- Research datasets deposited in online repositories

- Data curation - adding value to research data

- Scientific software

- Research blogging

- Contributions to scientific articles in Wikipedia

- [and so on]

Page 6: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Big Science, Big Data

• Scientific research increasingly large-scale and data-driven

• High-profile examples

– High-energy particle physics - experiments performed in the Large Hadron Collider

– Astronomy - data from ground-based and space telescopes, the Virtual Observatory (VO)

• Doctorow, C. Big data: Welcome to the petacentre. Nature 455, 16-21 (2008). http://dx.doi.org/10.1038/455016a

Page 7: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Biological research too is increasingly “Big” and data-driven

‣ From: small-scale datasets that fit into a printed journal article

Richards, M. et al. Paleolithic and neolithic lineages in the European mitochondrial gene pool. American journal of human genetics 59, 185-203 (1996). http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1915109/

Page 8: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

‣ To: large-scale collection of biological data in digital form

‣ Huge technological advances in last 5-10 yearsexperimental / observations <-- gathering data with high-throughput equipmentcomputer technology <-- storing & analyzing massive data volumes

‣ Example: massively-parallel sequencingDetermine human genome sequence in <1 day - the $1000 genomeMetagenomics: sequence *everything* in environment samplesLarge bio-specimen collections x100,0000 of individuals in disease/population biobanks

Biological research too is increasingly “Big” and data-driven

Page 9: TNC2012 Federated and scholarly identity - match made in heaven?

4

Prof Anthony J Brookes GEN2PHEN coordinator

Chair, Bioinformatics and GenomicsDepartment of Genetics

University of Leicester, UK

http://www.gen2phen.org

Page 10: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Identifying contributors

‣ Why? So we can.. - Attribution - link content creators with their works and attribute credit

appropriately

- Discovery - who contributed to publication X? which publications has person/organization Y contributed to?

‣ What kind of contributions?

- Characterizing ‘contributorship‘: role: author, creator, analyst, reviewer contribution: ‘conceived of study & designed experiment’, ‘wrote paper’, ‘performed experiments’

‣ LHC example: ~2000 ‘authors’ and ~170 institutions

Page 11: TNC2012 Federated and scholarly identity - match made in heaven?
Page 12: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Problem #1: name ambiguity

Are these authors all the same person?G. Thorisson, University of LeicesterG. A. Thorisson, University of LeicesterG. A. Thorisson, Cold Spring Harbor Laboratory

How about these?

Or these?

J. SmithJ. SmithJ. SmithJ. SmithJ. Smith [...]

[..] ∼2/3 of the ∼6 million authors in MEDLINE share a last name and first initial with at least one other author, and an ambiguous name refers to ∼8 persons on average.

Torvik and Smalheiser. Author name disambiguation in MEDLINE. ACM Transactions on Knowledge Discovery from Data (2009) vol. 3 (3)

Page 13: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Problem #1: name ambiguity

‣ Number of authors and other scholarly contributors is increasing

‣ Number & kinds of “works” they contribute to is increasing

Page 14: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Problem #1: name ambiguity

‣ Number of authors and other scholarly contributors is increasing

‣ Number & kinds of “works” they contribute to is increasing

‣ The scholarly record is broken

‣Reliable attribution of authors and contributors is impossible without unique person-level identifiers

Page 15: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Problem #2: digital identity crisis‣ Session title: Scientific Schizophrenia - How many identities do

YOU have?

‣ Well, I have several! <-- identity crisis?? - 2x Universities I’m affiliated with

- Several scholarly/professional profile services

- LinkedIn professional profile / CV

- Twitter microblogging (for professional purposes)

- Several other author profiles that are not under my control (Web of Science, Scopus, others)

‣ Identity fragmentation - big, big mess!!!!

Page 16: TNC2012 Federated and scholarly identity - match made in heaven?

How to Make a Tackle in RugbyTackling in rugby is one of the most important aspects of the game.[...]

Credit: http://djamba.com/how-to-make-a-tackle-in-rugby.html

Page 17: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

The Open Researcher & Contributor ID initiative

‣ ORCID is an international, interdisciplinary organization involving multiple stakeholders:

- Research institutions, libraries, funding organizations, publishers, intermediares and individual researchers

‣ Started in late 2009 to solve the name ambiguity problem in scholarly communication.

‣ Incorporated as a non-profit with a Board of Directors in August 2010.

Page 18: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

The Open Researcher & Contributor ID initiative

ORCID will work to support the creation of a permanent, clear and unambiguous record of scholarly communication by enabling reliable attribution of authors and contributors through unique identifiers

Page 19: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

ORCID Participants

ORCID has 328 participant organizations from across the world, 50 of which have provided sponsorship funding.

Page 20: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Given a work, tell me who is responsible for it and describe the nature of that responsibility.

Credit: Geoff Bilder http://irisc-workshop.org/wp-content/uploads/2011/09/irisc2011-geoff-bilder.ppt

Some knowledge discovery use cases

Page 21: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Given a work, tell me who is responsible for it and describe the nature of that responsibility.

Credit: Geoff Bilder http://irisc-workshop.org/wp-content/uploads/2011/09/irisc2011-geoff-bilder.ppt

Some knowledge discovery use cases

Page 22: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Given a contributor, tell be what works he/she has contributed to and describe the nature of the contributions.

Credit: Geoff Bilder http://irisc-workshop.org/wp-content/uploads/2011/09/irisc2011-geoff-bilder.ppt

Some knowledge discovery use cases

Page 23: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Given a contributor, tell me which other contributors are “related” to the first one and tell me the nature of that relationship.

Credit: Geoff Bilder http://irisc-workshop.org/wp-content/uploads/2011/09/irisc2011-geoff-bilder.ppt

Some knowledge discovery use cases

Page 24: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

WHO CARES!?

‣ Publishers who publish researchers’ work- Accurate author info, dealing with coauthors, generally managing the

peer-review & publishing process

‣ Institutions that employ researchers- Evaluating performance of research staff, tenure decisions

‣ Funders who give researchers money- Which PI scientists are getting funded, who are their co-applications, track

which research outputs were produced by a given grant

‣ Researchers themselves!- Automated CVs, receive credit, save time when submitting manuscripts

to journals

Page 25: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Page 26: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Page 27: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Self-AssertedIdentity

Credit: Kaliya Hamlin http://www.identitywoman.net/the-identity-spectrum Geoff Bilder http://about.orcid.org/sites/default/files/bilderorcidoutreachmay2012.ppt

Self-assertedIdentity

Socially-ValidatedIdentity

Socially-validatedIdentity

Organisationally-ValidatedIdentity

Organisationally-validatedIdentity

Page 28: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Examples of publication claims by different parties

Page 29: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Self-AssertedIdentity

Socially-ValidatedIdentity

Organisationally-ValidatedIdentity

Automated-Tools

Disambiguated Identity

Credit: Geoff Bilder http://about.orcid.org/sites/default/files/bilderorcidoutreachmay2012.ppt

Disambiguation without de-duplication - Modeling authority and trust in the ORCID system http://about.orcid.org/sites/default/files/disambiguation-deduplication_wp_v4.pdf

Page 30: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

ORCID development timelineQ1 2012 Q2 2012 Q3 2012 Q4 2012 Q1 2013 Q2 2013 Q3

Build Phase 1.0 Phase 2 Development Build Phase 1.1 Launch

Phase 1.1

Self-AssertedIdentity Socially-Validated

IdentityOrganisationally-ValidatedIdentity

Credit: Geoff Bilder http://about.orcid.org/sites/default/files/bilderorcidoutreachmay2012.ppt

Page 31: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

What makes ORCID different?

• Some key facts:• ORCID is the only researcher identifier that is not limited to discipline,

institution or geographic area

• ORCID is backed by a non-profit organization with >300 participants

• ORCID is backed by many different stakeholders

• Publishers are an important ORCID stakeholder but are just one part

• ORCID is serious about building an open system

Page 32: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Tackling problems&

Creating new opportunities

Page 33: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

‣ Publishers, funding agencies, universities, libraries

‣ Big payoffs from solving big identification problems - BUT, big, sprawling organizations take long time to move

‣ HOWEVER, adoption could well happen relatively rapidly

- .. via integration with manuscript tracking systems

- .. via deposit of profile data from large organizations

‣ Several publishers & their software vendors are already working on ORCID integration

ORCID uptake by “usual suspect” stakeholders in scholarly communication

Page 34: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Link your account now

User authenticates and approves NPG accessing

their data

ORCID returns User tothe NPG registration

NPG registration form ispre-populated with data

ORCID sends back

Credit: Veronique Kiermer http://about.orcid.org/sites/default/files/kiermerorcidoutreachmay2012.pptx

Publisher integration - NPG example

Page 35: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Creating opportunities in the Long Tail

‣ Lots of small, diverse online scholarly services - more nimble than bigger players so faster to onboard

‣ Rich flora of grassroots initiatives that can benefit from integration with ORCID

- Example: #altmetrics movement Total Impact - http://total-impact.org ScienceCard - http://sciencecard.org

- Example: genetic variation databases small-to-medium size data submissions & expert curation ** Part of the GEN2PHEN mission **

Page 36: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

How? Play the social networking card

‣ Now in modern social networking arena:

- Rich flora of 3rd party applications built around social IDs that users already have on Twitter, Facebook and other sites

‣ Coming soon:

- Lots of online scholarly communication tools built around ORCID IDs that scholars already have

- Ease of use - build on users’ familiarity with mainstream apps

- Rich ecosystem of ‘ORCID apps’

- Lower the barrier to participation - tackle the “multiple profiles syndrome”

Page 37: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Technologically, this is not rocket science

Page 38: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Technologically, this is not rocket science

Page 39: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

The Phase 1 ORCID servicewill support this stuff!!!

‣ Simple RESTful API - focus on making integration *easy*

‣ Standard OAuth 2 authn/authz so users can:

- link their local accounts with their ORCID ID

- authorize client apps to access non-public profile data

- authorize client apps to add/update profile data on their behalf

‣ USER DRIVEN - up to individual author/contributors whether to link accounts

Page 40: TNC2012 Federated and scholarly identity - match made in heaven?
Page 41: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

IRISC2011 workshop @CSC, Helsinki

‣ Workshop themes- unambiguously identifying authors/creators & attributing their scholarly works

- individual identification and access mgmt in the context of identity federations

‣ Workshop aims- Raising overall awareness of key technical and non-technical challenges,

opportunities and developments.

- Facilitating a dialogue, cross-pollination of ideas, collaboration and coordination between diverse – and largely unconnected – communities.

- Identifying & discussing existing/emerging technologies, best practices and requirements for researcher identification.

‣ >60 participants, ~2/3 from IDF community

Page 42: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

IRISC2011 workshop @CSC, Helsinki

‣ Workshop themes- unambiguously identifying authors/creators & attributing their scholarly works

- individual identification and access mgmt in the context of identity federations

‣ Workshop aims- Raising overall awareness of key technical and non-technical challenges,

opportunities and developments.

- Facilitating a dialogue, cross-pollination of ideas, collaboration and coordination between diverse – and largely unconnected – communities.

- Identifying & discussing existing/emerging technologies, best practices and requirements for researcher identification.

‣ >60 participants, ~2/3 from IDF community

Page 43: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

‣ Workshop report published online http://irisc-workshop.org/irisc2011-helsinki/workshop-report/

IRISC2011 workshop @CSC, Helsinki

Page 44: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

‣ Workshop report published online http://irisc-workshop.org/irisc2011-helsinki/workshop-report/

IRISC2011 workshop @CSC, Helsinki

Page 45: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

‣ Recommendations / suggested actions from report

[...]

Opportunities for collaboration and interoperability

Service providers should investigate possibilities for authenticating ‘homeless’ users (i.e. freelance researchers with no affiliation, or affiliated researchers at institutions which aren't part of an IDF) via ORCID or other trusted source of author identifiers that may join IDFs in the future.

The IDF community and ORCID should work to harmonize core profile fields/attributes which are likely to hold institution-validated information.

Establish a pilot on federated access management to a biomedical data provider together with EGA, eduGAIN and related national IDFs.

Investigate how an ORCID or other author identifier and its provenance can be modelled as an attribute in IDF and interfederation services, as part of a set of attributes automatically released by the identity provider.

[...]

IRISC2011 workshop @CSC, Helsinki

Page 46: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Match made in heaven, no?- Opportunities for collaboration -

‣ Pilot ORCID <-> IDF integration in high-value use cases

‣ Starting points - some suggestions

- A) Authenticate via federated identity to central ORCID system

- Users authenticates the first time, registers & his new profile is populated on the fly with orgz-validated information released by IdP

- B) Starting from institution, link ORCID account with inst. user account and pull in ORCID identifier + publication data

- Would need IDF attribute to carry universal, validated author identifier

Page 47: TNC2012 Federated and scholarly identity - match made in heaven?

TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012

Where do we go from here?

‣ Get involved - join the discussion

- http://about.orcid.org - Main website, general info

- http://dev.orcid.org - Developer web portal - NEW!!

- Test “sandbox” system (bring your own sand!) http://devsandbox.orcid.org http://api.devsandbox.orcid.org

- Contact me, as (provisionally) co-chair of ORCID’s Technical Outreach Working Group, together with Elsevier’s Mike Taylor

Page 48: TNC2012 Federated and scholarly identity - match made in heaven?

GEN2PHEN Consortium - http://www.gen2phen.org/about-gen2phen/partners

Prof Anthony J. Brookes Bioinformatics Group, Leicester

This work has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013)under grant agreement number 200754 - the GEN2PHEN project.

Contact me! <[email protected]>

http://www.linkedin.com/in/mummihttp://www.twitter.com/gthorisson

http://www.gthorisson.namePublished under the CC BY license: http://creativecommons.org/licenses/by/3.0/

Acknowledgements