strategies llctaxonomy copyright 2009taxonomy strategies llc. all rights reserved. assorted slides...

131
Strategies LLC Taxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr.

Upload: maximilian-baldwin

Post on 25-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

Strategies LLCTaxonomy

Copyright 2009Taxonomy Strategies LLC. All rights reserved.

Assorted Slides on Taxonomy & Metadata Governance

Ron Daniel, Jr.

Page 2: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

2Taxonomy Strategies LLC The business of organized information

Creating a Governance Structure for the Ongoing Maintenance of the Taxonomy

Taxonomies must change if they are to remain relevant. But what will it cost to make those changes to the taxonomy and to the data which is categorized by it? Organizations must have appropriate maintenance processes so that the taxonomy changes are based on rational cost/benefit decisions, without becoming mired in endless paperwork. This interactive workshop will highlight the framework for creating taxonomy governance teams and what their specific responsibilities should be. Special attention will be given to defining maintainable taxonomies and metadata for achieving business needs.

Page 3: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

3Taxonomy Strategies LLC The business of organized information

Agenda

10:15 Introduction

10:30 Background

10:35 Maintainable Taxonomies

10:45 Maintainable Metadata

10:50 ROI Estimation

11:00 Governance Environment

11:10 Controlled Items

11:30 Team Structures

11:45 Change Process

12:00 Exercises

12:15 Adjourn

Page 4: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

4Taxonomy Strategies LLC The business of organized information

Three Problems

Taxonomy development and maintenance is the LEAST of three problems:

The Taxonomy Problem: How are we going to build and maintain the lists of pre-defined values that can go into some of the metadata elements?

The Tagging Problem: How are we going to populate metadata elements with complete and consistent values? What can we expect to get from automatic classifiers? What kind of error

detection and error correction procedures do we need? What fields do we need?

The ROI (Return On Investment) Problem: How are we going to use content, metadata, and vocabularies in applications to obtain business benefits? More sales? Lower support costs? Greater productivity? Risk avoidance? How much content? How big an operating budget? How to expose to users?

Tolerance for poor data quality?

Business Goals and Cultural Factors are major influences on tagging and taxonomy. These must be acknowledged at the start to avoid rework.

Page 5: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

5Taxonomy Strategies LLC The business of organized information

There’s more to maintaining the Taxonomy than just maintaining the Taxonomy

What must change when the Taxonomy changes?

The master copy of the taxonomy.

The data tagged with the taxonomy?

The user interface which uses the taxonomy?

Backend system software which uses the taxonomy?

The training set for automatic classifiers?

The educational material for users, catalogers, programmers, etc.?

The information sent to downstream users of the taxonomy?The versions of the taxonomy distributed to others.The list of changes.

Announcements for stakeholders?

This is a set of items that might be maintained by

taxonomy team and need to be updated.

Few groups will have all of these under maint.

by the taxo team.

Page 6: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

6Taxonomy Strategies LLC The business of organized information

Agenda

10:15 Introduction

10:30 Background

10:35 Maintainable Taxonomies

10:45 Maintainable Metadata

10:50 ROI Estimation

11:00 Governance Environment

11:10 Controlled Items

11:30 Team Structures

11:45 Change Process

12:00 Exercises

12:15 Adjourn

Page 7: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

7Taxonomy Strategies LLC The business of organized information

Metadata and Taxonomy

Field Data Type Example

Title String “The Perl Directory”

Creator String The Perl Foundation

Identifier URL http://www.perl.org/

Date DateTime Jan. 12, 2006

Subject List Computers : Programming : Languages : Perl

Metadata

Taxonomy

Big simple hierarchy has lots of nodes and is

a lot of work to maintain.

Page 8: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

8Taxonomy Strategies LLC The business of organized information

DMOZ: A worst case example of a unified ‘subject’

Business Biotechnology & Pharmaceuticals

Education & Training

Regional Europe Ireland Business & Economy

Employment Health & Medical

Reference Education Colleges & Universities

North America United States Maryland Columbia Union College

Athletics

Reference Education K-12 Home Schooling Unschooling Chats and Forums

Science Math Academic Departments

South America Colombia

Society People Women Science & Technology

Mathematics

Science Social Sciences Linguistics Translation Associations

Business Small Business Finance Accounting

Business Accounting Firms Directories

Business Employment By Industry

Business Healthcare Employment Regional

Competency (discipline) 11

Geography 9

Audience 9

Topic 7

Organization 5

Doc Type 4

Industry 4

Process 4

DMOZ has over 600k categories

Most are a combination of common facets – Geography, Organization, Person, Document Type, …

(e.g.) Top: Regional: Europe: Spain: Travel and Tourism: Travel Guides

(BTW – DMOZ Governance model is out of whack)

Page 9: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

9Taxonomy Strategies LLC The business of organized information

If you want to get technical here, you can explain that lots of big hierarchies are pre-coordinated combinations of items that could come from separate facets. This introduces some arbitrary choices (do we list content type first and location second, or …). It also leads to a lot of repeated substructure which means there have to be edits in many places to make what is in concept a pretty small change.

Page 10: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

10Taxonomy Strategies LLC The business of organized information

The power of taxonomy facets

Categorize in multiple, independent, categories.

Allow combinations of categories to narrow the choice of items.

4 independent categories of 10 nodes each have the same discriminatory power as one hierarchy of 10,000 nodes (104) Easier to maintain Can be easier to navigate

Main Ingredients

Cooking Methods

Meal Type Cuisines

• Chocolate• Dairy• Fruits• Grains• Meat &

Seafood• Nuts• Olives• Pasta• Spices &

Seasonings• Vegetables

• Breakfast• Brunch• Lunch• Supper• Dinner• Snack

• African• American• Asian• Caribbean• Continental• Eclectic/

Fusion/ International

• Jewish• Latin American• Mediterranean• Middle Eastern• Vegetarian

• Advanced• Bake• Broil• Fry• Grill• Marinade• Microwave• No Cooking• Poach• Quick• Roast• Sauté• Slow

Cooking• Steam• Stir-fry

42 values to maintain (10+6+11+15)

9900 combinations (10x6x11x15)

Page 11: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

11Taxonomy Strategies LLC The business of organized information

How do I get a good Taxonomy? – Seven practical rules

1) Incremental, extensible process that identifies and enables users, and engages stakeholders.

2) Quick implementation that provides measurable results as quickly as possible.

3) Not monolithic—has separately maintainable facets.

4) Re-uses existing IP as much as possible.

5) A means to an end, and not the end in itself .

6) Not perfect, but it does the job it is supposed to do—such as improving search and navigation.

7) Improved over time, and maintained.

Page 12: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

12Taxonomy Strategies LLC The business of organized information

Some vocabulary construction rules

Don’t just have names, also have identifiers This will reduce retagging later when names change When tagging content, use the most specific code. Let software handle the

hierarchy. Bonus: Use URIs for node IDs & publish on the web (See LINKED DATA

in the futures chapter)

Develop scope notes Not just a definition, also say what kind of content the node applies to

Metadata specification must state the vocabulary for a element.

Gather data from multiple sources Talk with users and experts Analyze query logs and content

Choose and arrange terms Test and finalize first version

Shift into maintenance mode

Page 13: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

13Taxonomy Strategies LLC The business of organized information

What do I do with all these facets?

Either expose them directly in the user interface (post-coordinating)

or

Combine them in a minimal hierarchy (pre-coordination)

Post-coordination takes software support, which may be fancy or basic.

How many facets?(See elsewhere)

Page 14: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

14Taxonomy Strategies LLC The business of organized information

Agenda

10:15 Introduction

10:30 Background

10:35 Maintainable Taxonomies

10:45 Maintainable Metadata

10:50 ROI Estimation

11:00 Governance Environment

11:10 Controlled Items

11:30 Team Structures

11:45 Change Process

12:00 Exercises

12:15 Adjourn

Page 15: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

15Taxonomy Strategies LLC The business of organized information

Maintainable Metadata

Design metadata specification for future changes Lessons from the Dublin Core

Provide metadata tagging and storage that will deal with changes

Page 16: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

16Taxonomy Strategies LLC The business of organized information

Dublin Core: A little more complicated over time

Elements1. Identifier2. Title3. Creator4. Contributor5. Publisher6. Subject7. Description8. Coverage9. Format10. Type11. Date12. Relation13. Source14. Rights15. Language

AbstractAccess rightsAlternativeAudienceAvailableBibliographic citationConforms toCreatedDate acceptedDate copyrightedDate submittedEducation levelExtentHas formatHas partHas versionIs format ofIs part of

Is referenced byIs replaced byIs required byIssuedIs version ofLicenseMediatorMediumModifiedProvenanceReferencesReplacesRequiresRights holderSpatialTable of contentsTemporalValid

RefinementsBoxDCMITypeDDCIMTISO3166ISO639-2LCCLCSHMESHPeriodPointRFC1766RFC3066TGNUDCURIW3CTDF

EncodingsCollectionDatasetEventImageInteractive ResourceMoving ImagePhysical ObjectServiceSoftwareSoundStill ImageText

Types

Page 17: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

17Taxonomy Strategies LLC The business of organized information

Design Metadata Specification for future changes

Degree of future changes will depend on organization size, sophistication of use, number of repositories and amount of content. Don’t over-engineer

For all organizations: start with the Dublin Core with a few additions and deletions for specific needs

At large/sophisticated organizations: “Refinements” will be unavoidable in the future.

Start with “DatePublished” so that later additions of “DateModified”, DateApproved”, “DateVerified”, etc. fit in easily.

Identify broad “integration metadata” vs. division-specific fields. Coordinate with others to set up a working understanding of a corporate multi-level metadata standard.

Page 18: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

18Taxonomy Strategies LLC The business of organized information

Provide metatagging and storage that will deal with changes

Tag with identifiers, not names. This will reduce retagging later when names change Not good if people need to view raw tagging, but usually software

will be involved to show labels.

When tagging content, use the most specific concept. Let software handle the hierarchy.

Metadata is easier to manage if it is stored in a central repository, instead of spread out in the individual files. Exception – when sending files out to other systems (e.g. photo

metadata) Warning – ‘metadata repositories’ are usually a different class of

software than what we are discussing.

Page 19: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

19Taxonomy Strategies LLC The business of organized information

Agenda

10:15 Introduction

10:30 Background

10:35 Maintainable Taxonomies

10:45 Maintainable Metadata

10:50 ROI Estimation

11:00 Governance Environment

11:10 Controlled Items

11:30 Team Structures

11:45 Change Process

12:00 Exercises

12:15 Adjourn

Page 20: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

20Taxonomy Strategies LLC The business of organized information

Fundamentals of taxonomy ROI

Tagging content using a taxonomy is a cost, not a benefit.

There is no benefit without exposing the tagged content to users in some way that cuts costs, improves revenues, reduces risk, or achieves some other clear business goal.

Putting taxonomy into operation requires UI changes and/or backend system changes, as well as data changes.

You need to determine those changes, and their costs, as part of the ROI.

Page 21: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

21Taxonomy Strategies LLC The business of organized information

Key Factors in ROI

Breadth “How many people will metadata affect?”

Repeatability “How many times a day will they use it?

Cost/Benefit “Is this a costly effort with little or no benefits?”

Page 22: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

22Taxonomy Strategies LLC The business of organized information

How to estimate costs — Tagging

Taxonomy Facet Hier?TypicalCV Size

Time/ Value (min)

Avg # values /

Item $ / MinCost/

Element

Audience N 10 0.25 2 $ 0.42 $ 0.21

Content Type N 20 0.25 1 $ 0.42 $ 0.11

Organizational Unit Y 50 0.5 2 $ 0.42 $ 0.42

Products & Services Y 500 1.5 4 $ 0.42 $ 2.52

Geographic Region Y 100 0.5 2 $ 0.42 $ 0.42

Broad Topics Y 400 2 4 $ 0.42 $ 3.36

TOTALS   1080 5 15   $ 7.04

Inspired by: Ray Luoma, BAU Solutions

Consider complexity of facet and ambiguity of content to estimate

time per value.

Estimated cost of tagging one item. This can be reduced with automation, but cannot be

eliminated.

Is this field worth the

cost?

Page 23: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

23Taxonomy Strategies LLC The business of organized information

How to estimate costs — Assumptions

ASSUMPTIONS  

Enterprise SW License $ 100,000

Maintenance/Support 15%

SW Implementation 200%

Legacy Content Items 100,000

Content Growth Rate 15%

Tagging/Item $ 7.04

Enterprise Taxonomy $ 100,000

Your numbers will vary.

Page 24: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

24Taxonomy Strategies LLC The business of organized information

How to estimate costs — Total cost of ownership (TCO)

Description Year 1 Year 2 Year 3 Year 4 Year 5

SW          

Licenses $ 100,000        

Maintenance   $ 15,000 $ 15,000 $ 15,000 $ 15,000

Implementation $ 200,000        

App Tech Support   $ 30,000 $ 30,000 $ 30,000 $ 30,000

Tagging          

Legacy Content $ 704,000        

Ongoing   $ 105,600 $ 121,440 $ 139,656 $ 160,604

Taxonomy          

Creation $ 100,000        

Maintenance   $ 15,000 $ 15,000 $ 15,000 $ 15,000

TOTAL $ 1,103,500 $ 165,600 $ 181,440 $ 199,656 $ 220,604

Page 25: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

25Taxonomy Strategies LLC The business of organized information

Sample ROI Calculations

Description Year 1 Year 2 Year 3 Year 4 Year 5

Costs          

Software Licenses/ Maintenance $ 100,000 $ 15,000

$ 15,000

$ 15,000

$ 15,000

Implementation/Support $ 200,000 $ 30,000 $ 30,000

$ 30,000

$ 30,000

Taxonomy Creation/ Maintenance $ 100,000 $ 15,000

$ 15,000

$ 15,000

$ 15,000

Legacy/Ongoing Tagging $ 703,500 $ 105,600 $ 121,440

$ 139,656

$ 160,604

           

Benefits          

Productivity increases $ - $ 125,000

$ 1,250,000

$ 1,250,000

$ 1,250,000

Service efficiency gains $ - $ 129,600

$ 1,296,000

$ 1,296,000

$ 1,296,000

           

Yearly Net Benefits$(1,103,500) $ 89,000

$ 2,364,560

$ 2,346,344

$ 2,325,396

Payback period 1.4 Years until Benefits = CostsInspired by: Todd Stephens, Dublin Core Global Corporate Circle

Ongoing cost of tagging due to 15% content growth.

Page 26: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

26Taxonomy Strategies LLC The business of organized information

Where do the benefits come from?Common taxonomy ROI scenarios

Catalog site - ROI based on increased sales through improved: Product findability Product cross-sells and up-sells Customer loyalty

Call center - ROI based on cutting costs through: Fewer customer calls due to improved website self-service Faster, more accurate CSR responses through better information access

Compliance – ROI based on: Avoiding penalties for breaching regulations Following required procedures (e.g. Medical claims)

Knowledge worker productivity - ROI based on cutting costs through: Less time searching for things Less time recreating existing materials, with knock-on benefits of less confusion and

reduced storage and backup costs

Executive mandate No ROI at the start, just someone with a vision and the budget to make it happen

Page 27: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

27Taxonomy Strategies LLC The business of organized information

Agenda

10:15 Introduction

10:30 Background

10:35 Maintainable Taxonomies

10:45 Maintainable Metadata

10:50 ROI Estimation

11:00 Governance Environment

11:10 Controlled Items

11:30 Team Structures

11:45 Change Process

12:00 Exercises

12:15 Adjourn

Page 28: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

28Taxonomy Strategies LLC The business of organized information

Generic, yet Important, Advice

It’s not about the tools. It’s not about the taxonomy. It’s about the business goals and the processes people use to meet those goals.

Metrics are grossly underused in metadata and search.

Page 29: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

29Taxonomy Strategies LLC The business of organized information

Taxonomy governance overview

Taxonomy governance can be viewed as a standards processClosely linked to organizational metadata standardTaxonomy must evolve, but in predictable way

Take tips from other standards effortsTeam structure, with an appeals process

Taxonomy stewardship is part-time role at most organizationsTeam needs to make decisions based on costs and benefits

Documentation and educational material on Taxonomy and MetadataAnnouncementsComment-handling responsibilities (part of error-correction process)Issue LogsRelease Schedule

These practices are in rough order of

implementation.

Page 30: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

30Taxonomy Strategies LLC The business of organized information

Published Facets

Consuming Applications

IntranetSearch

’’

Web CMS

Archives

ERMS

Custodians

Notifications

Change Requests & Responses

ISO3166-1

Other External

ERP

Other Internal

Vocabulary Management

System

Other Controlled

Items

’’

Intranet Nav.

DAM

Taxonomy governance environment

Taxonomy Governance Environment

CVs

2: Team decides when to update facets within Taxonomy

3: Team adds value via mappings, translations, synonyms, training materials, etc.

1: External vocabularies change on their own schedule, with some advance notice.

4: Updated versions of facets published to consuming applications

CV (Controlled Vocabulary) – The list of values for one facet in the Taxonomy.

Page 31: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

31Taxonomy Strategies LLC The business of organized information

Agenda

10:15 Introduction

10:30 Background

10:35 Maintainable Taxonomies

10:45 Maintainable Metadata

10:50 ROI Estimation

11:00 Governance Environment

11:10 Controlled Items

11:30 Team Structures

11:45 Change Process

12:00 Exercises

12:15 Adjourn

Page 32: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

32Taxonomy Strategies LLC The business of organized information

Controlled Items

Taxonomy Team will have several items to manage: Controlled Vocabularies Metadata Standard Editorial Rules Tagger Training Materials (manual and automatic) Charter, Goals, Performance Measures Team Processes Outreach & ROI

Website Communication plan Presentations Announcements

“Roadmap” Advanced practice, requires long planning horizon for organization's IT

projects

Even small taxonomy teams should develop many of these items, although not to the same level of formality.

Page 33: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

33Taxonomy Strategies LLC The business of organized information

Controlled Vocabularies are not just tabbed lists

Source: NASA Taxonomy Competencies Facethttp://nasataxonomy.jpl.nasa.gov/nascomp/index_tt.htm

Page 34: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

34Taxonomy Strategies LLC The business of organized information

Element Name XML Map Repeatable Source Purpose

General Purpose Metadata

Unique ID dc:identifier 1 System supplied System identifier to retrieve item.

Owner dc:creator ? System supplied POC for content maintenance

Title dc:title 1 User supplied Text search & results display

Date dc:date 1 System suppliedPublish, feature, & review content.

Subject Metadata

Organization x:corp * Corp Classif CV

Search for, browse, group & filter search results.

Asset x:asset * Asset CV

Region/Country dc:coverage * Country CV

Basin/Platform/Well x:well * B/P/Well CV

Content Type dc:type ? Content Types CV

Company/Client/Operator/Partner x:company * Company CV

Project x:project * Project CV

Use Metadata

DisciplinedcTerms: audience * Discipline CV Target, personalize content.

Retention x:retention 1 System supplied Remove expired contentLegend: ? – 1 or more * - 0 or more

Controlled Item: Metadata Specification

Page 35: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

35Taxonomy Strategies LLC The business of organized information

Controlled Item: Editorial Rules

Akin to “Chicago Manual of Style”

Issues commonly addressed in the rules:AbbreviationsAmpersandsCapitalizationContinuations (More… or Other…)Duplicate TermsFidelity to External SourceHierarchy and PolyhierarchyLanguages and Character SetsLength Limits“Other” – Allowed or Forbidden?Plural vs. Singular FormsRelation Types and LimitsScope NotesSerial CommaSources of TermsSpacesSynonyms and AcronymsTranslationsTerm Order (Alphabetic or …)Term Label Order (Direct vs. Inverted)

What to do when rules conflict – how do people decide which rule is more important?

Rule Name Editorial Rule

Use Existing Vocabularies

Other things being equal, reusing an existing vocabulary is preferred to creating a new one.

Ampersands The character '&' is preferred to the word ‘and’ in Term Labels.Example: Use Type: “Manuals & Forms”, not “Manuals and Forms”.

Special Characters

Retain accented characters in Term Labels.Example: Use “España”, not “Espana”.

Serial comma If a category name includes more than two items, separate the items by commas. The last item is separated by the character ‘&’ which IS NOT preceded by a comma.Example: “Education, Learning & Employment”, not “Education, Learning, & Employment”.

Capitalization Use title case (where all words except articles are capitalized).Example: “Education, Learning & Employment”NOT “Education, learning & employment”NOT “EDUCATION, LEARNING & EMPLOYMENT”NOT “education, learning & employment”

… …

Page 36: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

36Taxonomy Strategies LLC The business of organized information

Controlled Item: Training Materials

Staff will require training on The UI they use to tag the

content The rules to follow when deciding

what codes to apply The end-effect of the codes they

apply The structure of the taxonomy

Tagging examples come from earlier stages in taxonomy development process

Hardcopies of the taxonomy, and yellow highlighters, are helpful during training

Indexing rulesRule Description

Specificity rule

Apply the most specific terms when tagging assets. Specific terms can always be generalized, but generic terms cannot be specialized.

Repeatable rule

All attributes should be repeatable. Use as many terms as necessary to describe What the asset is about and Why it is important. Storage is cheap. Re-creating content is expensive.

Appropriateness rule

Not all attributes apply to all assets. Only supply values for attributes that make sense.

Usability rule

Anticipate how the asset will be searched for in the future, and how to make it easy to find it. Remember that search engines can only operate on explicit information.

Indexing UI

Page 37: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

37Taxonomy Strategies LLC The business of organized information

Controlled item: Communications Plan

Stakeholders: Who are they and what do they need to know?

Channels: Methods available to send messages to stakeholders. Need a mix of narrow vs. broad,

formal vs. informal, interactive vs. archival, …

Messages: Communications to be sent at various stages of project. Bulk of the plan is here

Channel Description

Demo Live, or screen capture for download

Presentation Tailored message for specific audience

Website Overview info for all, link to files

Memo Formal notification

… …

Stakeholders Info. Needed

Project Sponsors Progress, Issues, Policies

Dept. Reps Progress, Priorities,

… …

Users Progress, How-Tos

Vendors RFPs & SOWs

Trigger Msg. Descrip

From To Chan.

Initiation Project overview

Dept. head

All Memo

… … … … …

Page 38: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

38Taxonomy Strategies LLC The business of organized information

Controlled Item: Team Charter

Taxonomy Team is responsible for maintaining: The Taxonomy, a multi-faceted classification scheme Associated materials, including a website providing:

Corporate Metadata Standard Editorial Style Guide Taxonomy Training Materials Team rules and procedures (subject to CIO review)

Team evaluates costs and benefits of suggested changes. Taxonomy Team will:

Manage relationship between providers of source vocabularies and consumers of the Taxonomy

Identify new opportunities for use of the Taxonomy across the Enterprise to improve information management practices

Promote awareness and use of the Taxonomy

Page 39: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

39Taxonomy Strategies LLC The business of organized information

Remaining Controlled Items

Performance Measures to go along with Charter?

Team Processes (see later in this presentation)

Automatic Classifier Training Materials

Website

Presentations and Announcements

Change Request List (see later in this presentation)

“Taxonomy Roadmap” Advanced practice, requires long planning horizon for

organization's IT projects

Page 40: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

40Taxonomy Strategies LLC The business of organized information

Exercise 2: Editorial Rules

Look at sample taxonomy

Think of ways to clean it up and make it ‘better’SmallerMore professional lookingEasy to use

Write editorial rules for the cleanups.

Provide an example with each rule:

Rule Name Editorial Rule

Plumem Lorne ipso ernum de jura fino el

Symosyit Esr Dirgin a periso de forestima

Himerisf Faleoin fi ribska firn eowkds

Capitalization All terms in lowercase.“programming, NOT “Programming”

Page 41: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

41Taxonomy Strategies LLC The business of organized information

Exercise 2: Sample Taxonomy

Source: http://del.icio.us/tag/

Page 42: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

42Taxonomy Strategies LLC The business of organized information

Exercise 2: Editorial Rules Worksheet

Rule Name Editorial Rule

Plurals Use plural form of names, not singular.

Capitalization All terms, except proper nouns, are lowercase.E.g. “programming”, NOT “Programming”.E.g. “Schwab”, not “schwab”.

Provide a name for each rule, the rule itself, and an example of the rule of the form “X, not Y”.

Page 43: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

43Taxonomy Strategies LLC The business of organized information

Agenda

10:15 Introduction

10:30 Background

10:35 Maintainable Taxonomies

10:45 Maintainable Metadata

10:50 ROI Estimation

11:00 Governance Environment

11:10 Controlled Items

11:30 Team Structures

11:45 Change Process

12:00 Exercises

12:15 Adjourn

Page 44: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

44Taxonomy Strategies LLC The business of organized information

Taxonomy StrategistTaxonomist

Information Architect 2Communications Specialist*

Organization 1: Taxonomy Governance TeamOrganization 1 – Internal portal for Fortune 50 Diversified Multinational.

Executive Sponsor

Advocate for the taxonomy team

Business Lead Keeps team on track with larger business

objectives Balances cost/benefit issues to decide

appropriate levels of effortSpecialists help in estimating costs

Obtains needed resources if those in team can’t accomplish a particular task

Technical Specialist Estimates costs of proposed changes in

terms of amount of data to be retagged, additional storage and processing burden, software changes, etc.

Helps obtain data from various systems

Content SpecialistTeam’s liaison to content creatorsEstimates costs of proposed changes in terms

of editorial process changes, additional or reduced workload, etc.Small-scale Metadata QA Responsibility

Taxonomy SpecialistSuggests potential taxonomy changes based

on analysis of query logs, indexer feedbackMakes edits to taxonomy, installs into system

with aid of IT specialist

Content OwnerReality check on process change suggestions

Changes

Page 45: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

45Taxonomy Strategies LLC The business of organized information

Organization 2: Vocabulary Policy Committee

Organization 2 – A non-profit international organization. Goal is to improve information management practices to reduce overlap between many similar vocabularies across many systems.Constraint: Even when number of vocabularies reduced, some must still have very close links. Business Lead Chairs group. Assures CVs fit with organization’s larger

information management effort. Small group management experience,

Information management background. Vocabulary Custodians (3)

Responsible for content in a specific CV, typically based on organizational lines.

Team lead experience, detail-oriented. Familiar with databases and organization processes

IT Representative Backups, admin of CV Tool IT administration experience

IT Steering Group Oversees Vocabulary Policy Committee

Stakeholders Managers of systems using the vocabularies, thus

affected by changes. They have a lot of visibility into the process. Control over CV changes is limited, but they

schedule their system’s adoption of changes.

Additional Roles – available during startup of team, and on an as-needed basis later

Training Representative Develops communications plan, training materials

Work Practices Representative Develops processes, monitors adherence

Other Relevant Staff

Page 46: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

46Taxonomy Strategies LLC The business of organized information

Organization 3: Taxonomy Team

Organization 3 – Public catalog site for Fortune 50 Retailer. Data for products provided by manufacturers.

Business Lead Chairs committee, resolves disputes

Marketing Representatives Provide product marketing expertise Advocate for product manufacturers Represent data entry concerns

Website Representative Provides input on search and

navigation impacts Advocate for customers and other

website users Provides search log and click trail

analysis Taxonomy Specialist

Maintains taxonomy and product catalog

Provides data feeds to drive site

Larger team than many retailers, where a single person

is responsible.

A single person still makes the changes here, but there is

some oversight.

Fast-Track Process – A fast-track process exists, likely to be used very often. Representative will ask Taxonomy Specialist for a change and he will get approval from Website Representative.

Likely Changes

Page 47: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

47Taxonomy Strategies LLC The business of organized information

What if I have to do it solo?

Realize: Its not totally solo – IT help, Graphics & UI help, Business Goals help,

Funding help, Review & QA help… You are the general contractor It needs to be part of your objectives Limit the objectives to what can be achieved by you, and by your

organization

Concentrate: Resource allocation

(i.e. Manage your time) Fundamental processes

Query log examination Error correction procedure

Communications!!!

Cherry-pick from RolesBusiness Lead – align with organization goals, get needed resources, make cost/benefit decisions, report upstairsIT Liaison – Work with IT specialists to get software installed, logs gathered, content harvested, etc. Consider impact of changes on tools and dataTaxonomy / Search Specialist – analyze behavior and suggest changes. Implement changes which pass cost/benefit musterWebsite/User Representative – consider impact of changes on users and job performance

Page 48: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

48Taxonomy Strategies LLC The business of organized information

Exercise 3: Team & Stakeholder Identification

Role Applicable/Modify Name(s)

Taxonomy Team Members

Team Lead

Taxonomy Editor(s)

Vocabulary Custodian(s)

Liaisons with external vocabularies

Liaisons with applications using vocabularies

User advocate(s)

Training / Communications

IT / Data & System Maintenance

External Stakeholders

Team Supervisory Group

Representatives of external vocabularies

Representatives of consuming applications

Representatives of users

Other representatives of organization

Page 49: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

49Taxonomy Strategies LLC The business of organized information

Agenda

10:15 Introduction

10:30 Background

10:35 Maintainable Taxonomies

10:45 Maintainable Metadata

10:50 ROI Estimation

11:00 Governance Environment

11:10 Controlled Items

11:30 Team Structures

11:45 Change Process

12:00 Exercises

12:15 Adjourn

Page 50: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

50Taxonomy Strategies LLC The business of organized information

Taxonomy editing tools

Abi

lity

to E

xecu

telo

whi

gh

Completeness of VisionVisionariesNiche Players

Widely used, cheap, good reporting, bad

IDs

All upper-end tools are high functionality

and high cost.

Most popular taxonomy editor? MS

Excel

Immature industry – no vendors in upper-right quadrant!

This slide is out of date. Don’t know if we want

to include this.

Page 51: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

51Taxonomy Strategies LLC The business of organized information

Taxonomy editor functionality requirements

Hierarchy

Browser

Term Editin

g

Standard and Custom FieldsStandard and Custom Relations

Data Typing and RestrictionsConsistency EnforcementFlexible ReportingFlexible Importing?

Basic

WorkflowVotingChange Request

Mgmt.Stylistic rules enforcementProgrammability

Ad

van

ced

UNICODEMultiple Vocabulary SupportInter-Vocabulary RelationsUnique IDs

ISO Codes not sufficientMid

ran

ge

Page 52: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

52Taxonomy Strategies LLC The business of organized information

Taxonomy governance: Where changes come from

experience

End User

Firewall

Taxonomy

Content TaggingLogic

ApplicationUI

TaggingUI

Tagging Staff

Taxonomy Editor

Staff notes

‘missing’concepts

Query log analysis

Requests from other parts of NASA

experience

End User

Taxonomy Team

FirewallFirewall

Taxonomy

Content TaggingLogic

TaggingLogic

ApplicationUI

ApplicationUI

TaggingUI

TaggingUI

Tagging Staff

Taxonomy Editor

Staff notes

‘missing’concepts

Query log analysis

Requests from other parts of the organization

Team considerations

1. Business goals

2. Changes in user experience

3. Retagging cost

Recommendations by Editor

1. Small taxonomy changes (labels, synonyms)

2. Large taxonomy changes (retagging, application changes)

3. New “best bets” content

Application Logic

I think three sources of change requests is a big

concept to communicate to

readers.

Page 53: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

53Taxonomy Strategies LLC The business of organized information

Processes

Different organizations will need to consider their own change processes.Organization 1: A custodian is responsible for the content, but checks facts with

department heads before making changesOrganization 2: Analysts suggest changes, editors approve, copyeditors verify

consistencyOrganization 3: Marketing reps ask for a change, taxonomy editor makes demo, web

representative approves it.

Change process MUST also consider cost of implementing the changeRetagging dataReconfiguring auto-classifierRetraining staffChanges in user expectations

Case 1. Renaming a term

Case 2. Adding a new leaf term

Case 3. Inserting a new term

Case 4. Splitting a term

Case 5. Deleting a leaf term or subtree

Case 6. Deleting a term

Case 7. Moving a subtree

Case 8. Merging terms

Case 9. Adding a CV

Case 10. Deleting a CV

Taxonomy Change Cases

Page 54: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

54Taxonomy Strategies LLC The business of organized information

Analyst Editor

Problem?

Copywriter

Problem?

Yes

Yes No

No

Suggest new name/category

Review new name

Taxon-omy

Taxonomy Tool

Copy edit new name

Add to enterprise Taxonomy

Sys Admin

Taxonomy governance: Taxonomy maintenance workflow

Can contrast this process with others that are less formal and/or less like a newsroom..

Couple more are described on next slide.

Page 55: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

55Taxonomy Strategies LLC The business of organized information

Other change processes

Processes may be diagramed or written

Provide an ‘emergency’ change process because it will be needed.How can emergency changes be requested? Who makes the change and who approves it?Who are backups for the people when they are out?Who are escalation points?

Change Request Process should call out decision criteria, e.g.Cost of retaggingBenefit of changeConflict with editorial rules

Organization X:Change Request Process

Anyone can ask a team member for a change. Team members responsible for figuring out details and bringing to team for decision.

Pending changes list for low priority/high cost items.Change Process

Includes preview of change on site and data mockupFast-Track Change Process

Anyone can ask editor, he gets team leader or deputy approval

Page 56: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

56Taxonomy Strategies LLC The business of organized information

Fundamental Processes & Outlooks

Two fundamental processes every organization should implement to maintain its metadata and taxonomies: Query log / Click trail examination Error Correction

What are the key outlooks a taxonomist should try to instill in their organization? Integrated approach to Taxonomy, Metadata, Search,

and UI Measure & Improve Mindset

Another biggie

Page 57: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

57Taxonomy Strategies LLC The business of organized information

Fundamental process #1 – Query log examination How can we characterize users

and what they are looking for?

Query Log & Click Trail Examination Only 30-40% of organizations

interested in Taxonomy Governance examine query logs*

Basic reports provide plenty of real value

Greatest value comes from: Identifying a person as

responsible for search quality Starting a “Measure & Improve”

mindset

Greatest challenge: Getting a person assigned (≥

10%) Getting logs turned back on

UltraSeek Reporting

• Top queries • Queries with no

results • Queries with no

click-through • Most requested

documents • Query trend

analysis • Complete server

usage summary Click Trail Packages

iWebTrackNetTrackerOptimalIQ

SiteCatalystVisitorvilleWebTrends

Source: Metadata Maturity Model Presentation, Ron Daniel, ESS’05

Page 58: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

58Taxonomy Strategies LLC The business of organized information

Fundamental process #2 – Error correction

Errors will happen, and some will be found. What are you going to do about them? Tagging errors, content errors, taxonomy errors, …

Define an error correction process. Process will accommodate questions like:

Who looks at it? Is it an error? What are the costs to correct vs. not correct? Does the correction need to be scheduled? etc.

Once a tagging error is corrected, NEVER lose that fact. Manually reviewed pages are vital for training automatic classifiers Has implications for metadata specification and review procedures

Over time, multiple error detection methods will be defined e.g. Statistical sampling of newly added pages Gradually, additional error correction processes may be defined to deal

with particular types of errors

You have an error correction process. Would

you hate to see it on paper?

Page 59: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

59Taxonomy Strategies LLC The business of organized information

Fundamental Outlooks

Measure & Improve Mindset Query logs and click trails are prime example Next place to instrument: Error correction and error

detection processes

Integrated handling of Taxonomy, Metadata, UI, & Search To be most effective, these must work together Governance structure must help that happen Cross-functional team structure is a start

Page 60: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

60Taxonomy Strategies LLC The business of organized information

Actions to define taxonomy governance

Initial vocabularies should be selected for stability as well as utility.

Custodians of shared vocabularies must be identified, educated re. impacts of changes.

Group of custodians and stakeholders must be established. (Simple) System for sharing the CVs and tracking the

update process must be established.

Page 61: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

61Taxonomy Strategies LLC The business of organized information

Agenda

10:15 Introduction

10:30 Background

10:35 Maintainable Taxonomies

10:45 Maintainable Metadata

10:50 ROI Estimation

11:00 Governance Environment

11:10 Controlled Items

11:30 Team Structures

11:45 Change Process

12:00 Exercises

12:15 Adjourn

Page 62: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

62Taxonomy Strategies LLC The business of organized information

Exercise 4: Self-Diagnosis

1. Does your organization know what it is, or wants to be, doing around search & taxonomy yet?

2. Is the cost basis for the taxonomy ROI clear to you?

3. Is the benefits basis for the taxonomy ROI clear to you?

4. Is the cost basis for the taxonomy ROI clear to your CFO?

5. Is the benefits basis for the taxonomy clear to your CFO?

6. Do you know how content will be tagged?

7. Do you know how tagged content will be displayed to users?

8. Do you know how users will fetch the content?

9. Do users know how they should report errors in the tagging?

10.Do you know how what information will be logged for later analysis?

11. Do you know what information has to be reported to management to justify the taxonomy team?

12.Does management expect the taxonomy team to justify its existence?

13. Is your organization planning a tightly focused taxonomy effort?

14. Is your organization planning a credible ‘Enterprise Taxonomy Strategy’?

15.Does your organization expect its taxonomies to change frequently?

16.Has your organization identified some facets as stable and some facets as volatile?

17.Does your organization have a plan for retagging data when the taxonomy is changed?

18.Do you have an identified taxonomy “team” with at least one person?

19. Is there at least one person working on taxonomy/metadata/search more than ½ time?

20.Does the team contain members who represent search, UI, and metadata tagging?

21.Does the organization have any hiring and training criteria for taxonomy, metadata, and search positions?

22.Does the team maintain Editorial Rules?

23.Does the team maintain a corporate metadata specification?

24.Does the team maintain educational materials?

25.Does the team have a communications plan?

26.Does the team examine query logs?

27.Does the team examine click trails?

28.Does the team have a documented error correction process?

29.Does the organization have a procedure to locate ROT (Redundant, Obsolete, or Trivial content)?

30.Does the organization have any qualitative or quantitative measures of data quality?

31.Do you use a tool other than MS Excel for editing and maintaining the Taxonomy?

32.Were taxonomy, metadata, search, or content management tools purchased with money other than “use it or lose it” funds?

I think a self-diagnosis quiz like this could be

nice to have in the book. Also see the “Metadata Maturity

Model” stuff in the next set of slides.

Page 63: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

Strategies LLCTaxonomy

Copyright 2009Taxonomy Strategies LLC. All rights reserved.

Data Governance Maturity:

When the business depends on clear description of fuzzy objects

Presented to San Francisco DAMA

Sept. 10, 2008

Ron Daniel, Jr.

Page 64: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

64Taxonomy Strategies LLC The business of organized information

Goals for this talk

Provide you with background on maturity models.

Provide the results of our surveys of Search, Metadata, & Taxonomy practices and discuss interesting findings.

Review the practices in use at stock photo houses, and compare them to methods that may be used in typical information management projects.

Give you the tools to do a simple self-assessment of your organization’s metadata maturity

Page 65: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

65Taxonomy Strategies LLC The business of organized information

Agenda

9:15 Metadata Definitions

9:30 Maturity Models

9:45 Metadata Maturity Model (ca. 2006)

10:15 Break

10:30 Stock Photo Business

10:40 Data Governance Practices in Stock Photo Agencies

11:40 Summary

11:45 Questions

12:00 Adjourn

Page 66: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

66Taxonomy Strategies LLC The business of organized information

Taxonomy and metadata definitions

Metadata “Data about data”. Different communities have very different assumptions

about they types of data being described. I’m from the Information Science community, not the database,

statistics, or massive storage communities.

Taxonomy1. The classification of organisms in an ordered system

that indicates natural relationships.

2. The science, laws, or principles of classification; systematics.

3. Division into ordered groups, categories, or hierarchies.

Page 67: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

67Taxonomy Strategies LLC The business of organized information

Examples of taxonomy used to populate metadata fields

ExcelCard.ico

PDFCard.ico

OffAcc.ico

PPTCard.ico

Metadata

Title

Author

Department

Audience

Topic

Topics

Employee Services

Compensation

Retirement

Insurance

Further Education

Finance and Budget

Products and Services

Support Services

Infrastructure

Supplies

Metadata Values(Facets within the overall Taxonomy)

Audience

InternalExecutives

Managers

External

Suppliers

Customers

Partners

Page 68: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

68Taxonomy Strategies LLC The business of organized information

Example faceted taxonomy

ABC Computers.com

AllBusinessABC EmployeeEducationGaming Enthusiast

HomeInvestorJob SeekerMediaPartnerShopper

First TimeExperiencedAdvanced

Supplier

Audience

AllHome & Home Office

GamingGovernment, Education & Healthcare

Medium & Large Business

Small Business

Line of Business

AllAsia-PacificCanadaABC EMEAJapanLatin America & Caribbean

United States

Region-Country

DesktopsMP3 PlayersMonitorsNetworkingNotebooksPrintersProjectorsServersServicesStorageTelevisionsNon-ABC Brands

Product Family

AwardCase StudyContract & Warranty

DemoMagazineNews & EventProduct Information

ServicesSolutionSpecificationTechnical NoteToolTrainingWhite PaperOther Content Type

Content Type

Business & Finance

Interpersonal Development

IT Professionals Technical Training

IT Professionals Training & Certification

PC ProductivityPersonal Computing Proficiency

Competency Industry

Banking & Finance

Communica-tions

E-BusinessEducationGovernmentHealthcareHospitalityManufacturingPetro-chemocals

Retail / Wholesale

TechnologyTransportationOther Industries

Service

Assessment, Design & Implementation

DeploymentEnterprise Support

Client Support

Managed Lifecycle

Asset Recovery & Recycling

Training

Page 69: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

69Taxonomy Strategies LLC The business of organized information

Manually tagged metadata sample

Attribute Values

Title Jupiter’s Ring System

URL http://ringmaster.arc.nasa.gov/jupiter/

Description Overview of the Jupiter ring system. Many images, animations and references are included for both the scientist and the public.

Content Types Web Sites; Animations; Images; Reference Sources

Audiences Educators; Students

Organizations Ames Research Center

Missions & Projects Voyager; Galileo; Cassini; Hubble Space Telescope

Locations Jupiter

Business Functions Scientific and Technical Information

Disciplines Planetary and Lunar Science

Time Period 1979-1999

Page 70: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

70Taxonomy Strategies LLC The business of organized information

Other things sometimes called Taxonomy

Type Remarks

Synonym Ring 4 Connects a series of terms together 4 Treats them as equivalent for search purposese.g (Dog, Canine, Pooch, Mutt) (Cat, Feline, Kitty), …

Authority File 4 Used to control variant names with a preferred term 4 Typically used for names of countries, individuals, organizationse.g. (IBM, Big Blue, International Business Machines Inc.)

Classification Scheme

4 A hierarchical arrangement of terms4 May or may not follow strict “is-a” hierarchy rules4 Usually enumerated; ie, LC or Dewey

Thesaurus 4 Expresses semantic relationships of: • Hierarchy (broader & narrower terms)• Equivalence (synonyms) • Associative (related terms)

4 May include definitions

Ontology 4 Resembles faceted taxonomy but uses richer semantic relationships among terms and attributes and strict specification rules

4 A model of reality, allowing inferences to be made.

Page 71: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

71Taxonomy Strategies LLC The business of organized information

Agenda

9:15 Metadata Definitions

9:30 Maturity Models

9:45 Metadata Maturity Model (ca. 2006)

10:15 Break

10:30 Stock Photo Business

10:40 Data Governance Practices in Stock Photo Agencies

11:40 Summary

11:45 Questions

12:00 Adjourn

Page 72: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

72Taxonomy Strategies LLC The business of organized information

Organizational benchmarking

A common goal of organizations is to ‘benchmark’ themselves against other organizations.

Different organizations have: Different levels of sophistication in their planning,

execution, and follow-up for CMS, Search, Portal, Metadata, and Taxonomy projects.

Different reasons for pursuing Search, Metadata, and Taxonomy efforts

Different cultures

Benchmarks should be to similar organizations.

Page 73: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

73Taxonomy Strategies LLC The business of organized information

Is unnecessary capability harmful?

Tool Vendors continue to provide ever-more capable tools with ever-more sophisticated features. But we live in a world where a significant fraction of

public, commercial, web pages don’t have a <title> tag. Organizations that can’t manage <title> tags stand a

very poor chance of putting an entity extractor to use, which requires some ongoing management of the lists of entities to be extracted.

Organizations that can’t create and maintain clean metadata can’t put a faceted search UI to good use.

Unused capability is poor value-for-money. Organizations over-spend on tools and under-spend on

staff & processes.

Page 74: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

74Taxonomy Strategies LLC The business of organized information

Towards better benchmarking…

Wanted a method to: Generally identify good and bad practices. Help clients identify the things they can do, and the things that

stand an excellent chance of failing. Predict likely sources of problems in engagements.

We have started to develop a Metadata Maturity Model, inspired by Maturity Models from the software industry.

To keep the model tied to reality, we are conducting surveys to determine the actual state of practice around search, metadata, taxonomy, and supporting business functions such as staffing and project management.

Page 75: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

75TAXONOMY STRATEGIES The business of organized information

A Tale of Two Software Maturity Models

CMMI (Capability Maturity Model Integration)

vs.

The Joel Test

Page 76: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

76Taxonomy Strategies LLC The business of organized information

CMMI structure

Source: http://chrguibert.free.fr/cmmi

Maturity Models are collections of Practices.

Main differences in Maturity Models concern:

• Descriptivist or Prescriptivist Purpose

• Degree of Categorization of Practices

• Number of Practices (~400 in CMMI)

Page 77: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

77Taxonomy Strategies LLC The business of organized information

22 Process Areas, keyed to 5 Maturity Levels… Process Areas contain Specific

and Generic Practices, organized by Goals and Features, and arranged into Levels

Process Areas cover a broad range of practices beyond simple software development

CMMI Axioms:Individual processes at higher levels are AT RISK from supporting processes at lower levels.A Maturity Level is not achieved until ALL the Practices in that level are in operation.

Page 78: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

78Taxonomy Strategies LLC The business of organized information

CMMI Positives

Independent audits of an organization’s level of maturity are a common service Level 3 certification frequently required in bids

“…compared with an average Level 2 program, Level 3 programs have 3.6 times fewer latent defects, Level 4 programs have 14.5 times fewer latent defects, and Level 5 programs have 16.8 times fewer latent defects”.

Michael Diaz and Jeff King – “How CMM Impacts Quality, Productivity,Rework, and the Bottom Line”

‘If you find yourself involved in product liability litigation you're going to hear terms like "prevailing standard of care" and "what a reasonable member of your profession would have done". Considering the fact that well over a thousand companies world-wide have achieved level 3 or above, and the body of knowledge about the CMM is readily available, you might have some explaining to do if you claim ignorance’.

Linda Zarate in a review of A Guide to the Cmm: Understanding the Capability Maturity Model for Software by Kenneth M. Dymond

Page 79: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

79Taxonomy Strategies LLC The business of organized information

CMMI Negatives

Complexity and Expense Reading and understanding the materials Putting it into action – identifying processes, mapping

processes to model, gathering required data, … Audits are expensive

CMMI does not scale down well to small shops Has been accused of restraint of trade

Page 80: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

80Taxonomy Strategies LLC The business of organized information

At the other extreme, The Joel Test

Developed by Joel Spolsky as reaction to CMMI complexity

Positives - Quick, easy, and inexpensive to use.

Negatives - Doesn’t scale up well:Not a good way to assure the quality of nuclear reactor software.Not suitable for scaring away liability lawyers.Not a longer-term improvement plan.

The Joel Test1. Do you use source control? 2. Can you make a build in one step? 3. Do you make daily builds? 4. Do you have a bug database? 5. Do you fix bugs before writing new code? 6. Do you have an up-to-date schedule? 7. Do you have a spec? 8. Do programmers have quiet working conditions? 9. Do you use the best tools money can buy? 10.Do you have testers? 11. Do new candidates write code during their interview? 12.Do you do hallway usability testing?

Scoring: 1 point for each ‘yes’. Scores below 10 indicate serious trouble.

Page 81: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

81Taxonomy Strategies LLC The business of organized information

What does software development “Maturity” really mean?

A low score on a maturity audit DOES NOT mean that an organization can’t develop good software

It DOES mean that whether the organization will do a good job depends on the specific mix of people assigned to the project

In other words, it sets a floor for how bad an organization is likely to do, not a ceiling on how good they can do Probability of failure is a good thing to know before

spending a lot of time and money

Page 82: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

82TAXONOMY STRATEGIES The business of organized information

Towards a Metadata Maturity Model

Caveats: Maturity is not a goal, it is a characterization of an

organization’s methods for achieving its core goals.

Mature processes impose expenses which must be justified by consequent cost savings, revenue

gains, or service improvements.

Nevertheless, Maturity Models are useful as collections of best practices and stages in which to try to adopt

them.

Page 83: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

83Taxonomy Strategies LLC The business of organized information

Basis for initial maturity model

CEN study on commercial adoption of Dublin Core

Small-scale phone survey Organizations which have world-class search and

metadata externally Not necessarily the most mature overall processes or

the best internal search and metadata

Literature review

Client experiences

Structure from software maturity models

Page 84: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

84Taxonomy Strategies LLC The business of organized information

Initial Metadata Maturity Model (ca. May, 2005)

Practice Area Maturity Level

Basic Intermediate Advanced Bleeding- Edge

Limiting

Search Capabilities Uniform Search BoxQuery Log Exam.

Index Multiple Repos.Best BetsSimple Grouping

Intranet Facet NavigationImproved Ranking

Metadata and taxonomy standards

System MD Stds. Organization MD Std.Reuse ERP

Multipe Repos ComplyTaxonomy Roadmap

Highly Abstract Subject Taxos.

Tools and tool selection

Requirements, then Tools

Bakeoff Datasets Budget for Bakeoffs Unneeded Capabils.Tools, then Reqs.

Staff training and hiring

Search Analyst Role Librarian Expertise Pre-hire Testing SME Catalogers

Data creation and QA CM Introduced ROT-Eliminatiion Hybrid Creation Model Adaptive QualificationQuality Measures

Project management Project Plan Std. Proj. Methodol.X-Functional TeamsCommunication PlanMulti-Year Plan

Early Termination

Executive support and ROI

External Search ROI Intranet ROI Model CEO knows Search ROI Use it or Lose It Budgets

37 Practices, Categorized by Area, Level, and

Importance

Page 85: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

85Taxonomy Strategies LLC The business of organized information

Shortcomings of the initial model

No idea of how it corresponds to actual practice across multiple organizations Some indications that it over-emphasized the sophisticated

practices and under-emphasized beginning practices.

The initial metadata maturity model can be regarded as a hypothesis about how an organization progresses through various practices as it matures How to test it? Let’s ask! Two surveys to date Surveys are being run in stages because of large number of

practices. Ask about future, current, and former practices to gather

information on progression

Page 86: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

86Taxonomy Strategies LLC The business of organized information

Agenda

9:15 Metadata Definitions

9:30 Maturity Models

9:45 Metadata Maturity Model (ca. 2006)

10:15 Break

10:30 Stock Photo Business

10:40 Data Governance Practices in Stock Photo Agencies

11:40 Summary

11:45 Questions

12:00 Adjourn

Page 87: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

87TAXONOMY STRATEGIES The business of organized information

Survey 1: Search, Metadata, & Taxonomy Practices

The data in this section comes from a survey conducted in the autumn of 2005.

Page 88: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

88Taxonomy Strategies LLC The business of organized information

Participants by Organization Size

Page 89: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

89Taxonomy Strategies LLC The business of organized information

Participants by Job Role

Page 90: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

90Taxonomy Strategies LLC The business of organized information

Participants by Industry

Page 91: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

91Taxonomy Strategies LLC The business of organized information

Search Practices

Not current practice

Being developed In practice

Former practice

NA or Unknown

Search Box in standard place on all web pages. 20% (12) 11% (7) 62% (38) 2% (1) 5% (3)

Search engine indexes multiple repositories in addition to web sites. 25% (15) 21% (13) 44% (27) 2% (1) 8% (5)

Spell Checking. 31% (19) 18% (11) 38% (23) 0% (0) 13% (8)

Synonym Searching. 41% (25) 23% (14) 30% (18) 0% (0) 7% (4)

Search results grouped by date, location, or other factors in addition to simple relevance score. 37% (22) 20% (12) 37% (22) 0% (0) 7% (4)

Queries are logged and the logs are regularly examined 31% (19) 25% (15) 31% (19) 5% (3) 8% (5)

Common queries identified, 'best' pages for those queries are found, and search engine configured to return them at the top. 46% (28) 25% (15) 21% (13) 0% (0) 8% (5)

Advanced computation of relevance based on data in addition to the text of the document. 43% (26) 16% (10) 25% (15) 0% (0) 16% (10)

A faceted search tool, such as Endeca, has been implemented for the organization's external site or product catalog search. 68% (41) 7% (4) 10% (6) 0% (0) 15% (9)

A faceted search tool, such as Endeca, has been implemented for the organization's internal website(s) or portal. 57% (34) 15% (9) 17% (10) 0% (0) 12% (7)

Page 92: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

92Taxonomy Strategies LLC The business of organized information

Metadata Practices

Not current practice

Being developed In practice

Former practice

NA or Unknown

Metadata standards are developed for the needs of each system with no overall attempt to unify them. 22% (13) 12% (7) 37% (22) 20% (12) 10% (6)

An Organization-wide metadata standard exists and new systems consider it during development. 37% (22) 37% (22) 20% (12) 0% (0) 7% (4)

The Organization-wide metadata standard is based on the Dublin Core. 52% (30) 16% (9) 21% (12) 0% (0) 12% (7)

Multiple repositories comply with metadata standard. 52% (31) 20% (12) 17% (10) 0% (0) 12% (7)

A Cataloging Policy document exists to teach people how to tag data in compliance with organizational metadata standard. 48% (29) 20% (12) 20% (12) 0% (0) 12% (7)

The Cataloging Policy document is revised periodically. 48% (29) 15% (9) 17% (10) 0% (0) 20% (12)

A centralized metadata repository exists to aggregate and unify metadata from disparate sources. 57% (34) 17% (10) 17% (10) 0% (0) 10% (6)

Metadata is manually entered into web forms. 15% (9) 12% (7) 61% (36) 3% (2) 8% (5)

Metadata is generated automatically by software. 38% (23) 18% (11) 27% (16) 2% (1) 15% (9)

Metadata is generated automatically, then reviewed manually for correction. 48% (29) 18% (11) 17% (10) 2% (1) 15% (9)

These two questions were the only ones with much correlation to

organization size

Page 93: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

93Taxonomy Strategies LLC The business of organized information

Taxonomy Practices

Not current practice

Being developed In practice

Former practice

NA or Unknown

Org Chart' Taxonomy - One based primarily on the structure of the organization. 36% (21) 10% (6) 34% (20) 5% (3) 15% (9)

'Products' Taxonomy - One based primarily on the products and/or services offered by the organization. 37% (22) 10% (6) 32% (19) 5% (3) 15% (9)

'Content Types' Taxonomy - One based primarily on the different types of documents. 28% (16) 21% (12) 40% (23) 5% (3) 7% (4)

'Topical' Taxonomy - One based primarily on topics of interest to the site users. 20% (12) 36% (21) 34% (20) 3% (2) 7% (4)

'Faceted' Taxonomy - One which uses several of the approaches above. 32% (19) 29% (17) 34% (20) 0% (0) 5% (3)

The Taxonomy, or a portion of it, was licensed from an outside taxonomy vendor. 75% (44) 3% (2) 14% (8) 0% (0) 8% (5)

The Taxonomy follows a written 'style guide' to ensure its consistency over time. 47% (28) 22% (13) 20% (12) 0% (0) 10% (6)

The Taxonomy is maintained using a taxonomy editing tool other than MS Excel. 35% (21) 17% (10) 40% (24) 2% (1) 7% (4)

The Taxonomy was validated on a representative sample of content during its development. 28% (17) 22% (13) 33% (20) 3% (2) 13% (8)

A Roadmap for the future evolution of the Taxonomy has been developed. 38% (23) 40% (24) 13% (8) 0% (0) 8% (5)

Page 94: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

94TAXONOMY STRATEGIES The business of organized information

Survey 2: Business Drivers, Processes, and Staffing

The data in this section comes from a survey conducted in the spring of 2006.

Page 95: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

95Taxonomy Strategies LLC The business of organized information

Participants by Job Role

Page 96: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

96Taxonomy Strategies LLC The business of organized information

Participants by Tenure

Page 97: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

97Taxonomy Strategies LLC The business of organized information

Participants by Industry

Page 98: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

98Taxonomy Strategies LLC The business of organized information

Participants by Organization Size

Page 99: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

99Taxonomy Strategies LLC The business of organized information

Business Drivers: Search, Metadata, and Taxonomy (SMT) Applications

Page 100: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

100Taxonomy Strategies LLC The business of organized information

Business Drivers: Desired Benefits

1 Innovation

2 Core to our business product3 Clients do all the above [From a consultant]4 Better navigation to diverse State web sites5 Increased knowledge sharing across the corporation6 Interoperability7 Dynamic web applications8 Improved user search experience9 Improve R&D

10Higher value to members [From a non-profit membership

org.]11 For organization to have better understanding of their content

Other desired benefits

:

Page 101: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

101Taxonomy Strategies LLC The business of organized information

ROI: Cost Estimation

Page 102: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

102Taxonomy Strategies LLC The business of organized information

Processes

Use of search logs is

improving

Surprisingly sophisticated

Basic data quality and communications need improvement

Many solo operators

Page 103: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

103Taxonomy Strategies LLC The business of organized information

Team Structures & Staffing

Page 104: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

104Taxonomy Strategies LLC The business of organized information

Salary Survey

Experience 0.6 Nice to see it really counts.

Geography 0.5 California and the Northeast have highest salaries.

Co. Size 0.5 Not very reliable, big changes from one datapoint

Education 0.4 Many taxonomists have MLS or above.

Industry 0.4 Surprisingly, retail has high salaries for taxonomists.

Role 0.04 Taxonomists paid about like Information Architects

Time at current job -0.07

Page 105: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

105Taxonomy Strategies LLC The business of organized information

Notes from Participants

There is the constant struggle with individual [magazine] titles to hire trained librarians or data specialists instead of trying to save money by hiring an editor who can build articles AND create and assign metadata. This is a governance issue we have been struggling with since we have no monetary stake in the individual publications. We make recommendations, but have no higher level authority to require titles to hire trained staff for metadata.

Reporting metrics have become a new area of confusion as we move to portalized pages consisting of objects in portlets, each with their own metadata.

Key organizational issue is that the "problems" that stem from lack of systematic metadata/taxonomy creation are not "owned" by anyone, and consequently have no budget for their solution.

Page 106: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

106TAXONOMY STRATEGIES The business of organized information

Interim Conclusions

Page 107: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

107Taxonomy Strategies LLC The business of organized information

Observations (1)

Practices which a single person or a small group can carry out are more commonly used Not surprising Very different than ERP/BPR, indicates that information

management is not being sold to the “C-level” staff. People need to question how inclusive their

“Organizational Metadata Standards” and “Taxonomy Roadmaps” actually are. We have found Taxonomy Roadmaps to be an advanced

practice, due to a dependence on knowing upcoming IT development schedule

Page 108: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

108Taxonomy Strategies LLC The business of organized information

Observations (2)

Many of the basics are being skipped More organizations doing “Spell Checking” than “Query

Log Analysis”. 69% have a taxonomy change plan, but only 41% have

a plan for revisiting data if the taxonomy changes. 64% have a communications plan, but only 56% have a

website. This seems to be linked to the previous observation –

things that are easy for an individual get done before things that need an organizational effort, despite their level of ‘sophistication’.

Page 109: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

109Taxonomy Strategies LLC The business of organized information

Interim Metadata Maturity Model (ca. May, 2006)

Practice Area Basic Intermediate Advanced Limiting

Search Capabilities Uniform Search BoxQuery Log Exam.

Index Multiple Repos.Best Bets

Facet Navigation UI

Metadata and taxonomy standards

System MD Stds.Organization MD Std.

Multipe Repos Comply w/ MD Std.Reuse ERP TaxosTaxo Maint. Doc

Taxonomy RoadmapHighly Abstract Subject Taxos (e.g. “Moods”)Metadata Maint. Doc

Tools and tool selection

Requirements, then Tools Bakeoff Datasets Budget for Bakeoffs Tools, then Reqs.

Staff training and hiring

Librarian or IA ExpertiseSearch Analyst Role

Cross-Functional Taxonomy Creation

Cross-functional taxonomy maint.SME CatalogersPre-hire Testing

Data creation and QA CM Introduced ROT-EliminatiionSemi-auto tagging

Quality Measures

Project management Project PlanX-Functional Teams

Std. Proj. Methodol.Multi-Year PlanCommunication PlanSMT Business Manager, instead of IT Manager

Early Termination

Executive support and ROI

External Search ROISMT in separate silos

Intranet ROI Model CEO knows Search ROI Use it or Lose It Budgets

Page 110: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

110Taxonomy Strategies LLC The business of organized information

Search and Metadata Maturity Quick Quiz

Basic1) Is there a process in place to examine query logs?2) Is there a process for adding directories and content to the repository, or do people just

do what they want?3) Is there an organization-wide metadata standard, such as an extension of the Dublin

Core, for use by search tools, multiple repositories, etc.?Intermediate4) Does the search engine index more than 4 repositories around the organization?5) Does the search engine integrate with the taxonomy to improve searches and organize

results?6) Are there hiring and training practices especially for metadata and taxonomy positions?7) Is there an ongoing data cleansing procedure to look for ROT (Redundant, Obsolete,

Trivial content)?8) Are tools only acquired after requirements have been analyzed, or are major purchases

sometimes made to use up year-end money?Advanced9) Are there established qualitative and quantitative measures of metadata quality?10) Can the CEO explain the ROI for search and metadata?

Page 111: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

111Taxonomy Strategies LLC The business of organized information

Agenda

9:15 Metadata Definitions

9:30 Maturity Models

9:45 Metadata Maturity Model (ca. 2006)

10:15 Break

10:30 Stock Photo Business

10:40 Data Governance Practices in Stock Photo Agencies

11:40 Summary

11:45 Questions

12:00 Adjourn

Page 112: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

112Taxonomy Strategies LLC The business of organized information

Agenda

9:15 Metadata Definitions

9:30 Maturity Models

9:45 Metadata Maturity Model (ca. 2006)

10:15 Break

10:30 Stock Photo Business

10:40 Data Governance Practices in Stock Photo Agencies

11:40 Summary

11:45 Questions

12:00 Adjourn

Page 113: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

113Taxonomy Strategies LLC The business of organized information

Stock Photo Business

Advertising, Editorial Content, Corporate Communications, and many other types of content rely on images to convey information and moods.

When time and/or budget does not allow a commissioned shoot, stock photo houses can supply images.

Fundamental problem for users: How to search for an image that conveys what you want?

Fundamental problem for houses: How to describe images so that users can find them?

Page 114: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

114Taxonomy Strategies LLC The business of organized information

How would you search for this image?

Page 115: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

115Taxonomy Strategies LLC The business of organized information

Tagging by emotions

Page 116: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

116Taxonomy Strategies LLC The business of organized information

“silence”

Conceptual refinement

Objective criteria

Conceptual refinement

Image Rights Criteria

Page 117: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

117Taxonomy Strategies LLC The business of organized information

Clarification: Finger on Lips

Page 118: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

118Taxonomy Strategies LLC The business of organized information

Scrolling through results…

This is more of the mood I’m looking for…

Page 119: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

119Taxonomy Strategies LLC The business of organized information

More like this

Page 120: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

120Taxonomy Strategies LLC The business of organized information

Facets at gettyimages.com

Page 121: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

121Taxonomy Strategies LLC The business of organized information

Key Questions

Getty Images (and Corbis) have put a lot of effort into their websites for image purchase*.

Internal staff at such organizations tell me that their intranets are nowhere near as easy to use. ROI is the reason why. Recall that retail had high salaries for taxonomists,

because the ROI for a better shopping site is so clear.

The front-ends are dependent on data. How is that data governed? How does that differ from how their intranets are governed?

*Licensing, not purchasing, to be pedantic.

Page 122: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

122Taxonomy Strategies LLC The business of organized information

Agenda

9:15 Metadata Definitions

9:30 Maturity Models

9:45 Metadata Maturity Model (ca. 2006)

10:15 Break

10:30 Stock Photo Business

10:40 Data Governance Practices in Stock Photo Agencies

11:40 Summary

11:45 Questions

12:00 Adjourn

Page 123: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

123Taxonomy Strategies LLC The business of organized information

Who are the users & what are they looking for?

Only 30-40% of organizations regularly examine their logs.

Sophisticated software available, but don’t wait. 80% of value comes from basic reports

Page 124: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

124Taxonomy Strategies LLC The business of organized information

Query log & click trail examination—Click trail packages iWebTrack NetTracker OptimalIQ SiteCatalyst Visitorville WebTrends

Overkill

Page 125: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

125Taxonomy Strategies LLC The business of organized information

Query log & click trail examination– Query log

UltraSeek Reporting Top queries Queries with no results Queries with no click-through Most requested documents Query trend analysis Complete server usage

summary

Basic queries provide most of the value if organization has a

process to review what is going one.

Page 126: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

126Taxonomy Strategies LLC The business of organized information

Key Governance Aspects

Roles and Responsibilities – Managers Reviewers

Policies – For naming Required Fields

Procedures – For reviewing and approving metadata placement For acting on poor metadata application

Page 127: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

127Taxonomy Strategies LLC The business of organized information

Recommended Measure and Improve Mindset Measure - Determine current situation and what is wrong.

• Too many documents in a category? Too many categories? People complaining about not finding material that is on the site? People asking for materials not on the site? Common searches without results?

Decide – Decide how to change things to fix the problem.• Change navigation list? Add new categories? Add synonyms to search? Create

new content?

Confirm – Before rolling out changes, test them to make sure they will improve the problem.

• Usability tests, Card sorts, Internal functionality tests, …

Implement – Roll out the changes.

Repeat – Monitor people’s behavior on the site as well as responding to reported problems.

• Query log examination, Clicktrail examination, Google search result position, Stakeholder feedback, User surveys, Site analytics, etc.

Page 128: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

128Taxonomy Strategies LLC The business of organized information

Taxonomy team: Generic roles

Business Lead

Technical Specialist

Content Specialist

Taxonomy Specialist

Content Owners

Keeps team on track with larger business objectives.

Reality check on process change suggestions.

Balances cost/benefit issues to decide appropriate levels of effort.

Obtains needed resources if those on committee can’t accomplish a particular task.

Estimates costs of proposed changes in terms of amount of data to be retagged, additional storage and processing burden, software changes, etc.

Helps obtain data from various systems.

Committee’s liaison to content creators. Estimates costs of proposed changes in terms of editorial

process changes, additional or reduced workload, etc.

Suggests potential taxonomy changes based on analysis of query logs, indexer feedback.

Makes edits to taxonomy, installs into system with aid of IT specialist.

Stakeholder Committee

Page 129: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

129Taxonomy Strategies LLC The business of organized information

Recommended Reading

CMMI: http://chrguibert.free.fr/cmmi

(Official site is http://www.sei.cmu.edu/cmmi/, but that is not the most comprehensible.)

Joel Testhttp://www.joelonsoftware.com/articles/fog0000000043.html

EIA Roadmaphttp://www.louisrosenfeld.com/presentations/031013-KMintranets.ppt

Enterprise Search Reporthttp://www.cmswatch.com/EntSearch/

Page 130: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

130Taxonomy Strategies LLC The business of organized information

Fun Questions

The animals are divided into:(a) belonging to the emperor,(b) embalmed, (c) tame, (d) sucking pigs, (e) sirens, (f) fabulous, (g) stray dogs, (h) included in the present classification,(i) frenzied, (j) innumerable, (k) drawn with a very fine camelhair brush, (l) et cetera, (m) having just broken the water pitcher, (n) that from along way off look like flies.

Jorge Luis Borges, " THE ANALYTICAL LANGUAGE OF JOHN WILKINS"Works in 3 volumes (in Russian). St. Petersburg, "Polaris", 1994. V. 2: 87.

This was created to be

as bad a classification as possible.

What makes it so bad?

Page 131: Strategies LLCTaxonomy Copyright 2009Taxonomy Strategies LLC. All rights reserved. Assorted Slides on Taxonomy & Metadata Governance Ron Daniel, Jr

Strategies LLCTaxonomy

Copyright 2009Taxonomy Strategies LLC. All rights reserved.

Contact Info

Ron Daniel, Jr.

925-368-8371

[email protected]