metadata strategies alternatives for creating value from metadata tom reamy chief knowledge...
TRANSCRIPT
Metadata Strategies Alternatives for creating value from metadata
Tom ReamyChief Knowledge Architect
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com
2
Agenda
To Metadata or not to Metadata Issues and Approaches to Metadata Infrastructure Solution
– Metadata Contexts– Tools– People
Why an infrastructure Solution? – Decreasing cost and increasing value
Conclusion
3
Metadata about Metadata: Two Sources
Global Corporate Circle DCMI 2003 Workshop– Importance of Metadata– Difficulty of implementation and justification
• http://dublincore.org/groups/corporate/Seattle
KAPS Group Experience– Consulting, Taxonomy & Metadata, Strategy – Knowledge architecture audit– Partners – Inxight, Convera, etc.– Intellectual infrastructure for organizations
• Knowledge organization, technology, people and processes• Search, CM, portals, collaboration, KM, e-learning, etc
4
To Metadata or not to MetadataThat is the Question
Whether ‘tis nobler in the mind to suffer the slings and arrows of outrageous search results
Or to take up metadata against a sea of irrelevance And by organizing them find them? Why not Metadata?
– Costly - $200K to set up, maintenance costs– Difficult to do
• Missing, incorrect, confusing, inconsistent
• Poor quality metadata can make search worse
5
To Metadata or not to MetadataThat is the Question Why Metadata? Metadata is expensive – only if unique job every time Not metadata is even more expensive?
– $8,200 per employee per year– IDC – 1,000 people - $2.5 mil a year lost
Need more sophisticated ROI – Saved time per search– Easy to measure, hard to believe– Use relative not absolute numbers
• 60 people years for metadata for 1 million documents• 6,000 people years to write = 1%
– Regulatory and Legal requirements – Stories – business needs, improvements
6
Metadata Issues and Approaches: Alternatives
Metadata, we don’t need no stinking metadata– Condemned to wander search results lists forever– Need to answer these people
KA Team – Consultants– Costly, Still need to maintain
Automatic metadata (clustering & categorization)– Uneven, poor quality
Author generated metadata– Uneven quality, inconsistent– Cultural – getting authors to want to do it
7
Metadata Issues and Approaches: AlternativesContent Value Tiers - Rosenfeld
Different levels of metadata for different documents– Not too much, not too little, just right
High value documents get full metadata, others less Criteria: authority, strategic value, popularity, currency,
reusability Practical solution, 80-20 rule, focus on business value BUT – doesn’t answer how to do metadata, simply limits
problem AND – adds some new problems
8
Metadata Issues and Approaches: AlternativesContent Value Tiers - Rosenfeld
Who decides what is good content?– Publishers, Authors, KA Team– Political and Quality Issues
Objective – metrics of use – Popularity is not high value– High impact if not found– Who decides which measure to use?
Authority, popularity, etc. are metadata– Almost as much effort to decide what is good as add metadata
Points in the right direction – multiple approaches
9
Infrastructure Solutions: The Right Context
No one solution– Can’t answer content questions from perspective of content alone – need to understand users and activities and organization
Context – understanding your context– Match amount of metadata to value– Match type of metadata to content and use– Lower the cost and increase the value
The problem is not that metadata initiatives have been too complex, it’s been that they have been too simple.
– Metadata is more than adding keywords as an afterthought For same or less effort, you can go from metadata that makes
search worse to a set of solutions
10
Infrastructure Solutions: The Right Context
Content – structured & unstructured, external & internal Taxonomies, Metadata and Controlled Vocabularies
– Standards, best practices
Publishing Policy and Procedures Technologies – search, portals, CM, applications People
– Central Team and Subject matter experts– Communities of users and information behaviors
Business processes and requirements
11
Infrastructure Solutions: Metadata Contexts
Why are you adding metadata? – Ranking for Retrieval – lower value– Context – dynamic browse, multiple views– Get content to users - agents
Metadata Standards – matching the level and unit of organization,
– Paragraph – XML, RDF – High value, training– Document – Titles, keywords– Collection – Publisher, Functions – Facets, low cost
• Documents into metadata
12
Infrastructure Solutions: Metadata Contexts
Keywords – most difficult• Common terms, unique terms, aboutness terms• Need to do it right and completely to get real value
Keywords - Need Taxonomy, Controlled Vocabulary – Enhance quality, consistency– Supports author generated metadata
Value from all fields– Titles and Descriptions – Balance of system and description– Publisher and Author – Automated and easy– DocumentType – FAQ’s, Policy Doc – support user behavior– Audience – target information, agents – variety of app’s
• Purpose per Audience
13
Infrastructure Solutions: Tools
Content Management – the right place for adding metadata– Metadata generation - Keywords within a taxonomy– Take advantage of automation, rules, work flow– Hybrid and Distributed
Tools for central team– Unstructured Data management, Visualization
• KA Team reduced costs, improved quality
Applications – Search, Portals, CRM, Text Mining– Need to be able to integrate and apply metadata– Analytics based on meaning, metadata– Faceted search results – high value (Marti Hearst)
14
15
Example
16
Infrastructure Solutions: People
Central Team supported by software and offering services– Creating, acquiring, evaluating taxonomies, metadata standards,
vocabularies– Input into technology decisions and design – content management,
portals, search– Socializing the benefits of metadata, creating a content culture– Evaluating metadata quality, facilitating author metadata– Analyzing the results of using metadata, how communities are using– Research metadata theory, user centric metadata – Design content value structure – more nuanced than good / poor
content.
17
Infrastructure Solutions: Why?
Needed to implement any alternative approach– Justification for metadata - measure and present realistic ROI– Supplement consultants– Integrate automated and author supplied metadata– Integrate content tiers into broader context
Needed for tailoring solutions to organizations Metadata as add on to a search engine purchase will fail Most cost effective way to produce valuable metadata Needed to support variety of cognitive behaviors
– Monkey, Panda, Banana
18
Infrastructure Solutions: Why?
Decrease the cost of creating metadata– Tools and Processes
• Content management, categorization and visualization software• Large batch of legacy content • Spread the cost – automation, author, central team
– Leverage intellectual infrastructure elements – taxonomies, controlled vocabularies, metadata standards
– Increase cost of not adding metadata – policy and culture, but support with software
19
Infrastructure Solutions: Why?
Increase the value of creating metadata– Better quality metadata
• Categorization experts and subject matter experts
– Beyond Search and relevance ranking • Multiple facets – contextual, entity, concept, document type
• Dynamic classification – intersection of 2 subjects
• Applications – integrated metadata for portals, agents, etc• Personalization by categorization
– Beyond content – people metadata:• Community personalization, information behaviors
• Community categorization
20
Infrastructure Solutions: What if I can’t get there from here? First Step – Create an infrastructure strategic vision
– Including metadata standards KA Team – can be part time, needs official recognition Content Management is essential Don’t start with keywords Buy or develop taxonomies, controlled vocabularies Relevance ranking as last resort
– Best bet metadata– Browse and dynamic classifications– Faceted Displays
Think Big, Start Small, Scale Fast
21
Conclusion
You wouldn’t run a company without organizing your employees and computers, why think you can create information access without organizing your information?
More metadata, not less. Better Metadata
– Better metadata system– Better values in fields– Better value from metadata– Better processes for creating metadata
Questions?
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com