lecture 21: facetted classification
DESCRIPTION
Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 am Fall 2004. Lecture 21: Facetted Classification. SIMS 202: Information Organization and Retrieval. Agenda. Facetted Classification Traditional vs. Facetted Classification - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/1.jpg)
2004.11.094 - SLIDE 1IS 202 - FALL 2004
Lecture 21: Facetted Classification
Prof. Ray Larson & Prof. Marc Davis
UC Berkeley SIMS
Tuesday and Thursday 10:30 am - 12:00 am
Fall 2004
SIMS 202:
Information Organization
and Retrieval
![Page 2: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/2.jpg)
2004.11.094 - SLIDE 2IS 202 - FALL 2004
Agenda
• Facetted Classification– Traditional vs. Facetted Classification– Designing Facetted Classifications– Thesaurus Design– Assignment 6– Discussion Questions
• Action Items for Next Time
![Page 3: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/3.jpg)
2004.11.094 - SLIDE 3IS 202 - FALL 2004
Agenda
• Facetted Classification– Traditional vs. Facetted Classification– Designing Facetted Classifications– Thesaurus Design– Assignment 6– Discussion Questions
• Action Items for Next Time
![Page 4: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/4.jpg)
2004.11.094 - SLIDE 4IS 202 - FALL 2004
Controlled Vocabularies
• Vocabulary control is the attempt to provide a standardized and consistent set of terms (such as subject headings, names, classifications, etc.) with the intent of aiding the searcher in finding information
• That is, it is an attempt to provide a consistent set of descriptions for use in (or as) metadata
![Page 5: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/5.jpg)
2004.11.094 - SLIDE 5IS 202 - FALL 2004
Hierarchical Classification
• Each category is successively broken down into smaller and smaller subdivisions
• No item occurs in more than one subdivision
• Each level divided out by a “character of division” (also known as a feature)– Example:
• Distinguish “Literature” based on:– Language– Genre– Time Period
Slide author: Marti Hearst
![Page 6: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/6.jpg)
2004.11.094 - SLIDE 6IS 202 - FALL 2004
Hierarchical Classification
Literature
SpanishFrenchEnglish
DramaPoetryProse
18th17th16th
DramaPoetryProse
19th 18th17th16th 19th
...
... ... ...
...
Slide author: Marti Hearst
![Page 7: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/7.jpg)
2004.11.094 - SLIDE 7IS 202 - FALL 2004
Labeled Categories for Hierarchical Classification
• LITERATURE– 100 English Literature
• 110 English Prose– English Prose 16th Century– English Prose 17th Century– English Prose 18th Century– ...
• 111 English Poetry– 121 English Poetry 16th Century– 122 English Poetry 17th Century– ...
• 112 English Drama– 130 English Drama 16th Century– …
– 200 French LiteratureSlide author: Marti Hearst
![Page 8: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/8.jpg)
2004.11.094 - SLIDE 8IS 202 - FALL 2004
Faceted Categories
• Mutually exclusive– Non-overlapping, distinct categories
• Relational– Relations between facets, subfacets, and foci
(elements) are not restricted to hierarchical generalization-specialization relations
• Composable– Combined using grammars of order and
relation to form compound descriptions
![Page 9: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/9.jpg)
2004.11.094 - SLIDE 9IS 202 - FALL 2004
Faceted Classification Along With Labeled Categories
• A Language– a English– b French– c Spanish
• B Genre– a Prose– b Poetry– c Drama
• C Period– a 16th Century– b 17th Century– c 18th Century– d 19th Century
• Aa English Literature
• AaBa English Prose
• AaBaCa English Prose 16th Century
• AbBbCd French Poetry 19th Century
• BbCd Drama 19th Century
Slide author: Marti Hearst
![Page 10: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/10.jpg)
2004.11.094 - SLIDE 10IS 202 - FALL 2004
Ranganathan
• PMEST Facets– P(ersonality)
• WHO: Types of things
– M(atter)• WHAT: Constituent materials
– E(nergy)• HOW: Action or activity terms
– S(pace)• WHERE: Where things occur
– T(ime)• WHEN: When things occur
![Page 11: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/11.jpg)
2004.11.094 - SLIDE 11IS 202 - FALL 2004
“Classical” Facet Analysis
• Entity
• Kind
• Part
• Property
• Material
• Process
• Operation
• Patient
• Product
• By-Product
• Agent
• Space
• Time
![Page 12: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/12.jpg)
2004.11.094 - SLIDE 12IS 202 - FALL 2004
“Classical” Facet Analysis
• What is being done?– Entity– Kind– Product– By-Product
• What are its parts?– Part
• What are its properties?– Property– Material
• How is this achieved?– Process
• By what means?– Operation
• By whom?– Agent– Patient
• Where?– Space
• When?– Time
![Page 13: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/13.jpg)
2004.11.094 - SLIDE 13IS 202 - FALL 2004
“Classical” Facet Analysis
• Nouns– Entity– Kind– Part– Patient– Product– By-Product– Agent
• Adjectives– Property– Material
• Intransitive Verb– Process
• Transitive Verb– Operation
• Adverb– Space– Time
![Page 14: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/14.jpg)
2004.11.094 - SLIDE 14IS 202 - FALL 2004
Semantic and Syntactic Relationships
• Semantic relationships– Is-A (thing/kind,
genus/species)• Mammals
– Primates
» Humans
– Has-Parts• Human
– Head
» Eyes
• Syntactic relationships– Compounds
• Wheat + harvesting = “wheat harvesting”
• Object + operation = operation on object
![Page 15: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/15.jpg)
2004.11.094 - SLIDE 15IS 202 - FALL 2004
Faceted Classification
• Clearly distinguishes between semantic relationships and syntactic relationships– Semantic relationships
• Within a facet• Containment relations
– Syntactic relationships• Across facets• Combinatoric relations
• Have a “syntax” for syntactic combination of semantic terms
![Page 16: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/16.jpg)
2004.11.094 - SLIDE 16IS 202 - FALL 2004
Power of Facet Combinations
• The syntactic relations of faceted classifications enable a small controlled vocabulary to produce– Many, many structured descriptions– Complex, but formally structured descriptions
using nested compound descriptions– Descriptions for things we do not have words
for
![Page 17: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/17.jpg)
2004.11.094 - SLIDE 17IS 202 - FALL 2004
Example: Objects
Red Plastic Glass Blue Paper Straw
![Page 18: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/18.jpg)
2004.11.094 - SLIDE 18IS 202 - FALL 2004
Project Team Facetted Classifications• 007
– Personality• Straw
• Glass
– Operation• Drinking
• Slurping
• Sipping
– Material• Plastic
• Paper
– Color • Blue
• Red
• ARTery– Color– Size– Material– Weight– Shape– Radius/Circumference– Density– Volume/Capacity– Function/Use– Hardness/Softness– Yin/Yang
![Page 19: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/19.jpg)
2004.11.094 - SLIDE 19IS 202 - FALL 2004
Project Team Facetted Classifications• Culture Feed
– Color • Red• Blue
– Material• Plastic• Paper
– Use• Drink from• Drink with
– Dimensions• Circumference• Height• Diameter
• Picture Portal– Color
• Red• Blue
– Material• Paper • Plastic
– Use• Containment• Transport
– Shape• Torus• Planar
– # Holes• 0• 1
![Page 20: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/20.jpg)
2004.11.094 - SLIDE 20IS 202 - FALL 2004
Project Team Facetted Classifications
• F.U.N.– Shape– Color– Material
• Rigidity
– Function• Container• Conduit
– Locale– Weight– Size
• MNM– Functionality
• What it does• What you can do with it
– Physical Properties• Color• Shape• Material
![Page 21: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/21.jpg)
2004.11.094 - SLIDE 21IS 202 - FALL 2004
Project Team Facetted Classifications• pillBox
– Function• Container• Conduit
– Form• Shape
– Cylinder• Composition
– Paper – Plastic
• Color – Blue– Red
• Size– Tall and skinny– Short and fat
• Team iTour– Color
• Red• Blue
– State• Solid• Non-porous• Flexible
– Material• Plastic• Paper
– Geometry• Cylindrical• Hollow
– Function• Container• Drinking• Sucking• Blowing
![Page 22: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/22.jpg)
2004.11.094 - SLIDE 22IS 202 - FALL 2004
Example: Objects
Gray Metal Glass Two Yellow Plastic Straws
![Page 23: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/23.jpg)
2004.11.094 - SLIDE 23IS 202 - FALL 2004
Example: Objects
• Function• Form
– Shape– Material– Color– Number
Function: Drinking
Form
Shape: Cylinder
Material: Plastic
Color: Red
Number: 1
![Page 24: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/24.jpg)
2004.11.094 - SLIDE 24IS 202 - FALL 2004
Agenda
• Facetted Classification– Traditional vs. Facetted Classification– Designing Facetted Classifications– Thesaurus Design– Assignment 6– Discussion Questions
• Action Items for Next Time
![Page 25: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/25.jpg)
2004.11.094 - SLIDE 25IS 202 - FALL 2004
Faceted Classification Design
• Collect examples that need to be classified• Identify candidates for facets and subfacets
– Test classification scheme on examples for facet orthogonality
• Order foci within facets• Explicate grammar for ordering and combining facets
and subfacets– Test classification scheme on examples for combinatoric power
• Extend foci for comprehensiveness where applicable• Create new facets and subfacets where needed
– Test classification scheme on new examples, especially boundary cases
• Iterate and refine throughout
![Page 26: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/26.jpg)
2004.11.094 - SLIDE 26IS 202 - FALL 2004
• Terms on the same level in the ontology should be of the same level and type
• Facets, subfacets, and foci should have a discernible order
• Use of capitalization and singular/plural forms should be uniform
Facet Guidelines
– Sports• Team Sports
– Baseball
• Football• Basketball• Solo Sports• Marathon Running
– Sports• Team Sports
– Baseball
– Football
– Basketball
• Solo Sports– Marathon Running
![Page 27: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/27.jpg)
2004.11.094 - SLIDE 27IS 202 - FALL 2004
Ordering Foci (“Array”)
• Simple to complex– (Locomotions: walk, run, jump, skip, hurdle, cartwheel)
• Common/popular to uncommon/unpopular– (Vegetarian Pizza Toppings: mushroom, onion, olive, artichoke,
pineapple, pine nuts)• Spatial, geographical, or geometric
– (Southwestern States: California, Nevada, Arizona, New Mexico )• Chronological, historical, or evolutionary
– (Dinosaur Eras: Triassic, Jurassic, Cretaceous)• Canonical (pre-established order)
– (Playground Counting: Eenie, Meenie, Mynee, Mo)• Alphabetical
– (Boy’s Names: Al, Bob, Chuck, David, Ed, Frank, George, Harry)• Size
– (T-Shirts: Small, Medium, Large, XL, XXL)
![Page 28: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/28.jpg)
2004.11.094 - SLIDE 28IS 202 - FALL 2004
Agenda
• Facetted Classification– Traditional vs. Facetted Classification– Designing Facetted Classifications– Thesaurus Design– Assignment 6– Discussion Questions
• Action Items for Next Time
![Page 29: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/29.jpg)
2004.11.094 - SLIDE 29IS 202 - FALL 2004
Why Develop a Thesaurus?
• To provide a conceptual structure or “space” for a body of information– To make it possible to adequately describe
the topical content of information resources at an appropriate level of generality or specificity
– To provide enhanced search capabilities and to improve the effectiveness of searching (i.e., to retrieve most of the relevant material without too much irrelevant material)
![Page 30: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/30.jpg)
2004.11.094 - SLIDE 30IS 202 - FALL 2004
Why Develop a Thesaurus?
• To provide vocabulary (or terminological) control– When there are several possible terms
designating a single concept, the thesaurus should lead the indexer or searcher to the appropriate concept, regardless of the terms they start with
![Page 31: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/31.jpg)
2004.11.094 - SLIDE 31IS 202 - FALL 2004
Preliminary Considerations
• What is used now?– Continue using an existing thesaurus?– Ad hoc modification of existing thesaurus?– Develop a new well-structured thesaurus?
• What is the scope and complexity of the subject field?
• What kind of retrieval objects or data will be dealt with?
• How exhaustive and specific is the desired description of objects?
![Page 32: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/32.jpg)
2004.11.094 - SLIDE 32IS 202 - FALL 2004
Preliminary Considerations
• The scope and complexity of the field will provide some indication of the scope and complexity of the thesaurus– It is better to plan for a larger and more
comprehensive system than a smaller system that rapidly will become inadequate as the database grows
• Development of a good thesaurus requires a major intellectual effort as well as clerical operations like data entry and production of sorted lists
![Page 33: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/33.jpg)
2004.11.094 - SLIDE 33IS 202 - FALL 2004
Development of a Thesaurus
• Term selection
• Merging and development of concept classes
• Definition of broad subject fields and subfields
• Development of classificatory structure
• Review, testing, application, revision
![Page 34: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/34.jpg)
2004.11.094 - SLIDE 34IS 202 - FALL 2004
Flow of Work in Thesaurus Construction
Select Sources
Assign codes
Select Terms
Record Selected Terms
Sort Terms
Merge identical Terms
Define Broad SubjectFields
Merge Terms in SameConcept class
Sort Terms into BroadSubject Fields
Define Subfields withinone Subject Field
Work out detailed structureof the Subject Field
Select Preferred Terms
All Subfields of BroadSubject finished?
All BroadSubjects finished?
Improve Class Structure
Yes
Yes
No
No
Print Classified Indexand review
Discuss with Experts andUsers
Select descriptors andchecklist items
Produce Full Thesaurusand Check references
Assign Notation
Review and Test
Many Modifications?
Based on Soergel, pp 327-333
Yes
No
Revise asneeded
![Page 35: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/35.jpg)
2004.11.094 - SLIDE 35IS 202 - FALL 2004
1. Term Selection
• Select sources for the collection of terms– Prearranged Sources– Open-ended Sources
• Assign codes to each source
• Selection of terms– For part of pre-arranged and for all open-
ended sources
• Enter terms into database with all information
![Page 36: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/36.jpg)
2004.11.094 - SLIDE 36IS 202 - FALL 2004
1.1 Kinds of Sources
• Prearranged Sources– Existing descriptor lists, classification schemes
thesauri• This includes universal schemes like DDC or LCSH
– Nomenclatures of single disciplines– Treatises on the terminology of a field– Encyclopedias, lexica, dictionaries and glossaries– Tables of contents of textbooks and handbooks– Indexes of journals or abstracting journals– Indexes of other publications in the field
![Page 37: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/37.jpg)
2004.11.094 - SLIDE 37IS 202 - FALL 2004
1.1 Kinds of Sources
• Open-ended sources– Lists of search requests or interest profiles– Description of projects/activities to be served by the
information retrieval system– Discussion with specialists in the field– Sample of documents in the field
• Ask users why and how these documents relate to the field• Have documents indexed by experts in the field
– Lists of titles of documents in the field– Abstracts and reviews of documents– Your own knowledge
![Page 38: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/38.jpg)
2004.11.094 - SLIDE 38IS 202 - FALL 2004
Selection of Sources
• Prearranged sources require less effort in gathering the material, and may already indicate some relationships between terms and concepts and relationships among terms
• Open-ended sources can reflect current terminology and may provide more complete coverage
• Choose a set of sources that are current, as complete as possible, and considered authoritative
![Page 39: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/39.jpg)
2004.11.094 - SLIDE 39IS 202 - FALL 2004
Selection of Sources
• Each selected source is assigned an ID for tracking its use in the development of the thesaurus– Useful when making decisions about which
terms to prefer– Useful for backtracking when questions arise
(where did this come from?)
![Page 40: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/40.jpg)
2004.11.094 - SLIDE 40IS 202 - FALL 2004
Selection of Terms
• Terms can be transferred directly from prearranged sources to the recording medium (cards or database)– Have to decide which terms and references to
include, or to take the whole source
![Page 41: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/41.jpg)
2004.11.094 - SLIDE 41IS 202 - FALL 2004
Selection of Terms
• In open-ended sources you read through the source and pick out terms (i.e. words and phrases) that might be useful in retrieval or as references to other terms
• Alternatively, use keyword and phrase extraction software to create lists of terms and select from those
• Transfer selected terms to the recording medium (cards or database)
![Page 42: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/42.jpg)
2004.11.094 - SLIDE 42IS 202 - FALL 2004
2. Merging and Development of Concept Classes
• Sort Term DB into alphabetical order
• First Round– Merge information for identical terms, possibly
pulling info from additional sources
• Second Round– Merge synonyms or terms in the same
concept class
![Page 43: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/43.jpg)
2004.11.094 - SLIDE 43IS 202 - FALL 2004
3. Definition of Broad Subject Fields and Subfields
• Define broad subject fields and sort terms into these broad fields
• Define subfields within each broad field and sort terms into these subfields
• Work out the detailed structure– Select preferred terms– Merge information for terms in the same concept
class• Repeat these steps
– For each subfield within a broad field– And for each broad field– Until all terms have been consolidated and preferred
terms selected
![Page 44: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/44.jpg)
2004.11.094 - SLIDE 44IS 202 - FALL 2004
4. Development of Classificatory Structure
• Produce preliminary version of classified index and update the working database
• Improve classificatory structure
• Reality check– Produce and distribute a version of the
classified index– Distribute to users/experts
![Page 45: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/45.jpg)
2004.11.094 - SLIDE 45IS 202 - FALL 2004
5. Final Stages
• Review
• Testing
• Application
• Revision
![Page 46: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/46.jpg)
2004.11.094 - SLIDE 46IS 202 - FALL 2004
Review
• Discuss classified index with users/experts– Select descriptors and checklist descriptors
• Assign notational symbols
• Produce main thesaurus and indexes
![Page 47: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/47.jpg)
2004.11.094 - SLIDE 47IS 202 - FALL 2004
Review (cont.)
• Check cross references and insert where needed
• Produce test version
• Test by indexing
• Modify as needed
• Produce production version
![Page 48: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/48.jpg)
2004.11.094 - SLIDE 48IS 202 - FALL 2004
Testing a Thesaurus
• Assign descriptors to a sample set of NEW documents (use enough to get an idea of any gaps in the thesaurus)
• Test retrieval using sample questions and seeing how effectively the thesaurus maps to the appropriate descriptor
![Page 49: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/49.jpg)
2004.11.094 - SLIDE 49IS 202 - FALL 2004
Art and Architecture Thesaurus
• http://orange.sims.berkeley.edu/cgi-bin/flamenco/aa/Flamenco
![Page 50: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/50.jpg)
2004.11.094 - SLIDE 50IS 202 - FALL 2004
Agenda
• Facetted Classification– Traditional vs. Facetted Classification– Designing Facetted Classifications– Thesaurus Design– Assignment 6– Discussion Questions
• Action Items for Next Time
![Page 51: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/51.jpg)
2004.11.094 - SLIDE 51IS 202 - FALL 2004
Phone Project Assignments
• Photo Metadata Design (Assignment 6)– Having your application and the overall
project goals in mind, you will design a suitable metadata framework to use for annotating photos such that all photos would be accessible not only for the needs of your particular application, but also for the reusability of your photos and metadata by other applications.
![Page 52: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/52.jpg)
2004.11.094 - SLIDE 52IS 202 - FALL 2004
Agenda
• Facetted Classification– Traditional vs. Facetted Classification– Designing Facetted Classifications– Thesaurus Design– Assignment 6– Discussion Questions
• Action Items for Next Time
![Page 53: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/53.jpg)
2004.11.094 - SLIDE 53IS 202 - FALL 2004
Discussion Questions
• Paul Poling on “Broughton”– What are the major inadequacies of 19th century
classification systems which faceted classification overcomes?
– Some answers:• They don't "display very much in the way of internal logic, or
fundamental structural principles“ ineffectual at addressing the specific problems of vocabulary; they do not consider the precise relations between concepts multilingual switching difficult, particular in group/set names "fail to make adequate distinction between permanent hierarchical relationships, and relationships of syntactic association in complexes. As a result, structures are not logical (since the analysis is not rigorous), positioning of compound subjects is not predictable (since no operating rules for combination are normally present), and retrieval is unreliable"
![Page 54: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/54.jpg)
2004.11.094 - SLIDE 54IS 202 - FALL 2004
Discussion Questions
• Paul Poling on “Broughton”– The author makes the somewhat startling
claim that, "the fundamental thirteen categories have been found to be sufficient for the analysis of vocabulary in almost all areas of knowledge." Are there any exceptions to this that come to mind?
![Page 55: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/55.jpg)
2004.11.094 - SLIDE 55IS 202 - FALL 2004
Discussion Questions
• Paul Poling on “Broughton”– Broughton later notes that some aspects of
digital materials cannot be represented by the 13 categories used for the BC2 system. For use with our cameraphones, what are some categories that would need to be included? More importantly, what is the minimum set of additional categories needed?
![Page 56: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/56.jpg)
2004.11.094 - SLIDE 56IS 202 - FALL 2004
Discussion Questions
• Paul Poling on “Broughton”– Broughton states that, "There is no obvious way in
which the core vocabulary can be dealt with by machines...the initial allocation of vocabulary tocategories must be carried out intellectually." The author goes on to suggest that all but the initial category assignments can be done by a computer. How feasible is the BC2 system for the web, considering this requirement, when one considers the fairly rapidly expanding categories in so many fields of human knowledge?
![Page 57: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/57.jpg)
2004.11.094 - SLIDE 57IS 202 - FALL 2004
Discussion Questions
• Steve Chan on “Broughton”
– The category system used in BLISS/BC2 is based on a general specific ordering and on 13 functional categories. How do you think that Lakoff's ideas of base level categories, and the importance of metaphor/embodiment relate to the categories chosen in Bliss/BC2?
![Page 58: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/58.jpg)
2004.11.094 - SLIDE 58IS 202 - FALL 2004
Discussion Questions
• Steve Chan on “Broughton”– Many of the relationships in the categories fall
into types such as "is a kind of" or "is a part of". These are very similar to the predicates in WordNet. As a thought experiment, what would it take to interface WordNet into something like BC2, so that documents could be parsed for content and then automatically categorized? Would you want to let such a system generate the categories?
![Page 59: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/59.jpg)
2004.11.094 - SLIDE 59IS 202 - FALL 2004
Discussion Questions
• Scott Fisher on “Faceted Classification”– What are some different ways of ordering the
facets within a classification notation? When might one ordering be more appropriate than another? Why might the result be especially important for non-electronic documents?
![Page 60: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/60.jpg)
2004.11.094 - SLIDE 60IS 202 - FALL 2004
Discussion Questions
• Scott Fisher on “Faceted Classification”– Why is it important that characteristics of
division be mutually exclusive? Explain what might happen if they are not.
![Page 61: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/61.jpg)
2004.11.094 - SLIDE 61IS 202 - FALL 2004
Discussion Questions
• Morgan Ames on Vickery– Though facets are a powerful tool for organizing
information, they can be very time-consuming to define. Vickery describes the creation of facets, starting with the analysis of terms used by a user group, then the sorting of the terms into facets, the development of facets (depending on how often they're used), the arrangement of the facets, and finally, the establishment of a notation for the facets. Could one automate some or all of the process of defining facets for a particular area - say, an online community? If so, which parts could be automated, and how? If not, why not - what are the limitations of automation?
![Page 62: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/62.jpg)
2004.11.094 - SLIDE 62IS 202 - FALL 2004
Discussion Questions
• Morgan Ames on Vickery– How do the properties of facets compare with
the properties of relational databases?
![Page 63: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/63.jpg)
2004.11.094 - SLIDE 63IS 202 - FALL 2004
Discussion Questions
• Lilia Manguy on “Thesaurus Construction”– The reading mentions thesauri being
constructed for institutions. What are some examples of institutions with specialized thesauri? Why were they deemed necessary?
![Page 64: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/64.jpg)
2004.11.094 - SLIDE 64IS 202 - FALL 2004
Discussion Questions
• Lilia Manguy on “Thesaurus Construction”– In our field, what are some scenarios in which
a thesaurus would need to be constructed? How would you determine who would be your ‘expert’ consultants? Who would you choose?
![Page 65: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/65.jpg)
2004.11.094 - SLIDE 65IS 202 - FALL 2004
Discussion Questions
• Lilia Manguy on “Thesaurus Construction”– Using the process outlined in the reading for
constructing a thesaurus, how would you qualify whether your thesaurus is good or bad?
![Page 66: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/66.jpg)
2004.11.094 - SLIDE 66IS 202 - FALL 2004
Discussion Questions
• Christine Jones on “Card Sorting”– Considering the "vocabulary problem" laid forth in
"The Vocabulary Problem in Human System Communication," by Furnas et. al., do you think the card sorting technique is an effective approach for categorizing information for the SunWeb Intranet, i.e. do you think menus and the search function contain vocabulary users will understand? Would yourecommend any other tools for the user to increase their understanding of the SunWeb information space?
![Page 67: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/67.jpg)
2004.11.094 - SLIDE 67IS 202 - FALL 2004
Discussion Questions
• Christine Jones on “Card Sorting”– Usability studies including card sorting, icon
intuitiveness testing, card distribution to icons, and thinking aloud walkthrough were performed and the results were based in part on subjective interpretation. For example, instead of depending on formal statistics, eyeballing the data was used and when deciding whether to keep icons, the user interface designers made the final decisions. Do you think this level of subjective interpretation was justified for a project of this nature? What (if any) changes would you make to this approach if the project was a redesign or design of Sun's external Website?
![Page 68: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/68.jpg)
2004.11.094 - SLIDE 68IS 202 - FALL 2004
Discussion Questions
• Carrie Burgener on “Flamenco”– How do the search and browse functions used
by Flamenco compare to Bates’ Berry Picking Model?
![Page 69: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/69.jpg)
2004.11.094 - SLIDE 69IS 202 - FALL 2004
Discussion Questions
• Carrie Burgener on “Flamenco”– The examples in the article were collections of
images that had existing metadata associated. It has been presented in IS203 that people take pictures and generally do not organize them. How can the UI design of Flamenco be applied to photo annotation?
![Page 70: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/70.jpg)
2004.11.094 - SLIDE 70IS 202 - FALL 2004
Discussion Questions
• Carrie Burgener References for “Flamenco”– PhotoCompas: tool using Flamenco interface
• http://shark.stanford.edu:4230/cgi-bin/flamenco/mor_full/Flamenco?username=default
– Presentation by Professor Hearst • http://bailando.sims.berkeley.edu/talks/dli02.ppt
– Different article• http://www.sims.berkeley.edu/~hearst/papers/
cacm02.pdf
![Page 71: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/71.jpg)
2004.11.094 - SLIDE 71IS 202 - FALL 2004
Agenda
• Facetted Classification– Traditional vs. Facetted Classification– Designing Facetted Classifications– Thesaurus Design– Assignment 6– Discussion Questions
• Action Items for Next Time
![Page 72: Lecture 21: Facetted Classification](https://reader036.vdocuments.site/reader036/viewer/2022070401/56813627550346895d9d9fde/html5/thumbnails/72.jpg)
2004.11.094 - SLIDE 72IS 202 - FALL 2004
Homework (!)
• Assignment 6– Due Thursday, November 18
• Read
– Textbook: Organization of Information Chapters 3-5 (Taylor)• Chitra 3• Shufei 4• Jaime 5