search text mining web site usability

36
UCB HCC Retreat Search Text Mining Web Site Usability Marti Hearst SIMS

Upload: yvon

Post on 25-Feb-2016

46 views

Category:

Documents


3 download

DESCRIPTION

Search Text Mining Web Site Usability. Marti Hearst SIMS. BAILANDO Projects. Better Access to Information using Language Analysis and Novel Dynamic Organizations. Current BAILANDO Projects. CHA-CHA: Web Search results in Context LINDI: UI support for Search Text Data Mining - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Search Text Mining Web Site Usability

UCB HCC Retreat

SearchText Mining

Web Site Usability

Marti HearstSIMS

Page 2: Search Text Mining Web Site Usability

UCB HCC Retreat

BAILANDO Projects

Better Access to Information using Language Analysis and

Novel Dynamic Organizations

Page 3: Search Text Mining Web Site Usability

UCB HCC Retreat

Current BAILANDO Projects CHA-CHA:

Web Search results in Context

LINDI: UI support for Search Text Data Mining

TANGO: Automated Web Site Usability

Page 4: Search Text Mining Web Site Usability

UCB HCC Retreat

Search UIs

Combine Browsing & SearchPlace Search Results in Context

LargeCategoryHierarchies

Page 5: Search Text Mining Web Site Usability

UCB HCC Retreat

Cha-Cha Students: Mike Chen, Jamie Laflen, Jason Hong, Jimmy Lin, Shiang Chen

Page 6: Search Text Mining Web Site Usability

UCB HCC Retreat

Medical Category Hierarchy

M igraine M S

Disease

Carotid Artery Spinal Cord

Anatom y

T am oxifin Steroids

Drugs

M edicine

Page 7: Search Text Mining Web Site Usability

UCB HCC Retreat

DynaCat (Pratt, Hearst, & Fagan 99)

Page 8: Search Text Mining Web Site Usability

UCB HCC Retreat

DynaCat Study Design

Three queries 24 cancer patients Compared three interfaces

ranked list, clusters, categories Results

Participants strongly preferred categories Participants found more answers using categories Participants took same amount of time with all three interfaces

Similar results have been verified by another study by Chen and Dumais (CHI 2000)

Page 9: Search Text Mining Web Site Usability

Cat-a-Cone Interface(Hearst & Karadi 97)

Page 10: Search Text Mining Web Site Usability

UCB HCC Retreat

Improving Search via Large Category Hierarchies

How to show intersections across category

types? How to preview related categories in a user-

tailored, dynamic manner?

Page 11: Search Text Mining Web Site Usability

UCB HCC Retreat

Information retrieval

Text Data Mining

Page 12: Search Text Mining Web Site Usability

UCB HCC Retreat

Information retrieval

Selection or rejection of existing documents based on a function of word match.

Page 13: Search Text Mining Web Site Usability

UCB HCC Retreat

Text Data Mining

Relationships between information in documents can create new facts, not previously known.

Page 14: Search Text Mining Web Site Usability

UCB HCC Retreat

Imagine

You are a medical researcherYour patient hasspinal inflammationnumbness in fingerslow TC levelsnegative results for all tests

How can you help her?

Page 15: Search Text Mining Web Site Usability

UCB HCC Retreat

Idea

A new way of searching text.

Link pieces of information together

to formulate hypotheses …

Page 16: Search Text Mining Web Site Usability

UCB HCC Retreat

LINDILinking Information for New DIscoveries

Students: Barbara Rosario, David Blei Three main parts

Search UI for building and reusing hypothesis seeking strategies.

Statistical language analysis techniques for interpreting the text.

Backend for interfacing with various databases and translating different formats.

Page 17: Search Text Mining Web Site Usability

UCB HCC Retreat

Gathering Evidence

Spinal Inflammation

Numbness in fingers

Low TC Levels

Page 18: Search Text Mining Web Site Usability

UCB HCC Retreat

Gathering Evidence

Spinal Inflammation

Numbness in fingers

Low TC Levels

Find diseasesassociatedwith each

Page 19: Search Text Mining Web Site Usability

UCB HCC Retreat

Supporting Cascaded Search Operations

Spinal Inflammation

Numbness in fingers

Low TC Levels

Page 20: Search Text Mining Web Site Usability

UCB HCC Retreat

Page 21: Search Text Mining Web Site Usability

UCB HCC Retreat

New Language Analysis First use category labels to retrieve candidate

documents Then use language analysis to detect causal

relationships between concepts Title:

Magnesum deficiency implicated in increased stress levels. Interpretation:

<nutrient><reduction> related-to <increase><symptom> Use these to find relationships and formulate

hypotheses

Page 22: Search Text Mining Web Site Usability

UCB HCC Retreat

Statistical Semantic Parsing

Modern statistical techniques Mainly applied to syntactic structure

Probabilistic knowledge representation Represent hypotheses with different degrees

of certainty.

Page 23: Search Text Mining Web Site Usability

UCB HCC Retreat

Automating Assessment of

Web Site Usability

Page 24: Search Text Mining Web Site Usability

UCB HCC Retreat

Why Worry? Problem: IBM's extranet

Heavy use of help and search Unhappy users

Solution Massive web site redesign Focus on info-organization, not the purchasing

process. Cost: "in the millions"

Results Not announced or trumped up Use of "help" decreased 84% Sales increased 400%

Page 25: Search Text Mining Web Site Usability

UCB HCC Retreat

Web TANGOTool for Assessing NaviGation & Organization

Student: Melody Ivory

Goal: automated support for comparing design alternatives

How: Assess usability of the information architecture

Approximate people’s information-seeking behavior (Monte Carlo simulation)

Output quantitative usability metrics

Page 26: Search Text Mining Web Site Usability

UCB HCC Retreat

Anatomy of Web Site Design

Courtesy of Mark Newman

Information Architecture

NavigationDesign

InformationDesign

GraphicDesign

Page 27: Search Text Mining Web Site Usability

UCB HCC Retreat

Usability EvaluationStandard Techniques

User studies Have people use the interface to complete

some tasks Requires an implemented interface "Discount" vs. Scientific Results

Heuristic Evaluation An expert assesses a design or

implementation according to certain guidelines

Page 28: Search Text Mining Web Site Usability

UCB HCC Retreat

Automated Usability Evaluation Logging/capture

Pro: Easy Con: Requires implemented system Con: Don't know the user task (web) Con: Don't present alternatives Con: Don't distinguish error from success

Analytical Modeling Pro: doable at design phase Con: models an expert Con: academic exercise

Simulation

Page 29: Search Text Mining Web Site Usability

UCB HCC Retreat

Existing Metrics

Web metric analysis tools report on what is easy to measure, e.g.: Predicted download time Depth/breadth of site

We want to worry about Content User goals/tasks

Not available from logs

We also want to compare alternative designs.

Page 30: Search Text Mining Web Site Usability

UCB HCC Retreat

Monte Carlo Simulation

Have a model of information structure Have a set of user goals Want to assess navigation structure

Compare alternatives/tradeoffs Identify bottlenecks Identify critically important pages/links Check all pairs of start/end points Check overall reachability before and after a change.

Page 31: Search Text Mining Web Site Usability

UCB HCC Retreat

Monte Carlo Simulation At each step in the simulation

Assume a probability distribution over a set of next choices. The next choice is a function of:

The current goal The understandability of the choice The overall complexity of the set of choices Prior interaction history

These can use models of "scent" Varying the distribution corresponds to varying properties of

the links Spot-check important choices

Page 32: Search Text Mining Web Site Usability

UCB HCC Retreat

One Monte Carlo simulation step for Design 1, Task 1. Simulation starts from the home page and the target information is at Renter Support.

X

Page 33: Search Text Mining Web Site Usability

UCB HCC Retreat

Monte Carlo simulation results for Design 1, Task 1. Simulation runs start from all pages in the site. Average Navigation times are shown for Tasks 2 & 3.

X

Page 34: Search Text Mining Web Site Usability

UCB HCC Retreat

Using Simulator Results Design Decisions

Use Design 1 Improve Tasks 1 & 2

Next Steps Analyze results for Tasks 1

& 2 Create new Design 1 Repeat simulation to

compare old & new designs

Iterate if necessary

Design 1 Design 2 Task Time Errors Time Errors 1 41 sec 2 38 sec 4 2 38 sec 4 43 sec 5 3 32 sec 2 74 sec 6

Page 35: Search Text Mining Web Site Usability

UCB HCC Retreat

Research Issues: Navigation Predictions Develop IR model for predicting link selection

Requirements Information need (task metadata) Representation of pages (page metadata) Method for selecting links (relevance ranking) Maintaining user’s conceptual model during site traversal

(scent [Fur97,LC98,Pir97]) One possible approach

Information Foraging Theory [PC95,Pir97,PPR96] Functional categorization of pages based on features Prediction of relevance to current page

Consider link connectivity, text similarity & usage

Page 36: Search Text Mining Web Site Usability

UCB HCC Retreat

Other HCC-Related Projects

Using a large digital desk in design Ame Elliot

Using visualization for light design Dan Glaser

User interfaces and computer security Prof. Doug Tygar, Rachna Dahmija