analyzing web logs

29
Analyzing Web Logs Sarah Waterson 18 April 2002 SIMS 213 Group for User Interfa ce Researc h

Upload: rafi

Post on 15-Jan-2016

46 views

Category:

Documents


0 download

DESCRIPTION

User Interface. Group. Research. for. Analyzing Web Logs. Sarah Waterson 18 April 2002 SIMS 213. Talk Outline. What is a web log? Where do they come from? Why are they relevant? How can we analyze them? Study Discussion. A record of a visit to a web page Visitor (IP address) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Analyzing Web Logs

Analyzing Web LogsSarah Waterson

18 April 2002SIMS 213

Group for

UserInterfac

e Research

Page 2: Analyzing Web Logs

SIMS 21318 April 2002

Talk Outline

What is a web log? Where do they come from? Why are they relevant? How can we analyze them?

Study Discussion

Page 3: Analyzing Web Logs

SIMS 21318 April 2002

What is a web log?

A record of a visit to a web page

Visitor (IP address) URL Time of visit Time spent on a page Browser used Referring URL

Type of request Reply code Number of bytes

in the reply etc…

A record of a visit to a web page

Page 4: Analyzing Web Logs

SIMS 21318 April 2002

What is a clickstream?

A record of a path through web pages

Visitor (IP address) URL Time of visit Time spent on a page Browser used Referring URL

Type of request Reply code Number of bytes

in the reply Next URL etc…

A record of a path through web pages

Page 5: Analyzing Web Logs

SIMS 21318 April 2002

What is a Web Log?Apache web log:205.188.209.10 - - [29/Mar/2002:03:58:06 -0800] "GET

/~sophal/whole5.gif HTTP/1.0" 200 9609 "http://www.csua.berkeley.edu/~sophal/whole.html" "Mozilla/4.0 (compatible; MSIE 5.0; AOL 6.0; Windows 98; DigExt)"

216.35.116.26 - - [29/Mar/2002:03:59:40 -0800] "GET /~alexlam/resume.html HTTP/1.0" 200 2674 "-" "Mozilla/5.0 (Slurp/cat; [email protected]; http://www.inktomi.com/slurp.html)“

202.155.20.142 - - [29/Mar/2002:03:00:14 -0800] "GET /~tahir/indextop.html HTTP/1.1" 200 3510 "http://www.csua.berkeley.edu/~tahir/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)“

202.155.20.142 - - [29/Mar/2002:03:00:14 -0800] "GET /~tahir/animate.js HTTP/1.1" 200 14261 "http://www.csua.berkeley.edu/~tahir/indextop.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)“

Page 6: Analyzing Web Logs

SIMS 21318 April 2002

Where do they come from?

Servers Done on most

web servers Standard formats

Clients

Browsers, loggers on client machine Must send data back

Proxy Log

Proxies Similar to servers Hang out in between client and server

Page 7: Analyzing Web Logs

SIMS 21318 April 2002

Why are web logs relevant?

Lots of data Quantitative analysis is much more fun!

User behavior, patterns Real users, tasks Or at least more realistic users and tasks

Leaving the usability lab Testing effect

Fast, easy, cheap Automatic or almost-automatic

Page 8: Analyzing Web Logs

SIMS 21318 April 2002

Ed Chi asks…

Usage: How has information been accessed? How frequently? What’s popular? What’s not? How do people enter the site? Exit? Where do people spend time? How long do they spend there? How do people travel within the site? Who are the people visiting?

Page 9: Analyzing Web Logs

SIMS 21318 April 2002

Ed Chi asks…

Structural: What information has been added,

deleted, modified, moved? Usage + Structural

What happens when the site changes? (Google)

Does navigation change? Does popularity change? What about missing data?

Page 10: Analyzing Web Logs

SIMS 21318 April 2002

How do you analyze web logs?

1. Data Mining: task or intent unknown “Automated extraction of hidden predictive

information from (large) databases” – Kurt Thearling

Server log analysis

2. Remote Usability Testing: task or intent known Similar to traditional lab usability testing Clickstream analysis

What are people doing?

How well does the site support what people are doing?

Page 11: Analyzing Web Logs

SIMS 21318 April 2002

How? Data MiningStatistics and numbers galore! Gazillions of tools for server log analysis

Computers>Software>Internet>Site Management> Log Analysis

Usually charts, graphs, numbers galore Analog & NetTracker typical statistics In 3D too (eBizinsights)

Page 12: Analyzing Web Logs

SIMS 21318 April 2002

How? Data Mining cont’dOther interesting work: Web Ecologies (Chi)

Development over time Information scent (Chi)

Behavior patterns Understand how to organize info

“Information scent is made of cues that people use to decide whether a path is interesting.“

Useful for web designers?

Page 13: Analyzing Web Logs

SIMS 21318 April 2002

Web Ecologies (Chi 1998)

Page 14: Analyzing Web Logs

SIMS 21318 April 2002

How? Remote Usability Testing

Analyze clickstream in the context of the task and user intentions

Can be gathered on client, server, and via proxy

Varied granularities of interaction Mouse movements page access

Varied levels of user awareness Interactive invisible

Varied levels of access Site only entire web

Page 15: Analyzing Web Logs

SIMS 21318 April 2002

How? Remote Usability TestingWebVip and VisVip

(NIST) Server side logging Javascript

instrumentation Individual paths within

context of site Animation/replay

sessionsQuestions: What part of site used

for a task? Not used? How long to finish

task? Per page? What sorts of

behavior for task?

Page 16: Analyzing Web Logs

SIMS 21318 April 2002

How? Remote Usability TestingClickViz (Blue Martini)

Server side logging Custom instrumentation Aggregate paths based

on file system Include demographics,

purchase history Filtering

Questions: How does visitor of

type X compare to type Y?

Success vs. “failure”

Page 17: Analyzing Web Logs

SIMS 21318 April 2002

How? Remote Usability Testing

NetRaker Clickstream

Vividence ClickStreams

Not restricted to servers Testing suites Interesting aggregation methods

Page 18: Analyzing Web Logs

SIMS 21318 April 2002

How? Remote Usability Testing

WebQuilt (GUIR)Logging Design Goals:

Extensible, Scalable Allow for unobtrusive, “naturalistic” user interaction Multi-platform, multi-device compatibility Fast and easy to deploy on any website

Solution: Proxy-based logger rewrites links

Nearly invisible to user Independent of client browser

Infer actions (e.g. back button clicks) Stand alone or use with other tools

Page 19: Analyzing Web Logs

SIMS 21318 April 2002

How? Remote Usability Testing

WebQuilt (GUIR)Visual Analysis Tool:

Put data within context of the design Show deviations from expected paths Interactive graph

Page 20: Analyzing Web Logs

SIMS 21318 April 2002

Study: Purpose

Exploratory comparison of lab and remote usability testing with mobile devices

What types of usability issues can we: find with either method? find with one that we can’t find with the

other?

Design implications testing tools testing strategies

Page 21: Analyzing Web Logs

SIMS 21318 April 2002

Study: The Mobile Web

Limited and/or new interaction methods Small screens Graffiti, keypads, thumb-pads

Beyond the desktop Driving, traveling, walking Noisy, public

Gathering good usability data is vital to making these interfaces, and subsequently these devices, successful.

Page 22: Analyzing Web Logs

SIMS 21318 April 2002

Study: Design 10 users asked to find:

Anti-lock brake information on the latest Nissan Sentra

The closest Nissan dealer http://pda.edmunds.com Handspring Visor Edge with

OmniSky wireless modem 5 users in the lab 5 users in the wild Web-based questionnaires

Page 23: Analyzing Web Logs

SIMS 21318 April 2002

Study: Identifying Usability Issues

Lab Data Tester observations Participant

comments Questionnaire

Remote Data Clickstream analysis Questionnaire

Severity Levels 0 indicates a

comment 15

(minorcritical)

Four Categories

Device Browser Site Design Test Design

Page 24: Analyzing Web Logs

SIMS 21318 April 2002

Study: Caveats

Analysis and observation for both tests done by same person

Issues identified from remote tests first Avoids biasing remote analysis tools

Looking for potential problem areas

Page 25: Analyzing Web Logs

SIMS 21318 April 2002

Study: Results

Totals: 18 unique issues 7 found remotely

Lab Remote

Device 4 1

Browser 2 0

Test Design 6 2

Site Design 9 5

Site Design 5 of the 9 issues 3 of the 4 with severity level > 3

1/3 device or browser related

Test Design 2 of the 6 issues 2 of the 4 with severity level > 3

Page 26: Analyzing Web Logs

SIMS 21318 April 2002

Study: Process Observations

Remote usability testing can capture some usability issues that lab

testing already discovers

Lab testing gets me: Qualitative observations Thinking aloud comments Non-content usability issues

Page 27: Analyzing Web Logs

SIMS 21318 April 2002

Study: Process Observations

What can remote testing get us that labs can’t?

Lab effect Quitting a task is easier when not in

lab Network problems more realistic

With more users Patterns emerge Can reduce uncertainty

Faster

Page 28: Analyzing Web Logs

SIMS 21318 April 2002

Study: Conclusions

Remote usability testing is a promising technique for capturing realistic

usage data for mobile web site design

Main concerns Gathering user feedback on mobile devices is even

more difficult because of limited input Understanding users can be ambiguous

Potentially alleviated by ability to test larger number of users

Page 29: Analyzing Web Logs

SIMS 21318 April 2002

Discussion

Comments Questions

Where does web log analysis fit into a design cycle?

Understanding what methods to use when and where

Experiences? These or other tools?

Design

Evaluate Prototype