making mashups with marmite

42
Making Mashups with Marmite Jeff Wong Jason I. Hong Carnegie Mellon University

Upload: derek-hood

Post on 31-Dec-2015

28 views

Category:

Documents


1 download

DESCRIPTION

Making Mashups with Marmite. Jeff Wong Jason I. Hong Carnegie Mellon University. The Big Picture Problem. Lots of content out there on the web But not always in a form amenable to your needs Ex. Easy to get a list of hotels in San Jose, not so easy to sort by distance to convention center - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Making Mashups  with Marmite

Making Mashups with Marmite

Jeff WongJason I. Hong

Carnegie Mellon University

Page 2: Making Mashups  with Marmite

The Big Picture Problem

• Lots of content out there on the web– But not always in a form amenable to your needs

– Ex. Easy to get a list of hotels in San Jose, not so easy to sort by distance to convention center

• Two observations:– In many cases, all of the data and services people need

already exist, but not connected together

– Unlikely that a web site can predict all possible needs

Page 3: Making Mashups  with Marmite

A Solution: Mashups

• Rapidly growing community of users creating “mashups” combining content from multiple web sites– Ex. Housingmaps.com

Page 4: Making Mashups  with Marmite
Page 5: Making Mashups  with Marmite
Page 6: Making Mashups  with Marmite
Page 7: Making Mashups  with Marmite

A Solution: Mashups

• Rapidly growing community of users creating “mashups” combining content from multiple web sites– Ex. Housingmaps.com

– Ex. MySpace child predators

– Ex. Friendster locations

– Ex. Most popular videos on YouTube, Yahoo Video, …

Page 8: Making Mashups  with Marmite

A Solution: Mashups

• Rapidly growing community of users creating “mashups” combining content from multiple web sites– Ex. Housingmaps.com

– Ex. MySpace child predators

– Ex. Friendster locations

– Ex. Most popular videos on YouTube, Yahoo Video, …

• ProgrammableWeb.com statistics– ~1500 mashups created since April 2005

– 356 open web-based APIs available

Page 9: Making Mashups  with Marmite

But Creating Mashups is Hard

• Requires lots of skill to create a mashup– Ex. Housingmaps creator has PhD in computer science

– Ex. MySpace child predator list took months

• Requires programming expertise in many areas– Web crawling

– Text parsing

– Pattern matching

– Databases

– HTML

Page 10: Making Mashups  with Marmite

MarmiteEnd-User Programming for Mashups

• Main idea: make it easy to create web mashups

• Use a dataflow approach connecting small operators– Inspired by Unix pipes and Apple’s Automator

• Example:– Get all events from Upcoming.org

– Filter out events that are too old

– Put them all onto a map

• Runs inside of a standard web browser

Page 11: Making Mashups  with Marmite

Set of Operators

Page 12: Making Mashups  with Marmite

Data Flow View

Page 13: Making Mashups  with Marmite

Data View

Page 14: Making Mashups  with Marmite

Using Marmite (Envisioned)

• Extract content from one or more web pages – names, addresses, dates, phone #, URLs

• Process it in a data flow manner– filtering out values or adding metadata

– integrating with other data sources (similar to a database join operation)

• Direct the output to a variety of sinks– databases, map services, text files, visualizations, web

pages, or source code that can be further edited

Page 15: Making Mashups  with Marmite

Marmite

• Motivation and Examples• Features and Design Rationale• User Evaluation

Page 16: Making Mashups  with Marmite

Features and Design Rationale

• Conducted a series of quick evaluations to understand design space and potential problems– Automator

– Lo-fi prototypes

Page 17: Making Mashups  with Marmite

Automator

Page 18: Making Mashups  with Marmite

Informal Automator Evaluation

• Had three novices try three simple web-based tasks– Warm-up task

– Traverse a set of web pages

– Download a set of images

• Some findings:– Some difficulties knowing how to start and what to do next

– Little feedback about state of system between operations

– Difficult to iterate due to network speed issues

Page 19: Making Mashups  with Marmite

Lo-Fi Prototypes

• 6 paper prototypes with 20 participants

Page 20: Making Mashups  with Marmite

Design Solutions

• Problem: how to start and what to do next• Solution: Suggest next actions

– Weak data typing to find types (addresses, numbers, etc)

– Filter operators to only show relevant ones

– Suggest operators that might be applicable

Page 21: Making Mashups  with Marmite
Page 22: Making Mashups  with Marmite

Design Solutions

• Problem: little feedback about state of system between operations

• Solution: link data flow and data view together– Many systems take program-centric view (ex. Automator)

or data-centric view (ex. spreadsheets)

– Use hybrid data flow / data view, showing an operation and its effects together

– Data view usually “spreadsheet”, other views possible too (for example, maps)

Page 23: Making Mashups  with Marmite
Page 24: Making Mashups  with Marmite
Page 25: Making Mashups  with Marmite

Design Solutions

• Problem: difficult to iterate due to network speeds• Solution: cache data, let people “replay” data

– Reload, pause, play

Page 26: Making Mashups  with Marmite

Other Design Findings

• Screen real estate issues– Collapsible operators, leaving a readable label

Page 27: Making Mashups  with Marmite

Extracting Generic Content

• Can’t have pre-defined extractor operators for every possible web site– Need a more general way of extracting data from pages

• Developed a generic wizard UI for selecting links– Content from that set could be extracted via other operators

– Uses Solvent (MIT), an XPath-based algorithm for finding patterns in web pages

• Finds “groups” of related web content based on how HTML is structured

Page 28: Making Mashups  with Marmite

Marmite

Page 29: Making Mashups  with Marmite

Operators

• Operators have input types – Operator uses this to guess which columns it wants

• Operators have output types

Page 30: Making Mashups  with Marmite

Implementation

• JavaScript (for underlying code) and Extensible Binding Language (XBL for UI)

• Operators currently in JavaScript– Ideally could be scriptable in any programming language

– Currently ~15 operators

Page 31: Making Mashups  with Marmite

Marmite

• Motivation and Examples• Features and Design Rationale• User Evaluation

Page 32: Making Mashups  with Marmite

Evaluation

• Informal user study with 6 people– 2 novices

– 2 people with spreadsheet experience (formulas)

– 2 people with programming experience

• Tasks (in increasing difficulty)– Warmup task showing how to retrieve a set of addresses

and how to geocode an address

– Search for and filter out events further than a week away

– Compile a list of events from two event services and plot them on a map

– Recreate the housingmaps site

Page 33: Making Mashups  with Marmite

Results

• Three people able to complete all tasks in ~1 hour– First two users confused about suggested actions

(automatically popped up, made manual for other 4 users)

– Novice made some progress, not able to finish all tasks

• Able to re-create housingmaps in ~15 minutes

Page 34: Making Mashups  with Marmite

Marmite

Page 35: Making Mashups  with Marmite

More Results

• Biggest barrier was understanding the data flow– Did not understand input and output concept

– Applied operators as one-off, did not realize that it was a static representation of flow

– Did not understand data flow and data view were linked

Page 36: Making Mashups  with Marmite

Future Directions

• Short-term– Better screen-scraping operators

– More operators

– Better connection with web services (WSDL and REST)

– Better help for starting a data flow

• Long-term– Intelligence analysis

– Better visualizations

– Location-based services

Page 37: Making Mashups  with Marmite

Conclusions

• Marmite, a tool for creating web-based mashups– Extract content from one or more web pages

– Process it in a data flow manner

– Direct the output to a variety of sinks

• Hybrid data flow / data view• User evaluation shows some promising results

Jeff Wong, Jason Hong, Making Mashups with Marmite: Re-purposing Web Content through End-User Programming, CHI 2007

Page 38: Making Mashups  with Marmite
Page 39: Making Mashups  with Marmite
Page 40: Making Mashups  with Marmite
Page 41: Making Mashups  with Marmite

Marmite

Page 42: Making Mashups  with Marmite

Types of Operators

• Sources– Add data into Marmite by querying databases, extracting

information from web pages, and so on.

• Processors– modify, combine, or delete existing rows. Example operators

include geocoding (converting street addresses to latitude and longitude) and filtering. Processor operators might add or remove columns as well

• Sinks– redirect the flow the data out of Marmite. Examples include

showing data on a map, saving it to a file, or to a web page.