using mahout and a search engine for recommendation

47
1 ©MapR Technologies 2013- Multi-input Recommendations

Upload: ted-dunning

Post on 10-May-2015

4.162 views

Category:

Technology


0 download

DESCRIPTION

I presented this talk at the Open World Forum in Paris in 2013. The ideas here are that you can do basic recommendations and extended forms of recommendation such as intelligent search or cross recommendation or multi-modal recommendation using Mahout's cooccurrence analysis together with a search engine.

TRANSCRIPT

Page 1: Using Mahout and a Search Engine for Recommendation

1©MapR Technologies 2013-

Multi-input Recommendations

Page 2: Using Mahout and a Search Engine for Recommendation

2©MapR Technologies 2013-

Me, Us

Ted Dunning, Chief Application Architect, MapRCommitter PMC member, Mahout, Zookeeper, DrillBought the beer at the first HUG

MapRDistributes more open source components for HadoopAdds major technology for performance, HA, industry standard API’s

InfoHash tag - #maprSee also - @ApacheMahout @ApacheDrill

@ted_dunning and @mapR

Page 3: Using Mahout and a Search Engine for Recommendation

3©MapR Technologies 2013-

Topic For Today

What is recommendation? What makes it different? What is multi-model recommendation? How can I build it using common household items?

Page 4: Using Mahout and a Search Engine for Recommendation

4©MapR Technologies 2013-

Oh … Also This

Detailed break-down of a live machine learning system running with Mahout on MapR

With code examples

Page 5: Using Mahout and a Search Engine for Recommendation

5©MapR Technologies 2013-

I may have to summarize

Page 6: Using Mahout and a Search Engine for Recommendation

6©MapR Technologies 2013-

I may have to summarize

just a bit

Page 7: Using Mahout and a Search Engine for Recommendation

7©MapR Technologies 2013-

Part 1:5 minutes of background

Page 8: Using Mahout and a Search Engine for Recommendation

8©MapR Technologies 2013-

Part 2:10 minutes: I want a pony

Page 9: Using Mahout and a Search Engine for Recommendation

9©MapR Technologies 2013-

Part 3:Implementation Examples

Page 10: Using Mahout and a Search Engine for Recommendation

10©MapR Technologies 2013-

Page 11: Using Mahout and a Search Engine for Recommendation

11©MapR Technologies 2013-

Part 1:5 minutes of background

Page 12: Using Mahout and a Search Engine for Recommendation

12©MapR Technologies 2013-

What Does Machine Learning Look Like?

Page 13: Using Mahout and a Search Engine for Recommendation

13©MapR Technologies 2013-

Recommendations as Machine Learning

Recommendation: – Involves observation of interactions between people taking action (users)

and items for input data to the recommender model– Goal is to suggest additional appropriate or desirable interactions– Applications include: movie, music or map-based restaurant choices;

suggesting sale items for e-stores or via cash-register receipts

Page 14: Using Mahout and a Search Engine for Recommendation

14©MapR Technologies 2013-

Page 15: Using Mahout and a Search Engine for Recommendation

15©MapR Technologies 2013-

Page 16: Using Mahout and a Search Engine for Recommendation

16©MapR Technologies 2013-

Part 2:How recommenders work

(I still want a pony)

Page 17: Using Mahout and a Search Engine for Recommendation

17©MapR Technologies 2013-

Recommendations

Recap:Behavior of a crowd helps us understand what individuals will do

Page 18: Using Mahout and a Search Engine for Recommendation

18©MapR Technologies 2013-

Recommendations

Alice got an apple and a puppy

Charles got a bicycle

Alice

Charles

Page 19: Using Mahout and a Search Engine for Recommendation

19©MapR Technologies 2013-

Recommendations

Alice got an apple and a puppy

Charles got a bicycle

Bob got an apple

Alice

Bob

Charles

Page 20: Using Mahout and a Search Engine for Recommendation

20©MapR Technologies 2013-

Recommendations

What else would Bob like??

Alice

Bob

Charles

Page 21: Using Mahout and a Search Engine for Recommendation

21©MapR Technologies 2013-

Recommendations

A puppy, of course!

Alice

Bob

Charles

Page 22: Using Mahout and a Search Engine for Recommendation

22©MapR Technologies 2013-

You get the idea of how recommenders work… (By the way, like me, Bob also wants a pony)

Page 23: Using Mahout and a Search Engine for Recommendation

23©MapR Technologies 2013-

Recommendations

What if everybody gets a pony?

?

Alice

Bob

Charles

Amelia What else would you recommend for Amelia?

Page 24: Using Mahout and a Search Engine for Recommendation

24©MapR Technologies 2013-

Recommendations

?

Alice

Bob

Charles

AmeliaIf everybody gets a pony, it’s not a very good indicator of what to else predict...

Page 25: Using Mahout and a Search Engine for Recommendation

25©MapR Technologies 2013-

Problems with Raw Co-occurrence

Very popular items co-occur with everything (or why it’s not very helpful to know that everybody wants a pony…)– Examples: Welcome document; Elevator music

Very widespread occurrence is not interesting as a way to generate indicators – Unless you want to offer an item that is constantly desired, such as razor

blades (or ponies)

What we want is anomalous co-occurrence– This is the source of interesting indicators of preference on which to

base recommendation

Page 26: Using Mahout and a Search Engine for Recommendation

26©MapR Technologies 2013-

Get Useful Indicators from Behaviors

1. Use log files to build history matrix of users x items– Remember: this history of interactions will be sparse compared to all

potential combinations

2. Transform to a co-occurrence matrix of items x items

3. Look for useful co-occurrence by looking for anomalous co-occurrences to make an indicator matrix

– Log Likelihood Ratio (LLR) can be helpful to judge which co-occurrences can with confidence be used as indicators of preference

– RowSimilarityJob in Apache Mahout uses LLR

Page 27: Using Mahout and a Search Engine for Recommendation

27©MapR Technologies 2013-

Log Files

Alice

Bob

Charles

Alice

Bob

Charles

Alice

Page 28: Using Mahout and a Search Engine for Recommendation

28©MapR Technologies 2013-

History Matrix: Users by Items

Alice

Bob

Charles

✔ ✔ ✔

✔ ✔

✔ ✔

Page 29: Using Mahout and a Search Engine for Recommendation

29©MapR Technologies 2013-

Co-occurrence Matrix: Items by Items

-

1 2

1 1

1

1

2 1

How do you tell which co-occurrences are useful?.

00

0 0

Page 30: Using Mahout and a Search Engine for Recommendation

30©MapR Technologies 2013-

Co-occurrence Binary Matrix

1

1not

not

1

Page 31: Using Mahout and a Search Engine for Recommendation

31©MapR Technologies 2013-

Indicator Matrix: Anomalous Co-Occurrence

Result: The marked row will be added to the indicator field in the item document…

Page 32: Using Mahout and a Search Engine for Recommendation

32©MapR Technologies 2013-

Indicator Matrix

id: t4title: puppydesc: The sweetest little puppy ever.keywords: puppy, dog, pet

indicators: (t1)

That one row from indicator matrix becomes the indicator field in the Solr document used to deploy the recommendation engine.

Note: data for the indicator field is added directly to meta-data for a document in Solr index. You don’t need to create a separate index for the indicators.

Page 33: Using Mahout and a Search Engine for Recommendation

33©MapR Technologies 2013-

Internals of the Recommender Engine

33

Page 34: Using Mahout and a Search Engine for Recommendation

34©MapR Technologies 2013-

Internals of the Recommender Engine

34

Page 35: Using Mahout and a Search Engine for Recommendation

35©MapR Technologies 2013-

A Quick Simplification

Users who do h (a vector of things a user has done)

Also do r

User-centric recommendations(transpose translates back to things)

Item-centric recommendations(change the order of operations)

A translates things into users

Page 36: Using Mahout and a Search Engine for Recommendation

36©MapR Technologies 2013-

Nice. But we can do better?

Page 37: Using Mahout and a Search Engine for Recommendation

37©MapR Technologies 2013-

users

things

Page 38: Using Mahout and a Search Engine for Recommendation

38©MapR Technologies 2013-

users

thingtype 1

thingtype 2

Page 39: Using Mahout and a Search Engine for Recommendation

39©MapR Technologies 2013-

Page 40: Using Mahout and a Search Engine for Recommendation

40©MapR Technologies 2013-

Now again, without the scary math

Page 41: Using Mahout and a Search Engine for Recommendation

41©MapR Technologies 2013-

Part 3:Implementation Examples

Page 42: Using Mahout and a Search Engine for Recommendation

42©MapR Technologies 2013-

For example

Users enter queries (A)– (actor = user, item=query)

Users view videos (B)– (actor = user, item=video)

ATA gives query recommendation– “did you mean to ask for”

BTB gives video recommendation– “you might like these videos”

Page 43: Using Mahout and a Search Engine for Recommendation

43©MapR Technologies 2013-

The punch-line

BTA recommends videos in response to a query– (isn’t that a search engine?)– (not quite, it doesn’t look at content or meta-data)

Page 44: Using Mahout and a Search Engine for Recommendation

44©MapR Technologies 2013-

Real-life example

Query: “Paco de Lucia” Conventional meta-data search results:– “hombres del paco” times 400– not much else

Recommendation based search:– Flamenco guitar and dancers– Spanish and classical guitar– Van Halen doing a classical/flamenco riff

Page 45: Using Mahout and a Search Engine for Recommendation

45©MapR Technologies 2013-

Real-life example

Page 46: Using Mahout and a Search Engine for Recommendation

46©MapR Technologies 2013-

Search-based Recommendations

Sample document– Merchant Id– Field for text description– Phone– Address– Location

– Indicator merchant id’s– Indicator industry (SIC) id’s– Indicator offers– Indicator text– Local top40

Sample query– Current location– Recent merchant descriptions– Recent merchant id’s– Recent SIC codes– Recent accepted offers– Local top40

Original data and meta-data

Derived from cooccurrence and cross-occurrence analysis

Recommendation query

Page 47: Using Mahout and a Search Engine for Recommendation

47©MapR Technologies 2013-

Me, Us

Ted Dunning, Chief Application Architect, MapRCommitter PMC member, Mahout, Zookeeper, DrillBought the beer at the first HUG

MapRDistributes more open source components for HadoopAdds major technology for performance, HA, industry standard API’s

InfoHash tag - #maprSee also - @ApacheMahout @ApacheDrill

@ted_dunning and @mapR