a cognitive psychologist's approach to data mining
TRANSCRIPT
![Page 1: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/1.jpg)
A Cognitive Psychologist's- Approach to Data Mining
- How I beat Netflix Cinematch
Maggie XiongApril 22, 2014
![Page 2: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/2.jpg)
Parallel FrameworksCognitive psychology & data mining
Case StudyThe Netflix Prize Project
General Outline
![Page 3: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/3.jpg)
Abstraction and Generalization
CategorizationPrototypeExemplarDecision boundaryTheory-based categories
Semantic space / LSAConnectionism
![Page 4: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/4.jpg)
Abstraction
Linguistic ideas (Bransford & Franks, 1971)“The ants in the kitchen ate the sweet jelly which was on the table.”“The ants in the kitchen ate the sweet jelly.”“The ants in the kitchen ate the jelly.”“The ants were in the kitchen.”Participants were more confident in “recognizing” fuller sentences.
Prototype (Posner & Keele, 1968)Participants studied instances generated from distortions of prototypes.They showed the same accuracy and response time for never-seen
prototypes and memorized instances in a later test.
![Page 5: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/5.jpg)
Categorization
Category structure (Collins & Quillian, 1969)Economy of organizationParticipants takes longer to respond to statements across category levels.Typicality
Exemplar(Jacoby & Brooks, 1984)
![Page 6: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/6.jpg)
Decision BoundaryTheory-based Categories
Decision boundary (Ashby & Gott, 1988)
Theory-based categories (Murphy & Medin, 1985)Categories organized around theories about the world.clean vs unclean foods; apples and prime numbers
![Page 7: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/7.jpg)
Semantic SpaceLatent Semantic Analysis
Shepard, 1987Probability of generalization decays exponentially with
distance.
Osgood, 1957Factor analysisEvaluative, potency, activity
Dumais et al., 1988SVD, cosine similarityLandauer & Dumais, 1997
![Page 8: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/8.jpg)
Semantic SpaceLatent Semantic Analysis
Shepard, 1987Probability of generalization decays exponentially with
distance.
Osgood, 1957Factor analysisEvaluative, potency, activity
Dumais et al., 1988SVD, cosine similarityLandauer & Dumais, 1997
![Page 9: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/9.jpg)
Connectionism
Selfridge, 1958Pandemonium
Rumelhart, McClelland, & PDP Research Group, 1986Parallel Distributed Processing, 2 Vol Set
![Page 10: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/10.jpg)
ConnectionismRumelhart & Todd, 1993
![Page 11: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/11.jpg)
Common Ground
PrototypeKmeans
ExemplarK-Nearest Neighbor
Theory-based categoriesCollaborative filtering, decision-tree
Decision boundarySupport Vector Machine
Semantic space / LSAConnectionism - artificial neural net
![Page 12: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/12.jpg)
How Cognitive Psychologists Analyze Data
Task completion rate:
Main effect of coffeeavg(10,8,10,23,18,15) - avg(12,13,10,14,15,12)
Main effect of time-of-dayavg(14,15,12,23,18,15) - avg(12,13,10,10,8,10)
Interaction [avg(23,18,15) - avg(14,15,12)]- [avg(10,8,10) - avg(12,13,10)]
1 Cup 3 Cups
Morning 12,13,10 10,8,10
Evening 14,15,12 23,18,15
![Page 13: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/13.jpg)
Graph It
Main effects and interaction
Rate
Evening
Morning
Cups of Coffee
![Page 14: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/14.jpg)
Training set17770 movies, 500K users, 100M ratings
user_id, movie_id, rating, date_of_ratingmovie_id, title, year
Probe set (1.4M ratings)
Qualifying set (2.8M ratings)user_id, movie_id, date_of_rating
RMSEsqrt( sum(X - X.pred)2 / N )Cinematch: 0.9514
The Netflix Prize Problem, 2006/10/02
0.8563 => $1M
![Page 15: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/15.jpg)
Standard Deviation and RMSE
![Page 16: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/16.jpg)
The Netflix Problem, Interpreted
Overall average movie rating: 3.620*Main effect of movie:
Miss Congeniality: avg(u1,u2,u3...)Mission Impossible: avg(u1,u2,u3...)
Main effect of user:Alex: avg(m1,m2,m3…)Brian: avg(m2,m2,m3…)
Interaction:Alex - Miss Congeniality, Mission Impossible, ...Brian - Miss Congeniality, Mission Impossible, ...
![Page 17: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/17.jpg)
RMSE, Appreciated
Overall standard deviation: 1.0822*“Trivial approach” (main effect of movie): 1.0540Main effects of movie and user: 0.9889*
R.pred = M.avg + U.dev
Cinematch: 0.9514...
...
Prize: 0.8563
![Page 18: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/18.jpg)
The Arithmetic Approach
R = M.avg + U.dev + interactioninteraction = R - (M.avg + U.dev)R.pred = M.avg + U.dev + w.avg(interaction * sim(M.p, M))
Alex R M.avg dev interactionMission Impossible 4 4.3 -0.3 4 - [4.3 + (-1.4)] = 1.1Coyote Ugly 1 3.5 -2.5 1 - [3.5 + (-1.4)] = - 1.1Miss Congeniality ? 4.5
Alex U.dev = ((4 - 4.3) + (1 - 3.5)) / 2 = -1.4sim(Miss Congeniality, Coyote Ugly) = 0.8sim(Miss Congeniality, Mission Impossible) = 0.2? = 4.5 + (-1.4) + (-1.1*0.8 + 1.1 * 0.2) / (|0.8| + |0.2|) = 2.44
![Page 19: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/19.jpg)
Similarity Measures
Romesburg, 1984Shape difference vs.Size displacement
Euclidean distanceCosine similarityCorrelation coefficient
![Page 20: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/20.jpg)
Movie Similarity
Similarity measuresCo-occurrence count
How often a person rented both movies.
CorrelationA function of the difference in ratings when a person rented both
movies.
Correlation weighted by probability (significance)Mean Euclidean distance of movie x user interactions
interaction = R - (M.avg + U.dev)
Weighted by similarities inmovie release times, rental frequencies, mean ratings
![Page 21: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/21.jpg)
User Clusters
Differentiate movie mean rating and similarityR.pred = M.cluster_avg + U.cluster_dev + w.avg(interaction * sim_cluster(M,M.p))
By experience (number of movies rated)[2,180], [81,180], [181,240], [240,400], [401,3000]
By genderInferred from preference for different movie clusters
By cluster analysisPCA, Kmeans
![Page 22: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/22.jpg)
Blend It
Generate different sets of predictions using different movie similarity and user cluster strategies
Use linear regression to combine the sets of predictions into one final prediction
Weak learners are good too, as long as they provide unique information.
![Page 23: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/23.jpg)
RMSE, 2008/04/01
Overall standard deviation: 1.0822*“Trivial approach” (main effect of movie): 1.0540Main effects of movie and user: 0.9889*
R.pred = M.avg + U.dev
Cinematch: 0.9514...
Naga FX: 0.9063...
Prize: 0.8563
![Page 24: A cognitive psychologist's approach to data mining](https://reader034.vdocuments.site/reader034/viewer/2022042614/5593e38d1a28abe8758b45de/html5/thumbnails/24.jpg)
Cognitive theories and data mining methodsPrototype K-MeansExemplar K-Nearest NeighborTheory-based categories Collaborative filtering, decision-treeDecision boundary Support Vector MachineSemantic space / LSAConnectionism - artificial neural net
Abstraction and generalizationIt’s all about similarity.
Tversky, 1977Murphy & Medin, 1985
Looking Back