currents steps to be a researcher and faculty
DESCRIPTION
* Short introduction to myself (where i am from, which are my hobbies) * Presenting my research activities in the latest 2 years, with a more detailed presentation of the last paper I wrote with Xavier Amatriain, to be presented at UMAP 2011TRANSCRIPT
1
Denis studying/working to be a faculty/researcher
(Denis Parra || Denis Parra-Santander)PhD Student
http://www.sis.pitt.edu/~dparra/
March 18th 2011 PAWS Lab – School of information Sciences – University of Pittsburgh
2
What is this presentation about?
I. A short introduction of myselfII. A description of my research interests and
what I have been doing about it in the latest years
3
I.1 Where are you from?• I am from Chile, a country that looks like a
chile pepper, but, paradoxically, people don’t eat much spicy food.
Chile ≠ [red hot chile pepper] && Chile ≠ México
4
I.2 Are you from Santiago, the capital?• Good try. One third of the 16 million Chileans lives in Santiago.
But Chile is a looong country, in the north is hot and dry, in the south is very cold. I live in Valdivia, a city with rainy weather.
Very Hot!
Very Cold!
Here I Live!Valdivia
5
I.3 Which activities do you like to do?
• I like playing tennis, running & rowing
• I like writing poetry. Check some poems here in Spanish (translated to English)
• I like reading novels, my favorite authors are J. L. Borges, Fyodor Dostoyevsky & James Joyce (right now I’m reading a Roberto Bolaño’s novel)
• I like listening to music, from Blues to Lady Gaga, passing by Pink Floyd, Radiohead and Los Jaivas.
• I like watching movies like “A Clockwork Orange” by S. Kubrick and “Underground” by E. Kusturica. I also like surrealistic movies like “The Holy Mountain” by Alejandro Jodorowsky.
6
I.4 OK, but now let’s talk about work…
• (1997 - 2002) I have BS in Engineering with emphasis in Informatics from Universidad Austral de Chile. This is a 6 year program, my undergrad thesis was titled “SPORAS: An Adaptive Web Platform based on a Multiagent System and Ontologies” (my first link to Dr. Brusilovsky’s field, Adaptive Hypermedia)
• Then, I worked in several projects of e-learning, developing on Open Source LMS such as Dokeos and Moodle (2003-2004) later on, I worked as IT Manager and consultant for an aquaculture company, Aqua Cards, in the South of Chile (2005-2007)
• I was also teaching OOP (Java), Matlab, and Introduction to Software Engineering (2004, 2006-2007)
• In 2007 I co-founded a company, Perceptum TI.
7
I.5 … and what about research?
• In 2008 I started the PhD program and I joined the PAWS lab (lead by Dr. Brusilovsky)
• … so here is where this presentation starts– Tag-based recommendations– Spreading Activation for recommender systems– Related projects
• CourseAgent• TagTheMap• Conference Navigator• Latent Communities
– This Presentation• “Walk the Talk”: Mapping explicit
8
I.6.1 Tag-based recommendations
Main topic: Lack of ratings in most items of many systems pushes to look for alternatives to apply user and item-based Collaborative Filtering. We explore 2 variants: neighbor-weighted and tag-based BM25.
1. Presented a workshop paper in HT’09, p12. Presented a short-paper (poster) at Recsys ’09,
p23. Presented a short-paper at WI 09, p3
9
I.6.2 Spreading activation
• Presented a paper in a Workshop of Recsys 2009, p4
• Look for a way to apply Spreading activation for recommendations in order to:– Make use of the multidimensional network
structure of Folksonomies (users, items, tags)– Find an scalable algorithm (compared to state-of-
the art FolkRank, SVD and LDA-based) that makes use of local topology/neighborhood
10
I.6.3. Related Projects
Course AgentAllows you to plan your courses and career, for SIS and CS students. I am giving support to current version and developing the new one.
TagTheMapA Facebook application that allows users to tag places in a map and share them with facebook friends.
Conference NavigatorA web system that allows conference attendees to plan their conference, make contacts and receive papers and contacts recommendations
Latent CommunitiesIn this project we are exploring ways to identify “hidden” communities in social networks. We are using Gephi, developing plugins for that platform.
11
Part II: so finally…
• This project is based on the work of my internship at Telefónica Research (Barcelona, Spain) in the Summer of 2010
• Paper submitted to UMAP 2011: Walk the Talk
Analyzing the relation between implicit and explicit feedback for preference elicitation
(I am co-author with Dr. Xavier Amatriain)
12
II.1 Introduction (1/2)
• Explicit feedback: scarcity (people are not especially eager to rate)
• Implicit feedback: Is less scarce, but (Hu et al., 2008)– There’s no negative feedback– Noisy– Preference v/s Confidence– Lack of evaluation metrics
… and if you watch a TV program just once or twice?
… but explicit feedback is also noisy (Amatriain et al., 2009)
… we aim to map the I.F. to preference (our mail goal)
… if we can map I.F. and E.F., we can have a comparable evaluation
13
II.1 Introduction (2/2)
• Which variables better account for the amount of times a user listens to online albums?
• Is it possible to map implicit behavior to explicit preference (ratings)?
• Study with Last.fm users: – Part I: demographics and online music consumption– Part II: Rating 100 albums collected from their last.fm
user profile
14
II.2 About last.fm
15
II.2.1 Survey Screenshots
• Requirements: 18 y.o., scrobblings > 5000
16
II.2.2 Survey Part I• Pre-req: 18 years old & 5,000 min playcount (scrobblings)• # Users: 151 users started, 127 completed, 114 after filtering outliers.• 82% were male and 18% were female. • From 23 different countries, main were Spain (25 users), U.S. (15 users), and UK
(16 users).• 80% used 20 or more hours per week of internet. 50% of users listening to
music for over 20 hours per week.• 9% did not attend music concerts. 30% went to 11 or more concerts a year.• 35% said that they only read music magazines or blogs sometimes, but 20% did
it every week.• 50% of our subjects admitted rating music online never or seldom.• 45% of our subjects said they bought 1 to 10 physical records a year. However, a
non-negligible 18% said they did not buy any.• 35% of our subjects report never buy music online, 8% say they do it once a
month or more.• 14% preferred to listen to single tracks while over 45% preferred listening to full
albums. The other 40% reported listening to music either way.
17
II.2.3 Survey Part II• For item (album) sampling, we accounted for– Implicit Feedback (IF): playcount for a user on a given item.
Changed to scale [1-3], 3 means being more listened to.– Global Popularity (GP): global playcount for all users on a
given item [1-3]. Changed to scale [1-3], 3 means being more listened to.
– Recentness (R) : time ellapsed since user played a given item. Changed to scale [1-3], 3 means being listened to more recently.
18
II.3 General Analysis
• Initial assumption: Rating and IF (# playcount) must be strongly correlated.
19
II.3.1 Distribution of ratings
Average rating:- Considering 0s:3.206316- Not considering 0s:3.611144
20
II.3.2 Implicit Feedback 5
10
21
II.3.3 Recentness5
1
0
22
II.3.4 Global Popularity
5
1
0
23
Effect of Track or CD
5
1
0
24
II.3.5 General Analysis - Findings
• We “see” strong positive correlation between ratings and implicit feedback
• We “see” some level of positive correlation between ratings and recentness
• We don’t expect a significant relations between ratings and global popularity.
• On demographic data: Just listening to track or album shows a significant effect (using ANOVA)
25
II.4 Regression Analysis
• Including Recentness increases R2 in more than 10% [ 1 -> 2]• Including GP increases R2, not much compared to RE + IF [ 1 -> 3]• Not Including GP, but including interaction between IF and RE improves
the variance of the DV explained by the regression model. [ 2 -> 4 ]
26
II.4.1 Regression Analysis
• We tested conclusions of regression analysis by predicting the score, using RMSE and 10-fold cross validations
• Results of regression analysis are supported.
27
II.4.2 Regression Analysis
• Including track or CD
• Including this variable that seemed to have an effect in the general analysis, helped to improve accuracy of the model
28
II.5 Conclusions
• Using a linear model, Implicit feedback and recentness can help to predict explicit feedback (playcount)
• Global popularity doesn’t show a significant improvement in the prediction task (discussion)
• Our model can help to relate implicit and explicit feedback, helping to evaluate and compare explicit and implicit recommender systems.
• Ongoing Work?
30
Survey part I results
31
Graphics comparing % of ratings given 2 variables