price recommendation engine for airbnb · 2017 sas analytics day price recommendation engine for...
TRANSCRIPT
2017
SAS Analytics Day
Price Recommendation Engine for Airbnb
Praneeth Guggilla, Student, Oklahoma State University
Snigdha Gutha, Student, Oklahoma State University
Objective
• Understanding the factors influencing occupancy rate of a property.
• Analyze how price determines occupancy rate.
Data Extraction
• Extracted information related to host, price and availability of the New York listings.
• Extracted 614,128 customer reviews and ratings of all New York listings.
*Source: http://www.airbnb.com/
Data Cleaning
• Using latitude and longitude we classified all locations into five major neighborhoods(Manhattan, Bronx, Staten Island, Queens and Brooklyn).
• Appropriate transformations are applied to reduce skewness and Kurtosis.
Source: http://www.insideairbnb.com/
Percentage of Low Occupancy Listings
High Occupancy Listings = 12,531Low Occupancy Listings = 11,328Overall listings = 23,859Total Variables = 65
Text Mining
*Reference: Text Mining and Analysis: Practical Methods, Examples, and Case Studies Using SAS by
Dr.Goutam Chakraborty, Murali Pagolu, Satish Garla
Variable Selection
• Variable clustering is used to select the continuous variables.• All the categorical variables and text topics are used
Regression Model
Occupancy Model Price Model
• significant predictors – Price and room type
• Important predictors – Security deposit,
cleanliness review, neighborhood location,
and extra person fee
Significant Predictors – Security deposit and room type
Important predictors – Property type and neighborhood
Occupancy Model Text TopicsNot included
Text TopicsIncluded
Misclassification rate 39.84% 38.6%
Price Model Text TopicsNot included
Text TopicsIncluded
Adjusted R Square 68.74% 69.78%
Insights & Conclusions
Insights
• Manhattan - security deposit for high and low occupancy rate listings differed by 25%
• Bronx - average price for high and low occupancy rate listings differed by $18
• Shared apartment service had 12% higher occupancy rate than other room types
• Flexible pricing was more effective than strict and moderate pricing policy
Conclusions & Future Scope
• Price is major determining factor
• Cleanliness and reviews by other customers are other driving factors.
• Perform sentiment analysis on the text reviews
• Build optimization model to come up with optimal prices.
2017
SAS Analytics Day
Contact
Name : Praneeth GuggillaOrganization : Oklahoma State UniversityContact No: 405-780-5330Email : [email protected]: https://www.linkedin.com/in/praneethguggilla
Name : Snigdha GuthaOrganization : Oklahoma State UniversityContact No: 405-780-5330Email : [email protected]: https://www.linkedin.com/in/snigdhagutha