price recommendation engine for airbnb · 2017 sas analytics day price recommendation engine for...

12
2017 SAS Analytics Day Price Recommendation Engine for Airbnb Praneeth Guggilla, Student, Oklahoma State University Snigdha Gutha, Student, Oklahoma State University

Upload: lamminh

Post on 30-Apr-2018

225 views

Category:

Documents


1 download

TRANSCRIPT

2017

SAS Analytics Day

Price Recommendation Engine for Airbnb

Praneeth Guggilla, Student, Oklahoma State University

Snigdha Gutha, Student, Oklahoma State University

Objective

• Understanding the factors influencing occupancy rate of a property.

• Analyze how price determines occupancy rate.

Data Extraction

• Extracted information related to host, price and availability of the New York listings.

• Extracted 614,128 customer reviews and ratings of all New York listings.

*Source: http://www.airbnb.com/

Data Cleaning

• Using latitude and longitude we classified all locations into five major neighborhoods(Manhattan, Bronx, Staten Island, Queens and Brooklyn).

• Appropriate transformations are applied to reduce skewness and Kurtosis.

Source: http://www.insideairbnb.com/

Percentage of Low Occupancy Listings

High Occupancy Listings = 12,531Low Occupancy Listings = 11,328Overall listings = 23,859Total Variables = 65

Text Mining

*Reference: Text Mining and Analysis: Practical Methods, Examples, and Case Studies Using SAS by

Dr.Goutam Chakraborty, Murali Pagolu, Satish Garla

Amenities Text Topics

Modeling Approach

Note: Refer paper for detailed modeling approach

Variable Selection

• Variable clustering is used to select the continuous variables.• All the categorical variables and text topics are used

Regression Model

Occupancy Model Price Model

• significant predictors – Price and room type

• Important predictors – Security deposit,

cleanliness review, neighborhood location,

and extra person fee

Significant Predictors – Security deposit and room type

Important predictors – Property type and neighborhood

Occupancy Model Text TopicsNot included

Text TopicsIncluded

Misclassification rate 39.84% 38.6%

Price Model Text TopicsNot included

Text TopicsIncluded

Adjusted R Square 68.74% 69.78%

Insights & Conclusions

Insights

• Manhattan - security deposit for high and low occupancy rate listings differed by 25%

• Bronx - average price for high and low occupancy rate listings differed by $18

• Shared apartment service had 12% higher occupancy rate than other room types

• Flexible pricing was more effective than strict and moderate pricing policy

Conclusions & Future Scope

• Price is major determining factor

• Cleanliness and reviews by other customers are other driving factors.

• Perform sentiment analysis on the text reviews

• Build optimization model to come up with optimal prices.

2017

SAS Analytics Day

Contact

Name : Praneeth GuggillaOrganization : Oklahoma State UniversityContact No: 405-780-5330Email : [email protected]: https://www.linkedin.com/in/praneethguggilla

Name : Snigdha GuthaOrganization : Oklahoma State UniversityContact No: 405-780-5330Email : [email protected]: https://www.linkedin.com/in/snigdhagutha