find my ride (wide)

17
FIND MY RIDE IMPROVING YOUR BIKE SHARE EXPERIENCE DIANE IVY

Upload: diane-ivy

Post on 12-Apr-2017

34 views

Category:

Data & Analytics


0 download

TRANSCRIPT

FIND MY RIDEIMPROVING YOUR BIKE SHARE EXPERIENCE

DIANE IVY

▸Bike shares are awesome

▸Flexible and convenient ▸ But only if a bike is available when you need it

FIND MY RIDE

▸Bike shares are awesome

▸Flexible and convenient

▸But only if a bike is available when you want it

▸ But only if a bike is available when you need it

FIND MY RIDE

▸ Can get current status from Hubway website or 3rd party apps.

▸ Often bikes will be gone by the time you get to a station.

▸ Can we help people find the best station to go to?

FIND MY RIDE

Bike Data -2 years of station data -96 bike share stations -number of bikes available every minute at each station

Weather Data -hourly historical weather data from NOAA

12am 8am 5pm0

10N

umbe

r of B

ikes

WeekdayWeekend

Residential Station

12am 8am 5pm0

10

Num

ber o

f Bik

es

WeekdayWeekend

Business Station

FIND MY RIDE

INPUT start location

and time

Googlemaps api - calculate walk time to nearby station

Forecastio - get weather forecast

MODEL random forest

classifier at each station to predict # bikes

FEATURES month

day of week time of day

(minute) precipitation temperature

holidays

FORECAST 0, 1-2, 2+ bike

availability

OUTPUT rank stations

based on walking distance and # of

bikes

FIND MY RIDE

Training

54.8%

Testing

22.6%Validation

22.6%

FIND MY RIDE▸ Randomly selected days to train, test, and

validate on

▸ Tuned random forest classifier on 55% of the data (17/31 days per month) with different hyper parameters

Training

54.8%

Testing

22.6% Validation

22.6%

FIND MY RIDE▸ Randomly selected days to train, test, and

validate on

▸ Tuned random forest classifier on 55% of the data (17/31 days per month) with different hyper parameters

▸ Selected best model based on score of predicting test data (7/31 days)

Training

54.8%

Testing

22.6%

Validation

22.6%

FIND MY RIDE▸ Randomly selected days to train, test, and

validate on

▸ Tuned random forest classifier on 55% of the data (17/31 days per month) with different hyper parameters

▸ Selected best model based on score of predicting test data (7/31 days)

▸ Final results based on validation data (7/31 days)

▸ Randomly selected days to train, test, and validate on

▸ Tuned random forest classifier on 55% of the data (17/31 days per month) with different hyper parameters

▸ Selected best model based on score of predicting test data (7/31 days)

▸ Final results based on validation data (7/31 days)

▸ Scored the model on recall score of 0 bikes

0 Bikes 1-2 Bikes 2+ BikesActual

0 Bikes

1-2 Bikes

2+ Bikes

Pred

icte

d

0.83 0.05 0.12

0.11 0.30 0.59

0.05 0.11 0.84

0.0

0.5

1.0

FIND MY RIDE

Available bikes on Tuesday at 6:20pm FIND MY RIDE

Available bikes on Tuesday at 6:20pm

What if you left earlier?

FIND MY RIDE

Diane Ivy

FIND MY RIDE

month temp time (min) day precip holidays0.0

0.1

0.2

0.3

0.4

Feature Importance

FIND MY RIDE

0 1 2 3 4 5 6 7 8 9 101112131415Actual

0123456789

101112131415

Pred

icte

d

0.0

0.5

1.0

0 Bikes 1-2 Bikes 2+ BikesActual

0 Bikes

1-2 Bikes

2+ Bikes

Pred

icte

d

0.83 0.05 0.12

0.11 0.30 0.59

0.05 0.11 0.84

0.0

0.5

1.0

FIND MY RIDE