week 4 presentation 3 insight

Post on 18-Aug-2015

6 Views

Category:

Data & Analytics

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Can you run faster?

Alexis Yelton

Runners want to run fasterWhat goal should you set for a half marathon time?

How can I improve on that time?

Data from Strava.com

Pace

Time series, demographic, and aggregated running data on 10,000 runners. 1,000 with half-marathon times and other features.

Data from Strava.com

Distance past month Weight range

Time past month Age range

Pace past month Number of rest days/wk

Distance past 6 months Number of long days/wk

Gender Sdev pace

Analysis

Benchmarking with a linear model 0.49 10 min

Nonlinear regression modeling1. Lasso regression 0.48 10 min2. Ridge regression 0.48 10 min3. Random forest regression 0.66 8.3 min 3-fold cross-validation

Regression r2

RMSE

Validation:179 runners 0.79 6.2 min

Seems to be related to a different distribution in the test set. Possibly because of importance of outliers.

Your average pace over the past month is the most important feature by far.

ResultsVariable importance

Increase in node purity

Pace past month

Distance past month

Distance past 6 months

Elevation past month

Rest days

SD pace

Weight

Long days

Age

Gender

About me: Alexis Yelton, MIT postdocChitinase in marine cyanobacteria

Chiti

nase

acti

vity

My first half marathon:1:56:30

Personal best:1:47:56

top related