tracking data from gps-enabled devices in r with package ...ucakhfr/talks/ati2015.pdf ·

35
Tracking data from GPS-enabled devices in R with package ’trackeR’ Hannah Frick, Ioannis Kosmidis http://www.ucl.ac.uk/~ucakhfr/

Upload: others

Post on 18-Jan-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Tracking data from GPS-enabled devices in R withpackage ’trackeR’

Hannah Frick, Ioannis Kosmidis

http://www.ucl.ac.uk/~ucakhfr/

Page 2: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Outline

Introduction

Package structure and capabilities

Training distribution profiles

Real data example

Summary & Outlook

Page 3: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Introduction

Aim: Access and analyse tracking data from GPS-enabled devices.

Options include:

Plenty of commercial software from manufacturers of devices(Catapult, Garmin, TomTom, Polar, ...) or apps (endomondo,Runtastic, Nike+, ...) for different audiences.

Open-source software such as Golden Cheetah which provides alot of options for analysis.

R system for statistical computing: open-source with a vast collection ofadd-on packages for a large variety of tasks.

However, relatively few packages with a focus on sports and/orGPS tracking.

Package cycleRtools provides specialised cycling analysis in R.

We aim to provide the infrastructure to handle training data fromrunning and cycling to leverage the capabilities of R.

Page 4: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Package trackeR

Computational infrastructure: R with add-on packages, mainly zoo forordered observations as well as ggplot2 and ggmap for visualisations.

Available from Github viaR> # install.packages("devtools")R> devtools::install_github("hfrick/trackeR")

Provides functionality for 4 main areas:

read data

process data

summarise & analyse training sessions

visualise training sessions

Page 5: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Read data

Input:tcx file

readTCX

data frame

trackeRdata

trackeRdataobject

Input:db3 file

readDB3

readContainer

Page 6: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Process data

All observations:

Basic cleaning, e.g., no negative speeds or distances.

Split observations into training sessions: Any observations furtherapart than a specified threshold are considered to belong todifferent training sessions. Default is 2 hours.

Per session:

Distances as recorded by the GPS device can be corrected forelevation if necessary.

Imputation of speeds: Speed 0 is imputed for time periods whenthe device is paused.

Page 7: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Imputation of speeds

Either take speed measurements as provided or calculate fromcumulative distances:

Speedj =Distancej+1 − Distancej

Timej+1 − Timej. (1)

Remove observations with negative or missing speeds.

Check time stamps of observations for gaps of more than lgap

seconds.

In these gaps, impute m observations with 0 speed, latest positionfor latitude, longitude and altitude, and missing values for anythingelse.

Add m similar observations at the start and end of the session,assuming the athlete does not immediately start or stop moving.

Update distances based on imputed speeds via (1).

Page 8: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Imputation of speeds

Tj Tj+1

Tj∗ + h Tj∗ + 2h ...

Sj

0 ...

Sj+1

Tj

Sj Sj+1

Tj+1 − Tj > lgap

lskip lskip

Tj+1

0 0

Tj∗ Tj∗+1

0

Page 9: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Example for reading and processing data

Load package and read tcx file:R> library("trackeR")R> filepath <- system.file("extdata", "2013-06-04-174137.TCX",+ package = "trackeR")R> df <- readTCX(file = filepath, timezone = "GMT")

Process data into trackeRdata object:

R> run <- trackeRdata(df,+ ## basic information+ units = NULL, cycling = FALSE,+ ## separate sessions+ sessionThreshold = 2,+ ## elevation correction+ correctDistances = FALSE, country = NULL, mask = TRUE,+ ## impute speeds+ fromDistances = TRUE, lgap = 30, lskip = 5, m = 11)

Page 10: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Example for reading and processing data

Read and process data from a single file in one step:

R> run <- readContainer(file = filepath, type = "tcx",+ units = NULL, cycling = FALSE, sessionThreshold = 2)

Read and process data from all files in a directory in one step:

R> runs <- readDirectory(directory = "~/path/to/directory/",+ aggregate = TRUE,+ speedunit = list(tcx = "m_per_s", db3 = "km_per_h"),+ distanceunit = list(tcx = "m", db3 = "km"),+ verbose = TRUE)

Page 11: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Analysis & Visualisation

Visualise sessions:Plot profiles for speed, pace, elevation level, heart rate, etc.Plot route taken during a session on various maps.

Summarise sessions.Common summary statistics such as total distance, duration, timemoving, averages of speed, pace, power, cadence and heart rate,etc.Time spent training in specific heart rate or speed zones.

Visualise summaries of sessions.

Page 12: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Visualise sessions

5: 2013−06−05 6: 2013−06−06

90

120

150

2

4

6

8

10

heart.rate [bpm]

pace [min_per_km

]

05:3

0

05:4

0

05:5

0

06:0

0

11:5

0

12:0

0

12:1

0

12:2

0

12:3

0

12:4

0

time

Page 13: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Visualise sessions

5: 2013−06−05 6: 2013−06−06

0

25

50

75

0

2

4

6

altitude [m]

speed [m_per_s]

05:3

0

05:4

0

05:5

0

06:0

0

11:5

0

12:0

0

12:1

0

12:2

0

12:3

0

12:4

0

time

Page 14: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Visualise sessions

0

2

4

6

05:3

0

05:4

0

05:5

0

06:0

0

time

spee

d [m

_per

_s]

Page 15: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Visualise sessions

0

2

4

6

05:3

0

05:4

0

05:5

0

06:0

0

time

spee

d [m

_per

_s]

Page 16: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Visualise sessions

41.35

41.36

41.37

41.38

41.39

2.14 2.15 2.16 2.17 2.18longitude

latit

ude

0

2

4

6

speed

Page 17: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Summary of session data

R> summary(runs, session = 1)

*** Session 1 ***

Session times: 2013-06-01 17:32:15 - 2013-06-01 18:37:56Distance: 14130.7 mDuration: 1.09 hoursMoving time: 1.07 hoursAverage speed: 3.59 m_per_sAverage speed moving: 3.66 m_per_sAverage pace (per 1 km): 4:38 min:secAverage pace moving (per 1 km): 4:33 min:secAverage cadence: 88.66 steps_per_minAverage cadence moving: 88.81 steps_per_minAverage power: NA WAverage power moving: NA WAverage heart rate: 141.11 bpmAverage heart rate moving: 141.12 bpmAverage heart rate resting: 135.3 bpmWork to rest ratio: 48.26

Page 18: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Aggregation of high-frequency data

Records made with high frequency, e.g., 1 or 5 Hz, generating afairly big amount of data.

Some summary needed to describe training sessions in acomparable way.

Scalar summaries: time spent above x% of maximum aerobicspeed, set of quantiles, etc.

However, data are potentially noisy and appropriate degree ofsmoothing often is not straightforward.

Training distribution profiles (Kosmidis & Passfield, 2015) extendthe idea of “time spent above”.

Page 19: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Training distribution profile

The distribution profile is defined as the curve {v ,Πu(v)|v ≥ 0} foreach session u lasting tu seconds, with

Πu(v) =

∫ tu

0I(vu(t) > v)dt

being a function of the variable v under consideration (e.g., speed orheart rate) and I(·) denoting the indicator function. This describes thetime spent training above a certain threshold.

An observed version of Πu(v) can be calculated as

Pu(v) =nu∑

j=2

(Tu,j − Tu,j−1)I(Vu,j > v)

which can be conveniently smoothed respecting the positivity andmonotonicity of Πu(v).

Page 20: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Speed profile

0.0

2.5

5.0

7.509

:00

09:1

5

09:3

0

09:4

5

10:0

0

time

spee

d [m

_per

_s]

Page 21: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Distribution profile (unsmoothed)

0

1000

2000

3000

0 4 8 12speed

time

spen

t abo

ve th

resh

old

Page 22: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Distribution profile (smoothed)

0

1000

2000

3000

0 4 8 12speed

time

spen

t abo

ve th

resh

old

Page 23: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Concentration profile

0

25

50

75

0 4 8 12speed

dtim

e

Page 24: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Real data example

27 training session by one runner in June 2013.

Data available fromhttp://www.ucl.ac.uk/~ucakhfr/data/runs_ATI.rda.

Brief summary of the data.

Calculation of distribution and concentration profiles.

Exploratory analysis of speed concentration profiles via functionalprincipal component analysis.

Page 25: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Real data example

Load and summarise data:R> library("trackeR")R> load("runs.rda")R> ## summarise sessionsR> runsSummary <- summary(runs)R> plot(runsSummary, group = c("total", "moving"),+ what = c("avgSpeed", "distance", "duration", "avgHeartRate"))

Calculate distribution and concentration profiles:R> dpRuns <- distributionProfile(runs)R> dpRunsS <- smoother(dpRuns)R> cpRuns <- concentrationProfile(dpRunsS)R> plot(cpRuns, multiple = TRUE, smooth = FALSE)

Page 26: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Real data example

● ●

● ●

●●

●● ●

● ●●

●●

●●

● ●

● ●

●●

●● ●

● ●●

●●

●●

●●

● ●

●●

● ● ●●

● ●

●● ●

● ●

● ●

● ●

● ●

●●

● ● ● ●● ●

● ●● ●

● ● ●

●●

● ●

● ●

● ●

●●

● ●

●●

●●

●●

● ●

●● ●

●● ●

●●

● ●

●● ●

● ●● ●

120130140150160170

2

3

4

5

10000

20000

40

80

120

160

avgHeartR

ate [bpm]avgS

peed [m_per_s]

distance [m]

duration [min]

Jun 03 Jun 10 Jun 17 Jun 24 Jul 01Date

type

total

moving

Page 27: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Real data example

heart.rate [bpm] speed [m_per_s]

0

200

400

0 50 100 150 200 250 0 4 8 12

dtim

eSession3

Session4

Session5

Session6

Session7

Session8

Session9

Session10

Session11

Session12

Session13

Session14

Session15

Session16

Session17

Session18

Session19

Session20

Session21

Session22

Session23

Session24

Session25

Page 28: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Functional principal components analysis

Goal: explore the structure of variability in the data and describecharacteristic features of the profiles.

Tool: functional principal components analysis (PCA).

Idea: Find a weight function such that it captures most of thevariability. This is called the first principal component (PC) or firstharmonic.

The second PC is chosen to capture most of the remainingvariability, and so on.

Do the components capture interpretable concepts?

Page 29: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Real data example

1 2 3 4

Principal component

Sha

re o

f var

ianc

e ca

ptur

ed [%

]

010

2030

4050

60

Page 30: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Real data example

0 2 4 6 8 10 12

050

100

150

PCA function 1 (Percentage of variability 62.7 )

argvals

Har

mon

ic 1

++++++++++++++++++++++++++

++++++

+

+

+

+

+

+

++++

+

+

+

+

+

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++−−−−−−−−−−−−−−−−−−−−−−−−−−

−−−−−

−−−−

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

Page 31: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Real data example

0 2 4 6 8 10 12

020

4060

8010

012

0PCA function 2 (Percentage of variability 24.8 )

argvals

Har

mon

ic 2

++++++++++++++++++++++++

+++++++

+

+

+

+

+

+++

+

+

+

+

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++−−−−−−−−−−−−−−−−−−−−−−−−−

−−−−−

−−−−−

−−−

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

Page 32: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Real data example

●●

−50

0

50

100

20 40 60 80 100durationMoving

spee

d_pc

1

Page 33: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Real data example

● ●

−50

0

50

3.5 4.0 4.5 5.0 5.5avgSpeedMoving

spee

d_pc

2

Page 34: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

Summary & Outlook

Package trackeR provides basic infrastructure in R to read,process, summarise and analyse running and cycling data fromGPS-enabled devices.

Provides an implementation of training distribution profiles(Kosmidis & Passfield, 2015) as an aggregation of session data tofunctional objects for further analysis.Further work: Extend infrastructure

Reading capabilities for more formats, e.g., gpx and fit.More analytic tools for cycling data, e.g., W’, power distributionprofile.Suggestions welcome!

Further work: Functional data analysis for training distributionprofiles.

Page 35: Tracking data from GPS-enabled devices in R with package ...ucakhfr/talks/ATI2015.pdf ·

References

Golden Cheetahhttp://www.goldencheetah.org/

Mackie J (2015). cycleRtools: Tools for Cycling Data Analysis.R package version 1.0.4.https://github.com/jmackie4/cycleRtools

Kosmidis I, Passfield L (2015). “Linking the Performance of EnduranceRunners to Training and Physiological Effects via Multi-ResolutionElastic Net.” ArXiv e-print arXiv:1506.01388.