Exploring Spatial and Temporal Heterogeneity of Environmental Noise in TorontoCamila Casquilho-Resende, Nick Fishbane, Seong-Hwan Jun, and Yunlong Nie
Department of Statistics, The University of British Columbia
Introduction
Objective
The objective is to analyze the variation of environmental noise across temporal and spatial dimensions. In the
process, we identify site characteristics that contribute to the noise level in Toronto.
Data
Noise was measured at 10 sites throughout Toronto over a week (or more) at 30-minute intervals. Time series
were recorded at various intervals from mid-June to early September, starting at different times in the week, with
various amplitudes and noise levels. This can be seen by the time-series from three of the ten sites. The map
shows the site locations of this case study, collected in three phases: Cycle1 (n = 241), Cycle1.rep (n = 100),
and Cycle2 (n = 213).
30
40
50
60
70
80
Wed Thu Fri Sat Sun Mon Tue Wed Thu Fri Sat Sun Mon TueTime (labels on midnight)
LEQ
Site 6
30
40
50
60
70
80
Wed Thu Fri Sat Sun Mon Tue Wed
LEQ
Site 3
30
40
50
60
70
80
Fri Sat Sun Mon Tue Wed Thu
LEQ
Site 9
43.60
43.65
43.70
43.75
43.80
43.85
−79.7 −79.6 −79.5 −79.4 −79.3 −79.2 −79.1Longtitude
Latit
ude
50 60 70 80 90LEQ
Data cycle1 cycle1.rep cycle2
Methods Overview
ITemporal variation was modeled using a linear mixed effect (LME) model with harmonic and polynomial com-
ponents
I Importance of site characteristics assessed with random forest and Dirichlet process mixture models
I Spatial variation of adjusted noise (LEQ) explored using residual kriging by assuming a Gaussian process with
Matern covariance function
IAforementioned methods reapplied on test data
Temporal Model
We assume each site to be independent since we observe no relationship between distance and noise correlation
between sites. Also, for each site we only consider the “time point” of the observations, which we define as the
time of week (of which there are 48*7 = 336). The LME is as follows:
LEQit = bi + d1(t) + d2(t)1[t∈W ] + w(t) + εit bi ∼ N (0, σ2)
dj(t) = β0j + β1j sin(2πt
24
)+ β2j cos
(2πt
24
)+ β3j sin
(2πt
12
)+β4j sin
(2πt
8
)+ β5j cos
(2πt
8
)j = 1, 2
w(t) = α1t+ α2t2 + α3t
3
The components (periods) of dj(t) were chosen based on a spectral analysis on all sites time-series individually.
Then, the combination of components and the order of the polynomial are determined by significance and AIC.
Finally, the choice of time point baseline reference (t = 0 on Sunday midnight) is chosen by AIC. The resulting
trend will come from the fixed effects which will look like d1(t)+w(t) for weekdays and d1(t)+d2(t)+w(t)
for weekends.
Describing temporal variations
The isolated harmonic functions d1(t) and d1(t) + d2(t) are shown below as well as the overall weekly trend.
We see that for weekdays, noise is fairly constant throughout work hours, rising and falling at both rush hours,
with a peak at 3:30 PM; on weekends noise levels rise later in the day, and higher at that, peaking at 6:00 PM.
48
50
52
54
0:00
2:00
4:00
6:00
8:00
10:0
0
12:0
0
14:0
0
16:0
0
18:0
0
20:0
0
22:0
0
Time
Fitt
ed L
EQ
Weekend
48
50
52
54
0:00
2:00
4:00
6:00
8:00
10:0
0
12:0
0
14:0
0
16:0
0
18:0
0
20:0
0
22:0
0
Time
Fitt
ed L
EQ
Weekday
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
−10
0
10
20
6 8 5 3 7 4 1 10 2 9SiteID (in increasing order of median residual magnitude)
Res
idua
l LE
Q
50 55 60 65 70Mean LEQ
The week-long trend shown below illustrates that Wednesday is the loudest weekday while Saturday is the loudest
day of the week. The model was able to fit approximate the observed levels, as seen by the residuals boxplots.
50.0
52.5
55.0
57.5
Sun
0:0
0
Sun
6:0
0
Sun
12:
00
Sun
18:
00
Mon
0:0
0
Mon
6:0
0
Mon
12:
00
Mon
18:
00
Tue
0:00
Tue
6:00
Tue
12:0
0
Tue
18:0
0
Wed
0:0
0
Wed
6:0
0
Wed
12:
00
Wed
18:
00
Thu
0:0
0
Thu
6:0
0
Thu
12:
00
Thu
18:
00
Fri
0:00
Fri
6:00
Fri
12:0
0
Fri
18:0
0
Sat
0:0
0
Sat
6:0
0
Sat
12:
00
Sat
18:
00
Sat
23:
30
Time of Week
Fitt
ed L
EQ
Weekday
Sun
Mon
Tue
Wed
Thu
Fri
Sat
Site Characteristics
The site characteristics can potentially obscure the spatial pattern in the LEQ noise process, thus the need for
identifying the important site characteristics.
Random Forest Regression The random forest regression implicitly models the interaction effects
and it can identify important variables.
The variable importance plot shows that the traffic count is the most important variable followed by industrial
land use (IND100M) and population density (POP500M).
REC100M
OPEN100M
COM100M
DIST_EXP
POP500M
IND100M
Total.Traffic
0 2000 4000 6000Improvement in split criterion
Data Cycle1 Cycle2
0
200
400
600
HIGH MEDIUM LOW
Total.Traffic
0
5000
10000
15000
20000
25000
HIGH MEDIUM LOW
IND100M
0
3000
6000
9000
12000
HIGH MEDIUM LOW
COM100M
0
2500
5000
7500
10000
HIGH MEDIUM LOW
REC100M
Dirichlet Process Mixture ModelsWe wish to cluster the sites based on the noise levels and identify the characteristics for each cluster, but the
number of clusters K is unknown. An appropriate method in this framework is the Dirichlet Process Mixture
Models.
Obtained three clusters, which we label as HIGH, MEDIUM, and LOW noise. High level of traffic count and
IND100M for the HIGH cluster and MEDIUM noise sites. Recreational land use (REC100M) is high for LOW
noise sites.
Describing spatial variations
Question: Describe the spatial variations; are there locations inToronto experiencing higher noise levels?It is assumed that data Z(s) = (Z(s1), . . . , Z(sn)) are a realization of a spatial process {Z(s) : s ∈ R2}.
For every site location si, the residuals of the following regression model, denoted as R(si), retain the spatial
variability of the process but some of it has been removed, as a result of the external information used in the
modelling.
Z(si) = µ(si) + ε(si), ε(si) ∼ N(0, ψ2)
µ(si) = X(si)>β
These residuals are then kriged by using Ordinary Kriging assuming a Gaussian Process with Matern covariance
model with κ = 1.5 as described below.
R(s) ∼ N(µ0, τ2I + σ2Σ(φ)).
The objective of the kriging procedure is to obtain predictions at unsampled locations using information available
elsewhere in the study domain.
Describing spatial variations
Question: Are the spatial variations of Cycle1.rep similar to Cycle1?
I If Cycle1 and Cycle1.rep follow the same spatial pattern in terms of “unexplained noise”, it is likely the kriging
prediction of Cycle1.rep obtained from the residual kriging using Cycle1 is somewhat a fair prediction of the
noise after adjusting the predictors.
IWe observe a good agreement for the two sets of residuals, so we augment Cycle1 with Cycle1.rep to provide
calibrated predictions on Toronto. These are mapped in the first contour plot to show areas of high residual
noise, such as downtown and near the Pearson airport, as expected.
Cycle 1
43.60
43.65
43.70
43.75
43.80
43.85
−79.7 −79.6 −79.5 −79.4 −79.3 −79.2 −79.1Longtitude
Latit
ude
−15−10 −5 0 5 10
Residual Noise (LEQ)
Cycle 2
43.60
43.65
43.70
43.75
43.80
43.85
−79.7 −79.6 −79.5 −79.4 −79.3 −79.2 −79.1Longtitude
Latit
ude
−10 0 10 20
Residual Noise (LEQ)
Describing spatial variations
Question: Cycles 1 and 2 similarityIThe same methods as described above for Cycle1–linear model with chosen covariates and residual kriging–is
implemented on Cycle2 to obtain a contour plot for residual noise in Toronto.
IA large porportion of LEQ variation between sites in Cycle 2 can be explained by site characteristics similarly
to Cycle 1. Random forest shows nearly identical results for the importance of site characteristics, so the cause
of spatial noise variation is comparable.
IWhen employing residual kriging, we obtain the contour map shown above which is dissimilar to that obtained
with Cycle 1. This is likely due to the complementarity of the site placements coupled with the low
long-distance prediction capability of this particular spatial model.