estimating the impact of a marketing campaign on the

43
Master’s Thesis Estimating the impact of a marketing campaign on the adoption of mobile banking Using marketing mix modelling Val´ erie Vermaas Student number: 10360301 Date of final version: January 15, 2018 Master’s programme: Econometrics Specialisation: Financial Econometrics Supervisor: Prof. dr. C. G. H. Diks Second reader: Prof. dr. A. Rapp Faculty of Economics and Business

Upload: others

Post on 03-Oct-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Requirements thesis MSc in Econometrics.
1. The thesis should have the nature of a scientic paper. Consequently the thesis is divided up into a number of sections and contains references. An outline can be something like (this is an example for an empirical thesis, for a theoretical thesis have a look at a relevant paper from the literature):
(a) Front page (requirements see below)
(b) Statement of originality (compulsary, separate page)
(c) Introduction
(i) References (compulsary)
If preferred you can change the number and order of the sections (but the order you use should be logical) and the heading of the sections. You have a free choice how to list your references but be consistent. References in the text should contain the names of the authors and the year of publication. E.g. Heckman and McFadden (2013). In the case of three or more authors: list all names and year of publication in case of the rst reference and use the rst name and et al and year of publication for the other references. Provide page numbers.
2. As a guideline, the thesis usually contains 25-40 pages using a normal page format. All that actually matters is that your supervisor agrees with your thesis.
3. The front page should contain:
(a) The logo of the UvA, a reference to the Amsterdam School of Economics and the Faculty as in the heading of this document. This combination is provided on Blackboard (in MSc Econometrics Theses & Presentations).
(b) The title of the thesis
(c) Your name and student number
(d) Date of submission nal version
(e) MSc in Econometrics
1
campaign on the adoption of mobile banking
Using marketing mix modelling
Master’s programme: Econometrics
Second reader: Prof. dr. A. Rapp
Faculty of Economics and Business
i
Statement of Originality
This document is written by Student Valerie Vermaas who declares to take full responsibility for
the contents of this document. I declare that the text and the work presented in this document
is original and that no sources other than those mentioned in the text and its references have
been used in creating it. The Faculty of Economics and Business is responsible solely for the
supervision of completion of the work, not for the contents.
Contents
1.3 Return on Marketing Investment . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Data and preliminary analysis 4
2.1 Data description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Preliminary Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3 Model evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 The differences between the card services . . . . . . . . . . . . . . . . . . . . . . 25
4.3 Generalized Additive Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5 Conclusion 34
Introduction
Every year, billions are spent on marketing campaigns to convince costumers to buy products
or use services and simultaneously create brand awareness. Implicitly or explicitly, every firm
conducts a strategy on how this budget is allocated over TV, radio and different on- and offline
strategies. But determining what the perfect strategy is to make this allocation is a timely and
complex problem. The Marketing Science Institute has even acknowledged the complexity and
importance of this topic as they have appointed ‘improving multi-touch attribution, marketing
mix, and ROI models – across all media, digital and non-digital’ as one of their top research
priorities (MSI, 2016).
ABN AMRO is a big Dutch bank that realizes it could potentially make a lot more impact
when they would make their decisions on marketing campaigns more data-based. A way to
get more understanding of this problem and thus be able to optimize this strategy is to use
Marketing Mix Modelling, Mhitarean (2017): a variety of models which are suitable to explain
the variation of a dependent variable such as sales or users of a service. In addition, as Danaher
and Rust (1996) describe, the results could potentially be used to estimate and optimize the
ROI, Return On Investment, of their marketing efforts.
While the bank is interested in the effect of media on all of their campaigns, this thesis focuses
on mobile banking. One of the core values of ABN AMRO is innovation and an important part
of this consists of convincing customers to innovate with them; one of the ways to do so is to
adopt mobile banking. Shaikh and Karjaluoto (2015): In 2011, 96% of the world’s population
has a mobile phone subscription while only 8,6% has a mobile banking account, this in spite of
the many advantages of mobile banking. For example, when a customer loses his debit card he
could contact the call-center, go through a long choice-menu, wait until an employee is available,
go through some safety-checks and explain to the employee that he lost his card, wants to block
it and receive a new one. Alternatively, he could open the mobile application, press a few
buttons and within a minute the whole process is handled. Obviously the second option is
desirable for both the bank and the customer since contacting a call-center costs a lot of money
and time. Therefore there is an incentive to use marketing campaigns to move customers to
solve a problem by using the mobile application instead of contacting the call-center.
1
CHAPTER 1. INTRODUCTION 2
Marketing Mix Modelling is a complex problem. While extensive research has been done
on how to model the effectiveness of advertising on sales, little research has been conducted on
the effect of marketing on the adoption of mobile banking although this is of interest since it
is beneficial to both the bank and the customer. Moreover, most research has been done on
monthly or weekly data instead of daily data. The objective of this thesis is to use and compare
different econometric models to see which of those describes the amount of usage of cost-saving
functions (e.g. applying for a new debit card) in the online environment best and obtain insights
on the marketing dynamics. In this chapter the most relevant literature on these matters will
be discussed and we will go more into detail with respect to the practical application.
1.1 Mobile banking adoption
Mobile banking can be advantageous for both banks and customers alike. Wessels and Drennan
(2010) give examples of these advantages for customers within the mobile value settings identi-
fied by Anckar and D’Incau (2002). These settings consist of (i) critical needs and arrangements
like a forgotten bill payment, (ii) spontaneous needs and arrangements like an impulse purchase
of an item that requires the transfer of funds, (iii) efficiency needs and ambitions which con-
sists of for example the increasing of productivity during ‘dead times’ like commuting and (iv)
mobility-related needs e.g. no access to a computer.
Furthermore Wessels and Drennan (2010) did a study on the key motivators of the customer
acceptance of mobile banking and found that perceived usefulness was the most significant
motivator, thus marketing may be needed to show customers how mobile banking can fit into
their lifestyle and how useful mobile banking can be.
However, from a bank’s point of view, just as important as the better customer experience
that comes with mobile banking is the fact that mobile banking is simply cost-saving. In Chapter
2 a broad description will be done of the different self-services within the mobile application
which, since the services are processed online instead of through the call-center or an office visit,
are cost-saving.
1.2 Marketing Mix Modelling
Various models will be considered in this thesis to capture all the different dynamic effects of the
marketing variables. Hanssens et al. (2002) describe different functional forms for decreasing,
constant and increasing returns to scale. They also mention the possibility of using ‘S-shaped’
response curves, “nicely convex-concave” functions as first proposed by Ginsberg (1974) Also
GAM models could be used to model the effect of marketing effort, as shown by Bhattacharya
(2012).
Besides the question what is the correct functional form of the marketing variables, also the
time-effect has to be considered. A TV advertisement one sees today, not only has an effect
today but also a smaller effect on the days to follow. To capture the evolving character of
CHAPTER 1. INTRODUCTION 3
marketing exposure, a carry-over effect can be included with the AdStock model of Broadbent
(1984). The AdStock model is based on the distributed lag models formulated by Koyck (1954)
and Jorgenson (1966) and will be explained in Chapter 3.
Next to the modelling of the effects of scale and time of the single marketing channels,
the cross-channel effects between different marketing channels have to be taken into account
as mentioned by Dinner et al. (2014). While advertising used to be for example only on TV,
it is found that advertising on multiple channels simultaneously can strengthen the message.
(Naik and Raman, 2003) also describe these effects as synergies. Nowadays there are even more
ways to implement multichannel strategies, with the new possibilities of SEA strategies and
advertising on social media (Leeflang et al., 2015).
Moreover, in the research of Naik et al. (2005) the shares of competing brands are taken into
account. In Chapter 2 we will describe how the effect of competing marketing efforts (those of
competing banks) will be taken into account in our modelling. Also in this chapter all other
external factors that may influence the amount of self-services will be discussed.
1.3 Return on Marketing Investment
For a bank it is very useful to understand which factors have influence on the adoption of mobile
banking, what the carry-over effect is of the various marketing efforts and what the synergies
are between the various marketing channels. As discussed, the results could additionally be
used to estimate and optimize the ROI. The ROI of the marketing, or as Farris et al. (2002)
describe the Marketing Return on Investment, MROI or also mentioned as ROMI, is defined as
MROI = Incremental financial value generated by marketing− Cost of marketing
Cost of marketing . (1.1)
This allows companies to calculate how beneficial their marketing effort is. The MROI might
turn out negative, but companies should consider the fact that short-term marketing also leads
to a positive-long term effect. As Driver and Foxall (1986) describe, short-term advertising also
has a long-term effect since, among other matters, the people exposed to an advertisement are
not necessarily existing customers thus the impact of advertising can have delayed effects. Fur-
thermore, Clark et al. (2009) describe that short-term advertising is one of the most important
factors to create brand awareness and Macdonald and Sharp (2000) find in turn that brand
awareness leads to better brand performance.
More concrete, it is found that by considering also the long-term effect, the marketing effect
is doubled (Lodish and Lubetkin, 1992) or even tripled or increased by a factor four (Dyson,
2008). Moreover Naik et al. (2000) conducted a meta-analyses on 113 case studies and found
that advertising sometimes has a long-term effect that lasts longer than a year, which won’t be
captured by AdStock.
Data and preliminary analysis
The methods that will be used in this thesis will be applied to observational data from ABN-
AMRO, enriched with contextual data. The data provided by the bank are obtained from
different sources and can be divided into three categories. At first we have click data from
the mobile application and internet banking. Secondly we have access to data from the ATL,
Above The Line, media campaigns from their Marketing and Communication Dashboard. Fur-
thermore there is data on the BTL, Below the line, efforts. The data from these three sources
is supplemented with contextual data.
For this research, data from September 2016 until November 2017 is taken into account.
Unlike for example mortgages, which are sold by the bank of interest for decades, it is only
possible to go through the process of card services online for this short period. For this reason,
we cannot use data from a longer period and thus we decided to model at a daily level instead
of weekly even though in most studies weekly data is used. Moreover, all data are available on
a daily level. This provides us with 414 days and an equal amount of data points.
In this section both the bank data as well as the contextual data will be described just as
their relevant transformations and furthermore preliminary analysis will be done. At the end
of the data description an overview of all variables used will be presented together with the
expected signs of their coefficients in our model.
2.1 Data description
Click data
We have access to click data from the mobile application and internet banking. From this dataset
we can extract when a customer has reached the ‘Thank you’ page and thus has finished the
process of requesting a service in the mobile application or internet banking. In some occurrences
customers are sent from the mobile application to a section of the website and it is difficult to
extract whether the service is requested within internet banking or the mobile application.
Since the aim of the bank is to reduce store visits and calls to the call center and both internet
banking and the mobile application achieve this goal, we treat them as equal.
4
Card Service Total online requests % online of total requests
New card 458 k 57,6%
Replace card 770 k 28.4%
Block card 448 k 36.3%
Deblock card 154 k 40.5%
Change daylimit 1.32 M 89.4%
Change geolocation 665 k 92.3%
Table 2.1: Order size of the different card services
The main focus lies on six card services: request a new card, replace a card, block a card,
deblock a card, change the limit you can spend on a day and change your geographic location:
enable your card to be used out of Europe. In most instances the requests of these services
are processed within one store visit or phone call, thus can be considered as equally cost-saving
when instead processed online and therefore these services are aggregated for the first part of
this thesis. Moreover, we aggregate the services requested online over all customers per day. In
Table 2.1 the order size of the different card services requested online is given, together with
average percentage per day in the given period the online requests of this service cover of the
total requests.
ATL campaigns
ATL Campaigns perform Above The Line; mass media is used to promote brands and reach out
to the target customers. For these campaigns, it is not possible to lead back their effort to an
individual customer. The bank uses conventional forms of media such as commercials on TV
and Radio as well as print and modern forms of media such as exposure on Social media, SEA,
Search Engine Advertisement and online, bannering on websites. We purposely do not use all
the different kinds of media expenditures in our model since some are performed on such a low
scale that they won’t contribute significantly to the model.
We have knowledge of, and thus are able to use the marketing budget spent on all ATL
channels for our model. However, marketing agents base their decisions on the number of
impressions made by their marketing efforts. Moreover, it is difficult to decide which costs
should be taken into account. For example, do we divide creation costs of a campaign equally
over the days and the channels, and what if a campaign is reused? Also, when using marketing
euro’s spent in the model and the marketing costs drop due to lower prices in the TV or radio
market, the model will interpret this as a lower level of advertising, while in fact this remains
constant. Moreover, the quality of the commercial or advertisement can have an effect on the
effectiveness of the campaign, but this is left out of scope since this is difficult to measure.
For TV and Radio exposure cumulative Gross Rating Points, GRP’s, are taken into account.
Farrelly et al. (2005): The GRP’s measure the total volume of delivery of a media campaign to
a target audience. It is equal to the percentage of the target audience that is reached by the
campaign times the frequency of exposure. In 2017, the Dutch population of 18 years or older
CHAPTER 2. DATA AND PRELIMINARY ANALYSIS 6
Category Variable Unit Total Exposure Number of days
ATL TV GRP 1.12 K 35
ATL Radio GRP 1.85 K 35
ATL Online Impressions 655 B 414
ATL Social Impressions 3.72 B 414
ATL SEA Impressions 281 M 405
Table 2.2: Order size of the ATL variables
Category Variable Unit Total Exposure No. of days
BTL BM Amount sent 3.24 M 62
BTL EM Amount sent 194 K 28
BTL DM Amount sent 160 13
BTL POLS Amount shown 10.1 M 411
Table 2.3: Order size of the BTL variables
of age consists of 13.339.900 persons, which makes one GRP equal to 133.399 contacts whom
have watched one commercial. Furthermore, just like in the research of as Suarez and Estevez
(2016), the GRP’s are homogenised to a single format spot length of 20 seconds, where three
seen ads of 20 seconds are equivalent to two seen ads of 30 seconds.
For the channels Social, SEA and Online the exposure is measured in impressions. In Table
2.2 an overview of the order size of each ATL channel and the number of days of exposure is
given. There are more GRP’s spent on Radio then on TV, this is due to the fact that GRP’s
on TV are more expensive. Moreover, there are a lot more impressions Online than there are
on Social and SEA.
BTL campaigns
To the contrary of ATL, BTL campaigns can be aimed at individual customers. ABN-AMRO
uses BankMail, messages that appear in a mailbox in the mobile application, email and direct
mail, mail sent to the home address, for this purpose. From now on, these channels will be
referred to as BM, EM and DM respectively. Another channel used by the bank is POLS, these
are banners that customers can see when they are logged in to internet banking, referring to a
subject that is relevant to them.
A BM is available for reading for 60 days in the mobile app. Since not all users read their
BM on a daily basis, the number of BM ’s available in the mobile application is used in our
model. The exposure of EM is done in large batches. This means that for example at one day,
thousands of EM ’s are sent while the following weeks none will be sent. In the methodology
will be described how this property should be modelled. In Table 2.3 the order size of the BTL
variables is shown. As we can see, there is a considerable difference in the order size, especially
the exposure of DM is very small. Therefore DM is left out of the model.
CHAPTER 2. DATA AND PRELIMINARY ANALYSIS 7
Category Variable Unit Source Expected Sign Abbreviation
Click Data Services used Amount Bank Dep Var SERV
ATL TV GRP Bank + TV
ATL Radio GRP Bank + RAD
ATL Online Impressions Bank + ONL
ATL Social Impressions Bank + SOC
ATL SEA Impressions Bank + SEA
BTL BM Amount sent Bank + BM
BTL EM Amount sent Bank + EM
BTL POLS Amount shown Bank + POLS
Contextual School Holiday Dummy +/- SHOL
Contextual Holiday Dummy +/- HOL
Table 2.4: Overview of the variables used in the model
Contextual data
As one would expect, not merely advertising expenditures drive the amount of requests of
services processed online. Seasonality, trust in the bank and competitors may also have influence
and have to be taken into account. To do so, we complement the data provided by the bank
with contextual data.
Firstly, we expect that the amount of services requested are not the same level on a business
day as on a weekend day. To control for this effect we include dummy variables for six weekdays,
the seventh is left out. For the same reason we use dummies for school holidays and holidays,
as we expect that for example on Christmas day, few people will be occupied with their debit
card. Next to seasonality effects, we have to take competitors into account. We expect that
advertising of competitors for mobile banking will have a positive effect on the use of mobile
banking of the bank considered in this thesis, since we don’t expect that customers will change
banks because of this advertisement, but we expect they will consider the possibilities of mobile
banking in general. We allow for these effects using the estimated media budget spent by
competitors, ING and Rabobank, on mobile banking provided by The Nielsen company.
2.2 Preliminary Analysis
Descriptive Statistics
Fig. 2.1 shows the graphs of the number of different card services as described in the previous
section. We observe that Deblock card and Block card have very similar time series and that
Change geolocation has a peek around July, which can be explained by the fact that this is
holiday season and many people change their geolocation to use their debit cards outside of
Europe.
We perceive peculiar time series at (a) and (b). At some days little to none requests are
CHAPTER 2. DATA AND PRELIMINARY ANALYSIS 8
(a) New card (b) Replace card (c) Block card
(d) Deblock card (e) Change daylimit (f) Change geolocation
Figure 2.1: Graphs of the different card services
(a) New card (b) Replace card
Figure 2.2: The reallocated services
CHAPTER 2. DATA AND PRELIMINARY ANALYSIS 9
done, while on the next day the level is very high in comparison to the other days. Upon further
investigation it is found that due to a database error, the requests of a new card and a card
replacement aren’t correctly registered in the weekends and holidays and are falsely allocated
to the following day. To correct for this error, the assumption is made that service (a) and (b)
follow the same saturday-sunday-monday (or all successive days with a holiday) allocation as
services (c)-(f). Using this assumption, the requests of services (a) and (b) are reallocated on
the days where the database error occurred, which is shown in Fig. 2.2. In Fig. 2.3 the six
different time series of the card services are shown in one graph.
In Table 2.5 the descriptive statistics are given for all variables described in the previous
section. We observe that apart from Online, all media variables don’t have exposure on at least
one day in the given period. The graphs of the time series of all variables used in this thesis
are given in Fig. 2.4. To make the output easier to interpret, in the model the impressions
on Social, Online and SEA are included in millions and BM, EM and POLS are included in
thousands.
(d) Online (e) Social (f) SEA
(g) Bank Mail (h) Email (i) POLS
(j) ING (k) Rabobank
Figure 2.4: Graphs of the time series used in this thesis
CHAPTER 2. DATA AND PRELIMINARY ANALYSIS 11
Statistic N Mean St. Dev. Min Max
SERV 414 6.19 K 1.59 K 2.48 K 10.36 K
TV 414 2.69 9.66 0 56
RAD 414 44.75 154.36 0 799
ONL 414 158.24 M 91.63 M 30.72 K 527.35 K
SOC 414 8.98 M 18.92 M 0 107.12 M
SEA 414 67.76 M 116.17 M 0 546.92 M
BM 414 428.38 K 452.39 K 0 1.33 M
EM 414 464.31 5.18 K 0 70.75 K
POLS 414 24.40 K 22.79 K 0 137.10 K
ING 414 150.75 K 416.21 K 0 1.82 M
RABO 414 44.31 K 347.51 K 0 3.24 M
Table 2.5: Descriptive Statistics
Correlation between variables
A heat map of the correlation between the variables is given in Fig. 2.5. The red tiles signify
negative correlation whilst the blue tiles signify positive correlation. The darker the colour, the
stronger the correlation. From the heatmap we can deduce that several media variables have
positive correlations, especially TV and Social, ING and Radio and ING and Social. However,
the largest correlation is still smaller than 0.7, thus we expect this correlation is not worrisome.
The positive correlation also makes sense if we look at Fig. 2.4, since we see that the TV, Radio,
ING and Social campaign are all approximately simultaneous.
Figure 2.5: Heat map of the correlations between the variables
Chapter 3
Methodology
In the previous section we discussed that marketing mix modelling is a timely and complex
problem. In this chapter the methodology will be described of how the difficulties that are
encountered within this subject will be addressed. An appropriate functional form of the model
has to be be chosen, we have to account for seasonality and the dynamic structure of the
marketing efforts has to be considered. Moreover, we will describe on which criteria the choice
for the final model will be based and which tests will be used to validate this model.
3.1 Choice of model
Stationarity
The first step of our modelling approach is to test whether the considered time series is sta-
tionary, a necessary condition to obtain valid test results. We test for a unit root with the
augmented Dickey-Fuller (ADF) unit root test. Following the approach as in Heij et al. (2004),
we use the augmented Dickey-Fuller test equation
yt = α+ βt+ ρyt−1 + ρ1yt−1 + · · ·+ ρp−1yt−p+1 + εt (3.1)
to test for the following hypotheses:
H0 : ρ = 0 and β = 0 (stochastic trend),
H1 : ρ < 0 and β 6= 0 (deterministic trend)
using a t-test on ρ in Eviews using automated lag selection based on BIC, Bayesian Information
Criterion as proposed by Schwarz (1978), given by
BIC = log(N)k − 2 log(L),
where L is the maximum likelihood value and k the number of parameters estimated by the
model.
We have to observe the time series to decide whether it appears to have a trend and a
constant. When it has, we have to include those in the test and otherwise we have to leave
them out. The test will be performed at the 5% significance level.
12
Model selection
As described in the data section, we have a substantial number of variables that might explain
the number of requests of card services. We do want to explain as much as possible, but we
also don’t want to overparametrize the model. Therefore we choose the model with the lowest
BIC to select the optimal number of parameters and the optimal functional form.
From the descriptive statistics in the data chapter, we can deduce that there might be an
autonomous growth of the amount of services that are requested online. Therefore we will test
whether we should include a trend in the model using the BIC described earlier. Furthermore,
we can see a weekly trend in the time series thus we expect that we have to include a dummy
variable to describe this effect. It will be investigated whether it is better to include a dummy
variable for Monday to Saturday or only to indicate whether it is a business day or not, or if we
should group the days in a different way or even not include a day variable at all. Ultimately, we
will investigate whether it is better to incorporate the media spent by the competitors separately
or together, and whether it is better to include the net spent or merely use a dummy variable
to indicate whether or not there was a media campaign running that day.
Linear or log-linear
As described in the literature section, both linear and non-linear functional forms are used in
marketing mix modelling. While a linear model is simple and intuitive, it does not allow for a
non-linear return to scale of marketing efforts.
A linear model,
yt = β0 + β1x1t + . . .+ xktβk + εt, (3.2)
implies that the increase of each xt in GRP’s results in an identical increase of yt, while this is
most of the times not realistic. We expect that an exponential model,
yt = eβ0+β1x1t+β2x2t+...+εt , (3.3)
might provide a better fit which can be estimated using OLS again when the logarithms are
taken on the left- and right- hand side,
log(yt) = β0 + β1x1t + . . .+ βkxkt + εt, (3.4)
which is well-known as the log-linear model, the coefficients of which can be interpreted by
%y = 100 · ( eβi − 1
) ≈ 100βi, for βi small and i = 1, 2, . . .
As proposed by Mhitarean (2017) we can use the Box-Cox transformation
yλt − 1
λ = β0 + β1x1t + . . .+ βkxkt + εt (3.5)
on the linear model to find the appropriate functional form. When λ tends to 1, (3.5) goes to
yt = (1 + β0) + β1x1t + . . .+ βkxkt + εt, (3.6)
CHAPTER 3. METHODOLOGY 14
and the linear model is the best fit. When λ→ 0, (3.5) goes to (3.4) thus the log-linear model
is the best fit. We test whether model (3.2) or (3.4) is optimal using the Likelihood Ratio (LR)
test. When both 0 and 1 aren’t in the 95% confidence interval of λ, other functional forms
should be investigated.
The Ramsey RESET test
Also, it should be tested whether square or cubic effects should be incorporated in the model.
This can be tested using the Ramsey RESET test. As in Tsay (2005) β = (β0, β1, ..., βk) ′ to
compute the fit yt = X ′tβ with Xt = (1, x1t, ..., xkt), the residuals εt = Xt − Xt and the sum of
squared residuals SSR0 = ∑T
t=p+1 ε 2 t with T=414, the sample size. Next, the linear regression
εt = X ′tα1 +M ′tαt + υt (3.7)
is done with Mt = (x2t , x 3 t ) and the least squares residuals
υt = εt −X ′tα1 −M ′tα2
and the sum of squared residuals SSR1 = ∑T
t=k+1 υ 2 t . Our null hypothesis is that α1 and α2 of
Eq. 3.7 are equal to zero. This can be tested by the F -statistic of Eq. 3.7 given by
F = (SSR0 − SSR1)/g
SSR1/(T − k − g) with g = s+ k + 1 (3.8)
With s = 2 as we control for quadratic and cubic effects. When the null hypothesis is rejected,
this indicates that a quadratic or cubic term should be included.
AdStock
As mentioned in Chapter 1, advertising at time t has a direct effect at time t, but also a (smaller)
effect at time t+1, t+2, ... and this carry-over effect can be included with the AdStock model of
Broadbent (1984). Lags of the advertising expenditure of a marketing channel will be included
in the model using an AdStock variable
AdStockkt = f(xkt) = xkt + λkxk,t−1 + λ2kxk,t−2 + ... (3.9)
with xkt the advertising expenditure of channel k on time t and λk ∈ [0, 1) the decay factor or
retention rate of channel k which can be rewritten to
AdStockkt = λkAdStockk,t−1 + xkt. (3.10)
Besides AdStock variables to include the carry-over effects of the marketing exposure, we want
to include synergy AdStock variables in our model. As described in the literature section, there
are synergies between different marketing channels and we want to capture those in our model.
We can do so by, similar to equation 3.10, define the synergy effect between channel k and k’ as
AdStockk:k′t = λk:k′AdStockk:k′,t−1 + xktxk′,t. (3.11)
CHAPTER 3. METHODOLOGY 15
The value of λ is of course dependent of the channel of the marketing effort. Most research is
done on weekly instead of daily data on AdStock models. In a meta-analysis of 114 papers,
Assmus et al. (1984) found an average λ of 0.46 with a standard deviation of σ = 0.30 Since
we use daily data, we expect that the retention rate to be closer to one than for weekly data
considering this has to be the case to get the same half-life, log(0.5)/ log(λ). However, Abe
(1991) found that for cranberry drinks the daily retention rate was equal to 0.909 which implies
a half-life of 7.26 days which is relatively short. For SEA and interaction with SEA we do not
include AdStock variable, since one only is exposed to this channel when actively searching for
mobile banking thus the effect is expected to be direct. The same holds for POLS, customers
only see a POLS message when they are already in the online banking environment thus we do
not include AdStock for POLS.
Non-linear estimation
Since with the use of the AdStock model for media efforts our model became nonlinear, we have
to use a nonlinear optimization method to solve for
minS(β) = min T∑ t=1
(yt − yt)2 s.t. λk ∈ [0, 1). (3.12)
We choose not to use the Gauss-Newton method, since we want to set boundaries on λ and this is
not possible with the Gauss-Newton method. Therefore we estimate β = (β1 . . . βk λ1 . . . λk) T
with the Levenberg-Marquardt method, which is a combination of the Gauss-Newton and the
Gradient descend method. This is an iterative method where starting values have to be given
for β. We use λ=0.9 and the estimates of the model without AdStock as an initial guess for
β. Gavin (2017) notes that the parameter updates of the Levenberg-Marquardt algorithm vary
between the gradient descent update and the Gauss-Newton update,[ JTWJ + γI
] θlm = JTW (y− y), (3.13)
where small values of the algorithmic parameter γ result in a Gauss-Newton update and large
values of γ in a gradient descent update. In each iterative step the value of γ is adjusted. If
an iteration leads to a worse approximation, S(β + γ) > S(β) γ is increased. Otherwise, when
the approximation improves, γ is decreased. To improve convergence, Marquardt (1963) stated
that the values of γ have to be normalized to the values of JTWJ such that[ JTWJ + γ diag(JTQJ)
] θlm = JTW (y− y). (3.14)
After we have estimated the parameters using the Levenberg-Marquardt method, we save
the AdStock rates which are significantly different from zero and use those rates to re-estimate
the log-linear model. We will evaluate whether our model has improved by comparing the BIC
values and thereafter use stepwise selection where we delete the least significant variable in
each iteration until either the BIC does not improve anymore or every variable is significant at
the 5% level. Subsequently we will determine whether there exist significant interaction effects
between the media variables and whether interaction AdStock effects should be included.
CHAPTER 3. METHODOLOGY 16
3.2 Generalized Additive Models
We will also attempt to model the dynamic nature of the media variables with the use of
Generalized Additive Models, GAM. For this part of the modelling we follow the approach
and notation of Wood (2017). A generalized additive model in general has a structure which
resembles
where
µi ≡ E(Yi) and Yi ∼ some exponential family distribution.
Furthermore Yi indicates the response variable, in our case services or log(services), which one
turns out to be the best fit from the linear models. X∗i is a row of the model matrix with only
strictly parametric model components, in our case the dummy variables that indicate weekdays,
holidays and so on. θ is the corresponding parameter vector and each fj represents a smooth
function of the covariates which are not estimated strictly parametric, xk, in our case the media
variables. It is also possible to include interaction effects, as in the last term of 3.15. In this
section a brief explanation will be given of the GAM theory.
Thin plate regression splines
A smooth function fj can be estimated by choosing basis functions bij(x). Then fj has a
representation like
bij(x)βij . (3.16)
Various bases are available, in this thesis is chosen to use a thin plate regression spline basis to
find f , as default in the mgcv package of R, since it can represent smooths of more than one
predictor variable and they are considered optimal. Thin plate regression splines are based on
Thin plate splines, which estimate a smooth function g(x) by minimizing
||y − f ||2 + φJmd(f) (3.17)
where y is a vector of n observations
yi = g(xi) + εi
and f = (f(x1), f(x2), . . . , f(xn))T . Jmd(f) is a function that puts a penalty on the so-called
‘wiggliness’ of f . We want to fit the data as good as possible, but at the same time prefer
a smooth function of f . We define φ as the smoothing parameter that controls the tradeoff
between those two properties (note that the standard notation for the smoothing parameter is
λ, but we choose φ to avoid confusion since λ is already used as decay factor). If φ → ∞, the
estimate is a straight line, while if φ = 0 there is no penalty.
While the thin plate spline f is a good smoother, thin plate splines have as many unknown
parameters as there are unique predictor combinations and are therefore computationally costly.
CHAPTER 3. METHODOLOGY 17
Therefore thin plate regression splines are introduced. They divide the components of the thin
plate splines into wiggly components δ which space is truncated and zero-wiggliness components
α that remain unchanged. They are estimated with the Lanczos algorithm which has a lower
computational cost. For more background on thin plate ,regression splines and the Lanzcos
algorithm see Wood (2017).
GCV function
We have to choose the optimal smoothing estimator φ to make a tradeoff between a penalty on
wiggliness and on bad fit. This is done by using the Generalized Cross Validation, GCV score,
where the φ is chosen for which
Vg = nD(β)
(n− tr(A))2 (3.18)
is minimized, where D(β) is the deviance and tr(A) the effective degrees of freedom of the
model. The GCV score is known to overfit on occasion, therefore Kim and Gu (2004) suggest to
multiply the degrees of freedom with γ = 1.4. This can largely correct the tendency to overfit
without compromising model fit and will thus be used in this thesis to estimate the optimal
value of φ.
Model selection
The GAM method does not allow for parameter selection, therefore we start with the full model
and subsequently use stepwise selection to find the optimal amount of parameters. Moreover,
we will include AdStock variables of all media variables except for SEA and POLS. Since with
GAM we can’t determine an optimal decay factor as we can with the Levenberg-Marquardt
method, we use the decay factor as found for daily data by Abe (1991), λ = 0.909.
3.3 Model evaluation
Additional to the tests mentioned earlier, we have to perform the Jarque-Bera test for Normality
and calculate the VIF ’s, Variance Inflation Factors, to validate the model. While the former is
well-known, for reference see Heij et al. (2004), the latter is a less commonly used method and
thus will be discussed in this section.
Although in Chapter 2 we could conclude from the heat map that the correlation between
the variables was not worrisome, we plan on including AdStock transformations of the media
variables and interaction effects. To check whether or not those models suffer from severe
multicollinearity we calculate the VIF for each variable j as described in Fox (2008)
VIFj = 1
, (3.19)
where R2 j is the R2 of the regression of Xj on the other covariates,
X1 = C + α2X2 + . . .+ . . .+ αkXk + e, (3.20)
CHAPTER 3. METHODOLOGY 18
in the example that j = 1. Since the estimated variance of the βj can be expressed as
V ar(βj) = 1
1−R2 j
(n− 1)V ar(Xj) ,
the VIF indicates the impact of collinarity on the precision of βj . Thus with an VIF of four, the
standard error of βj is two times as big as it would be when the Xj was uncorrelated with the
other variables. As a ‘rule of thumb’, a VIF that exceeds ten indicates severe multicollinearity.
Chapter 4
Results
In this section the methods of the previous chapter are applied to the dataset provided by the
ABN-AMRO on the self-services within their online banking environment, which is described
in chapter 2. First the behaviour of the aggregated card services will be investigated, next the
differences between the card services and lastly the general additive models.
As described in the previous chapter, the first step in our modelling approach is to check
whether there is a stochastic, deterministic or no trend in the time series. This is tested with
the augmented Dickey-Fuller test, in Table 4.1 the ADF statistic with corresponding p-value is
reported together with the lag length of the test and whether a constant and trend was included
in the test. We observed the time series in Fig. 2.4 (a)-(f) to decide on whether or not to include
a trend. New card does not appear to have a trend, but about the trend of Change Geolocation
we are not sure. For this series we first tested for a unit root with a constant and a trend and
then, when the null hypothesis couldn’t be rejected, we tested with a constant only, which also
isn’t rejected. This indicates that Change geolocation has a unit root. When we take the first
differences of Change geolocation the null hypothesis is rejected, thus we should model the first
differences for this series. All other series have a trend included in the test and show to have
a deterministic trend as the null hypothesis of a stochastic trend is rejected. Finally, Change
daylimit is stationary.
Card Services -4.09 0.0071 15
Change geolocation -2.72 0.0719 7
(Change geolocation) −7.79 < 0.0001 6
Change daylimit -6.30 < 0.0001 13
Block card -10.88 < 0.0001 6
Deblock card -8.57 < 0.0001 12
Request new card -6.90 < 0.0001 14
Replace card -16.06 < 0.0001 5
Table 4.1: The ADF-statistic for all card services
19
4.1 The aggregated card services
The linear and log-linear model
When modelling the aggregated card services, we first want to determine whether we should
include the media expense on mobile banking of both competitors separately or combined,
whether we want to use the net spent or a dummy variable if they had a campaign on mobile
banking running and in which way we should incorporate the day of the week effect. Based on
the BIC values, it is best to include the competitors separately with a dummy variable whether
there was a campaign on mobile banking that day. Moreover, we can conclude that it is better
to incorporate a dummy for the days Monday to Saturday than only a dummy to indicate
whether or not it was a business day. However, Tuesday and Wednesday were not significantly
different from Friday, thus is chosen to group these days which also has a better BIC-value.
Finally, using the same measure, it can be concluded that there is a significant trend thus there
is an autonomous growth. The results of this linear model with these decisions processed,
SERVt = β0 + β1TRENDt + β2TVt + β3RADt + β4SOCt + β5ONLt + β6SEAt + β7RABOt
+ β8INGt + β9SHOLt + β10HOLt + β11TUEt + β12WEDTHUFRIt + β13SATt
+ β14SUNt + β15BMt + β16EMt + β17POLSt + εt, (4.1)
are shown in the first column of Table 4.2.
We conclude that all variables are significant except for Social, Online and EM and that
POLS has a significant effect, but with an unexpected sign. Next, we perform the LR test on
the Box-Cox transformation as described in 3.5. As the results of this test, displayed in Fig.
4.1, show that 0 lies in the 95% confidence interval of λ, we can conclude that the log-linear
model,
+ β7RABOt + β8INGt + β9SHOLt + β10HOLt + β11TUEt + β12WEDTHUFRIt
+ β13SATt + β14SUNt + β15BMt + β16EMt + β17POLSt + εt, (4.2)
is a better fit. The results of this model are shown in the second column of Table 4.2. The signs
and significances of the variables are equal to the linear model, except for POLS, this variable
has lost its significance. Moreover, we can see that the BIC-value of the log-linear model is
better than the BIC-value of the linear model. Finally the Ramsey RESET test is performed
for both models, the results are shown in Table 4.2. This shows that the null hypothesis of
no nonlinearity is rejected for the linear model, but not for the log-linear model. From the
full log-linear model the optimal log-linear model is defined using the R-function StepAIC with
k=log(n)=log(414) instead of k=log(2) degrees of freedom to get the model with the best BIC.
The results are shown in Table 4.3. We can see that all media variables are omitted from
the model, except for TV, Radio, SEA and BM. Moreover it shows that media exposure of
Rabobank has a positive effect while media exposure of ING has a negative effect, which might
CHAPTER 4. RESULTS 21
Linear model Log-linear model
(Intercept) 5.23 · 103∗ (2.71 · 102) 8.53 · 100∗ (4.21 · 10−2)
Trend 6.15 · 100∗ (4.29 · 10−1) 1.03 · 10−3∗ (6.68 · 10−5)
TV 1.78 · 101∗ (5.52 · 100) 2.89 · 10−3∗ (8.59 · 10−4)
RAD 7.98 · 10−1∗ (3.22 · 10−1) 1.19 · 10−4∗ (5.01 · 10−5)
SOC −1.69 · 100 (3.59 · 100) −3.77 · 10−4 (5.59 · 10−4)
ONL 7.04 · 10−3 (5.07 · 10−1) 3.79 · 10−5 (7.89 · 10−5)
SEA 2.20 · 100∗ (4.01 · 10−1) 3.18 · 10−4∗ (6.24 · 10−5)
RABO 3.34 · 102∗ (1.12 · 102) 4.70 · 10−2∗ (1.74 · 10−2)
ING −6.42 · 102∗ (1.42 · 102) −8.67 · 10−2∗ (2.21 · 10−2)
SHOL 2.72 · 102∗ (9.99 · 102) 4.12 · 10−2∗ (1.55 · 10−2)
HOL −1.40 · 103∗ (2.40 · 102) −2.59 · 10−1∗ (3.73 · 10−2)
TUE 3.87 · 102∗ (1.34 · 102) 5.94 · 10−2∗ (2.09 · 10−2)
WEDTHUFRI 5.37 · 102∗ (1.10 · 102) 8.38 · 10−2∗ (1.72 · 10−2)
SAT −1.79 · 103∗ (1.38 · 102) −3.13 · 10−1∗ (2.14 · 10−2)
SUN −2.39 · 103∗ (1.37 · 102) −4.61 · 10−1∗ (2.14 · 10−2)
BM 4.67 · 10−1∗ (1.21 · 10−1) 6.95 · 10−5∗ (1.88 · 10−5)
EM −8.09 · 10−1 (7.06 · 100) −1.13 · 10−3∗ (1.10 · 10−3)
POLS −5.52 · 100∗ (2.14 · 100) −6.51 · 10−3∗ (3.33 · 10−3)
N 414 414
R2 0.799 0.835
Standard errors in parentheses ∗ indicates significance at p < 0.05
Table 4.2: The estimates of the linear and log-linear model
Figure 4.1: The LR test on the Box-Cox transformation
CHAPTER 4. RESULTS 22
be dependent of the nature of the campaign. All media variables have a positive effect and on
Wednesday, Thursday and Friday most card services are requested digitally and on a Sunday
the least. Furthermore, on a Holiday less card services are requested digitally, while on a School
Holiday this is more.
AdStock Estimation
As described in Chapter 4, for all media variables which are included in the optimal log-linear
model, AdStock variables are included:
log(SERVt) = β0 + β1TRENDt + β2AdStock(TVt, λTV) + β3AdStock(RADt, λRAD) + β4SEAt
+ β5RABOt + β6INGt + β7SHOLt + β8HOLt + β9TUEt + β10WEDTHUFRIt
+ β11SATt + β12SUNt + β13AdStock(BMt, λBM ) + εt, (4.3)
This is estimated with the Levenberg-Marquardt method. Saving the AdStocks, the model is
re-estimated with OLS. Next, the least significant variable is removed from the model and this
is estimated using the Levenberg-Marquardt method again. This procedure is repeated until
the BIC-value does not improve anymore by removing variables. The results are shown in Table
4.3. All coefficients have the same sign as in the optimal loglinear model, Rabobank and BM
are removed in the optimal model with AdStock. TV has an optimal AdStock of 0.000, which
indicates that there is approximately no carry-over effect. Radio has a carry-over effect of 0.9454
which indicates a half-time of 12.3 days.
Next, for the media variables of that are included in the optimal log-linear model, both the
AdStock as the interaction AdStock is included:
log(SERVt) = β0 + β1TRENDt + β2AdStock(TVt, λTV ) + β3AdStock(RADt, λRAD) + β4SEAt
+ β5RABOt + β6INGt + β7SHOLt + β8HOLt + β9TUEt + β10WEDTHUFRIt
+ β11SATt + β12SUNt + β13AdStock(BMt, λBM) + β14AdStock(TVt · RADt, λTVRAD)
+ β15AdStock(TVt · BMt, λTV BM ) + β16AdStock (RADt · BMt, λRADBM ) (4.4)
+ β17SEA · TV + β18SEA · RAD + β19SEA · BM + εt
For SEA, interaction effects with the other media variables are included instead of the interaction
AdStock. The optimal model is determined with the same procedure as for the optimal log-
linear model with only AdStock. The results are shown in Table 4.3. The signs are equal to the
log-linear model and log-linear model with AdStock except for the AdStock variable of Radio,
which is negative in the model with interactions. Radio and BM have a positive synergy just
like SEA and Radio, while the other synergies are negative. However, also because the BIC-
value is lower, we expect this model to be over-parametrized. This can lead to poor estimates
which can explain the negative value of the AdStock for Radio.
As described in the methodology, several tests will be performed to evaluate the models.
First, the variance inflation factor is given. In the absence of multicollinearity, this should be
CHAPTER 4. RESULTS 23
Log-linear Log-linear AdStock Log-linear AdS and Int.
(Intercept) 8.52 · 100∗ (3.57 · 10−2) 8.56 · 100∗ (2.73 · 10−2) 8.47 · 100∗ (2.90 · 10−2)
Trend 1.06 · 10−3∗ (6.04 · 10−2) 9.51 · 10−4∗ (5.53 · 10−2) 1.02 · 10−3∗ (8.68 · 10−5)
TV 2.61 · 10−3∗ (7.19 · 10−4)
RAD 1.24 · 10−4∗ (4.51 · 10−5)
SEA 3.14 · 10−4∗ (6.05 · 10−5) 3.77 · 10−4∗ (5.47 · 10−5) 3.49 · 10−4∗ (9.33 · 10−5)
RABO 3.71 · 10−2∗ (1.69 · 10−2) 3.62 · 10−2 (1.67 · 10−2)
ING −8.55 · 10−2∗ (2.18 · 10−2) −6.32 · 10−2∗ (2.00 · 10−2)
SHOL 4.06 · 10−2∗ (1.44 · 10−2) 3.08 · 10−2∗ (1.51 · 10−2)
HOL −2.65 · 10−1∗ (3.69 · 10−2) −2.51 · 10−1∗ (3.53 · 10−2) −2.67 · 10−1∗ (3.56 · 10−2)
TUE 6.06 · 10−2∗ (2.08 · 10−2) 6.34 · 10−2∗ (2.00 · 10−2) 6.14 · 10−2∗ (1.93 · 10−2)
WEDTHUFRI 8.43 · 10−2∗ (1.71 · 10−2) 8.84 · 10−2∗ (1.64 · 10−2) 8.95 · 10−2∗ (1.58 · 10−2)
SAT −3.06 · 10−1∗ (2.10 · 10−2) −3.03 · 10−1∗ (2.01 · 10−2) −3.04 · 10−1∗ (1.95 · 10−2)
SUN −4.53 · 10−1∗ (2.10 · 10−2) −4.52 · 10−2∗ (2.01 · 10−2) −4.54 · 10−2∗ (1.94 · 10−2)
BM 5.17 · 10−5∗ (1.56 · 10−5) 2.85 · 10−6∗ (2.25 · 10−5)
AdSTV 1.58 · 10−3∗ (5.91 · 10−4) 1.35 · 10−2∗ (2.75 · 10−3)
AdSRAD 3.00 · 10−5∗ (3.33 · 10−6) −2.10 · 10−4∗ (4.29 · 10−5)
AdSRADTV −9.83 · 10−8 (1.55 · 10−7)
AdSRADBM 1.73 · 10−7∗ (2.52 · 10−8)
AdSTV BM −9.81 · 10−6∗ (2.36 · 10−6)
SEA*RAD 4.53 · 10−6 (6.78 · 10−6)
SEA*TV −2.20 · 10−4∗ (1.10 · 10−4)
SEA*BM −4.23 · 10−8 (1.10 · 10−7)
λTV 0.0000 0.0000
λRAD 0.9454 0.7967
BIC -552.2811 -599.7158 -590.4975
Standard errors in parentheses
AdSα is an abbreviation for AdStock(α, λα) ∗ indicates significance at p < 0.05
Table 4.3: The optimal models with and without AdStock and interaction
smaller than ten for all variables. From the results given in Table 4.4 we can conclude that
the VIF’s exceed this boundary for the AdStock variables in the model with interaction. Thus
the multicollinearity in this model is severe and the estimates are less reliable. Moreover, the
Jaruqe-Bera test is performed on the residuals of the three optimal models and their qq-plots
are included. The Ajusted Jarque Bera statistics in Table 4.5 indicate that for all models, the
Null hypothesis of normality is rejected. From their qq-plots in Fig. 4.2 we can also distinguish
CHAPTER 4. RESULTS 24
Variable VIF Optimal VIF Optimal AdStock VIF Optimal AdStock interaction
Trend 1.680 1.524 4.048
Model AJB statistic p-value
Log-linear with AdStock and interaction 18.487 0.005
Table 4.5: The AJB statistics
(a) Log-linear (b) with AdStock (c) with AdStock and interaction
Figure 4.2: The qq-plots of the residuals of the optimal models
CHAPTER 4. RESULTS 25
4.2 The differences between the card services
In this section the different card services will be modelled and the differences between the card
services will be reviewed. Are the same media variables significant, and do they exhibit the
same weekly trend? Table 4.1 showed that Change daylimit, Block card, Deblock card and
Request card have a deterministic trend thus a trend is included, Request new card is stationary
and Change geolocation has a stochastic trend thus the first differences are modelled.
Firstly, for every series the optimal day grouping is found. The results of the models with
all variables are shown in Table 4.6. The day-of-the-week effects are left out of the table for
readability. Subsequently, StepAIC is used again to determine the optimal models, the results
can be found in Table 4.7.
For the different time series of the separate card services it could be investigated whether
it is better to use a log-linear functional form instead of a linear, and interaction effects and
AdStock could be included. This might give a better fit, but the main objective of this thesis is
to model the total of card services since they are all equally cost-saving. The reason to model
them separately is to see whether or not the media variables had influence on all card services.
In chapter 5 conclusions will be drawn from the results.
CHAPTER 4. RESULTS 26
(Intercept) 2200.2829∗ 322.2906∗ 76.2672∗ 305.3080∗ 503.5507∗ 505.6731∗
(183.2910) (16.9314) (8.4518) (36.1962) (30.8149) (102.3529)
Trend 2.8162∗ 0.6669∗ 0.4446∗ 1.4629∗
(0.2904) (0.0268) (0.0134) (0.1714)
(3.7338) (0.3449) (0.1722) (0.9458) (0.8054) (2.2086)
RAD 0.2720 0.0083 −0.0077 0.0455 0.0678 −0.1175
(0.2178) (0.0200) (0.0100) (0.0551) (0.0469) (0.1291)
SOC 3.2294 −0.0800 −0.0230 0.0316 −1.4338∗ 0.0199
(2.4300) (0.2248) (0.1121) (0.6156) (0.5242) (1.4361)
ONL 0.2884 0.0195 −0.0054 −0.0378 0.0186 0.0179
(0.3431) (0.0316) (0.0158) (0.0834) (0.0710) (0.2029)
SEA 1.4931∗ 0.0653∗ 0.0521∗ 0.0761 0.1009 0.0826
(0.2712) (0.0251) (0.0125) (0.0654) (0.0556) (0.1605)
RABO 251.6827∗ −23.0756∗ −4.9439 −20.0777 −68.4580∗ 3.0007
(75.7393) (7.0050) (3.4925) (18.9646) (16.1364) (44.8146)
ING −333.1837∗ −11.8842 −5.4291 8.5594 −8.4679 7.0272
(96.2118) (8.8722) (4.4365) (24.1685) (20.5630) (56.9166)
SHOL −10.0219 7.5666 1.4369 −25.4498 51.6175∗ 18.9937
(67.5796) (6.2496) (3.1162) (16.9624) (14.4376) (40.0163)
HOL −593.5460∗ −23.4586 −20.9491∗ −108.5370∗ −395.2507∗ −215.1816∗
(162.3170) (15.0085) (7.4847) (41.2054) (35.0156) (95.8346)
BM 0.2089∗ 0.0293∗ 0.0114∗ −0.0126 −0.0517∗ 0.0634
(0.0817) (0.0075) (0.0038) (0.0207) (0.0177) (0.0484)
EM 1.4271 −0.0689 0.0818 −1.3443 −1.4420 −0.8278
(4.7789) (0.4410) (0.2204) (1.2116) (1.0312) (2.8285)
POLS 0.3218 −0.7583∗ −0.4449∗ −0.0415 0.1192 −0.9920
(1.4463) (0.1309) (0.0667) (0.3457) (0.2943) (0.8578)
N 414 414 414 414 414 414
R2 0.7197 0.8019 0.8667 0.5916 0.8775 0.3664
adj. R2 0.7077 0.7939 0.8610 0.5741 0.8726 0.3441
Resid. sd 492.3983 45.5420 22.7052 124.7319 106.2199 291.8329
BIC 6403.98 4427.817 3856.478 5267.036 5129.031 5955.895
Standard errors in parentheses ∗ indicates significance at p < 0.05
Table 4.6: The linear models of the different card services
CHAPTER 4. RESULTS 27
(Intercept) 2297.2118∗ 307.5812∗ 63.8916∗ 1922.5529∗ 487.9578∗ 520.3316∗
(144.1702) (11.2458) (4.7336) (60.7113) (19.3768) (29.8830)
Trend 2.7726∗ 0.6743∗ 0.4510∗ 1.4470∗
(0.2462) (0.0244) (0.0121) (0.1216)
(71.6596) (6.2053) (33.4496) (14.3336)
(0.0630) (0.0060) (0.0027) (0.0355) (0.0125)
POLS −0.7506∗ −0.4114∗ −5.3328∗
adj. R2 0.7065 0.7942 0.8607 0.5621 0.8706 0.3455
Resid. sd 493.4272 45.5151 22.7332 224.0023 107.0359 291.5225
BIC 6370.781 4387.382 3817.571 5731.878 5095.421 5899.988
Standard errors in parentheses ∗ indicates significance at p < 0.05
Table 4.7: The optimal linear models for the different card services
CHAPTER 4. RESULTS 28
4.3 Generalized Additive Models
We first consider the GAM with all media and contextual variables. Also for all media variables
except for SEA and POLS AdStock is included with λ = 0.909 as described in Section 3. Again
we tested, based on the BIC value, whether it is better to group weekdays and the media spent
on mobile banking of the competitors. This results in the following equation:
E(log(SERVt)) = θ0 + θ1TRENDt + θ2COMPt + θ3SHOLt + θ4HOLt + θ5SATt + θ6SUNt
+ θ7THUt + θ8WEDFRIt + f1(AdStock(TVt, λ)) + f2(AdStock(RADt, λ))
+ f3(AdStock(SOCt, λ)) + f4(AdStock(ONLt, λ)) + f5(SEAt) (4.5)
+ f6(AdStock(BMt, λ)) + f7(AdStock(EMt, λ)) + f8(POLSt) + εt,
with λ = 0.909.
(Intercept) 8.4986 0.0279 304.5362 < 0.0001
Trend 0.0012 0.0001 13.1227 < 0.0001
COMP 0.0274 0.0153 1.7972 0.0731
SHOL 0.0259 0.0182 1.4241 0.1553
HOL -0.2486 0.0403 -6.1771 < 0.0001
SAT -0.2985 0.0195 -15.3044 < 0.0001
SUN -0.4335 0.0214 -20.2949 < 0.0001
THU 0.0576 0.0157 3.6653 0.0003
TUE 0.0582 0.0155 3.7583 0.0002
WEDFRI 0.0799 0.0137 5.8454 < 0.0001
B. smooth terms edf Ref.df F-value p-value
s(AdStock(TV, 0.909)) 5.2232 6.3058 8.2511 < 0.0001
s(AdStock(RAD, 0.909)) 1.0003 1.0005 0.0767 0.7820
s(AdStock(SOC, 0.909)) 2.2170 2.8363 0.9841 0.2862
s(AdStock(ONL, 0.909)) 7.9135 8.6660 3.6586 0.0003
s(SEA) 3.1847 3.9998 11.5871 < 0.0001
s(AdStock(BM, 0.909)) 5.8773 6.9649 12.1111 < 0.0001
s(AdStock(EM, 0.909)) 3.8406 4.7890 1.6549 0.1206
s(POLS) 2.8846 3.6301 8.0898 < 0.0001
C. model properties
Deviance explained 88.8% BIC 6632.138
GCV 3.8513e+05 n 414
Table 4.8: Output of the full GAM
In Table 4.8 the model output is shown. This tells us that Competitors, School Holiday and
the smooth terms of Radio, Social and EM are not significant at the 5% level. Estimating the
CHAPTER 4. RESULTS 29
E(log(SERVt)) = θ0 + θ1TRENDt + θ2HOLt + θ3SATt + θ4SUNt + θ5THUt + θ6WEDFRIt
+ f1(AdStock(TVt, λ)) + f2(AdStock(ONLt, λ)) + f4(SEAt) (4.6)
+ f5(AdStock(BMt, λ)) + f6(POLSt) + εt
with λ = 0.909, we get the output as shown in Table 4.9. We can see that the adjusted R2, GCV
and BIC have improved, but the deviance explained is a little lower. In Fig. 4.3 (a)-(e) the
estimated partial regression functions of Model 4.6 are shown. On the x-axis the media spent
and on the y-axis the effect of the media variable is given. The solid line shows the estimated
effect and the grey area is the 95% confidence interval. The stripes or so-called rug plots at
the bottom of each plot show the values of the covariates of each smooth; if the stripes are
close together there are many observations in that area, if they are far apart there are little and
the 95% confidence interval gets wider. From the graphs of the smooth functions we can tell
A. parametric coefficients Estimate Std. Error t-value p-value
(Intercept) 8.5339 0.0212 403.0150 < 0.0001
Trend 0.0011 0.0001 14.0647 < 0.0001
HOL -0.2598 0.0404 -6.4347 < 0.0001
SAT -0.2936 0.0194 -15.0976 < 0.0001
SUN -0.4289 0.0214 -20.0024 < 0.0001
THU 0.0602 0.0158 3.8101 0.0002
TUE 0.0579 0.0157 3.6901 0.0003
WEDFRI 0.0828 0.0138 6.0165 < 0.0001
B. smooth terms edf Ref.df F-value p-value
s(AdStock(TV, 0.909)) 4.8320 5.8710 10.6305 < 0.0001
s(AdStock(ONL, 0.909)) 8.1989 8.8126 5.2275 < 0.0001
s(SEA) 2.8920 3.6321 15.6346 < 0.0001
s(AdStock(BM, 0.909)) 6.2377 7.3498 14.1622 < 0.0001
s(POLS) 2.9663 3.7279 7.9058 < 0.0001
C. model properties
Deviance explained 88.1% BIC 6602.143
GCV 3.8087e+05 n 414
Table 4.9: Output of the reduced GAM
that SEA is increasing monotone, the more SEA impressions the better. POLS is increasing as
well, but decreases after about 70.000 views. However, the confidence interval gets very wide at
that point since there are little observations in this area, thus it might as well be possible that
POLS is saturated at this point. TV shows about the same fashion, except the effect decreases
slightly after a bit. This might be due to the fact that instead of the GRP’s, the AdStock of
TV is included. Since the campaign lasted 35 days and with a decay factor of λ = 0.909 we
have a half-time of 7.26 days, the AdStock of TV is higher after the campaign than on the first
CHAPTER 4. RESULTS 30
few days of the campaign, and therefore a lower effect with a lower AdStock doesn’t necessarily
mean that less GRP’s spent on TV works better. For BM we see different levels of effect for a
different number of BM visible. This might be attributable to the nature of BM, sometimes a
message is sent to a large number of customers while on another occasion a more specific target
group is chosen. Some messages have a stronger ’call to action’ than others. While there might
be a similar explanation for Online, the smooth is so wiggly that it looks like it is overfitted.
(a) s(AdStock(TV,0.909)) (b) s(SEA) (c) s(AdStock(Online,0.909))
(d) s(POLS) (e) s(AdStock(BM,0.909)) (f) te(RAD,TV )
Figure 4.3: Graphs of the smooth functions
Including Synergy
We have seen in the linear models that there is synergy between Radio and TV. As discussed
in Chapter 3, thin plate regression splines can represent smooths of more than one predictor
variable. We incorporated this interaction effect using a te() term, this produces a smooth of
Radio and TV from the tensor product of any basis available for use with s() (Wood, 2017).
CHAPTER 4. RESULTS 31
E(log(SERVt)) = θ0 + θ1TRENDt + θ2HOLt + θ3SATt + θ4SUNt + θ5THUt + θ6WEDFRIt
+ f1(AdStock(TVt, λ), AdStock(Radiot, λ)) + f2(AdStock(ONLt, λ))
+ f4(SEAt) + f5(AdStock(BMt, λ)) + f6(POLSt) + εt (4.7)
with λ = 0.909, of which the results are shown in Table 4.10. It shows that the GCV and
deviance explained are better than in model 4.6, but the BIC and adjusted R2 are worse. We
performed an F-test with the null hypothesis that model 4.7 is better than 4.6 and this null
hypothesis is rejected, see Table 4.11. Thus we can not conclude that it is better to include
the interaction effect of Radio and TV instead excluding Radio. In 4.3 (f) the tensor product
spline fit is shown of TV and Radio. The bold contours show the estimate of the of the smooth,
it shows that for a low level of Radio and TV we see interaction. However, it is quite hard to
interpret a smooth of two variables.
A. parametric coefficients Estimate Std. Error t-value p-value
(Intercept) 8.5299 0.0212 401.4509 < 0.0001
Trend 0.0012 0.0001 14.0744 < 0.0001
HOL -0.2336 0.0407 -5.7391 < 0.0001
SAT -0.2905 0.0192 -15.1357 < 0.0001
SUN -0.4284 0.0212 -20.2384 < 0.0001
THU 0.0592 0.0156 3.7953 0.0002
TUE 0.0562 0.0155 3.6361 0.0003
WEDFRI 0.0823 0.0136 6.0644 < 0.0001
B. smooth terms edf Ref.df F-value p-value
te(AdStock(TV, 0.909),AdStock(RADIO, 0.909)) 10.7335 11.5703 8.0671 < 0.0001
s(AdStock(ONL, 0.909)) 8.3190 8.8494 4.9825 < 0.0001
s(SEA) 3.1181 3.9094 12.3402 < 0.0001
s(AdStock(BM, 0.909)) 5.6469 6.7758 10.3926 < 0.0001
s(POLS) 2.8983 3.6464 6.7594 0.0001
C. model properties
Deviance explained 88.7% BIC 6617.268
GCV 3.8017e+05 n 414
Table 4.10: Output of the GAM with interaction
Resid. Df Resid. Dev Df Deviance F Pr(>F)
1 371.28 118884043
Table 4.11: F-test between model 4.7 and model 4.6
CHAPTER 4. RESULTS 32
Grid Search
Two remarks have to be made about the outcome of the GAM. Firstly, we have few data points
and not every level of input is evenly covered, as the rug plots show. Secondly, since in the
mgcv-package it is not possible to automatically select an optimal decay factor λ we have chosen
λ = 0.909 for all media variables which is likely to be sub-optimal. To improve the model we
have performed a grid search on the decay factors of Model 4.6, the script of which is to be
found in Appendix 5. The optimal decay factors of TV, Online and BM are determined in that
order, since it is computationally too heavy to perform a grid search on three decay factors at
once.
The graphs of the BIC against the decay factor λ can be found in Fig. 4.4. It shows that
the optimal decay factors for TV and BM are 0.89 and 0.91 respectively. The BIC for the
varying decay factor of Online keeps decreasing when the decay factor grows beyond 0.9. Since
we expect the decay factor of TV to be higher than the decay factor of Online and a decay
factor of bigger than 0.95 seems very unlikely, since this indicates a halftime of longer than 13
days, we choose the local optimum of 0.80 for Online.
(a) TV (b) Online (c) BM
Figure 4.4: Graphs of the BIC with the different decay factors
In Table 4.12 the results are shown of the model estimated with the decay factors found in
the grid search. We see that the Adjusted R2, the deviance explained, the GCV and the BIC
have all improved. In Fig. 4.5 (a)-(e) the estimated partial regression functions are shown. We
can see that the graphs haven’t changed dramatically, which is good since it would be worrisome
if a small model adjustment gives a very different outcome. However, the confidence intervals
are smaller, Online is a little less wiggly and TV is less decreasing. Thus we can conclude it is
worthwhile to do a grid search for the decay factor.
CHAPTER 4. RESULTS 33
(Intercept) 8.5120 0.0204 416.9096 < 0.0001
Trend 0.0013 0.0001 16.3747 < 0.0001
HOL -0.2753 0.0401 -6.8667 < 0.0001
SAT -0.2932 0.0192 -15.2695 < 0.0001
SUN -0.4291 0.0212 -20.2316 < 0.0001
THU 0.0574 0.0157 3.6646 0.0003
TUE 0.0539 0.0156 3.4611 0.0006
WEDFRI 0.0796 0.0137 5.8048 < 0.0001
B. smooth terms edf Ref.df F-value p-value
s(AdStock(TV, 0.89)) 6.0349 7.1655 20.9543 < 0.0001
s(AdStock(ONL, 0.8)) 8.2411 8.8296 5.3324 < 0.0001
s(SEA) 2.8822 3.6168 23.3090 < 0.0001
s(AdStock(BM, 0.91)) 5.5614 6.6708 14.4444 < 0.0001
s(POLS) 3.2188 4.0033 6.5157 < 0.0001
C. model properties
Deviance explained 88.4% BIC 6596.706
GCV 3.738e+05 n 414
Table 4.12: Output of the GAM with Grid search AdStock
(a) s(AdStock(TV,0.89)) (b) s(SEA) (c) s(AdStock(Online,0.8))
(d) s(POLS) (e) s(AdStock(BM,0.91))
Chapter 5
Conclusion
In this thesis Marketing Mix Modelling is applied to learn what the effect of the media exposure
and other contextual variables is on the self-services of within the mobile banking application.
Extensive research has been devoted on how to model the effectiveness of advertising on sales,
but more research could be conducted on the effect of marketing on the adoption of mobile
banking and also little research has been done with daily data. The bank wants to know
what media contributes to the increase of self-services to be able to calculate their return
on marketing investment. Furthermore we were interested in the decay factors of the media
variables, the differences between the separate self-services and which type of model gives us
the best insights. In this section a summary will be given of the most important results and
further recommendations will be done.
The log-linear models
The first step of our modelling approach was to perform an ADF-test to see if the self-services
are stationary, have a deterministic or a stochastic trend. Resultant we modelled the series
with a trend and based on the BIC-value found that Wednesday, Thursday and Friday are
not significantly different. Also we found that the best fit is to incorporate whether or not
the competitors had a campaign on mobile banking instead of their actual spent on media in
the model. Using a Likelihood Ratio test on the Box-Cox transformation and considering the
results of the Ramsey RESET test we concluded that the log-linear model is the best functional
form.
From the full log-linear form we estimated the optimal log-linear model, the optimal log-
linear model with AdStock and the optimal log-linear model with AdStock and interactions,
using the approach as discussed in Section 3. Based on the VIF’s we conclude that the loglinear
model with AdStock and interaction suffers from high multicollinearity. Based on the BIC and
Adj. R2 we conclude that the Log-Linear model with AdStock has the best fit. TV, Radio
and SEA have a significant positive effect on the card-services. We found a decay factor for
Radio of λ = 0.9454 which corresponds to a half-time of 12.3 days. Moreover, Rabobank has a
significant positive contribution and on a Holiday the level of card services is significantly lower.
34
CHAPTER 5. CONCLUSION 35
Furthermore the level of card services is the highest on Wednesday, Thursday and Friday and
the lowest on Sunday.
We considered the separate card services and found that all except New card have a signif-
icant trend. It appears that New card is insensitive for media exposure. Moreover, geolocation
has a unit root and thus the first differences were modelled. It shows that School holiday has
a positive effect on Geolocation, which makes sense since more people go on vacation, and New
card. Less services are done digitally on a Holiday in the category Change Daylimit, Deblock card
and New card. Moreover, in the optimal models TV has a positive effect on Change daylimit
and Replace card, while Radio only has an effect on Change geolocation. SEA has a positive
effect on Change daylimit, Block card, Deblock card and BM has a positive effect on Change
daylimit, Block card, Deblock card, and Geolocation but a negative effect on New card which
seems odd. Also POLS has negative effects on Block card, Deblock card and Change geolocation
which is unexpected. Another interesting insight is that we found that the positive effect of
Rabobank in the log-linear model of the combined self-services is due to the large positive effect
on Daylimit.
The Generalized Additive Models
To obtain more insight of the dynamics of the media variables without having to specify a
functional form for every media variable in advance we used Generalized Additive Models.
First we used a set decay factor of 0.909 for the media variables. We found that in the GAM
Competitors, School Holiday, Radio, Social and EM did not contribute significantly at the 5%
level. In the log-linear models Radio and Competitors did have a significant effect. SEA appears
to be monotone increasing, the more impressions the better, while POLS and TV are saturated
at some point. We found different levels of effect for a different volume of BM, this might be
due to the type of BM sent and will be interesting to investigate. Online however seems to be
overfitted.
In an attempt to improve the fit of the model, we performed a grid search for the best decay
factor of TV, BM and Online. Due to computational time, we were not able to do a grid search
on them simultaneously but based on the BIC, we found decay factors of 0.89, 0.91 and 0.80
respectively and this corresponds with our expectations. When one reads a BM, the message
will probably stay in mind longer than when an advertisement seen on TV. When a message is
displayed in a banner Online it will probably be forgotten even faster. With the optimal decay
factors the fit of the model is indeed better but Online is still wiggly. Another method than
thin plate regression splines could be considered, but it could also be possible that other factors
should be taken in consideration, like on which websites there was bannering or the clicks should
be taken into account instead of the views.
Since Radio was omitted from the optimal GAM while it was included in the optimal log-
linear models, we tried to fit a smooth of Radio and TV combined. The fit of this model is
slightly better but an F-test turned out that it is not significantly better. Moreover the level
CHAPTER 5. CONCLUSION 36
plot is more difficult to interpret and as the objective of this thesis is explaining the effects of
variables and not predicting, we prefer the model with single-variable smooths.
Recommendations
From the results of both the combined as the separate card services we can conclude that model
specification is very important. While we want to get as much information as possible out of the
data and thus also include AdStock and interaction effects, we also have a very limited dataset of
only 414 observations. This dataset is even more limited since there has only been one so-called
‘flight’ of a media campaign on Radio and TV of 35 days, where Radio and TV were almost
always together ‘on’ or ‘off’ which makes it very hard to distinguish to which channel we can
ascribe the increase in card-services. Radio does contribute significantly in the log-linear models
but not in the GAM. Many more different functional forms can be tested but the main advise
towards the bank is to repeat the modelling once a second media campaign performed, preferably
where Radio and TV also have been deployed separately. Another practical recommendation
towards the bank is that they should improve their data quality. As mentioned in 2, two self-
services were not registered correctly in the weekends and on holidays. Assumptions can be
made, but this will undermine the quality of the result.
Moreover we see that SEA has a big effect as well as TV and Radio, but we see that Social
and Online don’t appear to contribute as significantly. Thus the advice would be to invest less
in those channels. As described, not every channel seems to have the same effect on every self-
service. However, one should not blindly act on those results as not only the amount of GRP’s
or impressions a customer is exposed to influences the amount of self-services, but also the
nature of the media exposure; Is it focussed on a specific self service or more a broad promotion
of mobile banking?
Finally we recommend to model not only a log-linear model or a GAM but both. Where
the log-linear models are easier to interpret, they do not allow for the flexible functional forms
of the variables as a GAM does. From the GAM we conclude that there might be unobserved
factors that can explain the shape of the smooth of BM and Online, which is interesting to
investigate for further research.
Bibliography
Abe, M. (1991). A moving ellipsoid method for nonparametric regression and its application to
logit diagnostics using scanner data. Journal of Marketing Research, 28(1):339–346.
Anckar, B. and D’Incau, D. (2002). Value creation in mobile commerce: Findings from a con-
sumer survey. Journal of Information Technology Theory and Application (JITTA), 4(1):43–
64.
Assmus, G., Farley, J. U., and Lehmann, D. R. (1984). How advertising affects sales: Meta-
analysis of econometric results. Journal of Marketing Research, 21(1):65–74.
Bhattacharya, P. (2012). GAM in marketing mix modeling: Revisited. Technical report, Jigyasa
Analytics.
Broadbent, S. (1984). Modeling with adstock. Journal of the Market Research Society,
26(4):295–312.
Clark, C. R., Doraszelski, U., and Draganska, M. (2009). The effect of advertising on brand
awareness and perceived quality: An empirical investigation using panel data. Quantitative
Marketing and Economics, 7(2):207–236.
Danaher, P. J. and Rust, R. T. (1996). Determining the optimal return on investment for an
advertising campaign. European Journal of Operational Research, 95(3):511–521.
Dinner, I. M., Van Heerde, H. J., and Neslin, S. A. (2014). Driving online and offline sales:
The cross-channel effects of traditional, online display, and paid search advertising. Journal
of Marketing Research, 51(5):527–545.
Driver, J. C. and Foxall, G. R. (1986). Optimal advertising: Adstock and be-
yond. Technical report, Cranfield Institute of Technology, https://dspace.
lib.cranfield.ac.uk/bitstream/1826/2901/1/M
Dyson, P. (2008). Cutting adspend in a recession delays recovery. Technical report, World
Advertising Research Center, Oxon.
Farrelly, M. C., Davis, K. C., Haviland, M. L., Messeri, P., and Healton, C. G. (2005). Evi-
dence of a dose–response relationship between “truth” antismoking ads and youth smoking
prevalence. American Journal of Public Health, 95(3):425–431.
37
BIBLIOGRAPHY 38
Farris, P. W., Hanssens, D. M., Lenskold, J. D., and Reibstein, D. J. (2002). Marketing return
on investment: Seeking clarity for concept and measurement. Applied Marketing Analytics,
1(3):267–282.
Fox, J. (2008). Applied Regression Analysis and Generalized Linear Models. Sage, London.
Gavin, H. P. (2017). The levenberg-marquardt method for nonlinear least squares curve-fitting
problems. Technical report, Department of Civil and Environmental Engineering, Duke Uni-
versity, Durham.
Ginsberg, W. (1974). The multiplant firm with increasing returns to scale. Journal of Economic
Theory, 9(1):283–292.
Hanssens, D. M., Parsons, L. J., and Schultz, R. L. (2002). Market Response Models, Economet-
ric and Time Series Analysis, Second Edition. International Series in Quantitative Marketing.
Kluwer Academic Publishers, New York.
Heij, C., de Boer, P., Franses, P. H., Kloek, T., and van Dijk, H. K. (2004). Econometric
Methods with Applications in Business and Economics. Oxford University Press, Oxford.
Jorgenson, D. W. (1966). Rational distributed lag functions. Econometrica: Journal of the
Econometric Society, 34(1):135–149.
Kim, Y. J. and Gu, C. (2004). Smoothing spline gaussian regression: more scalable computation
via efficient approximation. Journal of the Royal Statistical Society, 66(Series B):337–357.
Koyck, L. M. (1954). Distributed Lags and Investment Analysis By L.M. Koyck. Contributions
to Economic Analysis, 4. North-Holland Publishing Company, Amsterdam.
Leeflang, P., Wieringa, J. E., Bijmolt, T. H. A., and Pauwels, K. H. (2015). Modeling Markets,
Analyzing Marketing Phenomena and Improving Marketing Decision Making. Springer, New
York.
Lodish, L. and Lubetkin, B. (1992). General truths? Nine key findings from IRI test data.
Technical report, Admap, Oxon.
Macdonald, E. K. and Sharp, B. M. (2000). Brand awareness effects on consumer decision
making for a common, repeat purchase product: A replication. Journal of Business Research,
48(1):5–15.
Marquardt, D. W. (1963). An algorithm for least-squares estimation of nonlinear parameters.
Journal of the Society for Industrial and Applied Mathematics, 11(2):431–441.
Mhitarean, E. (2017). Marketing mix modelling from the multiple regression perspective. Mas-
ter’s thesis, KTH Royal Institute of Technology, School of Engineering Sciences, Stockholm.
BIBLIOGRAPHY 39
Massachusetts.
Naik, P. A. and Raman, K. (2003). Understanding the impact of synergy in multimedia com-
munications. Journal of Marketing Research, 40(4):375–388.
Naik, P. A., Raman, K., and Winer, R. S. (2000). What do advertisements really do for brands?
International Journal of Advertising, 19(2):147–165.
Naik, P. A., Raman, K., and Winer, R. S. (2005). Planning marketing-mix strategies in the
presence of interaction effects. Marketing Science, 24(1):25–34.
Schwarz, G. E. (1978). Estimating the dimension of a model. Annals of Statistics, 6(2):461–464.
Shaikh, A. A. and Karjaluoto, H. (2015). Mobile banking adoption: A literature review. Telem-
atics and Informatics, 32(1):129–142.
Suarez, M. M. and Estevez, M. (2016). Calculation of marketing ROI in marketing mix models,
from ROMI, to marketing-created value for shareholders, EVAM. Universia Business Review,
52(1):18–75.
Tsay, R. S. (2005). Analysis of financial time series. Wiley series in probability and statistics.
John Wiley and Sons, Inc., Hoboken, New Jersey.
Wessels, L. and Drennan, J. (2010). An investigation of consumer acceptance of m-banking.
International Journal of Bank Marketing, 28(7):547–568.
Wood, S. N. (2017). Generalized Additive Models: an introduction with R. Texts in statistical
science. CRC Press, Boca Raton, Florida.
Programs
#Vary lambda from 0.5 to 0.95 in steps of 0.01
Ads= seq(from=0.5,to=0.95, by=0.01)
resultsTV= matrix(nrow=length(Ads),ncol=2)
gamoptTV <-
#Save the lambda and BIC for each iteration
resultsTV[i,1]=lambda
resultsTV[i,2]=BIC(gamoptTV)
plot(resultsTV[,1],resultsTV[,2],type="l",xlab="lambda",ylab="BIC")
Program Listing .1: Grid search optimal decay factor for TV
40
Introduction
Generalized Additive Models