quantitative methods of market forecasting (commodities ...€¦ · quantitative methods of market...
TRANSCRIPT
National Research University «Higher School of Economics»
“Quantitative methods of market forecasting” – Course syllabus
Bachelor’s program 38.03.05 “Business informatics”
1
The Government of the Russian Federation
The Federal State Autonomous Institution of Higher Education “National Research University – Higher School of Economics”
Faculty of Business and Management
School of Business Informatics
Department for Management of Information Systems and Digital Infrastructure
Quantitative methods of market forecasting (commodities and stock
markets)
Bachelor’s program 38.03.05 “Business informatics”
Author: S.V. Petropavlovsky, associate professor
Approved at the meeting of the
Department for Management of Information Systems and Digital Infrastructure
«___»____________ 2017
Head of Department
_______________ / E.A. Isaev /
Approved by the Academic Council of Business Informatics
«___»____________ 2017
Chairman
_______________/ A.V. Dmitriev
Moscow, 2017
The document cannot be used by other HSE departments as well as other universities and
educational institutions without permission from the course authors
National Research University «Higher School of Economics»
“Quantitative methods of market forecasting” – Course syllabus
Bachelor’s program 38.03.05 “Business informatics”
2
1. Applicability and Normative References
The program provides the contents of the course and describes the learning outcomes,
competences and practical skills obtained upon completion of the course. It also sets pre-
requisites for taking the course and provides criteria for assessing students’ performance. The
program is designed for instructors teaching the course, teaching assistants and undergraduate
students following educational track 38.03.05 "Business Informatics", Bachelor’s level.
2. Course Objectives
The course provides a theoretical background of financial time series analysis and aims at
developing practical skills of acquisition, processing and interpretation of the financial data.
3. Course Description
"Quantitative methods of market forecasting" is an elective course taken in the 4th
academic year
of the Bachelor’s program. The course focuses on the basic as well as more advanced
econometric models explaining the structure of the financial time series. We start with the
definition of asset returns and review their basic properties. Then we consider the simple linear
models for asset returns such as the autoregressive AR(p) and moving average MA(q) models,
combined ARMA models, the unit-root processes, exponential smoothing and the ARIMA(p,q)
models. For each model, the processes of identification, estimation, verification and forecasting
are described in detail and illustrated by examples. Some additional topics such as handling
seasonality and long-memory effects are also addressed in this section.
A significant attention is paid to the non-linear time series in the context of volatility
modeling. The ARCH/GARCH processes are studied in detail including various extensions of
these models. As an alternative to the ARCH/GARCH paradigm, we introduce and briefly
discuss the stochastic volatility models.
The applications of volatility modeling such as option pricing, term structure of interest
rates and portfolio management are demonstrated. However, the emphasis in the applications is
put on the risk management, more specifically, on the measures of risk such as value at risk
(VaR) and expected shortfall. The course provides a comprehensive description of computing,
interpreting and backtesting the VaR indicator. Among other approaches, we introduce the
extreme value theory to compute the VaR.
A considerable portion of the course is devoted to modeling the high frequency data.
Specifically, we introduce and discuss the models for price changes (ordered probit model and
some others), duration models (the ACD model), and the concept of realized volatility.
As a natural generalization of simple linear models, we discuss the multivariate time
series at the introductory level. We focus on the notion of the cointegrated time series which
provides a theoretical framework for algorithms of statistical arbitrage used in automated trading
systems, in particular, pairs trading.
In the last part of the course some non-econometric methods for classification and
prediction in the financial markets are analyzed. In particular, machine learning algorithms such
as regression trees, support vector machines, neural networks and multivariate adaptive
regression splines are applied for building a prototype of an on-line trading system.
The students are supposed to use the R language for implementing the algorithms
throughout the course (but not limited to), so a brief introduction to R is done at the very
beginning. The duration of the course is one module. The course is taught in English and worth
4 credits.
National Research University «Higher School of Economics»
“Quantitative methods of market forecasting” – Course syllabus
Bachelor’s program 38.03.05 “Business informatics”
3
4. Learning Outcomes and Competencies
At the end of the course, students should:
Be aware of:
the need, basic concepts and applicability of models used in financial econometrics;
the details of implementing the models of financial econometrics in R.
Be able:
to download and pre-process market data;
to select the model and identify it;
to estimate the parameters of the model;
to make forecasts with the help of the model;
to interpret the results of the forecast and use them in the decision-making process.
Learn how to:
search for, select and download market data for the subsequent prediction;
process market data using modern software;
make predictions regarding the risk and returns of assets
present the results of the analysis.
Pre-requisites:
Programming (R is a plus but not essential), mathematics (algebra and calculus), probability
theory and statistics. Good command of English.
Competencies:
Competencies
Code
accord- ing to
Federal
standard/HSE
Descriptors – basic signs of
mastering (indicators of
achieving a result)
Ways and methods of
teaching leading to
formation and development
of the competencies
Being able to explicate the scientific essence of problems in the professional field
СК-1 Mastering and using Lectures, practice in computer labs, preparation of class and home assignments
Being able to solve problems in the professional field on the basis of analysis and synthesis
СК-Б4 Mastering and using Lectures, practice in computer labs, preparation of class and home assignments
National Research University «Higher School of Economics»
“Quantitative methods of market forecasting” – Course syllabus
Bachelor’s program 38.03.05 “Business informatics”
4
Being able to realize scientific and practical activities in international environment
СК-Б11
Mastering and using Lectures, practice in computer labs, preparation of class and home
assignments
Being able to control and develop the content of an enterprise and Internet- resources, to control the processes of creating and using information services
ПК-13 Mastering and using Lectures, practice in computer labs, preparation of class and home assignments
Consulting with respect to the rational choice of methods and tools for con- trolling the IT-infrastructure of an enterprise
ПК-24 Mastering and using Lectures, practice in computer labs, preparation of class and home assignments
Being able to use the relevant mathematical and technical tools for processing, analysis and systematization of data on the topic of research
ПК-22 Mastering and using Lectures, practice in computer labs, preparation of class and home assignments
Being able to prepare scientific reports and presentations
ПК-23 Mastering and using Lectures, practice in computer labs, preparation of class and home assignments
5. Role of the course in the curriculum The course is a part of major (professional) block of disciplines. It is an elective course. The
course is based on a number of preceding disciplines:
Calculus;
Linear Algebra;
National Research University «Higher School of Economics»
“Quantitative methods of market forecasting” – Course syllabus
Bachelor’s program 38.03.05 “Business informatics”
5
Micro- and Macroeconomics, Financial Management;
Probability Theory and Statistics;
Modeling of processes and systems
The concepts and methods provided by the current course may be helpful in studying the
subsequent courses such as:
Analysis of business processes;
Fractal analysis of market data;
Semantic informational systems;
6. Course Structure and Contents
6.1. Course Structure
№ Topic
In-class hours
Self-
study Total
Lectures
Practice
in
computer
labs
Total
1st module
1. Introduction to R 2 2 4 5 9
2. Properties of Asset Returns 2 2 4 10 14
3. Linear Models for Financial Time
Series 6 6 12 10 22
4. Multivariate Linear Time Series 4 4 8 10 18
5. Volatility Models 4 4 8 10 18
6. Applications of Volatility Models 4 4 8 5 13
7. High Frequency Financial Data
4 4 8 10 18
8. Value at Risk 2 2 4 10 14
9. Machine Learning Algorithms in
Finance 4 4 8 10 18
Total 32 32 64 80 144
6.2. Syllabus
Topic 1. Introduction to R.
Data objects in R, installing and using packages. Loading data from local files and on-line
databases. Plotting data in R. Advanced graphics. Time series objects. Overview of basic
statistics in R. Major programming constructs: conditional operators, loops, functions.
Reading:
1. Core Text: [2]
2. Further Reading: [5]
National Research University «Higher School of Economics»
“Quantitative methods of market forecasting” – Course syllabus
Bachelor’s program 38.03.05 “Business informatics”
6
Topic 2. Properties of Asset Returns
The time value of money: future value, present value, multiple compounding periods, eff ective
annual rate. Asset return calculations. Portfolio returns. Average returns. Continuously
compounded returns.
Reading:
1. Core Text: [2]
2. Further Reading: [5]
Topic 3. Linear Models for Financial Time Series
Stationarity of a time series. Correlation and autocorrelation functions. White noise and linear
time series.
Autoregressive (AR) models. Properties of AR models. Identifying AR models in practice.
Goodness of fit. Forecasting under AR models.
Moving average (MA) models. Properties of MA models. Identifying MA models in practice.
Estimation. Forecasting using MA models.
Mixed ARMA models. Properties of the ARMA(1,1) models. General ARMA models.
Identifying ARMA models. Forecasting under the ARMA models. Different representations of
the ARMA model.
Unit-root non-stationarity. Random walks. Trend-stationary time series. General unit-root non-
stationary models. Unit-root test.
Exponential smoothing.
Seasonal models. Seasonal differencing. Multiplicative seasonal model. Seasonal dummy
variable.
Regression models with time series errors. Long-memory models.
Model comparison and averaging. In-sample and out-of-sample comparison.
Reading:
1. Core Text: [2]
2. Further Reading: [5]
Topic 4. Multivariate Linear Time Series
Review of univariate analysis of stationary time series. AR(p) time series process. MA(q) time
series process. ARMA(p, q) time series process.
Multivariate analysis of stationary time series characteristics. Vector autoregressive model.
Specification, assumptions and estimation. Diagnostic tests, causality analysis. Forecasting.
National Research University «Higher School of Economics»
“Quantitative methods of market forecasting” – Course syllabus
Bachelor’s program 38.03.05 “Business informatics”
7
Structural vector autoregressive model. Specification, assumptions and estimation. Forecast error
variance decomposition. Non-stationary time series. Unit root processes. Long-memory
processes.
Cointegration and common trends. Spurious regression. Concept of cointegration and error-
correction models. Systems of cointegrated variables. Granger's representation theorem.
Statistical inference for cointegrated systems. Statistical arbitrage. Formation of cointegration
pairs. Trading with cointegration pairs.
Reading:
1. Core Text: [3]
2. Further Reading: [5]
Topic 5. Volatility Models.
Characteristics of volatility. Structure of a model. Testing for ARCH effect.
The ARCH Model. Properties of the ARCH models. Building and using an ARCH model.
Examples.
The GARCH Model. Forecasting evaluation. A two-pass estimation method. The integrated
GARCH model. The GARCH-M model. The exponential GARCH model. The threshold
GARCH model. Asymmetric power ARCH model. Non-symmetric GARCH model.
The stochastic volatility model.
Reading:
1. Core Text: [2]
2. Further Reading: [5]
Topic 6. Applications of Volatility Models
GARCH volatility term structure. Option pricing and hedging. Time dependent correlations and
betas. Minimum variance portfolios.
Reading:
1. Core Text: [2]
2. Further Reading: [4,5]
National Research University «Higher School of Economics»
“Quantitative methods of market forecasting” – Course syllabus
Bachelor’s program 38.03.05 “Business informatics”
8
Topic 7. High Frequency Financial Data
Nonsynchronous trading. Bid–ask spread of trading price. Empirical characteristics of high
frequency trading data.
Models for price changes. Ordered probit model, decomposition model.
Duration models. Diurnal component, the ACD model.
Realized volatility. Handling microstructure noises.
Reading:
1. Core Text: [2]
2. Further Reading: [5]
Topic 8. Value at Risk
Risk measure and coherence: Value at Risk (VaR), expected shortfall. JP Morgan’s Riskmetrics,
multiple positions. Econometric approach. Quantile estimation, quantile and order statistics.
Quantile regression. Extreme value theory, application to asset returns. An extreme value
approach to VaR. Multiperiod VaR. Peaks over thresholds: statistical theory, mean excess
function, estimation. The stationary loss processes.
Reading:
1. Core Text: [2]
2. Further Reading: [5]
Topic 9. Machine Learning Algorithms in Finance
Types of machine learning algorithms. The limits of machine learning.
Classification using Nearest Neighbors algorithm: measuring similarity with distance, choosing
an appropriate number of neighbors, preparing data for use with k-NN. Examples of k-NN
algorithm.
Probabilistic learning using Naive Bayes approach: the basic idea, the Laplace estimator,
numerical features of the Naive Bayes approach. Examples (filtering out spam, etc).
Classification using decision trees and rules. Divide and conquer algorithm. The 1R algorithm.
The RIPPER algorithm. Boosting the accuracy of decision trees, pruning the trees. Bagging
classification. Random forests. The Gini index. Advantages and disadvantages of trees.
Black box methods. Neural networks. Activation functions. Network topology. Training a model
on the data. Evaluating and improving model performance. Support vector machines.
National Research University «Higher School of Economics»
“Quantitative methods of market forecasting” – Course syllabus
Bachelor’s program 38.03.05 “Business informatics”
9
Classification with hyperplanes (linearly and non-linearly separable data). Using kernels for non-
linear spaces.
Reading:
1. Core Text: [5]
2. Further Reading: [1-3]
7. Reading List
A. Core Texts
1. J. Verzani, Using R for Introductory Statistics, Second Edition, Chapman & Hall/CRC The
R Series, Taylor & Francis, 2014. URL https://books.google.ru/books?id=O86uAwAAQBAJ
2. R. Tsay, An Introduction to Analysis of Financial Data with R, John Wiley & Sons, Inc.,
2013.
3. R. Tsay, Multivariate Time Series Analysis With R and Financial Applications, John Wiley &
Sons, Inc., 2014.
4. Brett Lantz, Machine Learning with R, Second Edition, Packt Publishing, 2015.
B. Further reading
1. J. Ross Quinlan. Induction of decision trees. Machine learning, 1(1): 81–106, 1986.
2. Anthony, M. & Bartlet, P. Neural Network Learning: Theoretical Foundations,
Cambridge University Press, 1999.
3. Barber, D. Bayesian reasoning and machine learning, Cambridge University
Press, 2012.
4.Bernhard Pfaff, Analysis of Integrated and Cointegrated Time Series with R, Springer, 2008.
5. N. Chan, Time Series Applications to Finance with R and S-Plus, Second Edition, John Wiley
& Sons, Inc., Hoboken, New Jersey, 2010.
8. Assessment of student’s performance
Type of
assessment
Means of assessment 3 year
1 2 3 4
Pre-exam test
(last week of
module 1)
Computer-based
assignment
*
Final Exam *
National Research University «Higher School of Economics»
“Quantitative methods of market forecasting” – Course syllabus
Bachelor’s program 38.03.05 “Business informatics”
10
8.1. Criteria of assessment
To successfully pass the pre-exam test, the students should be able to solve the problems that
were discussed in class. To pass the final exam, the students should demonstrate the knowledge
of basic concepts of financial econometrics and ability to implement them in practice.
8.2. Topics suggested for the pre-exam test
Computer-based assignments on Topics 1-9.
8.3. Sample concept questions for final exam
1. Basic properties of the AR models. Identification, estimation and forecasting under the AR
model.
2. Basic properties of the MA models. Identification, estimation and forecasting under the MA
model.
3. Basic properties of the ARMA models. Identification, estimation and forecasting under the
ARMA model..
4. Unit-root process.
5. Basic properties of the ARIMA models. Identification, estimation and forecasting under the
ARIMA model.
6. Handling seasonality under the linear models.
7. Handling long-memory effects under the linear models.
8. The idea of exponential smoothing.
9. Vector autoregressive models: specification, assumptions and estimation.
10. Vector autoregressive models: diagnostic tests, causality analysis.
11. Vector autoregressive models: forecasting.
12. Structural vector autoregressive models: specification, assumptions and estimation.
13. Structural vector autoregressive models: forecast error variance decomposition.
14. Unit root processes.
15. Cointegration and common trends.
16. Statistical arbitrage. Trading with cointegration pairs.
17. Properties of the ARCH model.
18. Properties of the GARCH model.
19. Extensions of the GARCH model.
20. Models for intraday price changes.
21. Duration models for high frequency trading.
22. Computation and backtesting of VaR.
23. Nearest Neighbors algorithm for classification and its use in financial modeling.
24. Probabilistic learning: naive Bayes approach.
25. Decision trees and their use in financial modeling.
26. Basic concepts of neural networks and their use in financial modeling.
27. Support vector machine algorithms.
The final exam is computer based and lasts for 90 minutes. The assignment consists of two
concept questions and two practical tasks.
National Research University «Higher School of Economics»
“Quantitative methods of market forecasting” – Course syllabus
Bachelor’s program 38.03.05 “Business informatics”
11
8.3. Sample practice assignments for final exam
Problem 1 .
1 . Use getSymbols command of quantmod package to download data on some
commodity such as oil, gold, wheat, etc from Federal Reserve Economic Data cite
( http://research.stlouisfed.org/fred2/ ) .
2. Clearly state in your report what kind of data you are using (daily, monthly etc) .
3. Check for the missing data and remove the respective entries from the dataset, if any. You
may use the following script as an example: getSymbols ( ` GOLDAMGBD228NLBM ' , src= ' FRED ' )
idx < - c ( 1 : nrow (GOLD) ) [i s . na (GOLD) ]
GOLD <- GOLD [- idx]
If you did find the missing data, make a comment on this in your report.
4. Compute and plot the log price xt and the log return rt (in the same figure). Comment on
the two plots (how volatile the data are, volatility clustering, outliers etc).
5. Compute and plot the first 12 lags of ACF of xt. Comment on the plot. Based on the ACF,
is there a unit root in xt dataset? Why?
6. Consider the time series for rt. Perform the Ljung-Box test for m = 12. Make a conclusion
and back it with the statistical language, i.e. , in terms of critical region or p-value.
7. Use the command ar (rt , method=”mle” , order . max=20) to specify the
order of an
AR model for rt. State clearly the criterion you are using. Compare your selection with the
analysis of partial ACF. Use pacf (rt,lag= 12) command.
8. Build an AR model for rt. Check the model analyzing the ACF and the Ljung-Box statistics
of the residuals. Plot the time series of the residuals, ACF and p-values of the Ljung-Box. Is
the model adequate? Why? Refine the model by eliminating all estimates with t-ratio less
than 1.645 and check the new model as described above. Is the new model adequate? Why?
Write down the final model.
9. Does the model imply existence of a cycle? Why? If the cycles are present, compute the
average length of these cycles.
10. Use the fitted AR model to compute 1-step to 4-step ahead forecasts of rt at the forecast
origin corresponding to the last date of the time series. Also, compute the corresponding
95% interval forecasts. Plot these results.
9. Grading
The formula for the final grade finO
fin accm exam0.7 0.3O O O
is comprised of the grade accmO accumulated over the module and the grade examO for the final
exam. The accumulated grade accmO is calculated as follows:
accm HA MT0.6 0.4O O O
where HAO and MTO are the grades for the home assignments and the pre-exam test,
respectively.
National Research University «Higher School of Economics»
“Quantitative methods of market forecasting” – Course syllabus
Bachelor’s program 38.03.05 “Business informatics”
12
10. Software and Technical Tools
R, RStudio, Python, Matlab, Microsoft Excel
11. Recommendations for instructors In general, lectures should give insight into the concepts and ideas underlying the topic under
review. The theoretical core of presentation should be preceded and followed up by clear
examples. The lecture slides may contain pieces of (quasi) code illustrating implementation of
the algorithms in some programming language (presumably, in R). It is highly recommended to
provide students with the lecture slides prior to the lecture so that they could familiarize
themselves with the material in advance and prepare some questions. The lecturer should refer
the students for technicalities to the recommended textbooks, reviews and papers as needed
throughout the presentation.
Practice classes play the key role in providing the course. The instructor should focus on the
implementation of data analysis algorithms on computers. The difficult tasks should be
discussed and worked out together with students. The tasks being discussed should be close to
those of home assignment so as students could solve similar problems on their own. The students
are supposed to prepare a report on a particular home assignment and submit it to the instructor
electronically or in paper form. Some requirements for these reports may be set, e.g.:
The questions should be addressed in the same order they appear in the assignment. The
text of the question must be retained and placed before each answer. The working
language is English.
The answer to a particular question may take a form of a plot, formula etc followed by a
brief explanation and a conclusion. All conclusions must be justified numerically, i.e., by
some computed quantities, plots, etc. The answers do not need to be lengthy but they
must be convincing in mathematical and statistical sense, i.e., in terms of some
quantitative measures.
Each student must use a unique data set. It is the student’s responsibility to make sure that
no one else is using the same data. To facilitate the distribution of datasets among the
students, the instructor can create an editable shared check-in list on Google Drive or
some other cloud resource.
The deadlines for the reports should be clearly specified.
The instructor should notify the students about the penalties for late submission of the
reports.
The solutions should normally contain code in R or some other language.
It is good practice to suggest the students some datasets for the home assignments. For example,
a great amount of market data can be found at Yahoo Finance, Google Finance, Federal Reserve
Economic Data repository http://research.stlouisfed.org/fred2/ and so on. Other possible data
sources include the JSE archive http://ww2.amstat.org/publications/jse/jse_data_archive.htm, a
huge repository at https://www.data.gov/ and a list of freely available sources at
http://guides.emich.edu/data/free-data. Remarkably, most of these data can be downloaded in R
directly by using the respective functions which should be pointed out to students.
National Research University «Higher School of Economics»
“Quantitative methods of market forecasting” – Course syllabus
Bachelor’s program 38.03.05 “Business informatics”
13
12. Recommendations for students
When completing homework assignments, first read the lecture slides and the recommended
textbook. Then think a little and try some problems and then read and think some more. This
procedure should be iterated until the problem becomes clear. You should not spend much time
on pure reading with no practice but, at the same time, you should not tackle a problem without
understanding of the underlying theory. Plan your timetable so that to do the homework shortly
after the lecture and/or practice class so as to keep the basic ideas fresh in your mind.
Author of the program:
Associate professor Sergey V. Petropavlovsky