ma and arch time series model inference using minimum message length by: mony sak - 13080512...
Post on 21-Dec-2015
216 views
TRANSCRIPT
MA and ARCH Time Series model inference using
Minimum Message Length
MA and ARCH Time Series model inference using
Minimum Message Length
By:Mony Sak - 13080512
Supervisors: Assoc. Prof. David Dowe, Dr Sid Ray
By:Mony Sak - 13080512
Supervisors: Assoc. Prof. David Dowe, Dr Sid Ray
ContentsContents
1. “The Problem”2. Time Series Concepts3. Minimum Message Length (MML)4. MML applied to Time Series5. My Project6. Results7. Conclusion & Future Work
1. “The Problem”2. Time Series Concepts3. Minimum Message Length (MML)4. MML applied to Time Series5. My Project6. Results7. Conclusion & Future Work
1. “The Problem”1. “The Problem”
Which model fits the data best? Which model fits the data best?
?
?
?
?
??
2. Time Series Concepts2. Time Series Concepts
What is a Time Series (TS)? What is a Time Series (TS)?
timetime
Observation value
Observation value
Observations over time Observations over time
2. Time Series Concepts2. Time Series Concepts
What is a Time Series (TS)? Some examples (1 of 4):
Light Curve of Beta Persei, also known as Algol, or “demon star”1
What is a Time Series (TS)? Some examples (1 of 4):
Light Curve of Beta Persei, also known as Algol, or “demon star”1
2. Time Series Concepts2. Time Series Concepts
What is a Time Series (TS)? Some examples (2 of 4):
Closing stock price of Apple Computer Inc. (AAPL) (1984-2005)2
What is a Time Series (TS)? Some examples (2 of 4):
Closing stock price of Apple Computer Inc. (AAPL) (1984-2005)2
2. Time Series Concepts2. Time Series Concepts
What is a Time Series (TS)? Some examples (3 of 4):
Global temperature difference vs. Years3
What is a Time Series (TS)? Some examples (3 of 4):
Global temperature difference vs. Years3
2. Time Series Concepts2. Time Series Concepts
What is a Time Series (TS)? Some examples (4 of 4):
Average monthly busridership (weekdays) in Iowa city (1971-1982)4
What is a Time Series (TS)? Some examples (4 of 4):
Average monthly busridership (weekdays) in Iowa city (1971-1982)4
2. Time Series Concepts2. Time Series Concepts
Explanation A good model = good understanding of the
underlying process generating that data
Explanation A good model = good understanding of the
underlying process generating that data
Why study Time Series? Why study Time Series?
Prediction Predict future observation values
Prediction Predict future observation values
Control If we can predict future values, we are able to
‘control’ the time series to our benefit
Control If we can predict future values, we are able to
‘control’ the time series to our benefit
Description The best method of conveying information
Description The best method of conveying information
2. Time Series Concepts2. Time Series Concepts
Some TS models (1 of 3) Autoregressive, order p = AR(p)
Current observation value is a sum of weighted past observation values + random error5
Some TS models (1 of 3) Autoregressive, order p = AR(p)
Current observation value is a sum of weighted past observation values + random error5
2. Time Series Concepts2. Time Series Concepts
Some TS models (2 of 3) Moving Average, order q = MA(q)
Current observation value is a sum of weighted past error values + random error5
Some TS models (2 of 3) Moving Average, order q = MA(q)
Current observation value is a sum of weighted past error values + random error5
2. Time Series Concepts2. Time Series Concepts
Some TS models (3 of 3) Autoregressive Conditional Heteroskedastic, order q =
ARCH(q) Current variance value is a sum of weighted past squared
error values5
Some TS models (3 of 3) Autoregressive Conditional Heteroskedastic, order q =
ARCH(q) Current variance value is a sum of weighted past squared
error values5
2. Time Series Concepts2. Time Series Concepts
1 set of data… and many many models
1 set of data… and many many models ?
?
?
?
?
?
2. Time Series Concepts2. Time Series Concepts
Partial solution to the “The Problem” Partial solution to the “The Problem”
The Model Selection Criterion (MSC) The Model Selection Criterion (MSC)
Objective scoring of different models Objective scoring of different models
An equation, based on parsimony
An equation, based on parsimony
i am a criterion!
+
+
101.21
99.90
2. Time Series Concepts2. Time Series Concepts
Some popular Model Selection Criteria: Some popular Model Selection Criteria:
Akaike’s Information Criterion (AIC)6 Akaike’s Information Criterion (AIC)6
Bayesian Information Criterion (BIC)7 Bayesian Information Criterion (BIC)7
…many more incl. HQ8, RCL9, MML10 …many more incl. HQ8, RCL9, MML10
3. Minimum Message Length
3. Minimum Message Length
What it is & History What it is & History Information-theoretic criterion for model selection and
point estimation Information-theoretic criterion for model selection and
point estimation
Developed here at Monash University by Wallace & Boulton in 196811
Developed here at Monash University by Wallace & Boulton in 196811
Has been applied to mixture modelling (“snob”), decision tree/graph induction, generalized Bayesian networks, and more…
Has been applied to mixture modelling (“snob”), decision tree/graph induction, generalized Bayesian networks, and more…
3. Minimum Message Length
3. Minimum Message Length
Theory Theory A “message” can be encoded in 2 parts:
Part 1: Model, Part 2: Data (given the Model in Part 1) Combined Message Length = Part 1 + Part 2
A “message” can be encoded in 2 parts: Part 1: Model, Part 2: Data (given the Model in Part 1) Combined Message Length = Part 1 + Part 2
We choose the model that yields the smallest Combined Message Length
We choose the model that yields the smallest Combined Message Length
3. Minimum Message Length
3. Minimum Message Length
Theory (example) Theory (example)
model 1
model 2
model 3
model 4
data|model 1
data|model 2
data|model 3
data|model 4
3. Minimum Message Length
3. Minimum Message Length
MML87 Approximation: MML87 Approximation: Developed by Wallace & Freeman in 198713 Developed by Wallace & Freeman in 198713
Part 1 (model): Part 1 (model):
Part 2 (data|model): Part 2 (data|model):
4. MML87-based MSC4. MML87-based MSC
Past Research MML87-based MSC for:
Past Research MML87-based MSC for:
AR model inference10, Stock market simulation of AR traders14
ARMAX models15
AR model inference10, Stock market simulation of AR traders14
ARMAX models15
…Results: MML does very well when compared to the other Model Selection Criteria
…Results: MML does very well when compared to the other Model Selection Criteria
4. MML-based MSC4. MML-based MSC
Motivation for my project How well does MML-based MSCs perform with other models?
Motivation for my project How well does MML-based MSCs perform with other models?
Results from Fitzgibbon, Dowe, Vahid (2004)10
Results from Fitzgibbon, Dowe, Vahid (2004)10
5. My Project5. My Project
How well does an MML-based MSC perform with: How well does an MML-based MSC perform with:
Moving Average (MA) models? Moving Average (MA) models? Autoregressive Conditional Heteroskedastic
(ARCH) models? Autoregressive Conditional Heteroskedastic
(ARCH) models?
We need to derive 2 MSCs, 1 for each model We need to derive 2 MSCs, 1 for each model
Complex math regarding Fisher Information matrix. We resort to approximations
Complex math regarding Fisher Information matrix. We resort to approximations
MA is a conditional mean model, whereas ARCH is a conditional variance model - quite different
MA is a conditional mean model, whereas ARCH is a conditional variance model - quite different
6. Results6. Results
Results (simulations) Moving Average (MA) models
(Results from Sak, Dowe, Ray (2005). Accepted for inclusion in proceedings of Advanced Computing in Financial Markets ‘05. Istanbul, Turkey. Dec 15-17, 2005.)16
Results (simulations) Moving Average (MA) models
(Results from Sak, Dowe, Ray (2005). Accepted for inclusion in proceedings of Advanced Computing in Financial Markets ‘05. Istanbul, Turkey. Dec 15-17, 2005.)16
7. Conclusion & Future Work
7. Conclusion & Future Work
Future Work Future Work Try other MML approximations such as MMLD17 Try other MML approximations such as MMLD17
Other Time Series models: Generalized ARCH (GARCH)18, Generalized/Indexed AR (GAR)18
Other Time Series models: Generalized ARCH (GARCH)18, Generalized/Indexed AR (GAR)18
Other parameter estimation methods: Maximum Likelihood Estimation (MLE) is very very slow!
Other parameter estimation methods: Maximum Likelihood Estimation (MLE) is very very slow!
Conclusion Conclusion MML-based MSC for MA models performs very well MML-based MSC for ARCH models….
MML-based MSC for MA models performs very well MML-based MSC for ARCH models….
References (1 of 2)References (1 of 2)1. J. Stebbins. The measurement of the light of stars with a selenium photometer, with
an application to variations of Algol. The Astrophysical Journal, 32(3):185-214, 1910.2. Data obtained from http://finance.yahoo.com/q?s=aapl 3. Data obtained from http://www.elmhurst.edu/~chm/vchembook/globalwarmA.html4. Hyndman, R.J. (n.d.) Time Series Data Library, http://www-
personal.buseco.monash.edu.au/~hyndman/TSDL/. Accessed on 24 Oct., 2005.5. J. D. Hamilton. Time Series Analysis. Princeton University Press, 1994. 6. H. Akaike. Information theory as an extension of the Maximum Likelihood principle. In
Second International Symposium on Information Theory, pages 267-281, 1973. Petrov, B.N. and Csaki, F. (editors). Akademiai Kiado, Budapest.
7. G. Schwarz. Estimating the dimension of a model. The Annals of Statistics, 6(2):46-464, 1978.
8. E.J. Hannan and B.G. Quinn. The determination of the order of an autoregression. Journal of the Royal Statistical Society, Series B (Methodological), 41(2):190-195, 1979.
9. H. Mitchell and D.M McKenzie. GARCH model selection criteria. Quantitative Finance, 3:262-284, 2003.
10. L.J. Fitzgibbon, D.L. Dowe, and F. Vahid. Minimum Message Length Autoregressive Model Order Selection. In M. Palanaswami, C. Chandra Sekhar, G. Kumar Venayagamoorthy, S. Mohan and M. K. Ghantasala (eds.), International Conference on Intelligent Sensing and Information Processing (ICISIP), pages 439-444, 2004. Chennai, India, 4-7 January 2004, (ISBN: 0-7803-8243-9, IEEE Catalogue Number: 04EX783), www.csse.monash.edu.au/∼dld/Publications/2004/Fitzgibbon+Dowe+Vahid2004.ref.
1. J. Stebbins. The measurement of the light of stars with a selenium photometer, with an application to variations of Algol. The Astrophysical Journal, 32(3):185-214, 1910.
2. Data obtained from http://finance.yahoo.com/q?s=aapl 3. Data obtained from http://www.elmhurst.edu/~chm/vchembook/globalwarmA.html4. Hyndman, R.J. (n.d.) Time Series Data Library, http://www-
personal.buseco.monash.edu.au/~hyndman/TSDL/. Accessed on 24 Oct., 2005.5. J. D. Hamilton. Time Series Analysis. Princeton University Press, 1994. 6. H. Akaike. Information theory as an extension of the Maximum Likelihood principle. In
Second International Symposium on Information Theory, pages 267-281, 1973. Petrov, B.N. and Csaki, F. (editors). Akademiai Kiado, Budapest.
7. G. Schwarz. Estimating the dimension of a model. The Annals of Statistics, 6(2):46-464, 1978.
8. E.J. Hannan and B.G. Quinn. The determination of the order of an autoregression. Journal of the Royal Statistical Society, Series B (Methodological), 41(2):190-195, 1979.
9. H. Mitchell and D.M McKenzie. GARCH model selection criteria. Quantitative Finance, 3:262-284, 2003.
10. L.J. Fitzgibbon, D.L. Dowe, and F. Vahid. Minimum Message Length Autoregressive Model Order Selection. In M. Palanaswami, C. Chandra Sekhar, G. Kumar Venayagamoorthy, S. Mohan and M. K. Ghantasala (eds.), International Conference on Intelligent Sensing and Information Processing (ICISIP), pages 439-444, 2004. Chennai, India, 4-7 January 2004, (ISBN: 0-7803-8243-9, IEEE Catalogue Number: 04EX783), www.csse.monash.edu.au/∼dld/Publications/2004/Fitzgibbon+Dowe+Vahid2004.ref.
Want a copy of these slides? Send requests to [email protected] a copy of these slides? Send requests to [email protected]
References (2 of 2)References (2 of 2)11. C.S. Wallace and D.M. Boulton. An information measure for classification. Computer
Journal, 11(2):185-194, 1968. 12. L.J. Fitzgibbon. Message from Monte Carlo: A Framework for Minimum Message Length
Inference using Markov Chain Monte Carlo Methods. PhD thesis, Monash University, Clayton Campus. Wellington Rd, Clayton. Victoria 3800, Australia, 2004.
13. C.S. Wallace and P.R. Freeman. Estimation and inference by compact encoding. Journal of the Royal Statistical Society. Series B (Methodological), 49(3):240-265, 1987.
14. M. J. Collie, D. L. Dowe, and L. J. Fitzgibbon. Stock market simulation and inference technique, 2005. Accepted for inclusion in proceedings of the 5th international conference on Hybrid Intelligent Systems (HIS’05), Rio de Janeiro, Brazil, November 6-9, 2005.
15. [ Schmidt ]16. M. Sak, D.L. Dowe, and S. Ray. Minimum Message Length Moving Average Time Series
Data Mining. In Computational Intelligence: Methods and Applications. First International ICSC Symposium on Advanced Computing in Financial Markets (ACFM2005), 2005. Accepted for inclusion in proceedings of Advanced Computing in Financial Markets (ACFM2005), Istanbul, Turkey. Dec. 15-17, 2005.
17. E. Lam. Improved Approximations in MML. Honours Thesis, Monash University, School of Computer Science and Software Engineering (CSSE), Monash University, Clayton 3168, Australia, 2000.
18. T. Bollerslev. Generalized Autoregressive Conditional Heteroskedasticity. Journal of Econometrics, 31:307-27, 1986.
19. M.S. Peiris. Improving the Quality of Forecasting using Generalized AR Models: An
Application to Statistical Quality Control. Statistical Methods, 5(2):156-171, 2003.
11. C.S. Wallace and D.M. Boulton. An information measure for classification. Computer Journal, 11(2):185-194, 1968.
12. L.J. Fitzgibbon. Message from Monte Carlo: A Framework for Minimum Message Length Inference using Markov Chain Monte Carlo Methods. PhD thesis, Monash University, Clayton Campus. Wellington Rd, Clayton. Victoria 3800, Australia, 2004.
13. C.S. Wallace and P.R. Freeman. Estimation and inference by compact encoding. Journal of the Royal Statistical Society. Series B (Methodological), 49(3):240-265, 1987.
14. M. J. Collie, D. L. Dowe, and L. J. Fitzgibbon. Stock market simulation and inference technique, 2005. Accepted for inclusion in proceedings of the 5th international conference on Hybrid Intelligent Systems (HIS’05), Rio de Janeiro, Brazil, November 6-9, 2005.
15. [ Schmidt ]16. M. Sak, D.L. Dowe, and S. Ray. Minimum Message Length Moving Average Time Series
Data Mining. In Computational Intelligence: Methods and Applications. First International ICSC Symposium on Advanced Computing in Financial Markets (ACFM2005), 2005. Accepted for inclusion in proceedings of Advanced Computing in Financial Markets (ACFM2005), Istanbul, Turkey. Dec. 15-17, 2005.
17. E. Lam. Improved Approximations in MML. Honours Thesis, Monash University, School of Computer Science and Software Engineering (CSSE), Monash University, Clayton 3168, Australia, 2000.
18. T. Bollerslev. Generalized Autoregressive Conditional Heteroskedasticity. Journal of Econometrics, 31:307-27, 1986.
19. M.S. Peiris. Improving the Quality of Forecasting using Generalized AR Models: An
Application to Statistical Quality Control. Statistical Methods, 5(2):156-171, 2003.
Negative Log Likelihood Negative Log Likelihood
Takes into account the estimated variance Takes into account the estimated variance
6. Results6. Results
Empirical Comparison Empirical Comparison1. Simulate data sets for 200 models for each model order
(i.e. MA(1) - MA(8)) for a total of 1,600 MA data sets1. Simulate data sets for 200 models for each model order
(i.e. MA(1) - MA(8)) for a total of 1,600 MA data sets
2. Estimate model parameters using Maximum Likelihood (MLE)
2. Estimate model parameters using Maximum Likelihood (MLE)
3. Pass to each Model Selection Criterion (MSC) the same 1,600 data sets and parameter estimates (for each data set), and let them choose the model they think best represents the data
3. Pass to each Model Selection Criterion (MSC) the same 1,600 data sets and parameter estimates (for each data set), and let them choose the model they think best represents the data
5. Repeat experiment for ARCH models (again 1,600 data sets)
5. Repeat experiment for ARCH models (again 1,600 data sets)
4. Assessment is on correct model order selection accuracy and negative log likelihoood
4. Assessment is on correct model order selection accuracy and negative log likelihoood