presentation machine learning

51
Slide 1 of 52 Economic and Financial Forecasting Using Machine Learning Methodologies Periklis Gogas Theophilos Papadimitriou

Upload: periklis-gogas

Post on 12-Jan-2017

120 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Presentation Machine Learning

Slide 1 of 52

Economic and Financial ForecastingUsing Machine Learning Methodologies

Periklis GogasTheophilos Papadimitriou

Page 2: Presentation Machine Learning

Slide 2 of 52

The framework• Research Grant “Thales” – awarded to T.

Papadimitriou• Started • Concludes November 2015• Dissemination of research results

Page 3: Presentation Machine Learning

Slide 3 of 52

Presentation Outline• Intuition for using Machine Learning in Forecasting• Forecasting applications:

• Exchange rate forecasting• Forecasting House Prices in the U.S.• Forecasting Bank Failures and stress testing in the U.S.• Forecasting Recession

Page 4: Presentation Machine Learning

Slide 4 of 52

Why Machine Learning?• Renewed interest in machine learning• Higher volumes and varieties of available data• Affordable data storage.• More importantly: computer processing is cheaper

and more powerful

Page 5: Presentation Machine Learning

Slide 5 of 52

Application 1: Forecasting Exchange Rates

High frequency

(daily)

•Driven by microeconomic factors: Markets, demand & supply

Lower frequency (monthly)

•Driven by Macroeconomics: e.g. Monetary Exchange Rate models

Journal of Forecasting, 2015, vol. 34 (7), pp. 560-573

• 2 exchange rate frequencies• Theory: different data generating processes• Is this confirmed?

Page 6: Presentation Machine Learning

Slide 6 of 52

MethodologiesMachine Learning

• ARCH• GARCH• EGARCH • AR• ARMA• ARIMA• AFRIMA• Random Walk

• Artificial Neural Networks

• Support Vector Regression

Page 7: Presentation Machine Learning

Slide 7 of 52

Overview: Hybrid Method

Page 8: Presentation Machine Learning

Slide 8 of 52

Decomposition: Ensemble Empirical Mode Decomposition

Page 9: Presentation Machine Learning

Slide 9 of 52

Variable Selection: Multivariate Adaptive Regression Splines (MARS)

knot

• Breaks dataset into subsamples• Identifying observations called knots• Assigns weights to variables• Selects the most informative

Page 10: Presentation Machine Learning

Slide 10 of 52

-ε0

ζ

Forecasting: Support Vector Regression• Fit error tolerance band• Higher flexibility than fitting a simple line• Defined according to subset of observations called support

vectors

Page 11: Presentation Machine Learning

Slide 11 of 52

The Kernel Projection

K(x1,x2)

• We fit a linear model• Real phenomena usually non-linear• Project to higher dimensions• Find a dimensional space where a linear error tolerance band is

defined• Re-project to original space and obtain non-linear error tolerance

band

Page 12: Presentation Machine Learning

Slide 12 of 52

Kernels Used

Linear

RBF

Polynomial

Sigmoid

2121 ),( xxxxK T

221),( 21xxexxK

dT rxxxxK )(),( 2121

)tanh(),( 2121 rxxxxK T

Page 13: Presentation Machine Learning

• Separate data• Train data: used to obtain the optimum model• Test data: never seen by the optimum model, used for out-of-

sample forecasting

Train data

Test data

Building the Model: Split the dataset into two parts

Page 14: Presentation Machine Learning

Overfitting and Cross-Validation

• The problem of overfitting:• High accuracy in a specific sample• Low generalization ability

• Solution: use k-part Cross-Validation• Example of a 3-part CV

Page 15: Presentation Machine Learning

Training Testing

Model evaluation

model

Training Testingmodel

Training Testingmodel

Initial

Dataset

Overfitting and Cross-Validation

Page 16: Presentation Machine Learning

Slide 16 of 52

Forecasting Exchange Rates

The data used:• 5 rates of various trading volumes:• 2 high volume rates USD/EUR, USD/JPY• 1 medium volume rate NOK/AUD• 2 low volume rates ZAR/ PHP and

NZD/BRL

Page 17: Presentation Machine Learning

Slide 17 of 52

CommoditiesCrude Oil

CottonLumberCocoaCoffee

Orange JuiceSugarCorn

WheatOats

Rough RiceSoybean Meal

Soybean OilSoybeans

Feeder CattleLean HogsLive Cattle

Pork BelliesIron Ore

MetalsGold

CopperPalladiumPlatinum

SilverAluminum

ZincNickelLeadTin

Stock IndicesDow JonesNasdaq 100

S & P 500DAX

CAC 40FTSE 100

Nikkei 225

Interest ratesT-bill 6 monthsT-bill 10 years

Spread MLP-EURIBOR 3MSpread MLR-Eonia

Spread FF-CPSpread FF-EFF

EONIAEURIBOR 1 WeekECB Interest rate

EURIBOR 1 MonthFED rate

Technical Analysis

variablesFive Day Index

Moving Average 3 dayMoving Average 5 day

Moving Average 10 dayMoving Average 30 day

MacroeconomicVars all countries

CPIProductivity

index GDP

Trade Balance Unemployment

Central Bank Discount rate

Long Term Interest Rates

Short Term Interest Rate

Aggregate money M1, M2

Public DebtDeficit/Surplus of

Government Budget

Exchange RatesJPY/EURJPY/USDUSD/GBPBRL/NZDNOK/AUDPHP/ZAREUR/GBPEUR/USD

USD Trade Weighted

IndicesMajor partners

Broad IndexOther Partners

Input Variables (192)

Page 18: Presentation Machine Learning

Slide 18 of 52

Forecasting Exchange RatesEEMD smoothed component vs USD/EUR series

Page 19: Presentation Machine Learning

Slide 19 of 52

EUR/USD USD/JPY AUD/NOK

-2%

0%

2%

4%

6%

8%

10%

12%

14%

16%

RWAR-SVREEMD-AR-SVREEMD-MARS-SVREEMD-AR-ANNEEMD-MARS-ANNARIMAGARCH

Mea

n Ab

solu

te P

erce

ntag

e Er

ror

Out-of-sample forecasting resultsDaily frequency

Autoregressive models, no-fundamentals

Page 20: Presentation Machine Learning

Slide 20 of 52ZND/BRL ZAR/PHP0%

1%

2%

3%

4%

5%

6%

RWAR-SVREEMD-AR-SVREEMD-MARS-SVREEMD-AR-ANNEEMD-MARS-ANNARIMAGARCH

Mea

n Ab

solu

te P

erce

ntag

e Er

ror

Autoregressive models, no-fundamentals

Out-of-sample forecasting resultsDaily frequency

Page 21: Presentation Machine Learning

Slide 21 of 52EUR/USD USD/JPY AUD/NOK0.0%

0.1%

0.2%

0.3%

0.4%

0.5%

0.6%

0.7%

0.8%

0.9%

1.0%

RWAR-SVREEMD-AR-SVREEMD-MARS-SVREEMD-AR-ANNEEMD-MARS-ANNARIMAGARCH

Mea

n Ab

solu

te P

erce

ntag

e Er

ror

Structural Models: Fundamentals Included

Out-of-sample forecasting resultsMonthly frequency

Page 22: Presentation Machine Learning

Slide 22 of 52ZND/BRL ZAR/PHP0.0%

0.1%

0.2%

0.3%

0.4%

0.5%

0.6%

0.7%

0.8%

0.9%

RWAR-SVREEMD-AR-SVREEMD-MARS-SVREEMD-AR-ANNEEMD-MARS-ANNARIMAGARCH

Mea

n Ab

solu

t Per

cent

age

Erro

r

Structural Models: Fundamentals Included

Out-of-sample forecasting resultsMonthly frequency

Page 23: Presentation Machine Learning

Slide 23 of 52

Conclusions Best model outperforms the RW model ML: outperforms all econometric methodologies Rejection even of the weak form of efficiency The model captures the different data generating

processes that drive exchange rates in short and long run

Page 24: Presentation Machine Learning

Slide 24 of 52

• Forecast direction: UP or DOWN• Frequency: Daily• Rates: USD/EUR, USD/JPY, USD/GBP and USD/AUD • Data span: January 2, 2013 to December 26, 2013 • Training 200• Test: 51

Application 2: Forecasting Exchange Rates Directionally

Algorithmic Finance, vol. 4 (1-2), pp. 69-79.

Page 25: Presentation Machine Learning

Slide 25 of 52

• Order Flow Analysis• Problem: data availability• Alternative: www.Stocktwits.com• Investor sentiment

Application 2: Forecasting Exchange Rates Directionally

Page 26: Presentation Machine Learning

Slide 26 of 52

Application 2: Forecasting Exchange Rates Directionally

Page 27: Presentation Machine Learning

Slide 27 of 52

Past values of the exchange rate

The volume of “Bearish” and “Bullish” posts per day

The volume of “Bearish”, “Bullish” and total posts per day

Past values of the exchange rate and the volumes of “Bullish” and “Bearish” posts per day

Past values of the exchange rate, volumes of “Bullish”, “Bearish” and total posts per day

Input Variables Sets

Page 28: Presentation Machine Learning

Slide 28 of 52

Application 2: MethodologiesMachine Learning• Artificial Neural Networks• Support Vector Machines• Knn Nearest Neighbors• Boosted Decision Trees

Econometrics• Logistic Regression• Naïve Bayes Classifier• Random Walk

Page 29: Presentation Machine Learning

Slide 29 of 52

Classification: Support Vector Machines

Page 30: Presentation Machine Learning

Φ

Φ-1

Projection to n+i dimensions

Page 31: Presentation Machine Learning

Slide 31 of 52

Support Vector Machines

Page 32: Presentation Machine Learning

Slide 32 of 52USD/EUR USD/JPY USD/GBP USD/AUD0%

10%

20%

30%

40%

50%

60%

70%

80%

RWSVMNaïve BayesKnnAdaboostLogitboostLogistic RegressionANN

Dire

ction

al A

ccur

acy

Forecasting Exchange Rates(Short term– microstructural approach - classification)

Page 33: Presentation Machine Learning

Slide 33 of 52

Conclusions

• The machine learning methodologies outperform the RW model

• Machine Learning techniques forecast more accurately that the econometric methodologies the out-of-sample direction

• Market hype expressed through the volume (total number) of posts improves the total forecasting ability

• Alternative informational path than the order flow analysis

Page 34: Presentation Machine Learning

Slide 34 of 52

Forecasts

• 1-10 years ahead

• 1890-2012• 80% - 20%

Input Variables

• Real GDP per capita• Long and short term interest

rate• Population number• Real asset value• Real construction cost• Unemployment / Inflation• Real oil Prices• Fiscal Policy Indicator

Methodologies

• EEMD – Elastic Net – SVR

• Bayesian AR/VAR

Application 3: Forecasting the Case & Schiller house price index

Economic Modelling, 2015, vol. 45, pp. 259-267.

Page 35: Presentation Machine Learning

Slide 35 of 52Mean Absolute Percentage

ErrorDirectional Accuracy

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

RWRW with driftBARBVARAR-SVREN-SVREEMD-AR-SVREEMD-EN-SVR

Optimum Model

Page 36: Presentation Machine Learning

Slide 36 of 52

1 2 3 4 5 6 7 8 9 100%

5%

10%

15%

20%

25%

RWEEMD-AR-SVR

Forecasting Horizon (Years)

Mea

n Ab

solu

te P

erce

ntag

e Er

ror

Comparison in Alternative Forecasting Horizons

EEMD-AR-SVR vs Random Walk

Page 37: Presentation Machine Learning

Slide 37 of 5219891991

19931995

19971999

20012003

20052007

20092011

95

115

135

155

175

195

True Values

1 period ahead

2 Periods Ahead

The EEMD-AR-SVR and the Collapse of the Housing Market – Actual vs Forecasted

ML model forecasts the sudden drop of house prices in 2006 that sparked the 2007 financial crisis.

Page 38: Presentation Machine Learning

Slide 38 of 52

Conclusions

• ΕΕΜD-SVR forecasts more accurately than all models used in literature the evolution of house prices in out-of-sample forecasting

• The proposed model forecasts almost 2 years ahead the actual 2006 sudden drop in house prices

Page 39: Presentation Machine Learning

Slide 39 of 52

Application 4: Forecasting bank failures and stress testing

• 1443 U.S. Banks • 962 solvent • 481 failed

Data from FDIC (Federal Deposit Insurance Corporation)

Period 2003-2013

The dependent variable• Financial position of a bank (solvent or insolvent)

The independent variables • 144 financial variables and ratios for each bank

Page 40: Presentation Machine Learning

Slide 40 of 52

Variable selection:Local-learning for high-dimensional data (Sun et al. in

2010)

Selected variables:• Tier 1 (core) risk-based capital/total assets year t-1• Provision for loan and lease losses/total interest

income year t-1• Loan loss allowance/total assets year t-1• Total interest expense/total interest income year t-1• Equity capital to assets year t-1

Page 41: Presentation Machine Learning

Slide 41 of 52

Forecasting bank failures and stress testing

t-1 t-2 t-350%55%60%65%70%75%80%85%90%95%

100% 96.67%89.90%

78.13%

• 3 forecasting windows

Page 42: Presentation Machine Learning

Slide 42 of 52

Optimum model: Year t-1

Real Solvent228

Real Failed115

Predicted Solvent 223 3Predicted Failed 5 112Accuracy 97.39% 97.81%

Forecasting accuracy

Solvent Misclassified as failed 5• 4 out of 5 received enforcement action from FDIC• 1 received financial help from the U.S. Treasury as part of the

Capital Purchase Program under the Troubled Asset Relief Program

Failed misclassified as solvent 3• 1 discovered with unreported losses and was closed• 2 small banks with total assets $44.5 & $26.3 million respectively

Page 43: Presentation Machine Learning

Slide 43 of 52

Solvency elasticity – Stress testing

Insolvent Bank Space

Solvent Bank Space

Bank 1

Bank 2

Bank 3

Variable 2

Variable 1

• Banks are mapped on feature space• Distance from the separating hyperplane provides the sensitivity of its

state• This property may be used as an auxiliary stress testing methodology

Page 44: Presentation Machine Learning

Slide 44 of 52

• The behavior of the yield curve is associated with the business cycles.

• Indicator of future economic activity.

Application 5: Yield Curve and Recession Forecasting

Page 45: Presentation Machine Learning

• Forecasting: GDP deviations from trend• positive > Inflationary gaps• negative > Unemployment gaps • Literature: Pairs of interest rates, other variables

Application 5: The Yield Curve and EU Recession Forecasting

International Finance, vol. 18 (2), pp. 207-226

Page 46: Presentation Machine Learning

• Data span: September 2009 - October 2013• Methodologies: SVM vs Probit vs ANN• Variables: Eurocoin and interest rates: 3 & 6

months and 1, 2, 3, 5, 7, 10, 20 years.• We tested:

• 27 interest rates in pairs• 27 interest rates in triplets• All interest rates

Application 5: The Yield Curve and EU Recession Forecasting

International Finance, vol. 18 (2), pp. 207-226

Page 47: Presentation Machine Learning

Application 5: The Yield Curve and EU Recession Forecasting

International Finance, vol. 18 (2), pp. 207-226

A

B

A

B

D

C

• Motivation for interest rate triplets: exploit the curvature• AB same slope• ACB and ADB different curvature (concave vs convex)

Page 48: Presentation Machine Learning

Methodology Kernel Combination Train accuracy (%)

Test accuracy (%)

Growth accuracy (%)

Recession Accuracy (%)

Pairs

Probit 2Υ-10Υ 82,00 65,00 57,00 78,00SVM RBF 2Y-20Y 73,68 78,26 93,00 56,00SVM RBF 3Y-20Y 72,63 78,26 93,00 56,00SVM Polynomial 3M-5Y 78,94 69,56 50,00 100,00SVM Polynomial 1Y-2Y 77,89 69,56 50,00 100,00ΝN   3M-5Y 79,59 65,21 42,85 100,00

Triplets

Probit 1Υ-5Υ-10Υ 74,00 61,00 43,00 89,00SVM RBF 1Y-3Y-20Y 73,68 91,30 86,00 100,00SVM RBF 3M-2Y-20Y 73,68 78,26 64,00 100,00SVM Polynomial 1Y-3Y-7Y 74,73 65,22 50,00 89,00SVM Polynomial 3M-5Y-20Y 71,57 65,22 50,00 89,00ΝN   6M-5Y-7Y 81,63 65,21 57,14 77,78

All interest rates

Probit 83,00 65,00 57,00 78,00SVM Linear 58,94 39,13 0,00 100,00SVM RBF 74,73 69,56 79,00 56,00SVM Polynomial 74,73 56,52 50,00 67,00NN     75,51 69,56 78,57 55,56

Forecasting EU recessions

Page 49: Presentation Machine Learning

Slide 53 of 52

Augmenting with Fundamentals

• Australian dollar/euro• Canadian dollar/euro• Czech corone-euro• Chinese yuan-euro • Dollar-euro• CPI• Oil price• Unemployment• Inflation• Μ1

• Μ2• Μ3• Industrial Index • PPI• External liabilities• External demand

Page 50: Presentation Machine Learning

Slide 54 of 52

1Y-3Y-20Y Exchange rate austr. $/ euro

Μ10

10

20

30

40

50

60

70

80

90

100

73.68

87.369091.3 91.3

95.65

Train Accuracy (%) Test accuracy (%)

Forecasting EU recessions

Page 51: Presentation Machine Learning

Slide 55 of 52

Thank you

AcknowledgmentsThis research has been co-financed by the European Union (European Social Fund – ESF) and Greek national funds through the Operational Program "Education and Lifelong Learning" of the National Strategic Reference Framework (NSRF) - Research Funding Program: THALES. Investing in knowledge society through the European Social Fund.