predict risk 副本

28
PREDICTING RISK FROM FINANCIAL REPORTS WITH REGRESSION In Association for Computational Linguistics 2009. Human Language Technologies, pp.272280 Authors: Kogan, S., Levin, D., Routledge, B.R., Sagi, J.S., and Smith, N.A. Presenter: Chen QingZhi Date : 20120315

Upload: -

Post on 08-Jul-2015

170 views

Category:

Education


0 download

DESCRIPTION

predict risk

TRANSCRIPT

Page 1: Predict risk   副本

PREDICTING RISK FROM FINANCIAL

REPORTS WITH REGRESSION

In Association for Computational Linguistics 2009.

Human Language Technologies, pp.272–280

Authors: Kogan, S., Levin, D., Routledge, B.R., Sagi, J.S., and

Smith, N.A.

Presenter: Chen QingZhi

Date : 20120315

Page 2: Predict risk   副本

OUTLINES

1.Introduction

2.Stock Return Volatility

3.Problem Formulation

4.Dataset

5.BaseLine and Evaluation Method

6.Experiments

7.Conclusion

Page 3: Predict risk   副本

INTRODUCTION

Page 4: Predict risk   副本

INTRODUCTION

4.Our motivation and task:1)In real world, people use financial report to predict the

financial risk of investment of that company by human

experience.

2)Given some financial reports ,we try to automatically

predict a continuous quantity known as stock return

volatility which is an measurement of financial risk

5.The output variable in this work is

uncontroversial and resources(both text and

volatility record) are easy to obtain.

Page 5: Predict risk   副本

STOCK RETURN VOLATILITY

Page 6: Predict risk   副本

STOCK RETURN VOLATILITY

3.Why volatility?We are trying to predict how stable its price will be over a

future time period , especially one year

4.Volatility is easier predicted than stock

performance and not subject to any kind of

human expertise and disagreement.

Page 7: Predict risk   副本

PROBLEM FORMULATION

Page 8: Predict risk   副本

PROBLEM FORMUALTION

3. SVR is a well-known method for training a

regression model

where C is a regularization constant and controls the training

error

Page 9: Predict risk   副本

PROBLEM FORMULATION

In terms of kernel function:

Then use this formula to solve W

Page 10: Predict risk   副本

DATASET

1.Form 10-K: mandated by US Securities

Exchange Commission

2.Subsection 7A: quantitative and qualitative

disclosures about market risk.So we filter other sections from the reports and keep the

most important part

3.For some reasons , not all of the documents

pass the filter at all: bankrupt , delist …

Page 11: Predict risk   副本

DATASET

Table 1: Dimensions of the dataset used in this paper , after filtering and

tokenization.

The near doubling in average document size during 2002–3 is possibly due to the

passage of the Sarbanes-Oxley Act of 2002 in the wake of Enron’s accounting

scandal (and numerous others).

Page 12: Predict risk   副本

DATASET

REPORT SAMPLE:

The following discussion and analysis of ABC’s consolidated financial condition and consolidated results of operation should be read in conjunction with ABC’s Consolidated Financial Statements and Notes thereto included elsewhere herein. This discussion contains certain forward-looking statements which involve risks and uncertainties . ABC’s actual results could differ materially from the results expressed in, or implied by, such statements. See “Regarding Forward-Looking Statements.”

Page 13: Predict risk   副本

DATASET

Page 14: Predict risk   副本

DATASET

Data preparation :

Tokenization was applied to the text, including

punctuation removal, down casing, collapsing

all digit sequences, and heuristic removal of

remnant markup

Page 15: Predict risk   副本

BASELINES AND EVALUATION METHOD

Page 16: Predict risk   副本

BASELINES AND EVALUATION METHOD

Measurement is mean squared error:

Page 17: Predict risk   副本

EXPERIMENTS

Page 18: Predict risk   副本

EXPERIMENTS

Page 19: Predict risk   副本

EXPERIMENTS

Objective representation:

Page 20: Predict risk   副本

EXPERIMENTS

Page 21: Predict risk   副本

RESULTS

Table 2: MSE (Eq. 6) of different models on test data predictions. Lower values are better. Boldface denotes improvements over the baseline, and denotes significance compared to the baseline under a permutation test (p <0.05).

Page 22: Predict risk   副本

EFFECTS OF SARBANES-OXLEY

Sarbanes-Oxley Act of 2002, which sought to

reform financial reporting, had a clear effect on

informativity.

Page 23: Predict risk   副本

RECENT DATA IS MORE IMPORTANT

Page 24: Predict risk   副本

INTERPRET WEIGHTS

Table 3: Most strongly-weighted terms in models learned from various time periods (LOG1P model with unigrams and bigrams). “#” denotes any digit sequence.

Page 25: Predict risk   副本

EXPERIMENTS

Page 26: Predict risk   副本
Page 27: Predict risk   副本

EXPERIMENTS

For example :”estimates”, which averages one

occurrence per document even in the 1996–

2000 period, experiences the same term

frequency explosion, and goes through a

similar weight change, from strongly indicating

high volatility to strongly indicating low volatility

Page 28: Predict risk   副本

CONCLUSION

1.Testbed in NLP.

2.How well improvement this paper is present?

3.Interpretable weight can how commendatory

or derogatory term have changed.

4.Can compare information released by

different texts.