quant sem 4

7/28/2019 QUANT SEM 4

1/26

QUANTITATIVE DATA ANALYSIS

HARSHAD BAJPAI

7/28/2019 QUANT SEM 4

2/26

7/28/2019 QUANT SEM 4

3/26

A statistic is a number summarizing a bunch ofvalues.

Simple or univariate statistics summarize values ofone variable.

Effect or outcome statistics summarize therelationship between values of two or more

variables. Simple statistics for numeric variables

Mean: the average

Standard deviation: the typical variation

Standard error of the mean: the typical variation inthe mean with repeated sampling

Multiply by (sample size) to convert to standarddeviation.

7/28/2019 QUANT SEM 4

4/26

Simple statistics for nominal variables

Frequencies, proportions, or odds.

Can also use these for ordinal variables.

Effect statistics are the one which depict the

relarionship between two or more variables, orhow one effects he other. Ex co-relation

coefficitent, regression etc

Test statistics are use to test hypothesis

7/28/2019 QUANT SEM 4

5/26

Common descriptive statistics

Count (frequencies)

Percentage

Mean

Mode

Median

Range

Standard deviation

Variance

Ranking

7/28/2019 QUANT SEM 4

6/26

Model: numeric vs numerice.g. body fat vs sum of skinfolds

Model or test:linear regression

Effect statistics:

slope and intercept

= parameters correlation coefficient or variance explained (=

100correlation2)= measures of goodness of fit

Other statistics:

typical or standard error of the estimate= residual error= best measure ofvalidity (with criterion variable on the Yaxis)

7/28/2019 QUANT SEM 4

7/26

Correlation

The concept of correlation is a statistical toolwhich studies the Relationship between twovariables and

Correlation Analysis involves various methodsand techniques used for studying and measuringthe extent of the relationship between the twovariables.

Two variables are said to be in correlation if thechange in one of the variables results in a changein the other variable

7/28/2019 QUANT SEM 4

8/26

Types of Correlation

There are two important types of correlation.

They are

(1) Positive and Negative correlation and

(2) Linear and Non Linear correlation.

7/28/2019 QUANT SEM 4

9/26

Positive and Negative Correlation

If the values of the two variables deviate inthe same direction

i.e. if an increase (or decrease) in the values of

one variable results, on an average, in acorresponding increase (or decrease) in the

values of the other variable the correlation is

said to be positive.

7/28/2019 QUANT SEM 4

10/26

Examples

Some examples of series of positive

correlation are:-

(i) Heights and weights;

(ii) Household income and expenditure;

(iii) Price and supply of commodities;

(iv) Amount of rainfall and yield of crops.

7/28/2019 QUANT SEM 4

11/26

Negative correlation

Correlation between two variables is

said to be negative or inverse if the

variables deviate in opposite direction. That is, if the increase in the variables

deviate in opposite direction. That is, if

increase (or decrease) in the values ofone variable results on an average, in

corresponding decrease (or increase) in

the values of other variable.

7/28/2019 QUANT SEM 4

12/26

Examples

Some examples of series of negative

correlation are:

(i) Volume and pressure of perfect gas;

(ii) Current and resistance [keeping the voltage

constant

(iii) Price and demand of goods

7/28/2019 QUANT SEM 4

13/26

The relationship between two variables is said tobe non linear ifcorresponding to a unit changein one variable, the other variable does not

change at a constant rate but changes at afluctuating rate. In such cases, if the data isplotted on a graph sheet we will not get a straightline curve.

For example, one may have a relation of the form y = a + bx + cx2

or more general polynomial.

7/28/2019 QUANT SEM 4

14/26

The Coefficient of Correlation

One of the most widely used statistics is thecoefficient of correlation r which measures the

degree of association between the two values ofrelated variables given in the data set. It takesvalues from + 1 to 1. If two sets or data have

r = +1, they are said to be perfectly correlated

positively ifr = -1 they are said to be perfectlycorrelated negatively; and if r = 0 they areuncorrelated.

7/28/2019 QUANT SEM 4

15/26

variance

7/28/2019 QUANT SEM 4

16/26

In statistical significance testing, the p-value isthe probability of obtaining a test statistic at leastas extreme as the one that was actuallyobserved, assuming that the null hypothesis is

true. In this context, value a is considered more"extreme" than b ifa is less likely to occur underthe null. One often "rejects the null hypothesis"when the p-value is less than the significance

level (Greek alpha), which is often 0.05 or 0.01.When the null hypothesis is rejected, the result issaid to bestatistically significant.
http://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Probabilityhttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Test_statistichttp://en.wikipedia.org/wiki/Test_statistichttp://en.wikipedia.org/wiki/Probabilityhttp://en.wikipedia.org/wiki/Test_statistichttp://en.wikipedia.org/wiki/Null_hypothesishttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Null_hypothesishttp://en.wikipedia.org/wiki/Test_statistichttp://en.wikipedia.org/wiki/Probabilityhttp://en.wikipedia.org/wiki/Statistical_significance

7/28/2019 QUANT SEM 4

17/26

Null hypothesis H0: < 24

Alternative hypothesis

Ha

: >24Similarly different

combinations

7/28/2019 QUANT SEM 4

18/26

Errors

Type I error

Rejecting Ho when Ho is true

Type II error-

Accepting Ho when Ha is true

7/28/2019 QUANT SEM 4

19/26

Level of significance

Level of significance is the probability of

making a Type-I error.

Denoted by the symbol .

The person doing the hypothesis testing

specifies the value of.

7/28/2019 QUANT SEM 4

20/26

Value of alfa

If the cost of making the type I error is high,

lower values of are preferred.

If the cost is low the higher values are

preferred.

Like in case of critical components in

automobiles, the values should be low, to

ensure errors are no entertained

7/28/2019 QUANT SEM 4

21/26

Standard error

Standard error is the standard deviation of the

sample.

S.E = / n

SE is the of the mean (X), that is called SE to

signify how much the Mean varies.

7/28/2019 QUANT SEM 4

22/26

One tailed test

Lower tail test left

Upper tailed test right

Z = standard normal variate / test stastic Z = x- / x

Z=-1, means the value of X is one standard

error below mean. Z=-2, means the value of X is two standard

error below mean.

7/28/2019 QUANT SEM 4

23/26

Criterion

P valueits a probability, computed using z,

holds acceptance and rejection criterion

7/28/2019 QUANT SEM 4

24/26

Two approaches

1) p-value approach

2) critical value approach

7/28/2019 QUANT SEM 4

25/26

Criteria

Reject Ho if p-value < alfa( level of

significance)

P-value is called the observed level of

significance

P-value calculation depends upon he type of

the test

7/28/2019 QUANT SEM 4

26/26

Lower tail test

For lower tail test the p-value is the

probability of obtaining a value of test statistic

at least as small as that provided by the

sample.

quant sem 4

Documents