quant sem 4
TRANSCRIPT
-
7/28/2019 QUANT SEM 4
1/26
QUANTITATIVE DATA ANALYSIS
HARSHAD BAJPAI
-
7/28/2019 QUANT SEM 4
2/26
-
7/28/2019 QUANT SEM 4
3/26
A statistic is a number summarizing a bunch ofvalues.
Simple or univariate statistics summarize values ofone variable.
Effect or outcome statistics summarize therelationship between values of two or more
variables. Simple statistics for numeric variables
Mean: the average
Standard deviation: the typical variation
Standard error of the mean: the typical variation inthe mean with repeated sampling
Multiply by (sample size) to convert to standarddeviation.
-
7/28/2019 QUANT SEM 4
4/26
Simple statistics for nominal variables
Frequencies, proportions, or odds.
Can also use these for ordinal variables.
Effect statistics are the one which depict the
relarionship between two or more variables, orhow one effects he other. Ex co-relation
coefficitent, regression etc
Test statistics are use to test hypothesis
-
7/28/2019 QUANT SEM 4
5/26
Common descriptive statistics
Count (frequencies)
Percentage
Mean
Mode
Median
Range
Standard deviation
Variance
Ranking
-
7/28/2019 QUANT SEM 4
6/26
Model: numeric vs numerice.g. body fat vs sum of skinfolds
Model or test:linear regression
Effect statistics:
slope and intercept
= parameters correlation coefficient or variance explained (=
100correlation2)= measures of goodness of fit
Other statistics:
typical or standard error of the estimate= residual error= best measure ofvalidity (with criterion variable on the Yaxis)
-
7/28/2019 QUANT SEM 4
7/26
Correlation
The concept of correlation is a statistical toolwhich studies the Relationship between twovariables and
Correlation Analysis involves various methodsand techniques used for studying and measuringthe extent of the relationship between the twovariables.
Two variables are said to be in correlation if thechange in one of the variables results in a changein the other variable
-
7/28/2019 QUANT SEM 4
8/26
Types of Correlation
There are two important types of correlation.
They are
(1) Positive and Negative correlation and
(2) Linear and Non Linear correlation.
-
7/28/2019 QUANT SEM 4
9/26
Positive and Negative Correlation
If the values of the two variables deviate inthe same direction
i.e. if an increase (or decrease) in the values of
one variable results, on an average, in acorresponding increase (or decrease) in the
values of the other variable the correlation is
said to be positive.
-
7/28/2019 QUANT SEM 4
10/26
Examples
Some examples of series of positive
correlation are:-
(i) Heights and weights;
(ii) Household income and expenditure;
(iii) Price and supply of commodities;
(iv) Amount of rainfall and yield of crops.
-
7/28/2019 QUANT SEM 4
11/26
Negative correlation
Correlation between two variables is
said to be negative or inverse if the
variables deviate in opposite direction. That is, if the increase in the variables
deviate in opposite direction. That is, if
increase (or decrease) in the values ofone variable results on an average, in
corresponding decrease (or increase) in
the values of other variable.
-
7/28/2019 QUANT SEM 4
12/26
Examples
Some examples of series of negative
correlation are:
(i) Volume and pressure of perfect gas;
(ii) Current and resistance [keeping the voltage
constant
(iii) Price and demand of goods
-
7/28/2019 QUANT SEM 4
13/26
The relationship between two variables is said tobe non linear ifcorresponding to a unit changein one variable, the other variable does not
change at a constant rate but changes at afluctuating rate. In such cases, if the data isplotted on a graph sheet we will not get a straightline curve.
For example, one may have a relation of the form y = a + bx + cx2
or more general polynomial.
-
7/28/2019 QUANT SEM 4
14/26
The Coefficient of Correlation
One of the most widely used statistics is thecoefficient of correlation r which measures the
degree of association between the two values ofrelated variables given in the data set. It takesvalues from + 1 to 1. If two sets or data have
r = +1, they are said to be perfectly correlated
positively ifr = -1 they are said to be perfectlycorrelated negatively; and if r = 0 they areuncorrelated.
-
7/28/2019 QUANT SEM 4
15/26
variance
-
7/28/2019 QUANT SEM 4
16/26
In statistical significance testing, the p-value isthe probability of obtaining a test statistic at leastas extreme as the one that was actuallyobserved, assuming that the null hypothesis is
true. In this context, value a is considered more"extreme" than b ifa is less likely to occur underthe null. One often "rejects the null hypothesis"when the p-value is less than the significance
level (Greek alpha), which is often 0.05 or 0.01.When the null hypothesis is rejected, the result issaid to bestatistically significant.
http://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Probabilityhttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Test_statistichttp://en.wikipedia.org/wiki/Test_statistichttp://en.wikipedia.org/wiki/Probabilityhttp://en.wikipedia.org/wiki/Test_statistichttp://en.wikipedia.org/wiki/Null_hypothesishttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Null_hypothesishttp://en.wikipedia.org/wiki/Test_statistichttp://en.wikipedia.org/wiki/Probabilityhttp://en.wikipedia.org/wiki/Statistical_significance -
7/28/2019 QUANT SEM 4
17/26
Null hypothesis H0: < 24
Alternative hypothesis
Ha
: >24Similarly different
combinations
-
7/28/2019 QUANT SEM 4
18/26
Errors
Type I error
Rejecting Ho when Ho is true
Type II error-
Accepting Ho when Ha is true
-
7/28/2019 QUANT SEM 4
19/26
Level of significance
Level of significance is the probability of
making a Type-I error.
Denoted by the symbol .
The person doing the hypothesis testing
specifies the value of.
-
7/28/2019 QUANT SEM 4
20/26
Value of alfa
If the cost of making the type I error is high,
lower values of are preferred.
If the cost is low the higher values are
preferred.
Like in case of critical components in
automobiles, the values should be low, to
ensure errors are no entertained
-
7/28/2019 QUANT SEM 4
21/26
Standard error
Standard error is the standard deviation of the
sample.
S.E = / n
SE is the of the mean (X), that is called SE to
signify how much the Mean varies.
-
7/28/2019 QUANT SEM 4
22/26
One tailed test
Lower tail test left
Upper tailed test right
Z = standard normal variate / test stastic Z = x- / x
Z=-1, means the value of X is one standard
error below mean. Z=-2, means the value of X is two standard
error below mean.
-
7/28/2019 QUANT SEM 4
23/26
Criterion
P valueits a probability, computed using z,
holds acceptance and rejection criterion
-
7/28/2019 QUANT SEM 4
24/26
Two approaches
1) p-value approach
2) critical value approach
-
7/28/2019 QUANT SEM 4
25/26
Criteria
Reject Ho if p-value < alfa( level of
significance)
P-value is called the observed level of
significance
P-value calculation depends upon he type of
the test
-
7/28/2019 QUANT SEM 4
26/26
Lower tail test
For lower tail test the p-value is the
probability of obtaining a value of test statistic
at least as small as that provided by the
sample.