summary of stats
TRANSCRIPT
-
7/27/2019 summary of stats
1/35
-
7/27/2019 summary of stats
2/35
CORRELATION
A Statistical technique that is used to analyse the
strength and direction of the relationship betweentwo quantitative variable is called Correlational
analysis.
Two variables are said to be in correlation if the
change in one of the variable results in a change in
other variable.
E g :- 1) Frequency of smoking and lungs damage ,
2) Sales revenue and expenses incurred on
advertising.
-
7/27/2019 summary of stats
3/35
Importance of correlation
If variables are linearly related to each otherthen it helps in estimation of one from theother.
Advertisement and sales
Prices and Demand
We use Regression Analysis to find the valueof one variable from the other
-
7/27/2019 summary of stats
4/35
TYPES OF CORRELATION
POSITIVE AND NEGATIVE
LINEAR AND NON-LINEAR
SIMPLE ,PARTIAL AND MULTIPLE
-
7/27/2019 summary of stats
5/35
POSITIVE CORRELATION AND
NEGATIVE CORRELATION
POSITIVECORRELATION NEGATIVE CORRELATION
If the variables vary in
same direction,
correlation is said to bePOSITIVE.
If one variable increases,
the other also increases on
the other hand, if one
variable decreases, the
other also decreases.
If both variables vary in
the opposite direction,
correlation is said to beNEGATIVE.
If one variable increases
and the other decreases, or
one decreases the other
increases.
-
7/27/2019 summary of stats
6/35
LINEAR CORRELATION NON-
LINEAR CORRELATION
LINEAR CORRELATION
If the extent of changein one variable tends tohave a constant ratio inthe extent of change inanother variable, thenthe correlation is said tobe LINEAR.
NON-LINEAR CORRELATION
If the extent of changein one variable tends tohave no consistent ratioin the extent of changein another variable,then the correlation issaid to be NON-LINEAR.
-
7/27/2019 summary of stats
7/35
SIMPLE,PARTIAL AND MULTIPLE
CORRELATION
When only two variables are involved,
it is simple correlation When three or more than three
variables are involved, we can computeeither partial or multiple correlation
-
7/27/2019 summary of stats
8/35
Methods of
correlation
graphic
Scatter diagram
algebraic
1. Karl pearson
2. Rank method
-
7/27/2019 summary of stats
9/35
Scatter Diagram
Scatter diagram is a graph or chart which helps todetermine whether there is a relationship between twovariables by examining the graph of the observed data.
A scattered diagram can give us two types of information:
Pattern that indicate that the variables are related.
If the variables are related,what kind of line orestimating equation,describes this relationship.
-
7/27/2019 summary of stats
10/35
-
7/27/2019 summary of stats
11/35
KARLS PEARSONS
COEFFIENT OF CORRELATION
Karl Pearsons Coefficient of Correlationdenoted by- r The coefficient of
correlation r measure the degree oflinear relationship between twovariables say x & y.
r = N dxdy - dxdyN dx-(dx)N dy-(dy)
-
7/27/2019 summary of stats
12/35
The value of correlation coefficient rranges from -1 to +1If r = +1, then the correlation between the
two variables is said to be perfect andpositiveIf r = -1, then the correlation between the
two variables is said to be perfect andnegativeIf r = 0, then there exists no correlation
between the variables
Interpretation of Correlation
Coefficient (r)
-
7/27/2019 summary of stats
13/35
REGRESSION
The statistical technique that express the
relationship between two or more variables in the
form of an equation to estimate the value of a
variable, based on the given value of another
variable is called regression analysis.
eg :- Profit after Sales of a firm.
-
7/27/2019 summary of stats
14/35
Difference between dependent variable
and independent variable
Independent Variable
1. The known variable is called
the independent variable.
2. What we typically call X.
3. Variable that is controlled or
manipulated.
4. It is plotted on horizontal axis.
5. An input variable.
Dependent Variable
1. The variable we are trying to
predict is the dependent
variable.
2. What we typically call Y.
3. Variable that cannot be
controlled or manipulated.
4. It is plotted on vertical axis.
5. An output variable.
-
7/27/2019 summary of stats
15/35
Difference between Regression
and Correlation
Regression
A statistical method used
to describe the nature ofrelationship.
In linear regression analysisone variable is considered
as dependent variable and
other as independent
variable
Correlation
A statistical method used
to determine whether arelationship between two
or more variables exist.
In correlation analysis weexamine the degree of
association between two
variables
-
7/27/2019 summary of stats
16/35
Advantages of Regression Analysis
It helps in developing a regression equation
by which the value of a dependent variable
can be estimated given a value of an
independent variable.
It helps to determine standard error of
estimate to measure the variability or spread
of values of a dependent variable with
respect to the regression line.
-
7/27/2019 summary of stats
17/35
Estimation using the Regression Line
The equation for a straight line where thedependent variable Y is determined by the
independent variable X is:
Y = a + bxWhere,
a = y-intercept
b = slope of the line
Y = value of dependent variable
X = value of independent variable
-
7/27/2019 summary of stats
18/35
THE METHOD OF LEAST SQUARE
It is a method of having a good fit of a line
which minimizes the error between the
estimated points on the line and actual
points that were used to draw it.
In this method Y represents the individual
value of the observed points measured along
the Y-axis and Y(y-hat) symbolize the
individual values of the estimated points. The Estimated Line is:
= a + bx
-
7/27/2019 summary of stats
19/35
-
7/27/2019 summary of stats
20/35
COEFFICIENT OF
DETERMINATION
The convenient way of interpreting thevalue of correlation coefficient is to use ofsquare of coefficient of correlation whichis called Coefficient of Determination.
The Coefficient of Determination is r2.
r2
= 1- (Y- )2
(Y-Y)2
-
7/27/2019 summary of stats
21/35
STANDARD ERROR OF ESTIMATE
Standard error of estimate measures thevariability of the scatter of the observed
values around the regression line.
It is given by:
Se= (Y- )2
n-2If Se=0, the estimating equation is expected to
be a perfect estimator of the dependent
variable.
-
7/27/2019 summary of stats
22/35
WHAT DOES TIME-SERIES MEAN?
A time series is a sequence of data points,
measured typically at successive points in time
spaced at uniform time intervals.
Time series is a set of measurements of avariable that are ordered through time
Time series analysis comprises methods for
analyzing time series data in order to extractmeaningful statistics and other characteristics
of the data
DIFFERENCE WITH REGRESSION
-
7/27/2019 summary of stats
23/35
DIFFERENCE WITH REGRESSIONANALYSIS
Timeseries Analysis
Regression Analysis
Time series forecasting is the
use of a model to predict
future values based on
previously observed values.
Regression analysis is often
employed in such a way as
to test theories that the
current value of one time
series affects the current
value of another time
series.
Regression analysis cannot
explain seasonal and cyclical
effects.
It shows or suggestsperiodicity of a data like
seasonal and cyclical
effects.
-
7/27/2019 summary of stats
24/35
COMPONENTS OF TIME SERIES
SECULAR TREND
CYCLICAL VARIATIONS
SEASONAL VARIATIONS
IRREGULAR VARIATIONS
-
7/27/2019 summary of stats
25/35
-
7/27/2019 summary of stats
26/35
Units
years
Upward trend of sales of Laptops in Ranchi
2000 2001 2002 2003 2004 2005 2006 2007
2000
4000
6000
8000
10000
-
7/27/2019 summary of stats
27/35
units
(in
000
)
years
Declining trend of using Landline Phones in India
2000 01 02 03 05 06 07 08 09 10 11
30
60
90
120
150
180
04
-
7/27/2019 summary of stats
28/35
CYCLICAL VARIATION
Cyclical variations are long-term movements that representconsistently recurring rises and declines in activity.
Timing is the most important factor which affect
the Cyclical Variations.
for example- Business Cycle, it consists of the recurrence ofthe up and down movements of business activity
-
7/27/2019 summary of stats
29/35
depression
prosperity
Prosperity orboom
Economicactivities
time
Cyclical Variation(Business cycle)
-
7/27/2019 summary of stats
30/35
SEASONAL VARIATION
Seasonal variations are those periodic movements in business
activity which occur regularly every year.
Since these variations repeat during a period of twelve months
so, they can be predicted fairly accurately.
Seasonal Variations are caused by climate and weather
conditions, customs, festivals and habits.
for example-Sales of Cold-drinks goes up in summer season
than any other season
-
7/27/2019 summary of stats
31/35
U
nits
years2000 2001 2002 20032004 2005 2006
Sales of Cold-drinks
10000
12000
14000
1600018000
20000
IRREGULAR VARIATION
-
7/27/2019 summary of stats
32/35
IRREGULAR VARIATIONIrregular variations refer to such variations in business
activity which do not repeat in a definite pattern.
In these type of variations the pattern of the variable isunpredictable.Irregular Variations are caused by unpredictable
factors like natural disasters (earthquakes, floods,
wars etc.).These are unpredictable and no onehas control over it.
For example-Production of cars tremendously wentdown after earthquake came in Japan in Nov 2011.
-
7/27/2019 summary of stats
33/35
2005 2006 2007 2008 2009 2010 2011
100000
150000
200000
250000
300000
350000
unit
s
Production of cars in Japan
years
-
7/27/2019 summary of stats
34/35
-
7/27/2019 summary of stats
35/35
Thank you