session pdf

Upload: mahi-prince

Post on 02-Jun-2018

227 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Session PDF

    1/41

  • 8/10/2019 Session PDF

    2/41

    Agenda for Coming sessions Analytics with Probabilistic Decision Making Model

    Introduction to Logistics Regression

    Decision Theory

    Non Linear Models

    Business Analytics and Application

    Sentiment Analysis and Opinion Mining Online Business Channel and Web Analytics

    Analytics in Marketing

    Introduction to Markov Analysis Markov Decision Process

    Poisson Process Models

  • 8/10/2019 Session PDF

    3/41

    Continue Product Development

    Introduction to Life Cycle Cost Total cost of ownership

    Analytics In Operation

    Introduction to Sig Sigma for problem solvingAnalytics In finance

    Brownian Process

    Asset Performance Measure

  • 8/10/2019 Session PDF

    4/41

    Case Study 7i technology is a medium-sized consulting firm in San Francisco that

    specializes in developing various forecast of product demand, sales,consumption, or other information for its clients. To a lesser degree, ithas also developed ongoing models for internal use by client companies.When contacted by a potential client, 7i technology usually establishes abasic agreement with the firms top management that sets out thegeneral goals of the end product, primary contact personnel in both

    rms, an an out ne o t e pro ect s overa nc u ng any necessarytime constraints for intermediate and final completion and rough priceestimate for the contract). Following this step, a team of 7i personnel isassembled to determine the most appropriate forecasting technique andto develop a more detailed work program to be used as the basis for

    final contract negotiations. This team which vary in size according to thescope of the project and the clients needs, will perform the tasksestablished in the work program in conjunction with any personnel fromthe client firm who would be included in the team.

  • 8/10/2019 Session PDF

    5/41

    Continue Recently, 7i has been contacted by a rapidly growing multinational firm that

    manufactures, sells android based tablets for enterprise and retail use.Honeycomb has seen aggressive in global and regional market and is in the

    process to define new strategy to increase its present market share. But theproblem which the company is presently is facing is in terms of demand so thatthey can offer competitive price to increase their market share.

    As a Business Analyst of 7i you must decide between different forecasting

    approach. The linear trend equation is Yi= 12+2x, and it was developed using data from periods 1 through 10. Based

    on the data from periods 11 through 20, calculate the MPE (Mean PercentageError) and MAPE (Mean Absolute Percentage Error).

    Base on the values of MPE and MAPE comment on which of the two methods

    has the greater overall accuracy. Compare the two methods in terms of theforecast bias.

  • 8/10/2019 Session PDF

    6/41

    Data

    11 2

    12 2

    13 3

    1 0

    1 3

    1

    1 0

    1

    20

  • 8/10/2019 Session PDF

    7/41

    Since the MAPE values for the methods for the methods are

    approximately equal, the overall accuracy of the two methods

    is about the same. Both Methods are predicting

    approximately 11 % away from actual.

    -

    trend equation is overestimating sales by 7.91%.

    On the otherhand MPE is +ve for nave forecasting methods.

    It is underestimating the sales by 7.66%

  • 8/10/2019 Session PDF

    8/41

    Accuracy and Control

  • 8/10/2019 Session PDF

    9/41

    Forecast Errors Forecast error is the difference between the value that occurs

    and the value that was predicted for a given time period.

    Error=Actual-Forecast

    Positive errors results when the forecast is too low and

  • 8/10/2019 Session PDF

    10/41

    Reasons of Forecasting errors The model may be inadequate due to (a) the omission of an

    important variable, (b) a change or shift in the variable that the

    model cannot deal with (e.g., sudden appearance of a trend orcycle), or (c) the appearance of a new variable (e.g., newcompetitor)

    Irregular variations due to severe weather or other naturalp enomena, emporary s or ages or rea owns, ca as rop es, orsimilar events may occur.

    The forecasting technique may be used incorrectly or the resultsmay be misinterpreted

    There are random variations in the data. Randomness is theinherent variation that remains in the data after all causes ofvariation have been accounted for

  • 8/10/2019 Session PDF

    11/41

    Types of Forecasting Accuracy Mean Absolute Deviation( MAD)

    Definition: measures the average forecast error over anumber of periods, without regard to the sign of the error:

    for computation, all errors are treated as positive.

    Definition: the average squared error experienced over a

    number of periods.

  • 8/10/2019 Session PDF

    12/41

    Formula

    == FAeMAD

    ( )

    11

    22

    =

    =

    n

    FA

    n

    e

    MSE

  • 8/10/2019 Session PDF

    13/41

    Continue The MSE is a variance, and the n-1 in its denominator is used

    instead of n for essentially the same reason that n-1 is used to

    compute a sample standard deviation model.

    Difference Between the two models

    ,

    more than the MAD measure.

  • 8/10/2019 Session PDF

    14/41

    Monitoring and Controlling Forecast Is it time to reexamine the validity of the forecasting

    technique being used?

    There are two types of random errors

    Which are inherent and cannot be removed from the model

    econ one s non-ran om errors w c can e e m nate How to eliminate such errors?

    Modifying the technique

    Improving data collection .

  • 8/10/2019 Session PDF

    15/41

    Forecast Error

  • 8/10/2019 Session PDF

    16/41

    Mean Forecast Error (MFE)

    FA

    n

    FE=

  • 8/10/2019 Session PDF

    17/41

    The response variable, Y, is categorical

  • 8/10/2019 Session PDF

    18/41

    Analytics In Decision Making Logistics Model

    Introduction

  • 8/10/2019 Session PDF

    19/41

    Quiz

  • 8/10/2019 Session PDF

    20/41

    Difference Between Linear and

    Nonlinear Regression Models

    uiXiYi ++= 21

    uiieYi Xi += 2Exponential Regression Model

  • 8/10/2019 Session PDF

    21/41

    Business Problem in Marketing/ Retail What is the success probability by endorsing Chetan Bhagat

    to promote Huwaie technologies products.

    What channel of delivery is more effective.

    What is the impact of price label on buyers decision.

  • 8/10/2019 Session PDF

    22/41

    Business Problem in Banking and

    Finance How to distinguish between good and bad credit risks.

    How to identify most profitable customer.

    How customer will react in terms of there invest in mutual

    funds during bad market situations.

  • 8/10/2019 Session PDF

    23/41

    Definition Logistic regression also known as logit analysis is a statistical

    model used for prediction of probability of occurrence of an

    event.

    Logistic regression differs from multiple regression, however,

    event occurring (i.e. the probability of an observation beingin the group). Although probability values are metric

    measures, they are fundamental difference between two.

  • 8/10/2019 Session PDF

    24/41

    What this model explains. Logistics Regression models how probability, P, of an event

    may be affected by one or more explanatory variables.

  • 8/10/2019 Session PDF

    25/41

    Classification Classifying customer by their buying habits between various

    categories.

    Classification by a telecom operators its various customers in

    terms of usage.

  • 8/10/2019 Session PDF

    26/41

    Challenger launch temperature vs

    damage data

  • 8/10/2019 Session PDF

    27/41

    Equation

    Xe

    Pi )(1

    1

    21 +

    =+

    ii

    Z

    Z

    Zi

    XZ

    e

    e

    eP

    i

    21

    11

    1

    +=

    +

    =

    +

    =

  • 8/10/2019 Session PDF

    28/41

    Representation of Binary Dependent

    Variable Logistic regression represents the two groups of interest as

    binary variable with values of 0 and 1

    The assignment of values is not important but the

    interpretation of coefficient are done in this format

    particular reason. The result would be success and failure.

  • 8/10/2019 Session PDF

    29/41

    Use of Logistic Curve- Sigmoid or S

    shaped

    1012

    1

    0

    2

  • 8/10/2019 Session PDF

    30/41

    Explanation Binary Values has only value between 0 and 1

    In order to define relationship in logistics regression we use

    logistic curve between independent and dependent variable.

  • 8/10/2019 Session PDF

    31/41

    Unique Nature of the Dependent

    Variable Binary nature of the dependent variable (0 or 1) has

    properties that violate basic assumptions multiple regression.

    The error term of a discrete variable

  • 8/10/2019 Session PDF

    32/41

    Logit Function The logit function is a logarithmic transformation of the

    logistic function. It is defined as the natural logarithm of

    odds.

    Logit of a variable (with value between 0 and 1) is given

    XInLogit10

    1)(

    +=

    =

  • 8/10/2019 Session PDF

    33/41

    Logistic Transformation The logistic regression model is given by

    )(

    )(

    1 10

    10

    ee

    iX

    Xi

    +

    =+

    +

    110

    )(

    1

    1

    10

    XIn

    e iX

    i

    i

    +=

    =

    +

  • 8/10/2019 Session PDF

    34/41

    More robust

    Error terms need not be normal

    No requirement for equal variance for error terms

    No requirement for linear relationship between dependent

    an epen ent an n epen ent var a es.

  • 8/10/2019 Session PDF

    35/41

    In standard regression , the error term is assumed to follow

    normal distribution whereas in case of logistics regression its

    not the same.

    In case of binary logistics regression, the error for a given

    -

    .

  • 8/10/2019 Session PDF

    36/41

  • 8/10/2019 Session PDF

    37/41

    Estimation of parameters No closed form solutions exists for estimation of regression

    parameters of logistics regression.

    Estimation of parameters in logistic regression is carried out

    using Maximum Likelihood Estimation (MLE) technique.

    M i Lik lih d E ti t r

  • 8/10/2019 Session PDF

    38/41

    Maximum Likelihood Estimator

    (MLE)

  • 8/10/2019 Session PDF

    39/41

    MLE is a statistical model for estimating model parameters of

    a function

    For a given data set, the MLE chooses the values of the

    model parameters that makes the data more likely than

  • 8/10/2019 Session PDF

    40/41

  • 8/10/2019 Session PDF

    41/41

    E.g. Exponential Distribution Let x1, x2, , xn be the sample observation that follows

    exponential distribution with parameter .

    That is:

    f(x, )=

    The likelihood function is given by (assuming independence):