cs 147: computer systems performance analysis - advanced

31
CS 147: Computer Systems Performance Analysis Advanced Regression Techniques 1 / 31 CS 147: Computer Systems Performance Analysis Advanced Regression Techniques 2015-06-15 CS147

Upload: others

Post on 29-Nov-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

CS 147:Computer Systems Performance Analysis

Advanced Regression Techniques

1 / 31

CS 147:Computer Systems Performance Analysis

Advanced Regression Techniques

2015

-06-

15

CS147

Overview

Curvilinear RegressionCommon TransformationsGeneral Transformations

Handling Outliers

Common Mistakes

2 / 31

Overview

Curvilinear RegressionCommon TransformationsGeneral Transformations

Handling Outliers

Common Mistakes

2015

-06-

15

CS147

Overview

Curvilinear Regression

Curvilinear Regression

I Linear regression assumes a linear relationship betweenpredictor and response

I What if it isn’t linear?I You need to fit some other type of function to the relationship

3 / 31

Curvilinear Regression

I Linear regression assumes a linear relationship betweenpredictor and response

I What if it isn’t linear?I You need to fit some other type of function to the relationship

2015

-06-

15

CS147Curvilinear Regression

Curvilinear Regression

Curvilinear Regression

When To Use Curvilinear Regression

I Easiest to tell by sightI Make a scatter plot

I If plot looks non-linear, try curvilinear regressionI Or if non-linear relationship is suspected for other reasonsI Relationship should be convertible to a linear form

4 / 31

When To Use Curvilinear Regression

I Easiest to tell by sightI Make a scatter plot

I If plot looks non-linear, try curvilinear regressionI Or if non-linear relationship is suspected for other reasonsI Relationship should be convertible to a linear form

2015

-06-

15

CS147Curvilinear Regression

When To Use Curvilinear Regression

Curvilinear Regression Common Transformations

Types of Curvilinear Regression

I Many possible types, based on a variety of relationships:I y = axb

I y = a + b/xI y = abx

I Etc., ad infinitum

5 / 31

Types of Curvilinear Regression

I Many possible types, based on a variety of relationships:I y = axb

I y = a + b/xI y = abx

I Etc., ad infinitum

2015

-06-

15

CS147Curvilinear Regression

Common TransformationsTypes of Curvilinear Regression

Curvilinear Regression Common Transformations

Transform Them to Linear Forms

I Apply logarithms, multiplication, division, whatever to producesomething in linear form

I I.e., y = a + b × somethingI Or a similar form

I If predictor appears in more than one transformed predictorvariable, correlation is likely!

6 / 31

Transform Them to Linear Forms

I Apply logarithms, multiplication, division, whatever to producesomething in linear form

I I.e., y = a + b × somethingI Or a similar form

I If predictor appears in more than one transformed predictorvariable, correlation is likely!

2015

-06-

15

CS147Curvilinear Regression

Common TransformationsTransform Them to Linear Forms

Curvilinear Regression Common Transformations

Sample Transformations

I For y = aebx take logarithm of y , do regression onlog y = b0 + b1x , let b = b1, a = eb0

I For y = a + b log x , take log of x before fitting parameters, letb = b1, a = b0

I For y = axb, take log of both x and y , let b = b1, a = eb0

7 / 31

Sample Transformations

I For y = aebx take logarithm of y , do regression onlog y = b0 + b1x , let b = b1, a = eb0

I For y = a + b log x , take log of x before fitting parameters, letb = b1, a = b0

I For y = axb, take log of both x and y , let b = b1, a = eb0

2015

-06-

15

CS147Curvilinear Regression

Common TransformationsSample Transformations

Curvilinear Regression Common Transformations

Corrections to Jain p. 257(Early Editions)

Nonlinear Lineary = a + b/x y = a+b(1/x)

y = 1/(a + bx) (1/y) = a + bxy = x(a + bx) (x/y) = a + bx

y = abx ln y = ln a + x ln b

y = a + bxn y = a + b(xn)

8 / 31

Corrections to Jain p. 257(Early Editions)

Nonlinear Lineary = a + b/x y = a+b(1/x)

y = 1/(a + bx) (1/y) = a + bxy = x(a + bx) (x/y) = a + bx

y = abx ln y = ln a + x ln b

y = a + bxn y = a + b(xn)

2015

-06-

15

CS147Curvilinear Regression

Common TransformationsCorrections to Jain p. 257

Curvilinear Regression General Transformations

General Transformations

I Use some function of response variable y in place of y itselfI Curvilinear regression is one exampleI But techniques are more generally applicable

9 / 31

General Transformations

I Use some function of response variable y in place of y itselfI Curvilinear regression is one exampleI But techniques are more generally applicable

2015

-06-

15

CS147Curvilinear Regression

General TransformationsGeneral Transformations

Curvilinear Regression General Transformations

When To Transform?

I If known properties of measured system suggest itI If data’s range covers several orders of magnitudeI If homogeneous variance assumption of residuals

(homoscedasticity) is violated

10 / 31

When To Transform?

I If known properties of measured system suggest itI If data’s range covers several orders of magnitudeI If homogeneous variance assumption of residuals

(homoscedasticity) is violated

2015

-06-

15

CS147Curvilinear Regression

General TransformationsWhen To Transform?

Curvilinear Regression General Transformations

Transforming Due To (Lack of) Homoscedasticity

I If spread of scatter plot of residual vs. predicted responseisn’t homogeneous,

I Then residuals are still functions of the predictor variablesI Transformation of response may solve the problem

11 / 31

Transforming Due To (Lack of) Homoscedasticity

I If spread of scatter plot of residual vs. predicted responseisn’t homogeneous,

I Then residuals are still functions of the predictor variablesI Transformation of response may solve the problem

2015

-06-

15

CS147Curvilinear Regression

General TransformationsTransforming Due To (Lack of)Homoscedasticity

Curvilinear Regression General Transformations

What Transformation To Use?

I Compute standard deviation of residualsI Plot as function of mean of observations

I Assuming multiple experiments for single set of predictor valuesI Check for linearity: if linear, use a log transform

I If variance against mean of observations is linear, usesquare-root transform

I If standard deviation against mean squared is linear, useinverse (1/y ) transform

I If standard deviation against mean to a power is linear, usepower transform

I More covered in the book

12 / 31

What Transformation To Use?

I Compute standard deviation of residualsI Plot as function of mean of observations

I Assuming multiple experiments for single set of predictor valuesI Check for linearity: if linear, use a log transform

I If variance against mean of observations is linear, usesquare-root transform

I If standard deviation against mean squared is linear, useinverse (1/y ) transform

I If standard deviation against mean to a power is linear, usepower transform

I More covered in the book2015

-06-

15

CS147Curvilinear Regression

General TransformationsWhat Transformation To Use?

Curvilinear Regression General Transformations

General Transformation Principle

For some observed relation between standard deviation andmean, s = g(y):

let h(y) =∫

1g(y)

dy

transform to w = h(y) and regress on w

13 / 31

General Transformation Principle

For some observed relation between standard deviation andmean, s = g(y):

let h(y) =∫

1g(y)

dy

transform to w = h(y) and regress on w

2015

-06-

15

CS147Curvilinear Regression

General TransformationsGeneral Transformation Principle

Curvilinear Regression General Transformations

Example: Log Transformation

If standard deviation against mean is linear, then g(y) = ay

So h(y) =∫

1ay

dy =1a

ln y

14 / 31

Example: Log Transformation

If standard deviation against mean is linear, then g(y) = ay

So h(y) =∫

1ay

dy =1a

ln y

2015

-06-

15

CS147Curvilinear Regression

General TransformationsExample: Log Transformation

Curvilinear Regression General Transformations

Confidence Intervals for Nonlinear Regressions

I For nonlinear fits using general (e.g., exponential)transformations:

I Confidence intervals apply to transformed parametersI Not valid to perform inverse transformation before calculating

intervalsI Must express confidence intervals in transformed domain

15 / 31

Confidence Intervals for Nonlinear Regressions

I For nonlinear fits using general (e.g., exponential)transformations:

I Confidence intervals apply to transformed parametersI Not valid to perform inverse transformation before calculating

intervalsI Must express confidence intervals in transformed domain

2015

-06-

15

CS147Curvilinear Regression

General TransformationsConfidence Intervals for NonlinearRegressions

Handling Outliers

Outliers

I Atypical observations might be outliersI Measurements that are not truly characteristicI By chance, several standard deviations outI Or mistakes might have been made in measurement

I Which leads to a problem:Do you include outliers in analysis or not?

16 / 31

Outliers

I Atypical observations might be outliersI Measurements that are not truly characteristicI By chance, several standard deviations outI Or mistakes might have been made in measurement

I Which leads to a problem:Do you include outliers in analysis or not?

2015

-06-

15

CS147Handling Outliers

Outliers

Handling Outliers

Deciding How To Handle Outliers

1. Find them (by looking at scatter plot)2. Check carefully for experimental error3. Repeat experiments at predictor values for each outlier4. Decide whether to include or omit outliers

I Or do analysis both ways

Question: Is last point in last lecture’s example an outlier on ratingvs. year plot?

17 / 31

Deciding How To Handle Outliers

1. Find them (by looking at scatter plot)2. Check carefully for experimental error3. Repeat experiments at predictor values for each outlier4. Decide whether to include or omit outliers

I Or do analysis both ways

Question: Is last point in last lecture’s example an outlier on ratingvs. year plot?

2015

-06-

15

CS147Handling Outliers

Deciding How To Handle Outliers

Handling Outliers

Rating vs. Year

1940 1960 1980

Year

0

2

4

6

8

Rating

18 / 31

Rating vs. Year

1940 1960 1980

Year

0

2

4

6

8

Rating

2015

-06-

15

CS147Handling Outliers

Rating vs. Year

Common Mistakes

Common Mistakes in Regression

I Generally based on taking shortcutsI Or not being carefulI Or not understanding some fundamental principle of statistics

19 / 31

Common Mistakes in Regression

I Generally based on taking shortcutsI Or not being carefulI Or not understanding some fundamental principle of statistics

2015

-06-

15

CS147Common Mistakes

Common Mistakes in Regression

Common Mistakes

Not Verifying Linearity

I Draw the scatter plotI If it’s not linear, check for curvilinear possibilitiesI Misleading to use linear regression when relationship isn’t

linear

20 / 31

Not Verifying Linearity

I Draw the scatter plotI If it’s not linear, check for curvilinear possibilitiesI Misleading to use linear regression when relationship isn’t

linear

2015

-06-

15

CS147Common Mistakes

Not Verifying Linearity

Common Mistakes

Relying on Results Without Visual Verification

I Always check scatter plot as part of regressionI Examine predicted line vs. actual points

I Particularly important if regression is done automatically

21 / 31

Relying on Results Without Visual Verification

I Always check scatter plot as part of regressionI Examine predicted line vs. actual points

I Particularly important if regression is done automatically

2015

-06-

15

CS147Common Mistakes

Relying on Results Without VisualVerification

Common Mistakes

Some Nonlinear Examples

22 / 31

Some Nonlinear Examples

2015

-06-

15

CS147Common Mistakes

Some Nonlinear Examples

Common Mistakes

Attaching Importance to Parameter Values

I Numerical values of regression parameters depend on scaleof predictor variables

I So just because a particular parameter’s value seems “small”or “large,” not necessarily an indication of importance

I E.g., converting seconds to microseconds doesn’t changeanything fundamental

I But magnitude of associated parameter changes

23 / 31

Attaching Importance to Parameter Values

I Numerical values of regression parameters depend on scaleof predictor variables

I So just because a particular parameter’s value seems “small”or “large,” not necessarily an indication of importance

I E.g., converting seconds to microseconds doesn’t changeanything fundamental

I But magnitude of associated parameter changes

2015

-06-

15

CS147Common Mistakes

Attaching Importance to Parameter Values

Common Mistakes

Not Specifying Confidence Intervals

I Samples of observations are randomI Thus, regression yields parameters with random propertiesI Without confidence interval, impossible to understand what a

parameter really means

24 / 31

Not Specifying Confidence Intervals

I Samples of observations are randomI Thus, regression yields parameters with random propertiesI Without confidence interval, impossible to understand what a

parameter really means

2015

-06-

15

CS147Common Mistakes

Not Specifying Confidence Intervals

Common Mistakes

Not Calculating Coefficient of Determination

I Without R2, difficult to determine how much of variance isexplained by the regression

I Even if R2 looks good, safest to also perform an F-testI Not that much extra effort

25 / 31

Not Calculating Coefficient of Determination

I Without R2, difficult to determine how much of variance isexplained by the regression

I Even if R2 looks good, safest to also perform an F-testI Not that much extra effort

2015

-06-

15

CS147Common Mistakes

Not Calculating Coefficient of Determination

Common Mistakes

Using Coefficient of Correlation Improperly

I Coefficient of determination is R2

I Coefficient of correlation is RI R2 gives percentage of variance explained by regression, not

RI E.g., if R is .5, R2 is .25I And regression explains 25% of varianceI Not 50%!

26 / 31

Using Coefficient of Correlation Improperly

I Coefficient of determination is R2

I Coefficient of correlation is RI R2 gives percentage of variance explained by regression, not

RI E.g., if R is .5, R2 is .25I And regression explains 25% of varianceI Not 50%!

2015

-06-

15

CS147Common Mistakes

Using Coefficient of Correlation Improperly

Common Mistakes

Using Highly Correlated Predictor Variables

I If two predictor variables are highly correlated, using bothdegrades regression

I E.g., likely to be correlation between an executable’s on-diskand in-core sizes

I So don’t use both as predictors of run timeI Means you need to understand your predictor variables as

well as possible

27 / 31

Using Highly Correlated Predictor Variables

I If two predictor variables are highly correlated, using bothdegrades regression

I E.g., likely to be correlation between an executable’s on-diskand in-core sizes

I So don’t use both as predictors of run timeI Means you need to understand your predictor variables as

well as possible

2015

-06-

15

CS147Common Mistakes

Using Highly Correlated Predictor Variables

Common Mistakes

Using Regression Beyond Range of Observations

I Regression is based on observed behavior in a particularsample

I Most likely to predict accurately within range of that sampleI Far outside the range, who knows?

I E.g., regression on run time of executables smaller than sizeof main memory may not predict performance of executablesthat need VM activity

28 / 31

Using Regression Beyond Range of Observations

I Regression is based on observed behavior in a particularsample

I Most likely to predict accurately within range of that sampleI Far outside the range, who knows?

I E.g., regression on run time of executables smaller than sizeof main memory may not predict performance of executablesthat need VM activity

2015

-06-

15

CS147Common Mistakes

Using Regression Beyond Range ofObservations

Common Mistakes

Measuring Too Little of the Range

I Converse of prevoius mistakeI Regression only predicts well near range of observationsI If you don’t measure commonly used range, regression won’t

predict muchI E.g., if many programs are bigger than main memory, only

measuring those that are smaller is a mistake

29 / 31

Measuring Too Little of the Range

I Converse of prevoius mistakeI Regression only predicts well near range of observationsI If you don’t measure commonly used range, regression won’t

predict muchI E.g., if many programs are bigger than main memory, only

measuring those that are smaller is a mistake

2015

-06-

15

CS147Common Mistakes

Measuring Too Little of the Range

Common Mistakes

Using Too Many Predictor Variables

I Adding more predictors does not necessarily improve model!I More likely to run into multicollinearity problemsI So what variables to choose?

I It’s an artI Subject of much of this course

30 / 31

Using Too Many Predictor Variables

I Adding more predictors does not necessarily improve model!I More likely to run into multicollinearity problemsI So what variables to choose?

I It’s an artI Subject of much of this course

2015

-06-

15

CS147Common Mistakes

Using Too Many Predictor Variables

Common Mistakes

Assuming a Good Predictor Is a Good Controller

I Often, a goal of regression is finding control variablesI But correlation isn’t necessarily controlI Just because variable A is related to variable B, you may not

be able to control values of B by varying AI E.g., if number of hits on a Web page is correlated to server

bandwidth, you might not boost hits by increasing bandwidth

31 / 31

Assuming a Good Predictor Is a Good Controller

I Often, a goal of regression is finding control variablesI But correlation isn’t necessarily controlI Just because variable A is related to variable B, you may not

be able to control values of B by varying AI E.g., if number of hits on a Web page is correlated to server

bandwidth, you might not boost hits by increasing bandwidth

2015

-06-

15

CS147Common Mistakes

Assuming a Good Predictor Is a GoodController