1 the power and limitations of statistics in is research goal is to ask more questions about is...

47
1 The Power and Limitations of Statistics in IS Research o ask more questions about IS statistics rather tha to blindly accept them…. e Overheads were prepared and made available by Dr. Mary Lacity.

Upload: audra-bradley

Post on 20-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

1

The Power and Limitations of Statisticsin IS Research

Goal is to ask more questions about IS statistics rather thanto blindly accept them….

These Overheads were prepared and made available byDr. Mary Lacity.

Page 2: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

2

The Power and Limitations of Statisticsin IS Research

•On average, a company’s annual IT operating budgetrepresents 5% of annual revenues.

•80% of IS projects are delivered late and over budget orfail to deliver requirements.

•The global IT outsourcing market is $120 billion annually.

•There is no discernible relationship between IT investment and productivity.

•6% of US and UK respondents outsource more than 80% of IT budget to third party suppliers.

Page 3: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

3

Statistical Concepts

Population Parameters and how they are estimated:Census Sample

Random SampleNon-random Sample

Statistical calculations: Statistical tests:Mean (average) Statistical significanceMode Type I error: alpha valueMedian Type II error: beta valueStandard Deviation correlation

t-test

Page 4: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

4

Population Census of IS Professionals

PARAMETER of Interest: Sex: % of females

M M M M M M M M M M M M MM M M M M M MF F F F F F FF F F F F F F F F F F F F

CENSUS results:Number of Males: 20 Percentage of Males 50% Females: 20 Females 50%

Page 5: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

5

Sample of IS Professionals

Sample of 5 People

M M M M M M M M M M M M MM M M M M M MF F F F F F FF F F F F F F F F F F F F

SAMPLE results:Number of Males: 3 Percentage of Males 60% Females: 2 Females 40%

M M MF F

Page 6: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

6

When Sample statistics adequatelyapproximate population parameters:

Population MeanPopulation VariancePopulation Median

A sample statistic (such as mean) will be close to a population parameter if: ** Sample size is large enough ** Measuring instrument is good ** Sample is random

Sample meanSample varianceSample median

Page 7: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

7

IS Professor Salaries:Is the measuring instrument adequate; is the sample random?

PARAMETER of Interest: Average IS salary

Sample

$$$$$$ ?

On average, IS professors make $68,702

Page 8: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

8

IS Professor Salaries:Is the measuring instrument adequate; is the sample random?

How confident are you in this number?

$$$$$$ ?$68,702

Http://www.pitt.edu/galletta/1998sals.html

Page 9: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

9

IS Professor Salaries:Is the measuring instrument adequate; is the sample random?

How confident are you in this number?

$$$$$$ ?Average:$76,369

Http://www.pitt.edu/galletta/1999sals.html

Look at the 1999 survey so far…what can we learn from actuallylooking at the data!!!!!

Page 10: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

10

1999 IS Professor Salary

40,000 150,000-55,000 455,001-60,000 260,001-65,000 465,001-70,000 1170,001-75,000 1775,001-80,000 1380,001-85,000 885,001-90,000 790,001-95,000 495,001-100,000 2

150,000 174

Mean = $76,369 Median = $75,000

(half salaries above this number, half belowthis number.)

Mode: = $75,000 (most frequent salary cited)

Page 11: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

11

1999 IS Professor Salary

Frequency

0

2

4

6

8

10

12

14

16

18

40000 50,000-55,000

55,001-60,000

60,001-65,000

65,001-70,000

70,001-75,000

75,001-80,000

80,001-85,000

85,001-90,000

90,001-95,000

95,001-100,000

150000

Mean, Mode, and Median are nearly the same because the distribution approximates the normal distribution.

Page 12: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

12

When are mean, median, and mode different?

1 1

2

1

3

4

1

12

$45,000 $15,000 $10,000 $5,700 $5,000 $3,700 $3,000 $2,000

0

2

4

6

8

10

12

14

Num

ber

of E

mpl

oyee

sSalaries by Huff, p. 33

Population isnot normal

Mean: $5,700Median: $3,000Mode: $2,000

Page 13: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

13

Standard Deviation

1 standard deviationincludes 68% of data

mean

Page 14: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

14

Standard Deviation

2 standard deviationsincludes 95% of data

mean

Page 15: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

15

Standard Deviation: Does it get bigger or smaller as sample size increases?

mean

Page 16: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

16

Standard Deviation: Does it get bigger or smaller as sample size increases?

mean

n is largen is mediumn is small

As sample size n increases, the sampling distribution of samplemean gets closer to population mean. Also, the sampling distributiongets closer and closer to the normal curve as n increases. What is this called?

Page 17: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

17

Central Limit Theorem

Population Distribution Sample distribution if n is large

Page 18: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

18

Type I and Type II Errors

Assume this is the real population mean and standarddeviation.

When we take a sample, we get a sample mean anda sample deviation (or sample error).

Page 19: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

19

Type I and Type II Errors

Actual Population (which we usually don’t know)Sample 1Sample 2Sample 2

Page 20: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

20

Type I and Type II Errors

Our null hypothesis is: There is no difference between the population mean and sample mean

In reality, population mean In reality, population does equal sample mean doesn’t = sample meanSample selected indicatessample mean is different Type I error No error than population mean

Sample selected indicatessample mean is same as population mean No Error Type II Error

Page 21: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

21

Type I and Type II Errors

Type I error: Probability of rejecting null hypothesis when indeed null was true

Type II error: Probability of accepting null hypothesis when indeed null was false

Page 22: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

22

Type I and Type II Errors

Type I error: Probability of rejecting null hypothesis when indeed null was true

In this picture, the sample mean is very close to the population mean,so we would get a t-test that is large and indicates: don’t rejectthe null hypothesis.

Page 23: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

23

Type I and Type II Errors

Type I error: Probability of rejecting null hypothesis when indeed null was true

In this picture, the sample mean is far away from the population mean

If we select a Type I error of .05, then we would reject the nullhypothesis if sample mean was greater than critical mean identifiedby the Type I error selected.

Critical value

Page 24: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

24

Type I and Type II Errors

Type I error: Probability of rejecting null hypothesis when indeed null was true

Thus, we have about a 5% change of drawling a sample which indicates reject when we should have accepted the null hypothesis.

Critical value

Page 25: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

25

Type I and Type II Errors

Type II error: Probability of accepting null hypothesis when indeed null was false

In this picture, assume we really sampled the wrong population. Bychance, we might have a sample that tells us we did have correct sample when indeed we did not.

.

Critical value

Type II probability

Page 26: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

26

When Sample statistics adequatelyapproximate population parameters:Sample size

Desired sample size n = (confidence level selected * population from standard normal table)2 variance acceptable error2

How are we supposedto know this????

Page 27: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

27

When Sample statistics adequatelyapproximate population parameters:Sample size: An example

Desired = (confidence level selected * population sample size n from standard normal table)2 variance acceptable error value2

Assume we want to take a sample of IS professor salaries andassume we know the standard deviation is $12,000. If we willaccept a plus or minus $3,000 error, how large should the sample be?

n = (1.96)2 * (12,000)2

$3,0002

n = ????

Page 28: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

28

28

2.9 2.14.1

2.9 2.1

20.5

1.9 13.1

5.9

39

14

2.3

69.8

Fin

lan

d

No

rmay

Sw

eden

Ho

ng

Ko

ng

Isra

el

Ital

y

Den

mar

k

Sin

gap

ore

Po

rtu

gal

Au

stra

lia

Jap

an

So

uth

Ko

rea

Au

stri

a

US

0

10

20

30

40

50

60

70

80

Num

ber

of s

ubsc

riber

sSource: Gartner Group DataQuest as reported in World Almanac

World-wide subscriptions to Cellular Phones in Millions

Page 29: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

29

The semi-attached figure:Which country has highest cell phone adoption rate?

57

4846

43

3736

35

3231 31 31

20

29

26

FinlandNormay

SwedenHong Kong

IsraelItaly

DenmarkSingapore

PortugalAustralia

JapanSouth Korea

AustriaUS

0

10

20

30

40

50

60

Per

cent

age

of P

opul

atio

nSource: Gartner Group DataQuest as reported in World Almanac

World-wide subscriptions to Cellular Phones

Page 30: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

30

The semi-attached figure:Which Internet Stock should I invest in?

33

29 28

21 2018 18

15 14 14 1412 12

yah

oo

aol

msn

geo

citi

es

net

scap

e

go

mic

roso

ft

lyco

s

exci

te

ho

tmai

l

pas

spo

rt

ang

elfi

re

amaz

on

X-Axis

0

5

10

15

20

25

30

35

Un

iqu

e vi

sits

in m

illio

nsMost visited websites August 1999

Matrix Media as reported in World Almanac

Page 31: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

31

The One Dimensional Picture

Excite Msn

Msn.com had twice as many visitors as Excite.com

Page 32: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

32

So where did this statisticcome from???

On average, a company’s annual IT budget represents5% of annual revenues

It was a generally quoted statistic I heard over and over again. Oneexample includes:

Minoli, Analyzing Outsourcing, Re-engineering InformationAnd Communication Systems, McGraw Hill, 1994.

Data collected by author, but not much detail is given. Myconfidence comes from the fact that his results are similarto many other results from studies I’ve seen.

Page 33: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

33

So where did this statisticcome from???

80% of IS projects are delivered late and over budgetor fail to deliver requirements.

It was a generally quoted statistic I heard over and over again.Some more formal studies found:

AUTHOR # of Projects FINDINGSLehman 1979 57 46% overdue; 59% over budgetGladden 1982 ??? 75% systems not used or not completedJohnson 1995 365 31% projects cancelled; 53% cost over-run; 12% delivered on time to budgetPhan (1995) 143 25% do not meet requirements

Page 34: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

34

So where did this statisticcome from???

The global IT outsourcing market is $120 billion annually

This statistic was reported by International Data Corporation onhttp://www.outsourcing.com last year. However, sit no longer exists.

I found the following quote on: http://www.infoserver.com/

.. [5].src = "images/news_faq_up.gif"; } // --> Company: PR Newswire Date of Post: 08-Aug-99 Type of Article: Market Trends Article Title: IDC Reports Worldwide Outsourcing Spending Approached $100 Billion in 1998 and Will Surge to Over $151 Billion by 2003 Summary: Worldwide outsourcing services ...

Page 35: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

35

So where did this statisticcome from???

There is no discernible relation between IT investmentand productivity.

Attempts to correlate investments in information technology to productivity have found no correlation or a negative correlation:

A study of 60 manufacturing firms during the period of 1974-1984 failed to show a significant positive relationship between IT expense and productivity.

A study of 58 mutual savings banks found no relationship between organizational performance and IT expense.

An evaluation by the US Department of Commerce for the years 1950-1986 show a negative correlation between information technology and productivity.

Page 36: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

36

So where did this statisticcome from???

A research report by the Gartner Group revealed that firms that invested in office automation systems had exactly the same level of productivity in 1987 as they did in 1967.

Japan and Europe have much higher office and service sector productivity than the US even though they have not computerized nearly as quickly as the US

Peter Drucker observed that the number of office workers and clerical staff grow in proportion to investments in information technology.

There is no discernible relation between IT investmentand productivity.

Page 37: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

37

So where did this statisticcome from???

There is no discernible relation between IT investmentand productivity.

How can the paradox be correct?The paradox runs counter to intuition.We see the effects on productivity everyday--automated tellers, laser checkouts, fax machines, word processors, travel reservation systems.

1. Macroeconomic studies have no internal validity because the information technology/productivity paradox merely captures a correlation, not a causal relationship.

Perhaps productivity would have suffered a major decline without investments in IT.

Page 38: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

38

So where did this statisticcome from???

There is no discernible relation between IT investmentand productivity.

2. Macroeconomics considers worker productivity, not net benefits to society.

For example, automated tellers may not correlate with higher banking productivity, but society as a whole benefits from convenient, 24-hour banking.

3. IT is like R&D, many projects will fail, but you only need a few to gain a big payoff.

Page 39: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

39

So where did this statisticcome from???

There is no discernible relation between IT investmentand productivity.

4. Quinn & Baily outline flaws with macroeconomic numbers:

Industry productivity only captures 42% of service sector employment

30% of the productivity figures equate output and input--which will be constant!

Example: Input is budget, Output assumes an equivalent $ value for input. For example, if the police department’s budget is $5 million, it assumes they produced $5 million worth of law enforcement.

Page 40: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

40

So where did this statisticcome from???

•6% of US and UK respondents outsource more than 80% of IT budget to third party suppliers.

This statistic came from a survey that Leslie Willcocks and Iadministered to the following sample:

For US survey, 500 names of CIOs were obtained from a list maintained by Dun & Bradstreet Information Services. Only 38 people returned the survey.

For UK survey, a list of 100 CIOs were compiled from various sources including Financial Times top 100 list, and members of the Oxford Institute of Information Management. 63 surveyswere returned from UK.

Page 41: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

41

So where did this statisticcome from???

How confident are we in this 6% number? Other surveys (which willhave their own biases and limitations, found a similarly low numberof total outsourcing; most companies pursue selective sourcing:

In a survey of 300 IT managers in the US, on average lessthan 10% of the IT budget was outsourced (Caldwell, 1996a)

A survey of 110 Fortune 500 companies found that 76% spent less than 20% of the IT budget on outsourcing, and 96% spent less than 40% (Collins and Millen, 1995)

A survey of 365 US companies found that 65% outsourced one or more IT activities, but only 12 outsourced IT completely (Dekleva, 1994)

Page 42: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

42

Statistical Significance:a few surprises

Using the same dataset, US and UK respondents to outsourcingsurveys, let’s look at the avg company size:

However, there is no statistical difference at p=025 between US and UK revenues! How can this be, given US revenues are nearly 10 times larger!

261

10995

1311

Scandinavia United States United Kingdom

0

2000

4000

6000

8000

10000

12000

$US

mill

ions

Average Annual Revenues converted to $USn = 113 respondents

US: $10,995,000,000UK: $ 1,311,000,000

Page 43: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

43

Look at the standard deviation!

$US Revenues UK revenues in $USMinimum $30 million $1 millionMaximum $168,800 million $12,000 million Average $10,995 million $1,311 millionStandard Deviation $29,158 million $2,728 million

“Despite differences in means, a one-tailed t-test assuming heteroscedasticity at p=.025 level indicates that US and UK revenues are not statistically different. This finding is explained by the large standard deviation.

Page 44: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

44

$0.0

0$0

.01

$0.0

2$0

.03

$0.0

4$0

.05

$0.0

6$0

.07

$0.0

8$0

.09

$0.1

0$0

.20

$0.3

0$0

.40

$0.5

0$0

.60

$0.7

0$0

.80

$0.9

0$1

.00

$1.1

0$1

.20

$1.3

0$1

.40

$1.5

0$1

.60

$1.7

0$1

.80

$1.9

0$2

.00

$2.1

0$2

.20

$2.3

0$2

.40

$3.5

0$6

.00

$7.0

0$1

0.00

$10.

40$1

4.00

$15.

00$1

6.00

$32.

00$1

69.0

0

Revenues in $US

0

1

2

3

4

5

6

7

8

Fre

quen

cy

US FrequencyUK Frequency

Page 45: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

45

Gotta!!!!

The key is the level of significance for the probability ofa type I error.

Type I error = probability that we reject the null hypothesis when indeed the null is true.

With a t-test, we are testing the null hypothesis that the US and UK revenues not different.

At a selected p=.025, we are saying that we want the probability of rejecting the null hypothesis if indeed the null is true to be .025.

Page 46: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

46

Gotta!!!!

In reality, the calculated p value was .03

Thus, if our selected p value is .025, we only reject the nullhypothesis if the calculated p value was less than .025.

Thus I can conclude that US and UK revenues are different at.025 level.

What do we conclude if selected probability of type I erroris .05, the more usual probability selected?

Page 47: 1 The Power and Limitations of Statistics in IS Research Goal is to ask more questions about IS statistics rather than to blindly accept them…. These Overheads

47

Conclusions

“How to talk back to a statistic”, Huff, 1982, pp. 122-142

Who says so?How does he know?Did Somebody Change the subject?Does It Make Sense?