l643: evaluation of information systems week 13: march, 2008

L643: Evaluation of Information Systems

Week 13: March, 2008

2

Data Collection

1. Zipcar

2. Evergreen

3. Quandrem

4. Unicoop

5. LibraryThing

6. Fluvog Shoes & Boots

3

Ten Ways to A great Figure (Salkind, 2007, p.83)

Keep it simple

4


Keep it simple

5


Keep it simple

6


Label everything so nothing is left to the misunderstanding of the audience

A chart alone should convey what you want to say

7

Graphic Presentation

A line chart to show a trend in the data at equal intervals

8


A pie chart to show the proportion of an item that makes up a series of data points [usually for nominal (e.g., level of computer experience) and ordinal (e.g., age 18-34, 35-44, 45-54, 55-64, above 64) variables]

9


Times series charts => variables change over time

10

Influence of Research(ers)

The Hawthorne Effect Individual behaviors altered because they know

they are being studied See more info at:

http://www.envisionsoftware.com/articles/Hawthorne_Effect.html

http://www.psy.gla.ac.uk/~steve/hawth.html

11

Descriptive Statistics

Why do we need statistics? 2 ways to summarize or describe a set of

data According to how the individual pieces of

information cluster together (measuring central tendency)

According to how individual cases spread apart (measures of dispersion)

12

Measures of Central Tendency

3 most common measures of central tendency: Mean (average) Median (midpoint) Mode (mode (most frequent value(s))

13

Central Tendency (Salkind, 2000)

Mean the arithmetic average of all scores

Median the point that divides the distribution of scores in

half

Mode the most frequently occurring score(s)

XX

n

1

2

N

14


Mean Is a very accurate measure of central tendency

with fairly equal distribution Is the most important statistically of central

tendency (c.f., t-test; the analysis of variance)

15


Mean The sum of the individual values for each variable

divided by the the number of cases

X =Sum of scores

Number of scores

~

16


Mean Mean tells you the balance point, or the average

of the set of values

With a normal distribution, it is likely to be the same # of scores both above and below the mean

17


Median The midpoint of a set of ordered numbers To find the median, arrange the numbers from

smallest to largest It’s useful for distributions that are positively or

negatively skewed

18


Normal distribution

19


Normal distribution Skewed distribution

Positively skewed—with a few very high scores

Negatively skewed—with a few very low scores

Note: in positively skewed distributions, the mean is likely to be misleadingly high

In negatively skewed distributions, the mean is likely to misleadingly low

20

Curves

21


Mode The most frequent score(s) in a distribution Why use the mode?

It’s not so useful with a normal distribution It is useful for categorical data

22

Curves

23

Curves

24

Measurement Scales Nominal (categorical or qualitative) scale

E.g., what type of car do you have? Cf., Salkind chapter 2

Mode

25


In summary If a measure of central tendency of categorical

data, use only the mode Use the median when you have extreme

scores Use the mean when no extreme scores and

no categorical data

26

Measures of Dispersion

Variability Mean (4)

7, 6, 3, 3, 1 3, 4, 4, 5, 4 4, 4, 4, 4, 4

27


The most common measures of scatter, or dispersion, are: Range Standard deviation Variance

28


Range It is calculated by subtracting the lowest score

(minimum) from the highest score (maximum) in a distribution of values

It is not at all sensitive to the distribution of scores between min and max

What’s the range of the following set? 7, 6, 3, 3, 1 3, 4, 4, 5, 4

29


Standard deviation It is the average distance from the mean

1. Each score is subtracted from the mean

2. The difference is squared to eliminate any negative values and to give additional weight to extreme cases

3. These squared differences are added together & divided by the number of scores

S = N - 1

(x – X)2

30


Standard deviation The larger the standard deviation, the more spread out the

values are, and the more different they are from one another

Unlike the range, the SD is sensitive to every score in a distribution of scores

If the standard deviation = 0, there is no variability in the set of scores, and they are identical in value, which rarely happens.

31


To calculate the variance The variance is simply the standard

deviation squared, i.e., s2.

S = N-1

(x – X)22

32

Standard Deviation vs. Variance

Both measures of variability, dispersion, or spread

SD is stated in the original units from which it is derived

Variance is in units that are squared

33

Relationships

Relationships are important to examine because: answering research questions to examine, e.g.,

relationships between independent variables and dependent variables

suggesting new hypotheses and/or Qs

34

Correlation

Variable X Variable Y Type of Correlation

Value Example

X increases in value

Y increases in value

Direct Positive

.00 to +1.00

The more memory a machine has, the faster the machine becomes

X decreases in value

Y decreases in value

Direct Positive

.00 to +1.00

The fewer the links to a website, the lower the ranking on google appears

X increases in value

Y decreases in value

Indirect Negative

-1.00 to .00

The more time you spend time on an IS, the lower the productivity shows

X decreases in value

Y increases in value

Indirect Negative

-1.00 to .00

The less time spent on training, the mistakes on data entry increases

35

Correlation

r = N XY - X Y

[NX – (X) ] [NY - (Y) ] 2 2 2 2

.0 .2 .4 .6 .8 1.0

XY

Weak or norelationship

Weak relationship

Moderaterelationship

Strongrelationship

Very strongrelationship

Note: The association between 2 or more variables has nothing to do with causality (e.g., ice cream & crime rate)

36

Groups of Correlations

The correlation matrix

Info quality User satisfaction

Attitude Productivity

Info quality --- .574 -.08 .291

User satisfaction

.574 --- -.149 .199

Attitude -.08 -.149 --- -.169

Productivity .291 .199 -.169 ---

37

Summary of Descriptive Statistics

Descriptive statistics are summaries of distributions of measures or scores

These summaries are useful because of the large and complex nature of different quantitative studies, such as surveys, content analyses, or experiments

l643: evaluation of information systems week 13: march, 2008

Documents

central tendency3

central tendencyaccording

central tendencymeanthe

great figure salkind

simpleten ways

frequent scores

salkind chapter

distribution of scores