1 hrir 8011 “statistics is a collection of procedures and principles for gaining and processing...

Post on 21-Jan-2016

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

HRIR 8011

• “Statistics is a collection of procedures and principles for gaining and processing information in order to make decisions when faced with uncertainty.” (Utts, p. 3)

• Objective of HRIR 8011: learning to use information to make good (not lousy) decisions, which requires• Collecting information (data)• Analyzing data• Interpreting the results of the analyses

2

Consider…

• Employees who are dissatisfied with their job are more likely to vote for a union than employees who are satisfied (HRIR 8071)

• Structured interviews are better than open-ended interviews when selecting new employees (HRIR 8031)

• An HR manager asks what is the market rate of pay?• An HR manager asks what can I do to reduce

absenteeism?• If low paid workers are absent more, do you raise

wages?

3

The Focus of HRIR 8011

• Our focus…the procedures and principles of using information correctly

• When Professor Tubre says that you should use a cognitive ability test, question it! How do we know we should use it?

• What information is this conclusion based on?

• How were the data collected? Does that seem applicable to my situation?

• How were the data analyzed? Was that appropriate? What did they miss?

• Are the conclusions justified based on the data and the results?

4

Index Numbers

• Index value = 100 X

• Price Index Example: if current cost is $3,300 and base period costs is $2,400 then

• Price Index = 100 X (3,300/2,400) = 137.5

• Interpretation: the current period is 37.5% percent higher than the base period

current value

base period value

5

Time Series

5.0

5.5

6.0

6.5

7.0

7.5

8.0

8.5

Year 1 Year 2 Year 3

6

Measurement

7

Measurement

• Validity

• Reliability

• Bias

8

Seven Measurement

Pitfalls• Deliberate bias• Unintentional bias• Desire to please• Asking the uninformed• Unnecessary complexity• Ordering of questions• Confidentiality and anonymity

• Source: Jessica M. Utts, Seeing Through Statistics, 2nd ed. (Pacific Grove, CA: Duxbury, 1999), p. 32.

9

Cumulative Frequency

• Recall Eggs R Us

Race | Freq. Percent Cumul.‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑+‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑ African American | 87 15.10 15.10 Asian American | 6 1.04 16.15 Hispanic | 25 4.34 20.49 white | 458 79.51 100.00‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑+‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑ Total | 576 100.00

10

Percentiles

• The pth percentile of a sample is the value for which at most p% of the measurements are less than that value and at most (100-p)% of the measurements are greater than that value

• Median

• Quartiles

• Deciles

11

Box Plot

SmallestLargest

LowerQuartile

UpperQuartile

Median

{Middle halfof the data

12

Box Plot

0 10 20 30 40

Age

0 2 4 6 8 10

Tenure

13

A Box Plot in Labor

Source: Alan B. Krueger and Alexandre Mas, “Strikes, Scabs, and Tread Separations: Labor Strife and the Production of Defective Bridgestone/Firestone Tires,” Journal of Political Economy 112 (April 2004), pp. 252-289 at 274.

14

A Simple cdf Example

xRelative

FrequencyCumulative Frequency

1 0.125 0.125

3 0.125 0. 250

4 0. 250 0.500

5 0.125 0.625

6 0.125 0.750

9 0.125 0.875

10 0.125 1.000

Consider the simple data set:

1, 4, 6, 4, 10, 9, 3, 5

This yields the following relative and cumulative frequencies

15

A Simple cdf Example

0.000

0.125

0.250

0.375

0.500

0.625

0.750

0.875

1.000

0 1 2 3 4 5 6 7 8 9 10

xRelative

FrequencyCumulative Frequency

1 0.125 0.125

3 0.125 0. 250

4 0. 250 0.500

5 0.125 0.625

6 0.125 0.750

9 0.125 0.875

10 0.125 1.000

1. To make the cdf, start at zero and move to the right along the x-axis until you come to the first value of x (that is, x=1)

16

A Simple cdf Example

0.000

0.125

0.250

0.375

0.500

0.625

0.750

0.875

1.000

0 1 2 3 4 5 6 7 8 9 10

xRelative

FrequencyCumulative Frequency

1 0.125 0.125

3 0.125 0. 250

4 0. 250 0.500

5 0.125 0.625

6 0.125 0.750

9 0.125 0.875

10 0.125 1.000

2. The value x=1 accounts for 0.125 of the cumulative frequency so the cdf jumps up to 0.125

17

A Simple cdf Example

0.000

0.125

0.250

0.375

0.500

0.625

0.750

0.875

1.000

0 1 2 3 4 5 6 7 8 9 10

xRelative

FrequencyCumulative Frequency

1 0.125 0.125

3 0.125 0. 250

4 0. 250 0.500

5 0.125 0.625

6 0.125 0.750

9 0.125 0.875

10 0.125 1.000

3. Now continue to the right until you get to the next value (x=3) at which point the cdf jumps up another 0.125 to 0.250.

18

A Simple cdf Example

0.000

0.125

0.250

0.375

0.500

0.625

0.750

0.875

1.000

0 1 2 3 4 5 6 7 8 9 10

xRelative

FrequencyCumulative Frequency

1 0.125 0.125

3 0.125 0. 250

4 0. 250 0.500

5 0.125 0.625

6 0.125 0.750

9 0.125 0.875

10 0.125 1.000

4. At x=4, note that the relative frequency is 0.25 (recall that there were two occurrences of 4 in the data set) so the cdf jumps 0.25 to 0.50.

19

A Simple cdf Example

0.000

0.125

0.250

0.375

0.500

0.625

0.750

0.875

1.000

0 1 2 3 4 5 6 7 8 9 10

xRelative

FrequencyCumulative Frequency

1 0.125 0.125

3 0.125 0. 250

4 0. 250 0.500

5 0.125 0.625

6 0.125 0.750

9 0.125 0.875

10 0.125 1.000

5. Continuing for the remaining x values yields the completed cdf.

20

Birth of a Distribution

< 5 5 to 9 > 9

3 Bins

21

Birth of a Distribution

<2 2-4 4-6 6-8 8-1010-12 >12

7 Bins

22

Birth of a Distribution

15 Bins

23

Birth of a Distribution

33 Bins

24

Birth of a Distribution

1000 Bins

25

Different Distributions

26

Even More Distributions

27

Symmetrical Distributions

28

Symmetrical Distribution

29

Positively Skewed

30

Negatively Skewed

31

Symmetrical Distribution

Bell-shaped, symmetrical distribution

Will be very important for

statistical inference

32

Additional Variance Example

0

0.5

1

1.5

2

2.5

3

11 12 13 14 15 16 170

1

2

3

4

5

6

7

8

9

11 12 13 14 15 16 17

Cyberland (1st 10 obs) Contrived Sample

=13.9X =13.9X

=1.97 =0.30

top related