data presentation. descriptive statistics descriptive statistics provide procedures to organize data...
Post on 26-Dec-2015
221 Views
Preview:
TRANSCRIPT
Data Presentation
Descriptive Statistics
• Descriptive statistics provide procedures to organize data we have collected from studies, summarize sample findings, and present these summaries in ways that can be easily communicated to others.
Descriptive Statistics
• The goal of descriptive statistics is to summarize a collection of data in a clear and understandable way.– What is the pattern of scores over the range of possible
values?– Where, on the scale of possible scores, is a point that best
represents the set of scores?– Do the scores cluster about their central point or do they
spread out around it?
Display
• Graphs often make it easier to see certain characteristics and trends in a set of data.– Graphs for quantitative data.
• Histogram• Frequency Polygon• Stem and Leaf Display
– Graphs for qualitative data.• Bar Chart• Pie Chart
Data
• Classifications or Scales– Nominal - groups subjects into mutually exclusive categories;
numerals represent category labels only (sex, nationality, blood type, clinical diagnosis)
– Ordinal - gives a quantitative order to the variable; numbers indicate rank order of observations (manual muscle test, functional status, pain)
– Interval - equal units of measurement between each division, but no true zero thus can not represent absolute quantity (calendar years, IQ, temperature)
– Ratio - interval scale with an absolute zero (distance, age, time, weight strength, blood pressure)
Scales of Measurement
• Discrete variable:– Consists of separate, indivisible categories; no values
between neighboring categories.• e.g., students in a class; psychiatric disorders
• Continuous variable:– Divisible into an infinite number of fractional parts.
• e.g., height, weight, time.
Scores on continuous variables are actually intervals – therefore, they may have boundaries called real limits.
Scales of Measurement
1. nominal: Set of categories, but no quantitative distinctions between categories.
example: professions
Scales of Measurement
2. ordinal: Categories ranked in terms of magnitude.
example: ranking participants in a race
Scales of Measurement
3. interval: Ordered categories with equal intervals between them; however, ratios of magnitudes are not meaningful.
example: IQ scores
Is a person with IQ 200, twice as intelligent as the person with IQ 100?
Scales of Measurement
4. ratio: An interval scale with the additional feature of an absolute zero (ratios are meaningful).
example: time measurements
Is two hours twice as long as one hour?
Graduate students’ anxiety scores
51 50 50 50 51 48 46 48 46 4750 48 49 46 50 47 47 47 49 4945 46 46 47 46 46 47 47 44 4546 46 48 47 46 45 48 48 48 4747 44 49 47 48 47 49 47 45 4849 48 48 49 45 49 47 45 47 4448 46 46 48 48 48 47 47 46 4744 45 44 46 49 46 47 46 45 4747 49 43 47 46 45 45 47 48 4847 48 43 48 46 46 48 45 46 47
• First you must list your scores in order.
• Next, record the number of times each score occurs.
Anxiety Scores
515049484746454443
Checkeach time these numbers occur
515049484746454443
IIIIIIIIII IIIIIIII IIII IIII IIIIIIII IIII IIII IIII IIIIIIII IIII IIII IIIIIIII IIII IIIII II
25
102025201152
Freq (f).02.05.10.20.25.20.11.05.02
RelativeFreq.
Total 100 1.00
Cumulative Frequency Distribution
X freq Cumulative Freq51 2 10050 5 9849 10 9348 20 8347 25 6346 20 3845 11 1844 5 743 2 2
Records all subjects who obtained a particular score or lower.
The Percentile Rank• This is a measure of relative standing, i.e., it tells
us where a particular score falls in relation to the rest of the data set.
• In fact, it tells us what percentage of scores in a data set fall at or below a particular score.
• For a score at the pth percentile, p% of the scores fall at or below that score.
• E.g. On his first stats test, a student scored at the 70 percentile. This means 70% of the class scored the same or lower than that student.
• You need a cumulative frequency distribution when calculating percentile rank.
X freq Cumulative freq20 1 1519 2 14 16 2 1214 1 1012 4 911 2 510 3 3
What is the percentile rank of the score 14?
Eg., The following scores are received on a stats exam marked out of 20.
• Step 1 - Count the number of scores at and below the score you’re looking at.
• In this case, 10 scores fall at or below 14.
• Step 2 - Divide this number by N and multiply by 100 to get the percentile rank.
• In this case…10/15 X 100 = 67%.
• The score falls at the 67th percentile.
• When the score in question is obtained more than once, a couple of steps must be added.
X freq Cumulative Freq51 2 10050 5 9849 10 9348 20 8347 25 6346 20 3845 11 1844 5 743 2 2
What is the percentile rank of the score of 46?
Eg. Graduate Students Anxiety Scores
• Step 1 - Count the number of scores below the score you’re looking at.
• In this case, 18 scores fall below 46.
• Step 2 - Divide the number of scores the same as the one you’re looking at by 2.
• In this case, 20 people scored 46.
• 20/2 = 10.
• Step 3 - Add this number to the total from step 1.
• 10 + 18 = 28.
• Step 4 - Divide this number by N and multiply by 100 to get the percentile rank.
• 28/100 x 100 = 28%.
• The score 46 falls at the 28th percentile.
The Percentile Rank (backwards)
• We know how to find the percentile rank that corresponds to a score, but what if we want to do the reverse?
• What if we want to find the score that corresponds to a certain percentile rank.
X freq Cumulative freq20 1 1519 2 14 16 2 1214 1 1012 4 911 2 510 3 3
What score is at the 75th percentile?
E.g., The following scores are obtained on an exam marked out of 20.
• Step 1- Multiply the decimal form of the percentile rank by N.
• 0.75 X 15 = 12
• This tells you are looking for the 12th score in the cumulative frequency distribution.
• Step 2- Locate this score on the cumulative frequency distribution.
• The score at the 75th percentile is 16.
Frequency Distributions• Simply a way of organizing and making sense
of a data set. It’s difficult to get a sense of what the scores are really like when you just look at a data set.
• E.g., A hundred graduate students are given a test to measure their anxiety. They receive the following scores. Scores can range from 40 (low) to 55 (high).
What is the pattern of scores?
– Create a Frequency Distribution• Frequency distributions organize raw data or
observations that have been collected.• Ungrouped Data
– Listing all possible scores that occur in a distribution and then indicating how often each score occurs.
• Grouped Data– Combining all possible scores into classes and then indicating
how often each score occurs within each class.– Easier to see patterns in the data, but lose information about
individual scores.
An Example: GroupedFrequency Distribution
• Find the lowest and highest score (order scores from lowest to highest).– 891 is highest score.– 52 is lowest score.
• Find the range by subtracting the lowest score from the highest score.
– 891-52 = 839• Divide range by 10.
– 839/10 = 83.9• Round off to the nearest convenient width.
– 100• Determine the scores at which the lowest interval should begin (an interval
of the class width).– 0
472 303 280 417 400 257 205 384 282
264 317 76 643 480 136 250 100 732
317 264 384 750 402 422 373 325 313
749 791 196 891 283 52 186 693
Las Vegas Hotel Rates
An Example: Grouped Frequency Distribution
• Record the limits of all class intervals, placing the interval containing the highest score value at the top.
• Count up the number of scores in each interval.
Hotel Rates Frequency
800-899 1 700-799 4 600-699 2 500-599 0 400-499 6 300-399 8 200-299 8 100-199 4
0-99 2
472 303 280 417 400 257 205 384 282
264 317 76 643 480 136 250 100 732
317 264 384 750 402 422 373 325 313
749 791 196 891 283 52 186 693
Las Vegas Hotel Rates
Frequency Table Guidelines
• Intervals should not overlap, so no score can belong to more than one interval.
• Make all intervals the same width.• Make the intervals continuous throughout the
distribution (even if an interval is empty).• Place the interval with the highest score at the
top.• For most work, use 10 class intervals.• Choose a convenient interval width.• When possible, make the lower score limit a
multiple of the interval width.
Hotel Rates Frequency
800-899 1 700-799 4 600-699 2 500-599 0 400-499 6 300-399 8 200-299 8 100-199 4
0-99 2
An Example: Grouped Frequency Distribution
• Proportion (Relative Frequency)– Divide frequency of each class by total frequency.– Used when you want to compare the frequencies of one distribution
with another when the total number of data points is different.
Hotel Rates Frequency Proportion
800-899 1 .03 700-799 4 .11 600-699 2 .06 500-599 0 0 400-499 6 .17 300-399 8 .23 200-299 8 .23 100-199 4 .11
0-99 2 .06 N = 35
An Example: Grouped Frequency Distribution
• Percentage– Proportion *100
Hotel Rates Frequency Proportion Percent
800-899 1 .03 3 700-799 4 .11 11 600-699 2 .06 6 500-599 0 0 0 400-499 6 .17 17 300-399 8 .23 23 200-299 8 .23 23 100-199 4 .11 11
0-99 2 .06 6
An Example: Grouped Frequency Distribution
• Cumulative Frequency– Shows total number of observations in each class and all
lower classes.
Hotel Rates Frequency Proportion Percent Cumulative Frequency
800-899 1 .03 3 35 700-799 4 .11 11 34 600-699 2 .06 6 30 500-599 0 0 0 28 400-499 6 .17 17 28 300-399 8 .23 23 22 200-299 8 .23 23 14 100-199 4 .11 11 6
0-99 2 .06 6 2
An Example: Grouped Frequency Distribution
• Cumulative Proportion (Cumulative Relative Frequency):– Divide Cumulative Frequency by Total Frequency
• Percentile Rank• Cumulative Proportion * 100
Hotel Rates Frequency Proportion Percent Cumulative Frequency
Cumulative Proportion
Percentile Rank
800-899 1 .03 3 35 1 100 700-799 4 .11 11 34 .97 97 600-699 2 .06 6 30 .86 86 500-599 0 0 0 28 .80 80 400-499 6 .17 17 28 .80 80 300-399 8 .23 23 22 .63 63 200-299 8 .23 23 14 .40 40 100-199 4 .11 11 6 .17 17
0-99 2 .06 6 2 .06 6
Table 1: Examination scores for 80 students
72 49 81 52 31 38 81 58 68 73 43 56 45 54 40 81 60 52 52 38 79 83 63 58 59 71 89 73 77 60 65 60 69 88 75 59 52 75 70 93 90 62 91 61 53 83 32 49 39 57 39 28 67 74 61 42 39 76 68 65 58 49 72 29 70 56 48 60 36 79 72 65 40 49 37 63 72 58 62 46
Stem-and-leaf display
23456789
2
9
8 91 8 8 2 9 9 9 6 7
3 5 0 9 2 9 8 0 9 62 8 6 4 2 2 8 9 9 2 3 7 8 6 88 0 3 0 5 0 9 2 1 7 1 8 5 0 5 3 2
3 9 1 3 7 5 5 0 4 6 2 0 9 2 21 1 1 3 9 8 33 0 1
Stem-and-leaf display
23456789
0
0
8 91 2 6 7 8 8 9 9 9
0 2 3 5 6 8 9 9 9 92 2 2 2 3 4 6 6 7 8 8 8 8 9 90 0 0 0 1 1 2 2 3 3 5 5 5 7 8 8 9
0 1 2 2 2 2 3 3 4 5 5 6 7 9 91 1 1 3 3 8 90 1 3
Stem-and-leaf display
Break each number into its tens and units digits.
Tally together values which share the tens digit.
The ten digits will then be aligned vertically with the units digits displayed to the side.
Frequency distribution of categorical data
Response Frequency
Cry 25
Express anger 15
Withdraw 5
Play with another toy 5
Total 50
Table 2: Responses of young boys to removal of toy
Organizing data? Isn’t this table the original raw data?
Comparing distributions
Response Male Female
Cry 25 56
Express anger 15 6
Withdraw 5 8
Play with another toy 5 30
Total 50 100
Table 4: Response to removal of toy by gender of child
More girls withdraw?
Comparing distributions
Response Male Female
Cry 50% 56%
Express anger 30% 6%
Withdraw 10% 8%
Play with another toy 10% 30%
Total 100% 100%
Percentage distribution
Comparing distributions
• Making comparisons between distributions is a procedure often used.
• If the total numbers of cases are equal, the frequency distributions can be used to make comparisons
• In general, we use percentage distributions to make comparison.
Grouped distribution
• Grouped frequency/percentage distributions present raw (unprocessed) data in a more readily usable form.
• The price for this is the loss of some information.
• Worthwhile.
top related