tables and non parametric tests
DESCRIPTION
Tables and Non Parametric Tests. Lecture 5. Compare Means Menu. Compare Means Output. ”Service package”. ”Important package”. But What if Data are NOT Normal?. Non-normal Data. Binomial data. Really non-normal data. Log-normal data. Transform Data. - PowerPoint PPT PresentationTRANSCRIPT
Tables and Non Parametric Tests
Tables and Non Parametric Tests
Lecture 5
Compare Means Menu
Compare Means OutputGroup Statistics
10 2,40 1,275 ,403
10 2,85 1,247 ,394
Group1
2
ObservationN Mean Std. Deviation
Std. ErrorMean
Independent Samples Test
,004 ,949 -,793 18 ,438 -,447 ,564 -1,632 ,738
-,793 17,991 ,438 -,447 ,564 -1,632 ,738
Equal variancesassumed
Equal variancesnot assumed
ObservationF Sig.
Levene's Test forEquality of Variances
t df Sig. (2-tailed)Mean
DifferenceStd. ErrorDifference Lower Upper
95% ConfidenceInterval of the
Difference
t-test for Equality of Means
”Service package””Service package”
”Important package””Important package”
Non-normal Data
0 5 10 15
Observed Value
0
3
6
9
12
15
Exp
ect
ed
No
rmal
Va
lue
Normal Q-Q Plot of Not Normal
-10 0 10 20
Observed Value
-10
0
10
20
Ex
pe
cte
d N
orm
al V
alu
e
Normal Q-Q Plot of Not Normal Either
Log-normal dataLog-normal data
Transform DataTransform Data
Compare the means of the transformed (normal) data
Compare the means of the transformed (normal) data
Binomial dataBinomial data
Really non-normal data
Really non-normal data
Binomial Data
Are the proportions of Turks in Aalborg and Århus the same?
Non-Turks
Turks
Aalborg 465 35
Århus 358 42
Are the proportions significantly different?
Non-Turks
Turks
Aalborg 465 35
Århus 358 42
7.0%7.0%
10.5%10.5%
Compare 3.5% (= 10.5 – 7.0%)
with suitable SE.
Compare 3.5% (= 10.5 – 7.0%)
with suitable SE.
Another Approach
Non-Turks
Turks
Aalborg 465 35
Århus 358 42
Non-Turks
Turks
Aalborg 457 43
Århus 366 34
ObservedObserved ExpectedExpected
In total 77 turks in a 900 sample, i.e. 8.6%
In total 77 turks in a 900 sample, i.e. 8.6%
We expect 34 turks in
Århus (8.6% of 400) We expect 34 turks in
Århus (8.6% of 400)
Same proportion in Aalborg and Århus?
Non-Turks
Turks
Aalborg 465 35
Århus 358 42
Non-Turks
Turks
Aalborg 457 43
Århus 366 34
ObservedObserved ExpectedExpected
Observed and expected should be
close
Observed and expected should be
close
How to do it in SPSS
…or data could be organized in 900 rows
…or data could be organized in 900 rows
Cross-Tabs
City * Etnicity Crosstabulation
Count
465 35 500
358 42 400
823 77 900
Aalborg
Århus
City
Total
Non-Turk Turk
Etnicity
Total
Tricks
City * Etnicity Crosstabulation
465 35 500
457,2 42,8 500,0
93,0% 7,0% 100,0%
358 42 400
365,8 34,2 400,0
89,5% 10,5% 100,0%
823 77 900
823,0 77,0 900,0
91,4% 8,6% 100,0%
Count
Expected Count
% within City
Count
Expected Count
% within City
Count
Expected Count
% within City
Aalborg
Århus
City
Total
Non-Turk Turk
Etnicity
Total
Output
Chi-Square Tests
3,480b 1 ,062
3,047 1 ,081
3,454 1 ,063
,072 ,041
3,476 1 ,062
900
Pearson Chi-Square
Continuity Correctiona
Likelihood Ratio
Fisher's Exact Test
Linear-by-LinearAssociation
N of Valid Cases
Value dfAsymp. Sig.
(2-sided)Exact Sig.(2-sided)
Exact Sig.(1-sided)
Computed only for a 2x2 tablea.
0 cells (,0%) have expected count less than 5. The minimum expected count is34,22.
b.
Expected values
Expected values
ProportionsProportions
P-valueP-value
Test Statistic
Test Statistic
Binomial
One-SampleOne-Sample Two-SampleTwo-Sample K-SampleK-Sample
Is proportion equal to 10%
Proportions in Aalborg and
Århus are equal
Proportions in Aalborg, Randers, Vester Hjermislev
and Århus are equal
Cross-Tabs handles two or
more cities (categories)
Cross-Tabs handles two or
more cities (categories)
1. Calculate proportion and 95% CI
2. Is 10% in the CI?
1. Calculate proportion and 95% CI
2. Is 10% in the CI?
…or use SPSS as I will show
later
…or use SPSS as I will show
later
Non-Normal Data
1,00 1,50 2,00 2,50 3,00 3,50 4,00 4,50 5,00
0 5 10 15 20
Observations
Ranks
Statistics on Ranks
Rank
1 4
2 6
3 7
5 10
8 12
9 13
11 16
14 17
15 19
18 20
Mean Ranks should be close
if the two distributions are located similarly
Mean Ranks should be close
if the two distributions are located similarly
8.6 12.4
How to do it in SPSS
OutputDescriptive Statistics
210 3,2366 1,11176 1,10 4,50 2,0000 3,6000 4,2000
210 1,59 ,493 1 2 1,00 2,00 2,00
Observation
City
N Mean Std. Deviation Minimum Maximum 25th 50th (Median) 75th
Percentiles
Ranks
86 93,31 8025,00
124 113,95 14130,00
210
CityAalborg
Århus
Total
ObservationN Mean Rank Sum of Ranks
Test Statisticsa
4284,000
8025,000
-2,426
,015
Mann-Whitney U
Wilcoxon W
Z
Asymp. Sig. (2-tailed)
Observation
Grouping Variable: Citya.
Mann-Whitney Test ”Service package””Service package”
”Interesting package”
”Interesting package”
”Important package””Important package”
One-Sample (Symmetry or Location)
Kiama Blowhole Data
• Highly skew distribution
• Average approx 40 sec
• Rarely above 100 sec
Median equal to 40 sec?
Only above 100 sec in 1% of the eruptions?
Normal distributed ?
Normal distributed?
OutputOne-Sample Kolmogorov-Smirnov Test
64
39,83
33,751
,173
,173
-,165
1,382
,044
N
Mean
Std. Deviation
Normal Parameters a,b
Absolute
Positive
Negative
Most ExtremeDifferences
Kolmogorov-Smirnov Z
Asymp. Sig. (2-tailed)
TimeintervalbetweenKiama
Blowholeeruptions
Test distribution is Normal.a.
Calculated from data.b.
One-Sample Kolmogorov-Smirnov Test 2
64
7
169
,464
,464
-,016
3,708
,000
N
Minimum
Maximum
Uniform Parametersa,b
Absolute
Positive
Negative
Most ExtremeDifferences
Kolmogorov-Smirnov Z
Asymp. Sig. (2-tailed)
TimeintervalbetweenKiama
Blowholeeruptions
Test distribution is Uniform.a.
Calculated from data.b.
Data are
Not Normal
Not Uniform
But QQ-plots are better!!
But QQ-plots are better!!
Location of medianMedian equal to 40
sec?
Only above 100 sec in 1% of the eruptions?
Median equal to 40 sec?
Output
Descriptive Statistics
64 39,83 33,751 7 169 14,25 28,00 60,00Timeintervalbetween KiamaBlowhole eruptions
N Mean Std. Deviation Minimum Maximum 25th 50th (Median) 75th
Percentiles
NPar Tests
Binomial Test
<= 40 41 ,64 ,50 ,033a
> 40 23 ,36
64 1,00
Group 1
Group 2
Total
Timeintervalbetween KiamaBlowhole eruptions
Category NObserved
Prop. Test Prop.Asymp. Sig.
(2-tailed)
Based on Z Approximation.a.
NOPE!Median equal to 40 sec?
Only above 100 sec in 1% of the eruptions?
Binomial Test
<= 100 62 ,97 ,01 ,000a
> 100 2 ,03
64 1,00
Group 1
Group 2
Total
Timeintervalbetween KiamaBlowhole eruptions
Category NObserved
Prop. Test Prop.Asymp. Sig.
(1-tailed)
Based on Z Approximation.a.
K samples test
Output
Ranks
86 85,87
124 119,11
115 268,00
325
CityAalborg
Århus
Randers
Total
NumberN Mean Rank
Kruskal-Wallis Test
Test Statisticsa,b
229,298
2
,000
Chi-Square
df
Asymp. Sig.
Number
Kruskal Wallis Testa.
Grouping Variable: Cityb.
”Important package””Important package”
”Service package””Service package”
Overview (normal samples)
One sampleOne sample
Two samples (paired)Two samples (paired)
K samplesK samples
Two samples (unpaired)Two samples (unpaired)
Overview (binomial samples)
One sampleOne sample
Two samples (paired)Two samples (paired)
K samplesK samples
Two samples (unpaired)Two samples (unpaired)
Overview (non normal samples)
One sampleOne sample
Two samples (paired)Two samples (paired)
K samplesK samples
Two samples (unpaired)Two samples (unpaired)