comparing dependent correlations for ordinal...
TRANSCRIPT
Comparing Dependent Correlations forOrdinal Data
A ThesisSubmitted to the Faculty of Graduate Studies and Research
In Partial Fulfillment of the Requirementsfor the Degree ofMaster of Science
inStatistics
University of Regina
byYun Gao
Regina, SaskatchewanApril 2015
Copyright 2015: Yun Gao
UNIVERSITY OF REGINA
FACULTY OF GRADUATE STUDIES AND RESEARCH
SUPERVISORY AND EXAMINING COMMITTEE
Yun Gao, candidate for the degree of Master of Science in Statistics, has presented a thesis titled, Comparing Dependent Correlations for Ordinal Data, in an oral examination held on April 1, 2015. The following committee members have found the thesis acceptable in form and content, and that the candidate demonstrated satisfactory knowledge of the subject material. External Examiner: Dr. Jingtao Yao, Department of Computer Science
Supervisor: Dr. Dianliang Deng, Department of Mathematics & Statistics
Committee Member: Dr. Yang Y. Zhao, Department of Mathematics & Statistics
Committee Member: Dr. Andrei Volodin, Department of Mathematics & Statistics
Chair of Defense: Dr. Satish Sharma, Faculty of Engineering & Applied Science
ABSTRACT
The methods and application for analyzing categorical ordinal data have matured
in statistical inferences during recent decades of development. The methods include
logistic regression models, odds ratios, inferential methods by using chi-squared tests
of independence and conditional independence. On the basis, this thesis presents
an analysis of equality of dependent correlations with the longitudinal ordinal vari-
able. Eight test statistics, Dunn and Clark’s Z, Steriger’s Z, Meng’s Z, Hitter’s Z,
Hotelling’s t, William’s t and William’s modified t per Hendrickson, for comparing
dependent correlations are presented. The results via simulation studies indicate that
the choice as to which test statistics is relatively optimal, in terms of empirical level
and statistical power, depends not only on sample size but also on the magnitude of
the correlations and the effect size. On the other hand, this thesis suggests the meth-
ods of modification for some statistical tests when they performed unsatisfactory with
ordinal variables. The thesis also briefly discusses practicing the relatively efficient
test statistics for testing equality of the correlation coefficients in real medical data
and has achieved the good results.
ii
ACKNOWLEDGEMENTS
Grateful acknowledgment is made to my supervisor, Dr. Dianliang Deng, who
gave me considerable help by means of suggestion, comments and criticism. His
encouragement and unwavering support has sustained me through frustration and
depression. Without his pushing me ahead, the completion of this thesis would be
impossible. My heartfelt thanks also go to Professor Yang Zhao and Professor Andrei
Volodin and many other professors who have helped me, for their help in the making
of this thesis as well as their enlightening lectures from which I have benefited a great
deal.
This research is partially supported by Faculty of Graduate Studies and Research,
teaching assistantships and research assistantships provided by the Department of
Mathematics and Statistics .
iii
DEDICATION
This thesis is dedicated to my parents, for their loving considerations and great
confidence in me all through these years. It is also dedicated to my friend, Chenxu
Sun, and my fellow classmates who gave me their help and time in listening to me.
iv
Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
1 Introduction and Motivation 1
2 Literature Review 4
2.1 Ordinal Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 Measurement of Ordinal Data . . . . . . . . . . . . . . . . . . 4
2.1.2 The Classification of Ordinal Data . . . . . . . . . . . . . . . 5
2.1.3 The Difference Between Nominal and Ordinal Data . . . . . . 7
2.2 Models for Ordinal Responses . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Cumulative Logits Models . . . . . . . . . . . . . . . . . . . . 9
2.2.2 Adjacent - Categories Logits Models . . . . . . . . . . . . . . 11
2.2.3 Continuation Ratio Models . . . . . . . . . . . . . . . . . . . 12
2.2.4 Polytomous Logistic Models . . . . . . . . . . . . . . . . . . . 14
2.2.5 Stereotype Models . . . . . . . . . . . . . . . . . . . . . . . . 14
v
2.2.6 Other Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3 Statistics of Measuring dependent correlations in Contingency Table 17
3.1 Correlation with Ordinal Variables in Contingency Tables . . . . . . . 18
3.1.1 Ordinal Probabilities and Scores . . . . . . . . . . . . . . . . . 18
3.1.2 Contingency Tables . . . . . . . . . . . . . . . . . . . . . . . . 20
3.1.3 Correlations with Ordinal Scores in Contingency Table . . . . 21
3.2 Test Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2.1 Test Statistics with Student Distribution . . . . . . . . . . . . 23
3.2.2 Test Statistics with Standard Normal Distribution . . . . . . . 25
3.2.3 Exact Inference of Test Statistics . . . . . . . . . . . . . . . . 29
3.3 Modified Test Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4 Simulation Study 31
4.1 Data Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2 Hypotheses and the Criteria of Test Statistics . . . . . . . . . . . . . 34
4.2.1 Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2.2 The Criteria of Test Statistics . . . . . . . . . . . . . . . . . . 34
4.3 Results of Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.3.1 Empirical Level . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.3.2 Statistical Power . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5 Applications 57
5.1 Data Sources and Features . . . . . . . . . . . . . . . . . . . . . . . . 57
vi
5.1.1 Data Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.1.2 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.1.3 Modification of the real data in contingency tables . . . . . . . 61
5.2 Test Results and Conclusion of the Medical Data . . . . . . . . . . . 67
6 Conclusion and Future Work 76
6.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.1.1 Modification of Test Statistics by using Bootstrap Method . . 76
6.1.2 Testing the Equality of a Set of Correlated Correlations by using
Chi-square Statistics . . . . . . . . . . . . . . . . . . . . . . . 78
6.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
vii
List of Tables
3.1 An example of contingency table . . . . . . . . . . . . . . . . . . . . 20
3.2 Ordinal Variable Y through Times in Stratum k . . . . . . . . . . . . 22
4.1 Empirical levels of eight statistics for nominal level α = 0.01 with Σ1 37
4.2 Empirical levels of eight statistics for nominal level α = 0.05 with Σ1 38
4.3 Empirical levels of eight statistics for nominal level α = 0.10 with Σ1 39
4.4 Empirical levels of Zm and Zmmodifiedwith Σ1 . . . . . . . . . . . . . . 46
4.5 Empirical levels of eight statistics for nominal level α = 0.01 with Σ2 49
4.6 Empirical power of all eight statistics for nominal level α = 0.01 . . . 53
5.1 Deep Pain Sensation (X) at baseline time t0 and visit time t1 without
modification of data . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2 Deep Pain Sensation (X) at baseline time t1 and visit time t2 without
modification of data . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.3 Deep Pain Sensation (X) at visit time t2 and visit time t3 without
modification of data . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.4 Deep Pain Sensation (X) at baseline time t0 and visit time t1 . . . . 62
viii
5.5 Deep Pain Sensation (X) at visit time t1 and visit time t2 . . . . . . 62
5.6 Deep Pain Sensation (X) at baseline time t2 and visit time t3 . . . . 62
5.7 Pain Intensity (Y ) at baseline time t0 and visit time t1 . . . . . . . . 63
5.8 Pain Intensity (Y ) at baseline time t1 and visit time t2 . . . . . . . . 63
5.9 Pain Intensity (Y ) at baseline time t2 and visit time t3 . . . . . . . . 63
5.10 Lack of Energy (A) at baseline time t0 and visit time t1 . . . . . . . . 64
5.11 Lack of Energy (A) at baseline time t1 and visit time t2 . . . . . . . . 64
5.12 Lack of Energy (A) at baseline time t2 and visit time t3 . . . . . . . . 64
5.13 Nausea (B) at baseline time t0 and visit time t1 . . . . . . . . . . . . 65
5.14 Nausea (B) at baseline time t1 and visit time t2 . . . . . . . . . . . . 65
5.15 Nausea (B) at baseline time t2 and visit time t3 . . . . . . . . . . . . 65
5.16 Joint Pain/Muscle Cramps (C) at baseline time t0 and visit time t1 . 66
5.17 Joint Pain/Muscle Cramps (C) at baseline time t1 and visit time t2 . 66
5.18 Joint Pain/Muscle Cramps (C) at baseline time t2 and visit time t3 . 66
5.19 The correlations coefficients for ordinal variables . . . . . . . . . . . . 68
5.20 Result of deep pain sensation . . . . . . . . . . . . . . . . . . . . . . 69
5.21 Result of pain intensity . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.22 Result of lack of energy . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.23 Result of nausea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.24 Result of joint pain/muscle cramps . . . . . . . . . . . . . . . . . . . 73
ix
List of Figures
4.1 Empirical levels at α = 0.01 . . . . . . . . . . . . . . . . . . . . . . . 40
4.2 Empirical levels at α = 0.05 . . . . . . . . . . . . . . . . . . . . . . . 41
4.3 Empirical levels at α = 0.10 . . . . . . . . . . . . . . . . . . . . . . . 42
4.4 Empirical levels of Meng’s Z at α = 0.01 . . . . . . . . . . . . . . . . 44
4.5 Empirical level of modified Meng’s Z at α = 0.01 . . . . . . . . . . . 47
x
Chapter 1
Introduction and Motivation
A real data set which is used to illustrate the applicability and practicality of the
proposed approaches is from a project of cancer research center in the United States.
The data set involves the demographic information of the patients, such as age, gen-
der, race, history of several types of disease and medication records. It also contains
the recorded data from the cancer center, such as pain scale, changes in physiolo-
gy and psychology, chemotherapy, concomitant medication records and neuromeres
treatment. The data are classified as ordinal, nominal, string, interval, continuous
and ratio, and recorded on multiple times. We are interested in ordinal variables,
such as deep pain sensation, pain intensity and nausea, which have 11 categories at
most. The contingency tables are constructed by one of the ordinal variables between
adjacent visit times. The objective of this thesis is to test the homogeneity of a set of
correlations between the contingency tables with one ordinal variable. The research
in the medical sense is testing whether or not the consistent strength of treatment
for patients which embodied in specifying ordinal variables rather than judging the
1
patients who received an effective or ineffective treatment. The results as such should
have greater utility for applied researchers.
In order to compare the correlations of a common ordinal variable between ad-
jacent visit times, we arrange the data into contingency tables first. For instance,
in first contingency tables, the row data is an ordinal variable at visit time t1, and
column data is the same variable at visit time t2. The row data of the second con-
tingency table is the same variable at visit time t2, and column data is the variable
at visit time t3. By such process, we get several contingency tables and compare
the correlations from those tables. It is noted that the numbers of observations in a
sample should be specified, and we omit the observation who drop out of the project
over time. This thesis presents several test statistics for comparing the dependent
correlations with a sample from multivariate continuous normal distribution. They
are Olkin’s Z, Dunn and Clark’s Z, Steriger’s Z, Meng’s Z, Hitter’s Z, Hotelling’s t,
William’s t and William’s modified t per Hendrickson. In order to use them in ordinal
variables, the ordered scores could be assigned. Therefore, the statistical tests involve
the correlation coefficient where the categorical numbers are considered.
In this thesis, we ascertain the following proper test statistics for the ordinal
contingency tables through the simulation evaluation. One group including Dunn and
Clark’s Z, Hotelling’s t and William’s modified t per Hendrickson performs relatively
well in some cases in term of sample size and correlation coefficient. In other cases we
will discuss the test statistics in simulation study, another group including Steriger’s
Z, Meng’s Z, Hitter’s Z, William’s t is more appropriate. In addition, even thought
these test statistics work in many cases, they could be made a modification to perform
2
better in more situations.
The structure of the thesis is as follows. In Chapter 2, we present the measure-
ment and classification of ordinal data and compare the advantage of using ordinal
and nominal variables. The classical models for analysis of ordinal response, such as
logistic regression modeling and ordinal association structure modeling, are reviewed
in this chapter. In Chapter 3, we discuss several test statistics for comparing two or
more correlations using a sample from ordinal variables. Besides, a simple modifica-
tion of some test statistics is proposed. In Chapter 4, we generate a simulated sample
from a multivariate normal distribution and categorize the simulated variables into
ordinal data. Through the simulated results, we evaluate the test statistics and filter
the relatively appropriate several statistics used in ordinal variables. The application
to the medical real data is introduced in Chapter 5. Chapter 6 considers the possible
future work and conclusion.
3
Chapter 2
Literature Review
2.1 Ordinal Data
2.1.1 Measurement of Ordinal Data
The traditional theory of measurement is mainly popular in quantitative research, for
example, “length, capacity, weight, etc” are some variables which have units (such
as meter, liter, gram) of measurement. However, for the ordinal variable without the
standard unit, the measurement becomes invalid. Stevens (1951) redefined the theory
of measurement, so we can obtain the data for ordinal variable in a meaningful way.
According to the Dictionary of Statistics and Methodology (Vogt,1993), the or-
dinal scale of measurement is “... a scale of measurement that ranks subjects (puts
them in an order) on some variable, the difference between ranks does not need to be
equal as they are in an interval scale”. In general, the ordinal variable A is assumed
that (a) A ∈ a1, ..., aI , where ai ∈ R, i = (1, 2, ..., I) and I is the number of exclusive
4
and exhaustive categories. (b) The categories satisfy a1 < a2 < ... < aI . Obviously,
the measurement scale is usually used in social survey, market survey, financial sur-
vey and amount of social research activities, such as income range, educational level,
measuring attitude and other surveys.
According to the above-mentioned definition, we shall meet two questions. One is
to prove if sorting exists or not. Actually, it is hard to prove the existing of sorting
theoretically. However, we need not pay much attention to this point, because readers
already have a common understanding of its existing in practical life. What we need
to consider is the defined sorting standards on measuring ordered variable. We take
the survey of informants daily smoking as an example: 1 = seldom; 2 = normally; 3 =
usually, each of which means the amount range of daily smoking. If we do not provide
the sorting standards like the above-mentioned ones, each informant may have his or
her own standard about the amount of smoking, which causes that the investigative
result turns out to be ambiguous and its data is not comparable. Another question is
if classification has no objective standard as reference, we have to make measurement
by subjective judgment. Actually, the ordered measurement usually underlies some
subjective elements, which has immeasurable impact on the accuracy of the subjective
classification of variables. The above-mentioned fact, to some extent, is also the season
why the analysis of the ordered variable is less in statistical analysis.
2.1.2 The Classification of Ordinal Data
In fact, the ordered variable may match certain assumption. For example, if latent
variable exists and is classified by ordered variable, some parameterized methods shall
apply to this variable. On the basis of this thought, an ordered variable may come
5
from three aspects: measurable latent variables, unmeasurable latent variables and
nonexistent latent variables. If the ordered variable comes from nonexistent latent
variables, it may be divided into two categories, one has the objective standard with
classification, and another has no standard. In conclusion, ordered variable may be
divided into five types (J.Kampen and M.Swyngedouw, 2000):
1. The classification is on the measurable continual latent variable with known
threshold. For example, the classification of personal monthly income includes
the following categories: less than 1200, 1200 to 3000, 3000 to 6000, more than
6000.
2. The classification is on the measurable continual latent variable with unknown
threshold. For example, the classification of personal monthly income includes
the following categories: low, middle and high.
3. The classification is on the immeasurable continual latent variable with un-
known threshold. For example, the classification of personal intelligence in-
cludes the following categories: low, middle and high.
4. The latent variable does not exist and ordered variable is classified according to
a certain rule and with a standard as reference. For example, the classification
of the drug effect in clinical trials includes the following categories: invalid,
better, effective, significant.
5. The latent variable does not exist and ordered variable is classified according
to a certain rule and without any standard as reference. For example, the
6
classification of personal comments on certain expression includes the following
categories: disagree, neutral, agree.
In all five types, the classified standard on ordered data is an objective standard,
which is completely independent of informants and investigators under the situation
of the first classification (1). Whether the fourth classified standard is that objec-
tive depends on the correction on the study by investigators. The second and fifth
standards of classification are difficult. In practical situations, investigators can only
expect that informants match the classification standard of ordered variable. The
third classification has no appropriate standard and is totally determined by investi-
gators.
2.1.3 The Difference Between Nominal and Ordinal Data
Qualitative data has two kinds of measurement scale (Stevens, 1951). Qualitative
data will be called nominal data if there is no natural ordered relationship between
each data classification, such as sex (male, female), trip mode (train, car, plane, ship,
bicycle and walking), religious belief (Buddhism, Christianity, Islam and others).
For nominal data, the order of each classification is irrelevant and the sorting on
classification is not considered in statistical analysis.
However, lots of classified and ordered data are stipulated in advance, which is
called ordered data, such as quality of product (good, normal, bad), degree of educa-
tion (primary school and below, secondary school, high school or technical secondary
school, college or above) and physical condition (very good, good, normal, poor and
very poor). The classified order of ordered data is certain, but the distance between
7
each classification is implicit. For example, a man or woman with good physical con-
dition has a better constitution than others with normal physical condition, but there
is no explicit and accurate data to describe how good it is. Therefore, the statistical
analysis of ordered data is needed to take advantage of the ordering relationship be-
tween each classification. The type of data depends on the measurement scale that
is called the classified method. For example, in the aspects of public education and
private education, the obtained data is nominal data according to the degree of educa-
tion. But in the aspect of the classification with primary school and below, secondary
school, high school or technical secondary school, college or above, the obtained data
is ordered data. However, the scale of measurement also determines its applicative
statistical method for data in general. Ordinal scale in the measurable levels is higher
than nominal scale. The applicative statistical method for high level shall not apply
to the method that is applicative to low level in general (Agresti, 2002). The statisti-
cal method which is applicative to the analysis of ordered data shall not apply to the
analysis of nominal data because the classification of nominal data is without order
and does not match the applicable condition of this method.
2.2 Models for Ordinal Responses
In practical work, we often encounter a kind of multivariate response data. Response
variable Y is divided according to level, the value of Y is 1, 2, ..., k, signifying k ordered
categories, and explanatory variable X, it can be discrete, but also can be continuous
or combined with both. “Order” that we talk about in this chapter is response variable
8
Y . And as to the analysis on this kind of materials, current analysis mainly adopts
linear model and multiplied discriminant model. These models have strict demands on
variables, and they always have assumption on normality and covariance matrix. In
medicine, many data are unable to satisfy the above-mentioned assumption, because
an explanatory variable generally composes of qualitative data and quantitative data
with unknown distribution, and its various kinds of variances may differ a lot. For
instance, when researching the relationship between the same “dangerous factor” of
a certain disease, the variance of the diseased group is distinctively bigger than the
normal group, so it is improper to handle this issue with traditional method. In
addition, using linear models to score Y has some arbitrariness, and it is hard to
obtain interpretation on multiplied discriminant model.
With regard to the above-mentioned issues, some models for analyzing ordered
category data are brought up in recent years. These methods without the limitation
of ordinary linear model can sufficiently “use” the information of order. Next we will
first of all give brief introduction on several models for ordinal response, and then
illustrate application upon each models and existing issues.
2.2.1 Cumulative Logits Models
We first present cumulative logits model with explanatory variables, which is one of
the most widely used models for ordinal response. For subject i, assume yi be the
outcome category for the response variable, and xi be a column vector of the values
of the explanatory variables. The cumulative logits model was originally proposed by
Walker and Duncan [29] and called the proportional odds model by McCullagh [16],
9
and the model has the form
logit [P (Yi ≤ j|x)] = αj + β′xi = αj + β1xi1 + β2xi2 + · · · (2.1)
where j= 1, 2, ..., c− 1, and β, which is a column vector of parameters, describes the
effects of the explanatory variables.
In model (2.1), the logit for cumulative probability j has its own intercept, αj.
We can easily find that P (Yi ≤ j | x) increases in j for each fixed value of x , so the
{αj} are increasing in j, and the logit is an increasing function of this probability.
The equivalent forms for model (2.1) can be re-expressed by
P (Y ≤ j|x) =exp(αj + β′x)
1 + exp(αj + β′x), j = 1, ...c− 1. (2.2)
or
P (Y ≤ j|x)
P (Y > j|x)=
P (Y ≤ j|x)
1− P (Y ≤ j|x)= exp(αj + β′xi) (2.3)
If we consider two populations which are characterized by explanatory variables
x1 and x2, the general model with multiple explanatory variables satisfies
logit [P (Y ≤ j | x1)]− logit [P (Y ≤ j | x2)] = logP (Y ≤ j | x1)/P (Y > j | x1)
P (Y ≤ j | x2)/P (Y > j | x2)
= β′(x1 − x2).
It is clear to see that the ratio is independent of category number j. That means
the ratio of the cumulative odds for two populations is the same for all of the cumu-
lative odds. It is only effected by the explanatory variables x1 and x2 and the vector
of parameters β.
10
In 1980, McCullagh discussed this model as proportional odds model. He consid-
ered that how to categorize in a practical issue has a certain level of subjectivity and
hoped that the analysis result will not change because of particular category option.
Cumulative logits model can well solve this problem to some extent.
However, when applying this model, it shall firstly examine the assumption con-
ditions of proportional odds; if this condition is untenable, then we need other model
to analyze. Peterson and Harrell (1990) proposed unrestricted partial proportional
odds model to unleash the restriction on this condition. Regarding this situation,
Bender and Grouven (1998) also discussed the situation of using binary logits regres-
sion model to replace cumulative logits regression model to analyze. However, Torra
and Domingo Ferrer (2006) pointed out that using several binary logits regression
models to handle ordered variable will make the model lack efficiency because of not
making use of the important information of ordering between variables.
2.2.2 Adjacent - Categories Logits Models
In next two sections, we will introduce alternative logits models using the adjacent-
categories logits and the continuation-ratio logits. These two models have interpreta-
tions that can use individual categories rather than the cumulative probabilities.
The adjacent-categories logits with multinomial probabilities P (Y = j) = πj are
logit [P (Y = j | Y = j or Y = j + 1)] = logπjπj+1
, j = 1, ..., c− 1. (2.4)
thus, the general adjacent-categories logit model with a set of explanatory variable x
has the form
logπj(x)
πj+1(x)= log
P (Y = j|x)
P (Y = j + 1|x)= αj + β′jx, j = 1, ..., c− 1. (2.5)
11
From the models, we figure out that the parameter β1 corresponds to the regression
coefficients for the log odds of π1 relative to π2; the parameter β2 corresponds to the
regression coefficients for the log odds of π2 relative to π3, and so on. Therefore, the
effects of this model are presented with local odds ratios because the model uses pairs
of adjacent categories j and j+1, rather than the cumulative odds ratios. The effects
rely on the distance between categories, so the model recognizes the ordering of the
response scale.
2.2.3 Continuation Ratio Models
After presenting adjacent-categories logits models, we next introduce another logits
model which uses continuation-ratio logits. There are two types of logits. One set
forms the log odds for each category relative to the higher categories
logπj
πj+1 + · · ·+ πc, j = 1, ..., c− 1. (2.6)
which is useful when a sequential mechanism determines the response outcome, in
the sense that an observation must potentially occur in category j before it can occur
in a higher category (Tutz,1991).
An alternative set of continuation-ratio logits forms the log odds for each category
relative to the lower categories,
logπj+1
π1 + · · ·+ πj, j = 1, ..., c− 1. (2.7)
which is useful if the sequential mechanism works in the reverse direction.
The two forms of continuation-ratio logits are not equivalent. For example, there
12
are c = 4 categories, the first set of sequential continuation-ratio logits is
logπ1
π2 + π3 + π4. log
π2π3 + π4
. logπ3π4. (2.8)
however the second set is
logπ2π1. log
π3π1 + π2
. logπ4
π1 + π2 + π3. (2.9)
In this section, we only consider the first type of continuation-ratio logit. In this
case with duration and development scales, a subject passes through each category
in order before determining the response outcome. Continuation-ratio logit models
using sequential logits have the form
logit [ωj(x)] = logP (Y = j|x)
P (Y > j|x)= αj + β′jx, j = 1, ..., c− 1. (2.10)
This model was proposed by Cox on 1972 when he researched the survival life
table. In 1977, Thomposon discussed this model again. The left side of the model is
the log of the hazard function given by h(t) = f(t)/[1−F (t)] in survival time analysis,
where the f(t) = P (Y = j|x) is a probability density function and F (t) = P (Y ≤ j|x)
is a cumulative distribution function. Survival times are often measured with discrete
categories, with the response grouped into a set of categories. For example, how many
years that patients are survival after receiving a particular medical treatment ( < 1
year,1 to 5 years, 5 to 10 years, > 10 years ). Because survival time is strictly
ordered, sometimes continuation-ratio logits model is applied to grouped survival
data. Fienberg and McCullagh had introduced and used this model.
13
2.2.4 Polytomous Logistic Models
Anderson(1972) proposed the polytomous logistic model which is based on the logistic
models for binary response by Cox(1970). The model is then
logπj(x)
πj+1(x)= αj + β′jx, j = 1, ..., c− 1. (2.11)
In terms of the response probabilities, the polytomous logistic model has the following
representation:
πj(x) = P (Y = j|x) =exp(αj + β′jx)∑ck=1 exp(αk + β′kx)
, j = 1, 2, ..., c. (2.12)
where αc = 0 and βc = 0. The parameter vector β = (β1, β2, ..., βc−1)′ needs to be
estimated, and there are (c − 1) intercept parameters αj. The basic assumption of
this model is that the log odd ratio for arbitrary binary response is linear function of
explanatory variables x.
In fact, this model does not belong to the class of ordered response model, but it
still utilize with ordinal response and is the basic of stereotype models. We deal with
the ordinal data via analyzing the values βi1, βi2, ..., βi(k−1) of the ith variables, thereby
we can analyze the relationship between response Y and explanatorily variable xi.
2.2.5 Stereotype Models
Stereotype model is based on the aforementioned polytomous logistic model. When
Y is a categorized ordinal response, let the parameter vector β′j in polytomous logistic
model be φjβ′, then the model is presented as
πj(x) = P (Y = j|x) =exp(αj + φjβ
′x)∑ck=1 exp(αk + φkβ
′x), j = 1, 2, ..., c, (2.13)
14
Except for the parameters β = (β1, β2, ..., βc−1)′ and αj, (c−2) parameters φj need to
be estimated. That is, with constraint on φ1 = 1 and φc = 0, the other φ2, φ3, ..., φc−1
can be estimated. Anderson imposed an additional order constraint on the φj with
φ1 ≥ φ2 ≥ ... ≥ φc. However, we can ignore this constrain in practical applications
and observe the order via φj.
2.2.6 Other Models
Probit and log-log are other ordered regression models. Probit model is similar to
accumulative odds model. When Log-log model is asymmetric long-tailed distribution
in response variable Y, in fitting it may show some advantage, but these models are
less simple and easy to explain than the above-mentioned models, so the application
is relatively few.
With regard to the assumption of normal distribution and homogeneity of variance
going against traditional variance analysis, Piepho and Kalka (2003) proposed to
conduct analysis on ordered variable by using threshold model containing random
effect and fixed effect. At the same time, they pointed out that although threshold
model has strong flexibility, the model increases the number of parameters to be
estimated, increases the danger of excess fitting, and this model is unsuitable for
small sample analysis.
Regarding general linear model, Gautam and Kimeldorf (1996) proposed to estab-
lish score function of F testing statistics to ordered variable, the method of through
optimizing testing statistics to obtain ordered variable’s best assignment.
In addition, Kottas and Muller (2003), Agresti and Hitchcock (2005) research on
the application of parameteric and nonparametric Bayesian model in ordered variable
15
analysis; Jaime S. Cardoso (2006) concludes and proposes to apply neural network
methods to categorize and analyze on ordered materials.
16
Chapter 3
Statistics of Measuring
dependent correlations in
Contingency Table
Since one of the objective we are focusing on is testing the equality of correlations
from ordinal variables where the test statistics are based on two-way contingency
tables. It is noted that the ordinal variables from individuals which we are interested
are measured repeatedly through time. So we will test the equality of correlation
with one ordinal variables from adjacent times. In this chapter, we first arrange an
ordinal variable based on time into several contingency tables, and then introduce
several test statistics which are applied to test the homogeneity of correlations from
one ordinal variables between adjacent times. In the last section of this chapter, we
do some modification of the test statistics which we introduce in previous section.
17
3.1 Correlation with Ordinal Variables in Contin-
gency Tables
3.1.1 Ordinal Probabilities and Scores
1. Probabilities for ordered categories
We already use the basic ideal of probabilities for ordinal response Y in the ordinal
models in last chapter. Suppose Y that is an ordinal response, and let c denote the
number of categories for ordinal response Y , n denote sample size, n1, n2, ..., nc denote
the frequencies in the categories with n =∑
j nj, and {pj = nj/n} denote the sample
proportions. Let P (Y ≤ j) = πj denote the probability of response in category j,
then the cumulative probability for ordinal response Y is
Fj = P (Y ≤ j) = π1 + π2 + ...+ πj, j = 1, 2, ..., c.
and the cumulative probabilities should reflect the ordering of the categories, that is,
0 < F1 < F2 < ... < Fc = 1
2. Scores for an ordinal variable
In this part we will summary how to utilize the ordinal nature of the categorical scale.
One simple way is to use the cumulative probabilities to identity the median response,
that is, finding the smallest category j such that Fj ≥ 0.50. The median will move
from one category to the next one if the probability is changed a little bit by using
this method.
18
The second way to utilize the ordinal nature of the categorical scale is assigning
ordered scores. For an ordinal variable Y having c categories, we assign the scores
satisfying the order v1 ≤ v2 ≤ ... ≤ vc which is the same order as the categories. By
using the method, the scores can summarize the observations with ordinary measures
for quantitative data and treat the ordinal score as an interval scale. Selecting scores
is not an unique way. For example, when an ordinal variable Y has c = 5 categories,
if we compares the correlations for two groups using the scores (1,2,3,4,5) which could
yields the same conclusion as using the scores (2,4,6,8,10), (0,5,10,15,20) or any set
of linearly transformed scores, but we can get different conclusions from using the
scores such as (1,2,5,7,9) or (0,3,8,10,11). Therefore, the key point of assigning scores
is the choice for the relative distances between pairs of adjacent categories.
The last but not least, the method of selecting scores is using the data themselves
to determine the scores. Bross(1958) introduced the average cumulative proportion
scores and called the term ridits. In the term of sample proportions {pj}, the average
cumulative proportion in category j is
aj =
j−1∑k=1
pk +1
2pj, j = 1, 2, ..., c,
and in the term of the sample cumulative proportions Fj = p1 +p2 + ...+pj, the ridits
scores present as
aj =Fj−1 + Fj
2,
with F0 = 0. The ridits have the same ordering as the categories, that is, a1 ≤ a2 ≤
... ≤ ac.
19
Table 3.1: An example of contingency table
Gender
HandednessRight-handed Left-handed Total
Males n11 = 43 n12 = 9 n1+ = 52
Females n21 = 44 n22 = 4 n2+ = 48
Total n+1 = 87 n+2 = 13 n = 100
3.1.2 Contingency Tables
1. A 2× 2 contingency table
A two-way table which is also called a contingency table is a useful tool for examining
relationships between categorical variables. The entries in the cells of a two-way table
can be frequency counts or relative frequencies. Table 3.1 cross classified n = 100
respondents to a General Social Survey by their gender and by their handedness.
The table illustrated the cell count notation for these data. For example, n11 = 43,
and the related sample joint proportion is p11 = 43/100 = 0.43. It is noted that the
categories of the two factors in this contingency tables are not ordered. In addition,
the observations are recorded once at this study.
20
2. Sorting an ordinal variables based on time in several stratums of two-
way contingency tables
With regard to the nature of ordinal variables in the real medical data which we
use in this thesis, we consider one factor which is categorized by order and recorded
by several visit times. We examine the relationships between adjacent times for
the ordinal variable. The visit times are termed strata, and it is assumed that the
response are dependent from stratum to stratum. For an ordinal variable Y with
c categories, the variable Y is collected from a sample with n observations through
times t1, t2, ..., ts. Let Y (tk) denote the row data in stratum k, and Y (tk+1) denote
the column data in stratum k, k = 1, 2, ..., s − 1. Let nij(k) denote the relative
frequency that the number of observations is classified with i-th category in row and
j-th category in column at stratum k, i, j = 1, 2, ..., c, and k = 1, 2, ..., s − 1. Table
3.2 shows that ordinal variable Y through times in stratum k of two-way contingency
tables.
3.1.3 Correlations with Ordinal Scores in Contingency Table
After sorting ordinal variables in contingency table, we define the correlation coeffi-
cient for ordinal variables. With regard to Table 3.1, let u1 ≤ u2 ≤ ... ≤ ui ≤ ... ≤ uc,
i = 1, 2, ..., c denote the scores for the rows, let v1 ≤ v2 ≤ ... ≤ vj ≤ ... ≤ vc ,
j = 1, 2, ..., c denote the scores for the columns. Since we are interested in one ordinal
variables through times, the scores for rows equal the scores for columns, that is,
ui = vj when i = j. Let {pi.} = ni./n be the probability in row i, {p.j} = n.j/n
be the probability in column j, and {pij} = nij/n be the probability in the cell ij.
21
Table 3.2: Ordinal Variable Y through Times in Stratum k
Y (tk)
Y (tk+1)1 2 · · · j · · · c Row Totals
1 n11(k) n12(k) · · · n1j(k) · · · n1c(k) n1.(k)
2 n21(k) n22(k) · · · n2j(k) · · · n2c(k) n2.(k)
......
......
......
......
i ni1(k) ni2(k) · · · nij(k) · · · nic(k) ni.(k)
......
......
......
......
c nc1(k) nc2(k) · · · ncj(k) · · · ncc(k) nc.(k)
Column Total n.1(k) n.2(k) · · · n.j(k) · · · n.c(k) n
For simplicity of notation, we ignore the particular values of the stratum k in the
notations above, such as {pi.(k)} = {pi.}. Therefore the sample correlation is given
by
r =
∑ci,j=1(ui − u)(vj − v)pij√
[∑c
i=1(ui − u)2pi.][∑c
j=1(ui − v)2p.j]
where u =∑
i uipi. is the sample mean of the row scores, v =∑
j vjp.j is the sample
mean of the column scores, and∑
i,j(ui − u)(vj − v)pij weights cross-products of
deviation scores by their relative frequency.
The correlation falls between −1 and +1. The larger the correlation r for stratum
k is in absolute value, the stronger strength of relationship about the ordinal variable
Y (tk+1) relaying on the variable Y (tk−1), k = 1, 2, ..., s − 1. In order to test the
equality of correlations between adjacent times, we totally have (s− 1) strata. That
22
is, let rk,k+1 denote the correlations between ordinal variables Y (tk) and Y (tk+1),
k = 1, 2, ..., s− 1. And also there are total (s− 1) correlations to compare, and null
hypotheses could be r12 = r23 = ... = rs−1,s.
3.2 Test Statistics
In this section, we introduce eight statistics for testing the equality of two dependent
correlations in a common sample. Suppose X = (X1, X2, X3) has a trivariate distri-
bution with covariance symmetric matrix Σ, where σij = ρijσiσj be the ij-th element
of Σ, with ρii = 1(i = 1, 2, 3) and σi = σj = 1. Our hypothesis test is
H0 : ρ12 = ρ23;
Ha : ρ12 6= ρ23.
Let r12 be a number specifying the sample correlation between variable X1 and X2, r23
be a number specifying the correlation between variable X2 and X3, r13 be a number
specifying the correlation between variable X1 and X3, n be an integer defining the
size of the group in the following equation of test statistics.
3.2.1 Test Statistics with Student Distribution
We first present three test statistics which are compared with the 1 − α/2 point of
Student T distribution with n − 3 degrees of freedom. The three t test statistics
should be appropriate for small to moderate sample size.
23
1.Hotelling’s t Test Statistics
In 1940, Hotelling suggested it as a test statistic based on the difference r12 − r23
divided by some estimate of the asymptotic standard deviation of r12− r23. The test
statistic t is given by
t =(r12 − r23)
√(n− 3)(1 + r13)√2|R|
with df = n− 3
where R is the determinant of the sample correlation matrix
|R| = 1 + 2r12r23r13 − r212 − r223 − r213
Because of the nuisance parameter ρ13 in the expressions of the covariance matrix Σ,
Hotelling suggested the asymptotic standard deviation with considering the nuisance
parameter.
2.William’s t Test Statistics
Williams’ test statistic is proposed on 1959, which is a modification of Hotelling’s t
statistic. This statistic also depends on a standardized version of r12− r23 and differs
from Hotelling’ t only by the term in the denominator. The test statistics is given by
t = (r12 − r23)√
(n− 1)(1 + r13)
2(n−1n−3)|R|+ r2(1− r13)3
=(r12 − r23)
√(n− 3)(1 + r13)√
2|R|+ (r12−r23)2(1−r13)3(n−3)4(n−1)
with df = n− 3, where
r =r12 + r23
2
and
|R| = 1 + 2r12r23r13 − r212 − r223 − r213
24
From the second equation of William’s t test statistic, we can easily find that William’s
t only add (r12−r23)2(1−r13)3(n−3)4(n−1) in the denominator under square roots. That is, the
difference of r12− r23, the nuisance parameter r13, or the sample size n will affect the
conclusion by the test statistic.
3.Hendrickson’s t Test Statistic
William’s t per Hendrickson’s statistic is also a modification of Hotelling’s t by
William, and is written as
t =(r12 − r23)
√(n− 3)(1 + r13)√
2|R|+ (r12−r23)2(1−r13)34(n−1)
with df = n− 3
where
|R| = 1 + 2r12r23r13 − r212 − r223 − r213
This test statistic crossed out (n − 3) in the part of modification in Williams’ t test
statistic. So the second modification of Hotelling’s t is more accurate for effect of
sample size, the difference of r12− r23 and the predictor of nuisance parameter r13 on
statistics. From the equation, we can find the effect of this modification is small if
one or more the following three events occur: n is large; r12 − r23 is small, and r13 is
close to one.
3.2.2 Test Statistics with Standard Normal Distribution
The next five test statistics we are interested in are to be compared with the 1−α/2
point of the standard normal distribution.
25
1.Olkin’s Z Test Statistic
The first one statistic with standard normal distribution for testing equality of corre-
lations was proposed by Pearson and Filon(1898). Then Olkin(1967)transformed the
test statistics on 1967,
z =(r12 − r23)
√n√
(1− r212)2 + (1− r223)2 − 2r313 − (2r13 − r12r23)(1− r213 − r212 − r223)
=(r12 − r23)
√n√
(1− r212)2 + (1− r223)2 − 2k
where
k = r13(1− r212 − r223)−1
2(r12r23)(1− r212 − r223 − r213)
which has the asymptotic standard-normal distribution for large sample size. The
first equation presents Olkin’s Z test statistic, and the second is Pearson and Filon’s
Z. That is, Olkin’s Z = Pearson and Filon’s Z. It is noted that Pearson and Filon’s
Z could also test the difference between two correlations with no variables in common
within the same sample, that is H0 : ρ12 = ρ34.
Using Fisher’s r-to-Z transformation
The next following test statistics make use of Fisher’s r to Z transform. We use
the transform in the case if the sample is not large and population correlations have
extreme values which would increase for these statistics far away from the nominal
levels, then the Fisher r-to-Z transform
Z =1
2(ln(1 + r)− ln(1− r))
helps to eliminate this problem because it transforms a sample correlation to a variable
that is close to normal distribution, even with small to moderate sample size and
26
extreme sample correlations.
2.Dunn and Clark’s Z Test Statistic
The first test statistic using Fisher’s r − to − Z is Dunn and Clark’s Z, this test is
calculated as
z =(Z12 − Z23)
√n− 3√
2− 2c
where
c =r13(1− r212 − r223)− 1
2r12r23(1− r213 − r212 − r223)
(1− r212)(1− r223)
To obtain the asymptotic variance of Z12−Z23, Dunn and Clark used the expression
for the asymptotic correlations, therefore c denotes the asymptotic correlation of Z12
and Z23.
3.Steiger’s Z Test Statistic
This test was proposed by Steiger (1980) and is a modification of Dunn and Clark’s
Z. The test statistic Z is defined as
z =(Z12 − Z23)
√n− 3√
2− 2c
where
r =r12 + r23
2
and
c =r13(1− 2r2)− 1
2r2(1− 2r2 − r213)
(1− r2)2
In an effort to further improve the control of empirical level, Steiger arithmetically
averages the correlation r12 and r23, which would be instead of the individual corre-
lations r12 and r23 in the equation. However, Steiger’s method using the arithmetic
27
average the correlation r12 and r23 could be further modified by using the backtrans-
formed average Fisher’s Z, it is Hittner, May, and Silver’s Z Test Statistics.
4.Meng, Rosenthal, and Rubin’s Z Test Statistic
Meng’s Z is equivalent to Dunn and Clark’s test statistic asymptotically but is in a
rather simple and easy-to-use form. The following equation yields Meng’s Z,
z = (Z12 − Z23)
√(n− 3)
2(1− r13)h
where
h =1− f r2
1− r2
f =1− r13
2(1− r2),which must be ≤ 1,
r2 =r12
2 + r232
2
The bound on f is derived from constraints that the covariance matrix for correlation
coefficients must be nonnegative, so, f should be equal to 1 if (1−r13)/(2(1−r2)) > 1.
In addition, from the equation of Meng’s Z test statistic, we can obtain a (1 −
α/2)100% confidence interval of r12 and r23:
L,U = (Z12 − Z23)± zα2
√2(1− r13)hn− 3
5.Hittner, May, and Silver’s Z Test Statistic
Silver and Dunlap on 1987 first proposed the approach to backtransform averaged
Fisher’s r − to − Z, and the method was applied to the comparison of overlapping
correlations by Hittner et al on 2003. Hitter’s Z is based on Steiger’s Z, and is
28
calculated as,
z =(Z12 − Z23)
√n− 3√
2− 2c
where
c =r13(1− 2r2z)− 1
2r2z(1− 2r2z − r213)
(1− r2z)2,
rz =exp(2Z − 1)
exp(2Z + 1)
and
Z =Z12 + Z23
2
Silver and Dunlap shown that the backtransformed average Z is generally less biased
than using the average r in Steiger’s Z when sample size is very small, and the average
Z becomes increasingly less biased for large values of average r.[13]
3.2.3 Exact Inference of Test Statistics
In fact, the test statistics which we introduced above were all proposed in past studies.
However, they were used to compare the strength of association between a variable
X1, and each two potential predictor variables, X2 and X3, so the statistics were
used to test whether ρ12 = ρ13, and ρ23 is a nuisance parameter. But in this thesis,
through using scores in correlation coefficients and transforming the formula of test
statistics, the tests will be applied to comparing ρ12 = ρ23 for three dependent ordinal
variables. On the other hand, the test statistics in the past studies were used to
compare correlations between dependent continuous variables, but we try to measure
whether they can be used to compare correlations for dependent ordinal variables
and to choose more appropriate test statistics to apply. Based on the two points, we
29
simulated a sample to test whether the transformed test statistics we mentioned in
this chapter can be used in real medical data.
3.3 Modified Test Statistics
Suppose the estimated test statistics θ when they are used to test correlations with an
ordinal variables between adjacent times do not accurately follow standard normal
distribution. Therefore we need to modify the test statistics. It means the test
statistics need to be re-normalized as,
θmodified =θ − E(θ)√V ar(θ)
Based on the test statistics which we introduced in last section, E(θ) and V ar(θ)
should be the functions of sample size and the correlation coefficients, and denote
them as functions f1(n, r12, r23, r13) and f2(n, r12, r23, r13) respectively. For simplicity
of notation, we replace them to f1 and f2.
θmodified =θ − f1√f2
=1√f2θ − f1√
f2= θ + θf3 = θ +B
Therefore, our objective of modification is to find the function f3(n, r12, r23, r13) from
a theoretical and empirical points of view. Because of the complicated equations of
test statistics we list above which include sample size and three parameters, it is
difficult to find E(θ) and V ar(θ). Therefore, we try to modify the test statistics in
the empirical way, and it will show in simulation study on Chapter 4.
30
Chapter 4
Simulation Study
4.1 Data Generation
A limited study for the evaluation of eight test statistics for comparing dependent
correlations with ordinal longitudinal variables based on common sample is conducted.
For the purpose of simulation, we generate a sample of ordinal random variables that
meet the following conditions.
• Samples are generated with U = (U1, U2, U3) following the continuous multi-
variate normal distribution N(0,Σ);
Σ =
1 ρ12 ρ13
ρ12 1 ρ23
ρ13 ρ23 1
(4.1)
31
• The ordinal variables are categorized by U. Each variable consists of 5 cate-
gories;
• Sample size is 30, 60, 90, and 120 respectively;
• Replications are 10, 000 times.
• The appropriate two-tailed test are performed at nominal levels of 0.01, 0.05
and 0.10.
Here is the description of the conditions.
First, we generated a sample of U = (U1, U2, U3) having the continuous multi-
variate normal distribution N(0,Σ). Under the null hypotheses, we tried to use 29
parameter configurations, which adequately cover the possibilities for ρ12 = ρ23 that
the covariance matrix is a positive definite. We summarized the 29 parameter con-
figuration as these two patterns of Σ. That is,
Σ1 =
1 ρ ρ2
ρ 1 ρ
ρ2 ρ 1
Σ2 =
1 ρ1 ρ2
ρ1 1 ρ1
ρ2 ρ1 1
In Σ1, we choose the correlation coefficients ρ = 0, 0.1, 0.2, ..., 0.9. The propose
of setting Σ1 is to measure the effect of ρ12 and ρ23 in Σ on the result. Besides, in
order to measure the rule of ρ13 in Σ, we set the second pattern of Σ, i.e. Σ2, where
ρ1 = 0, 0.1, 0.3, 0.5, 0.7 and ρ2 = 0.1, 0.3, 0.5, 0.7. We note that the parameter ρ13 is
nuisance parameter which must be handled. We set ρ12 and ρ23 are all positive because
we tried some early runs including configurations where ρ12 or both parameters were
32
negative, but all the results were consistent with the case where all parameters are
positive. Therefore, those situations were not used in subsequent runs.
On the other hand, for the power analysis, we want to examine the extent to which
different degrees of discrepancy between ρ12 and ρ23 (hereinafter refereed to as effect
sizes) affect the results. So we try 16 parameter configurations under alternative
hypotheses. Both of the effect size and the nuisance parameter ρ13 in Σ that we
examined were 0.1, 0.15, 0.3, 0.5.
Secondly, we categorize the dependent variables Umn into 5 levels, where m =
1, 2, 3 and n = 1, 2, 3, ..., N , N is sample size. Then we could get the ordinal variable
Xmn. We assume that the probabilities of numbers falling in each ordinal category
are equal, so the range of each category equals to 1 over 5 base on standard normal
distribution. So if Umn falls in the interval −∞ to −0.84162, it is under category 1,
etc. Thus, we get
Xmn =
1, if Umn ∈ (−∞,−0.84162];
2, if Umn ∈ (−0.84162,−0.25335];
3, if Umn ∈ (−0.25335, 0.25335];
4, if Umn ∈ (0.25335, 0.84162];
5, if Umn ∈ (0.84162,+∞);
where m = 1, 2, 3, and n = 1, 2, 3, ..., N,N is the sample size.
In this way, we obtain a sample that contains N observations. Next, we split the
sample on a contingency table. For instance, if X1n = r, X2n = s, we count once for
the cell on row r and column s in contingency table with correlation coefficient r12.
33
We performe all of the simulation using R data step language.
4.2 Hypotheses and the Criteria of Test Statistics
4.2.1 Hypotheses
The hypotheses for simulation tests are
H0 : ρ12 = ρ23;
Ha : ρ12 6= ρ23.
4.2.2 The Criteria of Test Statistics
Given the 10,000 iterations, the empirical levels of test statistics should be near the
nominal levels. It means that the 0.01, 0.05, 0.10 levels used here would imply that
the particular test being considered ought to reject H0 approximately 100, 500, 1000
times respectively, at any of these null parameter values. The empirical level is more
close to nominal level, the test statistic is relatively more appropriate. However, the
choice of the test statistics depends not only on sample size but also the magnitude
of the correlations. Besides, under the alternative hypotheses, we choose the optimal
test statistic which should yield the highest empirical power.
4.3 Results of Simulation
In compliance with the above notations and rules of generating the simulated data,
the simulation results are given below.
34
4.3.1 Empirical Level
1. Results with Σ1
We first evaluate the result of simulation based on the generated data with Σ1. Tables
4.1, 4.2 and 4.3 show the empirical levels of all eight statistics for ρ = 0, 0.1, ..., 0.9
and n = 30, 60, 90, 120 based on 10,000 replication at nominal level α = 0.01, 0.05 and
0.10 respectively. We denote that Zo = Olkin’s Z, Zd = Dunn and Clark’s Z, Zs = S-
teriger’s Z, Zm = Meng’s Z, Zh = Hitter’s Z, Th = Hotelling’s t, Tw = William’s t, Tm
= William’s modified t per Hendrickson in the tables and figures below. Observations
from the tables are that, there are considerable variability in empirical levels across
the eight statistics for the various sample sizes and the magnitude of correlation coef-
ficients ρ. Nevertheless, there were some trends that emerged despite the difference.
First, across all sample sizes and nominal levels, Olkin’s Z has substantially greater
empirical level than do the other seven test statistics when the magnitude of ρ are
small to moderate (i.e, 0 to 0.4). Whereas for the medium and large correlation ρ
(i.e, 0.5 to 0.9), the discrepancy between Olkin’s Z and the other seven test statistics
becomes less pronounced, especially for the large sample sizes (i.e, n = 90 and 120).
In addition, across all the sample sizes, for small and moderate correlation ρ (i.e,
0 to 0.4) the remaining seven ones of the eight statistical tests (here are Dunn and
Clark’s Z, Steriger’s Z, Meng’s Z, Hitter’s Z, Hotelling’s t, William’s t, William’s
modified t per Hendrickson) yield the empirical levels that hovered around the nominal
levels. However, there were some instances in which the empirical level was somewhat
liberal. For example, we now focus on the result in Table 4.2. For the small correlation
35
ρ = 0.3 and 0.4, the empirical levels for Dunn and Clsrk’s Z, Hotelling’s t and
William’s modified t per Hendrickson for all the sample sizes exceeded 0.05 and range
from 0.054 to 0.067. By comparing the analysis, it is interesting to note that the
remaining four of seven statistics are conservative with respect to control of empirical
level for this situation. It is showed that Dunn and Clark’s Z, Hotelling’s t and
William’s modified t per Hendrickson perform more appropriately empirical levels
when correlation values are range from 0 to 0.2, however Steriger’s Z, Meng’s Z,
Hitter’s Z and William’s t perform well at ρ = 0.3 and 0.4. On the other hand, with
regard to the remaining seven test statistics, we find that they control their empirical
levels fairly effectively for medium and large correlation ρ (i.e. 0.5 to 0.9), and
empirical levels for these seven test statistics trend up as the correlation ρ increased
from 0 to 0.9 cross all the sample size.
We plotted the empirical levels to intuitively distinguish and compare the eight
test statistics. The figures 4.1, 4.2, 4.3 show the empirical levels of all eight test
statistics at the nominal levels α = 0.01, 0.05 and 0.10 respectively for n = 30, 60, 90,
and 120. The figures 4.1, 4.2, 4.3 show the empirical levels of statistics at the three
nominal levels respectively.
From the plots we can easily find some trends that we discussed above. For
instance, the empirical level of Olkin’s Z trend to decrease or reasonable stabilization
in a range given the small and medium sample sizes (i.s,N=30 and 60) from ρ = 0 to
0.9. However, the plots are increased for the large sample sizes (i.e, N=90 and 120).
In addition, the empirical levels for other seven test statistics are overall trend up as
correlation ρ increases. Moreover, the empirical levels for the seven test statistics are
36
Table 4.1: Empirical levels of eight statistics for nominal level α = 0.01 with Σ1
N ρ Zo Zd Zs Zm Zh Th Tw Tm
30
0 0.0218 0.0110 0.0089 0.0073 0.0073 0.0096 0.0092 0.00940.1 0.0197 0.0098 0.0075 0.0059 0.0059 0.0092 0.0086 0.00910.2 0.0228 0.0119 0.0100 0.0082 0.0082 0.0119 0.0106 0.01190.3 0.0226 0.0131 0.0101 0.0090 0.0090 0.0141 0.0113 0.01400.4 0.0202 0.0143 0.0125 0.0109 0.0109 0.0158 0.0127 0.01570.5 0.0162 0.0152 0.0141 0.0125 0.0125 0.0165 0.0134 0.01650.6 0.0136 0.0165 0.0155 0.0142 0.0142 0.0198 0.0144 0.01980.7 0.0100 0.0193 0.0176 0.0166 0.0166 0.0245 0.0159 0.02450.8 0.0054 0.0235 0.0225 0.0218 0.0219 0.0274 0.0205 0.02740.9 0.0016 0.0288 0.0273 0.0264 0.0264 0.0306 0.0229 0.0306
60
0 0.0158 0.0108 0.0098 0.0091 0.0091 0.0103 0.0101 0.01030.1 0.0153 0.0111 0.0101 0.0088 0.0088 0.0108 0.0105 0.01080.2 0.0148 0.0105 0.0096 0.0088 0.0088 0.0106 0.0102 0.01050.3 0.0167 0.0128 0.0122 0.0113 0.0113 0.0145 0.0126 0.01430.4 0.0171 0.0141 0.0133 0.0125 0.0125 0.0160 0.0134 0.01600.5 0.0150 0.0144 0.0133 0.0128 0.0128 0.0180 0.0130 0.01800.6 0.0145 0.0161 0.0152 0.0143 0.0143 0.0204 0.0145 0.02040.7 0.0140 0.0179 0.0173 0.0169 0.0169 0.0238 0.0166 0.02380.8 0.0118 0.0214 0.0210 0.0207 0.0207 0.0270 0.0197 0.02700.9 0.0112 0.0268 0.0260 0.0253 0.0253 0.0302 0.0238 0.0302
90
0 0.0161 0.0106 0.0093 0.0089 0.0089 0.0101 0.0100 0.01010.1 0.0125 0.0101 0.0094 0.0088 0.0088 0.0098 0.0096 0.00980.2 0.0147 0.0119 0.0113 0.0105 0.0105 0.0120 0.0116 0.01200.3 0.0156 0.0126 0.0118 0.0116 0.0116 0.0139 0.0120 0.01390.4 0.0149 0.0134 0.0125 0.0118 0.0118 0.0158 0.0125 0.01580.5 0.0160 0.0156 0.0151 0.0143 0.0143 0.0198 0.0150 0.01980.6 0.0162 0.0170 0.0164 0.0159 0.0159 0.0215 0.0162 0.02150.7 0.0181 0.0217 0.0211 0.0205 0.0205 0.0286 0.0202 0.02860.8 0.0151 0.0230 0.0226 0.0223 0.0223 0.0280 0.0220 0.02800.9 0.0136 0.0228 0.0225 0.0222 0.0223 0.0256 0.0215 0.0256
120
0 0.0122 0.0100 0.0096 0.0091 0.0091 0.0099 0.0098 0.00990.1 0.0124 0.0098 0.0095 0.0092 0.0092 0.0098 0.0097 0.00980.2 0.0131 0.0108 0.0104 0.0101 0.0101 0.0110 0.0107 0.01100.3 0.0145 0.0120 0.0119 0.0115 0.0115 0.0136 0.0119 0.01360.4 0.0136 0.0120 0.0119 0.0114 0.0114 0.0145 0.0119 0.01450.5 0.0144 0.0136 0.0131 0.0125 0.0125 0.0170 0.0129 0.01700.6 0.0132 0.0142 0.0137 0.0134 0.0134 0.0182 0.0134 0.01820.7 0.0188 0.0218 0.0215 0.0209 0.0209 0.0286 0.0209 0.02860.8 0.0177 0.0226 0.0226 0.0223 0.0223 0.0281 0.0220 0.02810.9 0.0158 0.0234 0.0231 0.0231 0.0232 0.0277 0.0225 0.0277
37
Table 4.2: Empirical levels of eight statistics for nominal level α = 0.05 with Σ1
N ρ Zo Zd Zs Zm Zh Th Tw Tm
30
0 0.0751 0.0527 0.0498 0.0461 0.0461 0.0506 0.0500 0.05040.1 0.0755 0.0520 0.0485 0.0443 0.0443 0.0509 0.0490 0.05090.2 0.0796 0.0561 0.0519 0.0470 0.0470 0.0559 0.0522 0.05580.3 0.0713 0.0535 0.0510 0.0476 0.0476 0.0562 0.0505 0.05600.4 0.0740 0.0595 0.0569 0.0535 0.0535 0.0640 0.0560 0.06400.5 0.0766 0.0677 0.0641 0.0616 0.0616 0.0747 0.0629 0.07460.6 0.0701 0.0680 0.0657 0.0637 0.0637 0.0784 0.0641 0.07840.7 0.0632 0.0704 0.0683 0.0673 0.0673 0.0828 0.0658 0.08280.8 0.0615 0.0811 0.0795 0.0788 0.0789 0.0933 0.0755 0.09330.9 0.0509 0.0913 0.0902 0.0898 0.0901 0.0981 0.0853 0.0981
60
0 0.0599 0.0484 0.0458 0.0437 0.0437 0.0471 0.0459 0.04700.1 0.0619 0.0502 0.0481 0.0462 0.0462 0.0493 0.0484 0.04930.2 0.0682 0.0582 0.0562 0.0539 0.0539 0.0589 0.0565 0.05880.3 0.0632 0.0546 0.0532 0.0503 0.0503 0.0579 0.0529 0.05790.4 0.0646 0.0568 0.0553 0.0538 0.0538 0.0631 0.0548 0.06310.5 0.0684 0.0642 0.0625 0.0613 0.0613 0.0713 0.0621 0.07130.6 0.0658 0.0657 0.0643 0.0634 0.0634 0.0772 0.0637 0.07720.7 0.0655 0.0702 0.0692 0.0684 0.0684 0.0848 0.0681 0.08480.8 0.0660 0.0741 0.0734 0.0731 0.0731 0.0873 0.0716 0.08720.9 0.0695 0.0884 0.0879 0.0874 0.0874 0.0970 0.0844 0.0970
90
0 0.0591 0.0514 0.0502 0.0482 0.0482 0.0507 0.0503 0.05070.1 0.0551 0.0491 0.0483 0.0469 0.0469 0.0490 0.0484 0.04890.2 0.0555 0.0508 0.0498 0.0490 0.0490 0.0520 0.0500 0.05200.3 0.0627 0.0567 0.0555 0.0537 0.0537 0.0605 0.0556 0.06050.4 0.0649 0.0591 0.0577 0.0568 0.0568 0.0667 0.0575 0.06670.5 0.0670 0.0637 0.0629 0.0622 0.0622 0.0723 0.0626 0.07230.6 0.0706 0.0699 0.0694 0.0687 0.0687 0.0820 0.0686 0.08200.7 0.0741 0.0775 0.0768 0.0757 0.0757 0.0913 0.0754 0.09130.8 0.0723 0.0808 0.0801 0.0801 0.0802 0.0943 0.0785 0.09430.9 0.0701 0.0829 0.0826 0.0825 0.0825 0.0942 0.0810 0.0942
120
0 0.0551 0.0495 0.0485 0.0478 0.0478 0.0489 0.0487 0.04890.1 0.0548 0.0499 0.0489 0.0481 0.0481 0.0495 0.0490 0.04950.2 0.0549 0.0498 0.0491 0.0481 0.0481 0.0508 0.0491 0.05080.3 0.0599 0.0549 0.0545 0.0534 0.0534 0.0590 0.0544 0.05900.4 0.0603 0.0558 0.0553 0.0544 0.0545 0.0632 0.0550 0.06320.5 0.0589 0.0565 0.0560 0.0554 0.0555 0.0681 0.0557 0.06810.6 0.0631 0.0628 0.0622 0.0619 0.0619 0.0746 0.0619 0.07460.7 0.0745 0.0764 0.0759 0.0755 0.0755 0.0907 0.0754 0.09070.8 0.0765 0.0811 0.0808 0.0807 0.0807 0.0941 0.0797 0.09410.9 0.0753 0.0818 0.0815 0.0815 0.0818 0.0922 0.0802 0.0922
38
Table 4.3: Empirical levels of eight statistics for nominal level α = 0.10 with Σ1
N ρ Zo Zd Zs Zm Zh Th Tw Tm
30
0 0.1277 0.1020 0.0985 0.0940 0.0940 0.0998 0.0980 0.09940.1 0.1308 0.1047 0.1008 0.0966 0.0967 0.1031 0.1003 0.10280.2 0.1370 0.1096 0.1053 0.1023 0.1023 0.1098 0.1052 0.10960.3 0.1274 0.1032 0.0996 0.0954 0.0954 0.1063 0.0983 0.10620.4 0.1322 0.1118 0.1087 0.1055 0.1055 0.1178 0.1072 0.11770.5 0.1369 0.1220 0.1192 0.1164 0.1164 0.1334 0.1170 0.13340.6 0.1338 0.1267 0.1246 0.1230 0.1230 0.1403 0.1226 0.14010.7 0.1289 0.1300 0.1280 0.1269 0.1269 0.1466 0.1248 0.14660.8 0.1284 0.1423 0.1410 0.1399 0.1399 0.1591 0.1357 0.15910.9 0.1263 0.1561 0.1555 0.1552 0.1555 0.1649 0.1503 0.1649
60
0 0.1152 0.1007 0.0987 0.0972 0.0972 0.0995 0.0986 0.09950.1 0.1177 0.1025 0.1010 0.0988 0.0988 0.1022 0.1001 0.10210.2 0.1277 0.1126 0.1109 0.1092 0.1092 0.1137 0.1106 0.11370.3 0.1182 0.1072 0.1056 0.1043 0.1043 0.1102 0.1054 0.11020.4 0.1220 0.1122 0.1106 0.1084 0.1084 0.1216 0.1096 0.12160.5 0.1218 0.1145 0.1134 0.1128 0.1128 0.1265 0.1127 0.12650.6 0.1253 0.1212 0.1204 0.1189 0.1189 0.1374 0.1194 0.13740.7 0.1263 0.1269 0.1258 0.1254 0.1254 0.1441 0.1242 0.14410.8 0.1271 0.1329 0.1318 0.1312 0.1312 0.1511 0.1300 0.15110.9 0.1349 0.1487 0.1483 0.1480 0.1480 0.1616 0.1445 0.1616
90
0 0.1095 0.1004 0.0990 0.0979 0.0979 0.0992 0.0987 0.09910.1 0.1070 0.0989 0.0978 0.0964 0.0964 0.0984 0.0977 0.09840.2 0.1062 0.0991 0.0985 0.0973 0.0973 0.1002 0.0983 0.10020.3 0.1149 0.1073 0.1067 0.1058 0.1058 0.1121 0.1064 0.11210.4 0.1213 0.1141 0.1133 0.1126 0.1126 0.1239 0.1131 0.12390.5 0.1237 0.1176 0.1168 0.1157 0.1157 0.1311 0.1161 0.13110.6 0.1287 0.1256 0.1246 0.1239 0.1239 0.1432 0.1240 0.14320.7 0.1353 0.1362 0.1355 0.1351 0.1352 0.1519 0.1345 0.15190.8 0.1369 0.1412 0.1406 0.1404 0.1405 0.1580 0.1395 0.15800.9 0.1358 0.1437 0.1434 0.1432 0.1432 0.1574 0.1424 0.1574
120
0 0.1087 0.1007 0.0997 0.0991 0.0991 0.0998 0.0997 0.09980.1 0.1073 0.1012 0.1006 0.0994 0.0994 0.1016 0.1005 0.10160.2 0.1086 0.1018 0.1008 0.0992 0.0992 0.1046 0.1007 0.10460.3 0.1121 0.1061 0.1045 0.1038 0.1038 0.1113 0.1043 0.11130.4 0.1165 0.1119 0.1112 0.1106 0.1106 0.1206 0.1110 0.12060.5 0.1189 0.1163 0.1158 0.1153 0.1153 0.1286 0.1155 0.12860.6 0.1206 0.1177 0.1169 0.1161 0.1161 0.1353 0.1161 0.13530.7 0.1383 0.1389 0.1383 0.1382 0.1382 0.1554 0.1377 0.15540.8 0.1430 0.1468 0.1465 0.1462 0.1462 0.1663 0.1457 0.16630.9 0.1397 0.1471 0.1470 0.1469 0.1471 0.1603 0.1452 0.1603
39
far away from the nominal levels when the correlation ρ is greater than 0.5. Besides,
Hotelling’s t and William’s modified t per Hendrickson are relatively poor to control
the empirical levels at that situation. Therefore, we need to do some modification for
test statistics as the correlation ρ increases.
0.0 0.5 1.0 1.5
0.00
0.01
0.02
0.03
0.04
α=0.01
ZoZdZsZmZhthtwtm
(a)Correlation Coefficient
Em
pirc
al S
izes
Rat
es N
=30
0.0 0.5 1.0 1.5
0.00
0.01
0.02
0.03
0.04
α=0.01
ZoZdZsZmZhthtwtm
(b)Correlation Coefficient
Em
pirc
al S
izes
Rat
es a
t N=
60
0.0 0.5 1.0 1.5
0.00
0.01
0.02
0.03
0.04
α=0.01
ZoZdZsZmZhthtwtm
(c)Correlation Coefficient
Em
pirc
al S
izes
Rat
es a
t N=
90
0.0 0.5 1.0 1.5
0.00
0.01
0.02
0.03
0.04
α=0.01
ZoZdZsZmZhthtwtm
(d)Correlation Coefficient
Em
pirc
al S
izes
Rat
es a
t N=
120
Figure 4.1: Empirical levels at α = 0.01
40
0.0 0.5 1.0 1.5
0.05
0.06
0.07
0.08
0.09
0.10
α=0.05
ZoZdZsZmZhthtwtm
(a)Correlation Coefficient
Em
pirc
al S
izes
Rat
es a
t N=
30
0.0 0.5 1.0 1.5
0.05
0.06
0.07
0.08
0.09
0.10
α=0.05
ZoZdZsZmZhthtwtm
(b)Correlation Coefficient
Em
pirc
al S
izes
Rat
es a
t N=
60
0.0 0.5 1.0 1.5
0.05
0.06
0.07
0.08
0.09
0.10
α=0.05
ZoZdZsZmZhthtwtm
(c)Correlation Coefficient
Em
pirc
al S
izes
Rat
es a
t N=
90
0.0 0.5 1.0 1.5
0.05
0.06
0.07
0.08
0.09
0.10
α=0.05
ZoZdZsZmZhthtwtm
(d)Correlation Coefficient
Em
pirc
al S
izes
Rat
es a
t N=
120
Figure 4.2: Empirical levels at α = 0.05
41
0.0 0.5 1.0 1.5
0.10
0.12
0.14
0.16
α=0.10
ZoZdZsZmZhthtwtm
(a)Correlation Coefficient
Em
pirc
al S
izes
Rat
es a
t N=
30
0.0 0.5 1.0 1.5
0.10
0.12
0.14
0.16
α=0.10
ZoZdZsZmZhthtwtm
(b)Correlation Coefficient
Em
pirc
al S
izes
Rat
es a
t N=
60
0.0 0.5 1.0 1.5
0.10
0.12
0.14
0.16
α=0.10
ZoZdZsZmZhthtwtm
(c)Correlation Coefficient
Em
pirc
al S
izes
Rat
es a
t N=
90
0.0 0.5 1.0 1.5
0.10
0.12
0.14
0.16
α=0.10
ZoZdZsZmZhthtwtm
(d)Correlation Coefficient
Em
pirc
al S
izes
Rat
es a
t N=
120
Figure 4.3: Empirical levels at α = 0.10
42
With regard to the result of simulation in the first pattern, Σ1, and based on the
idea of modification we mentioned in Section 3.3, we try to modify Meng’s Z test
statistics. From the Figures 4.1 to 4.3, the variational trend of empirical size for
Meng’s Z keeps consistent at difference sample sizes. For instance, at nominal level
α = 0.01, Figure.4.4 shows the empirical levels of Zm and regression lines at sample
size N = 30, 60, 90, 120 respectively. We can see the slope of the regression lines for
each empirical level are broadly similar at different sample sizes. From this point of
view, we modify Zm such that decreases as the correlation ρ increases, and the most
ideal effect of the modification is that the regression line for Zmmodifiedparallel and
close to the nominal level line.
It means that we can modify Meng’s Z by using the idea in Section 3.3 where the
function f3 is a function of ρ in pattern Σ1, and it does not depend on sample size.
Zmmodified= Zm + Zmf3
= Zm + Zma(ρ− ρ0)
where ρ0 in this function is the average values of point of intersection for regression
line and the line of the nominal level, and a is the empirical value and satisfies a < 1
so that the regression line for Zmmodifiedis closer to the nominal level line. According
to arithmetic and several simulation test, we modify the Meng’s Z to a relatively
ideal test statistics with a = 0.1625 and ρ0 = 0.2. Table 4.4 compares the empirical
levels of Zm and Zmmodifiedbased on generated data with pattern Σ1. Through the
adjustment, we find that the empirical levels of the modified Zm obviously yielded
values hovering around the nominal levels that did better than the empirical levels
of Zm, especially when correlation coefficient ρ is moderate to large (i.e, 0.5 to 0.9).
43
●
●
● ●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8
0.01
00.
015
0.02
00.
025
Correlation Coefficient
Em
pirc
al S
izes
of Z
m a
t N=
30
(a)
●
●
●
●
●
●
●
●
● ●
0.0 0.2 0.4 0.6 0.8
0.01
00.
015
0.02
00.
025
Correlation Coefficient
Em
pirc
al S
izes
of Z
m a
t N=
60
(b)
●
●
●
●●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8
0.01
00.
015
0.02
00.
025
Correlation Coefficient
Em
pirc
al S
izes
of Z
m a
t N=
90
(c)
●
●●
●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8
0.01
00.
015
0.02
00.
025
Correlation Coefficient
Em
pirc
al S
izes
of Z
m a
t N=
120
(d)
Figure 4.4: Empirical levels of Meng’s Z at α = 0.01
44
Figure 4.5 shows the empirical size of modified Zm and regression line. We can easily
compare the regression lines of empirical levels for modified Zm (see Figure 4.5) and
original Zm (see Figure 4.4), and the regression lines of empirical levels for modified
Zm is more paralleled and closer to the nominal level lines at four settled sample sizes.
It is shown that the method of modification we mentioned in Section 3.3 effectively
re-normalize the test statistics when we generated data with the pattern Σ1. However,
there still exist some problems by using the method of modification because it cannot
apply to general cases. The test statistics should be modified by estimating their
expected values and variances. We suggest that researchers who work with the same
issue might consider this direction to study.
45
Table 4.4: Empirical levels of Zm and Zmmodifiedwith Σ1
α = 0.01 α = 0.05 α = 0.10N ρ Zm Zmmodified
Zm ZmmodifiedZm Zmmodified
30
0 0.0073 0.0094 0.0461 0.0533 0.0940 0.10670.1 0.0059 0.0091 0.0443 0.0498 0.0966 0.10340.2 0.0082 0.0090 0.0470 0.0471 0.1023 0.09660.3 0.0090 0.0089 0.0476 0.0494 0.0954 0.09810.4 0.0109 0.0089 0.0535 0.0494 0.1055 0.09690.5 0.0125 0.0100 0.0616 0.0468 0.1164 0.09890.6 0.0142 0.0095 0.0637 0.0454 0.1230 0.10130.7 0.0166 0.0095 0.0673 0.0487 0.1269 0.09980.8 0.0218 0.0102 0.0788 0.0517 0.1399 0.10280.9 0.0264 0.0103 0.0898 0.0504 0.1552 0.0998
60
0 0.0091 0.0102 0.0437 0.0552 0.0972 0.10970.1 0.0088 0.0096 0.0462 0.0480 0.0988 0.09960.2 0.0088 0.0100 0.0539 0.0491 0.1092 0.10050.3 0.0113 0.0101 0.0503 0.0480 0.1043 0.09590.4 0.0125 0.0095 0.0538 0.0494 0.1084 0.09710.5 0.0128 0.0097 0.0613 0.0467 0.1128 0.09610.6 0.0143 0.0109 0.0634 0.0470 0.1189 0.09670.7 0.0169 0.0097 0.0684 0.0506 0.1254 0.09940.8 0.0207 0.0107 0.0731 0.0502 0.1312 0.10510.9 0.0253 0.0103 0.0874 0.0499 0.1480 0.1021
90
0 0.0089 0.0105 0.0482 0.0546 0.0979 0.10470.1 0.0088 0.0102 0.0469 0.0511 0.0964 0.10210.2 0.0105 0.0094 0.0490 0.0512 0.0973 0.10230.3 0.0116 0.0092 0.0537 0.0509 0.1058 0.10090.4 0.0118 0.0091 0.0568 0.0493 0.1126 0.10110.5 0.0143 0.0080 0.0622 0.0501 0.1157 0.10020.6 0.0159 0.0097 0.0687 0.0499 0.1239 0.09860.7 0.0205 0.0101 0.0757 0.0489 0.1351 0.10110.8 0.0223 0.0096 0.0801 0.0501 0.1404 0.09910.9 0.0222 0.0089 0.0825 0.0491 0.1432 0.0990
120
0 0.0091 0.0095 0.0478 0.0558 0.0991 0.11110.1 0.0092 0.0102 0.0481 0.0507 0.0994 0.10040.2 0.0101 0.0089 0.0481 0.0474 0.0992 0.09920.3 0.0115 0.0092 0.0534 0.0501 0.1038 0.10120.4 0.0114 0.0090 0.0544 0.0474 0.1106 0.09890.5 0.0125 0.0081 0.0554 0.0492 0.1153 0.09910.6 0.0134 0.0096 0.0619 0.0482 0.1161 0.09930.7 0.0209 0.0107 0.0755 0.0494 0.1382 0.10180.8 0.0223 0.0102 0.0807 0.0501 0.1462 0.10090.9 0.0231 0.0091 0.0815 0.0474 0.1469 0.1008
46
●● ● ● ●
●
● ●
● ●
0.0 0.2 0.4 0.6 0.8
0.00
80.
012
0.01
60.
020
Correlation Coefficient
Em
pirc
al S
izes
of m
odifi
ed Z
m a
t N=
30
(a)
●
●● ●
●●
●
●
●●
0.0 0.2 0.4 0.6 0.8
0.00
80.
012
0.01
60.
020
Correlation Coefficient
Em
pirc
al S
izes
of m
odifi
ed Z
m a
t N=
60
(b)
●●
●● ●
●
●●
●
●
0.0 0.2 0.4 0.6 0.8
0.00
80.
012
0.01
60.
020
Correlation Coefficient
Em
pirc
al S
izes
of m
odifi
ed Z
m a
t N=
90
(c)
●
●
●●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8
0.00
80.
012
0.01
60.
020
Correlation Coefficient
Em
pirc
al S
izes
of m
odifi
ed Z
m a
t N=
120
(d)
Figure 4.5: Empirical level of modified Meng’s Z at α = 0.01
47
2. Result with Σ2
Table 4.5 presents the result of the simulation study based on the generated data
with Σ2 consistent with H0 : ρ12 = ρ23 for sample size 30, 60, 90 and 120 respectively
at nominal level α = 0.01. We generated a sample with Σ2 to compare the eight test
statistics according to changes of correlation coefficient ρ13 in Σ.
With regard to the empirical levels finding in Table 4.5, we discover that the
empirical level for each test statistic is changed by the magnitude of ρ13. As we
mentioned in the result with Σ1, the seven of the eight test statistics, Dunn and
Clark’s Z , Steriger’s Z, Meng’s Z , Hitter’s Z, Hotelling’s t, William’s t, William’s
modified t per Hendrickson, are conservative with respect to the value of empirical
levels when correlation coefficients ρ1 are small to moderate. From Table 4.5, the
magnitude of ρ2 have had little effect on the empirical levels for test statistics when
ρ1 are small (i.e, 0 to 0.3). When ρ2 increases, the empirical levels are far away from
nominal level and are smaller than it. The effect of change for ρ2 is not in a good
way. It means when researchers need to choose one optimal test statistic to apply to
the real data, they do not pay much attention to deal with the correlation ρ2 when
ρ1 are small.
However, when ρ1 are medium to large (i.e, 0.5 and 0.7), the magnitude of ρ2
played a role in the control of the empirical levels. The effect of ρ2 is more pronounced
when ρ1 equals 0.5. For example, consider the small sample size in which ρ = 0.5
based on the generated data with Σ1 (i.e, n = 30, ρ12 = ρ23 = 0.5, and ρ13 = 0.52
= 0.25), the empirical levels for Steriger’s Z, Meng’s Z, Hitter’s Z and William’s
t shown in Table 4.1 are 0.0141, 0.0125, 0.0125 and 0.0134 respectively which all
48
rates are over the nominal level α = 0.01. Nevertheless, in the same situation but
the sample were generated with Σ2, the same four test statistics demonstrate the
relatively optimal empirical levels ranging from 0.0099 to 0.0111, except ρ2 = 0.5. On
the other hand, although the effect of ρ2 is not obvious when ρ1 are large (i.e ρ = 0.7),
the test statistics performed better due to the change of ρ2. Comparing the result in
Table 4.1 and Table 4.5, we also pay attention to Steriger’s Z, Meng’s Z, Hitter’s Z
and William’s t, the empirical levels for them are more closed to the nominal level as
ρ13 in Σ increases.
Even thought the force of ρ13 in Σ is not consistent for all the situations, we can
do the modification to test statistics follow the method above. In order to deal with
the inflated empirical levels as ρ12 = ρ23 increases, we can modify the test statistics
by multiplying an empirical function of correlation coefficients which is similar to last
part we discussed.
Table 4.5: Empirical levels of eight statistics for nominal level α =
0.01 with Σ2
ρ1 ρ2 Zo Zd Zs Zm Zh Th Tw Tm
N = 30
0
0.1 0.0183 0.0093 0.0069 0.0057 0.0057 0.0089 0.0085 0.00890.3 0.0156 0.0079 0.0064 0.0053 0.0053 0.0080 0.0078 0.00800.5 0.0138 0.0065 0.0054 0.0040 0.0040 0.0070 0.0068 0.00700.7 0.0114 0.0062 0.0053 0.0043 0.0043 0.0086 0.0086 0.0086
0.1
0.1 0.0237 0.0115 0.0091 0.0079 0.0079 0.0114 0.0106 0.01130.3 0.0188 0.0096 0.0081 0.0066 0.0066 0.0098 0.0094 0.00970.5 0.0164 0.0091 0.0074 0.0061 0.0061 0.0096 0.0094 0.00960.7 0.0131 0.0072 0.0064 0.0051 0.0051 0.0090 0.0088 0.0089
0.3
0.1 0.0204 0.0128 0.0103 0.0086 0.0086 0.0131 0.0109 0.01280.3 0.0194 0.0130 0.0112 0.0098 0.0098 0.0134 0.0124 0.01340.5 0.0154 0.0098 0.0084 0.0070 0.0070 0.0109 0.0102 0.01080.7 0.0121 0.0090 0.0074 0.0064 0.0064 0.0104 0.0101 0.0104
0.5
0.1 0.0158 0.0121 0.0110 0.0104 0.0104 0.0161 0.0103 0.0161
(see next page)
49
Table Continued(see last page)
ρ1 ρ2 Zo Zd Zs Zm Zh Th Tw Tm
0.3 0.0138 0.0122 0.0104 0.0099 0.0099 0.0139 0.0105 0.01390.5 0.0154 0.0098 0.0084 0.0070 0.0070 0.0109 0.0102 0.01080.7 0.0112 0.0134 0.0114 0.0100 0.0100 0.0139 0.0103 0.0139
0.7
0.1 0.0113 0.0187 0.0178 0.0187 0.0187 0.0617 0.0149 0.06160.3 0.0108 0.0195 0.0183 0.0180 0.0180 0.0335 0.0162 0.03330.5 0.0092 0.0193 0.0175 0.0162 0.0162 0.0249 0.0155 0.02480.7 0.0078 0.0205 0.0180 0.0165 0.0165 0.0211 0.0186 0.0211
N = 60
0
0.1 0.0136 0.0093 0.0084 0.0069 0.0069 0.0090 0.0089 0.00900.3 0.0136 0.0089 0.0079 0.0069 0.0069 0.0089 0.0088 0.00890.5 0.0131 0.0092 0.0077 0.0074 0.0074 0.0097 0.0096 0.00970.7 0.0107 0.0076 0.0072 0.0069 0.0069 0.0082 0.0082 0.0082
0.1
0.1 0.0161 0.0110 0.0102 0.0090 0.0090 0.0109 0.0107 0.01090.3 0.0144 0.0093 0.0081 0.0072 0.0072 0.0092 0.0091 0.00920.5 0.0128 0.0094 0.0086 0.0077 0.0077 0.0094 0.0094 0.00940.7 0.0115 0.0088 0.0080 0.0076 0.0076 0.0100 0.0099 0.0100
0.3
0.1 0.0174 0.0131 0.0122 0.0114 0.0114 0.0141 0.0126 0.01410.3 0.0155 0.0123 0.0112 0.0105 0.0105 0.0129 0.0120 0.01290.5 0.0137 0.0107 0.0103 0.0097 0.0097 0.0113 0.0107 0.01130.7 0.012 0.0104 0.0093 0.0088 0.0088 0.0113 0.0110 0.0113
0.5
0.1 0.0138 0.0124 0.0119 0.0114 0.0114 0.0179 0.0117 0.01790.3 0.0126 0.0119 0.0110 0.0102 0.0102 0.0150 0.0109 0.01500.5 0.0149 0.0146 0.0139 0.0131 0.0131 0.0158 0.0141 0.01580.7 0.0136 0.0142 0.0134 0.0126 0.0126 0.0148 0.0142 0.0148
0.7
0.1 0.0132 0.0169 0.0166 0.0168 0.0168 0.0637 0.0145 0.06370.3 0.0134 0.0175 0.0167 0.0165 0.0165 0.0356 0.0156 0.03560.5 0.0124 0.0163 0.0159 0.0156 0.0156 0.0224 0.0154 0.02240.7 0.0130 0.0190 0.0177 0.0168 0.0168 0.0203 0.0177 0.0203
N = 90
0
0.1 0.0124 0.0092 0.0088 0.0085 0.0085 0.0091 0.0090 0.00910.3 0.0117 0.0090 0.0084 0.0075 0.0075 0.0089 0.0089 0.00890.5 0.0109 0.0088 0.0085 0.0081 0.0081 0.0089 0.0089 0.00890.7 0.0112 0.0100 0.0099 0.0093 0.0093 0.0104 0.0104 0.0104
0.1
0.1 0.0140 0.0102 0.0094 0.0090 0.0090 0.0099 0.0097 0.00990.3 0.0143 0.0117 0.0107 0.0093 0.0093 0.0116 0.0116 0.01160.5 0.0130 0.0102 0.0097 0.0093 0.0093 0.0107 0.0104 0.01070.7 0.0110 0.0095 0.0090 0.0085 0.0085 0.0101 0.0101 0.0101
0.3
0.1 0.0172 0.0141 0.0135 0.0127 0.0127 0.0157 0.0135 0.01570.3 0.0135 0.0116 0.0112 0.0110 0.0110 0.0122 0.0114 0.01220.5 0.0123 0.0109 0.0105 0.0103 0.0103 0.0114 0.0110 0.01140.7 0.0141 0.0128 0.0119 0.0115 0.0115 0.0135 0.0133 0.0135
0.5
0.1 0.0155 0.0147 0.0140 0.0136 0.0136 0.0197 0.0138 0.01970.3 0.0134 0.0133 0.0131 0.0127 0.0127 0.0159 0.0130 0.01590.5 0.0148 0.0148 0.0142 0.0136 0.0136 0.0163 0.0143 0.01630.7 0.0124 0.0127 0.0119 0.0116 0.0116 0.0134 0.0127 0.0134
0.7
0.1 0.0153 0.0180 0.0174 0.0177 0.0177 0.0657 0.0164 0.06570.3 0.0159 0.0180 0.0179 0.0178 0.0178 0.0344 0.0172 0.03430.5 0.0157 0.0179 0.0174 0.0172 0.0172 0.0233 0.0171 0.0233
(see next page)
50
Table Continued(see last page)
ρ1 ρ2 Zo Zd Zs Zm Zh Th Tw Tm
0.7 0.0111 0.0155 0.0150 0.0142 0.0142 0.0164 0.0153 0.0164N = 120
0
0.1 0.0120 0.0099 0.0092 0.0088 0.0088 0.0095 0.0095 0.00950.3 0.0107 0.0096 0.0090 0.0085 0.0085 0.0096 0.0095 0.00960.5 0.0125 0.0102 0.0099 0.0098 0.0098 0.0106 0.0105 0.01060.7 0.0123 0.0113 0.0110 0.0109 0.0109 0.0114 0.0114 0.0114
0.1
0.1 0.0140 0.0119 0.0112 0.0103 0.0103 0.0119 0.0117 0.01190.3 0.0129 0.0114 0.0109 0.0098 0.0098 0.0114 0.0114 0.01140.5 0.0118 0.0095 0.0090 0.0088 0.0088 0.0098 0.0098 0.00980.7 0.0113 0.0101 0.0100 0.0097 0.0097 0.0105 0.0105 0.0105
0.3
0.1 0.0153 0.0132 0.0126 0.0122 0.0122 0.0142 0.0128 0.01420.3 0.0137 0.0125 0.0122 0.0114 0.0114 0.0131 0.0125 0.01310.5 0.0134 0.0119 0.0113 0.0111 0.0111 0.0123 0.0118 0.01230.7 0.0135 0.0127 0.0124 0.0121 0.0121 0.0131 0.0130 0.0131
0.5
0.1 0.0149 0.0142 0.0141 0.0139 0.0139 0.0201 0.0140 0.02010.3 0.0140 0.0135 0.0132 0.0129 0.0129 0.0165 0.0132 0.01650.5 0.0120 0.0120 0.0115 0.0112 0.0112 0.0128 0.0119 0.01280.7 0.0127 0.0130 0.0122 0.0118 0.0118 0.0133 0.0129 0.0133
0.7
0.1 0.0145 0.0162 0.0159 0.0160 0.0160 0.0615 0.0156 0.06150.3 0.0156 0.0181 0.0179 0.0178 0.0178 0.0315 0.0173 0.03150.5 0.0148 0.0172 0.0166 0.0165 0.0165 0.0226 0.0165 0.02260.7 0.0142 0.0170 0.0162 0.0160 0.0160 0.0186 0.0162 0.0186
51
4.3.2 Statistical Power
The simulated power estimates at nominal level α = 0.05 for all eight test statistics are
presented in Table 4.6. We denote that ES = effect size which refers to the magnitude
of difference between two correlations ρ12 and ρ23. The specified correlations used to
generate each effect size are as follows: for ES = 0.1, ρ12 = 0.5, ρ23 = 0.4; for ES =
0.15, ρ12 = 0.55, ρ23 = 0.4; for ES = 0.3, ρ12 = 0.5, ρ23 = 0.2; for ES = 0.5, ρ12 = 0.7,
ρ23 = 0.2. As the data indicates, we obtain acceptable levels of power (approximately
0.8 and higher) for all the moderate and large sample sizes (i.e, n = 60, 90, 120) with
an effect size of 0.5 regardless of the value of ρ13. For the smallest sample size, the
acceptable levels of power occur only at effect size of 0.5 and ρ13 = 0.5.
Besides, there are some trends in the estimated power across the statistical tests.
First, there is a general tendency across all situations for Olkin’s Z. Olkin’s Z yields
the highest empirical power for all the cases and is also accompanied by aforemen-
tioned inflated empirical level. As we know the view is that a optimal test statistic
should yield empirical level that are closer to the nominal level and higher power
estimates. We cannot recommend that applied researchers use Olkin’s Z because the
test statistic are contradictory on this point with regard to empirical level and power.
In general, the other seven test statistics being considered, Dunn and Clark’s
Z, Hotelling’s t and William’s modified t per Hendrickson are essentially similar in
aspect of empirical power. On the other hand, Steriger’s Z, Meng’s Z , Hitter’s Z
and William’s t yield equivalently in aspect of empirical power. It is noted that the
empirical powers for the first three test statistics are comparatively appropriate than
other four in all cases. In addition, the power simulation reveals a similar pattern of
52
finding across the different sample size, that is, we can discover that the empirical
power for statistical tests increases as the sample size becomes large. We should
consider the rule of the sample size when we do modification of test statistics.
Table 4.6: Empirical power of all eight statistics for nominal level
α = 0.01
ES ρ13 Zo Zd Zs Zm Zh Th Tw Tm
N = 30
0.1
0.1 0.0949 0.0795 0.0760 0.0735 0.0735 0.0899 0.0743 0.08990.15 0.0915 0.0763 0.0727 0.0706 0.0706 0.0845 0.0714 0.08440.3 0.0985 0.0855 0.0817 0.0788 0.0788 0.0904 0.0816 0.09020.5 0.0981 0.0886 0.0845 0.0809 0.0809 0.0908 0.0865 0.0907
0.15
0.1 0.1263 0.1116 0.1080 0.1054 0.1054 0.1263 0.1049 0.12620.15 0.1268 0.1102 0.1053 0.1020 0.1020 0.1232 0.1029 0.12310.3 0.1325 0.1199 0.1155 0.1115 0.1115 0.1257 0.1148 0.12560.5 0.1488 0.1354 0.1293 0.1242 0.1242 0.1392 0.1313 0.1391
0.3
0.1 0.2568 0.2211 0.2152 0.2088 0.2088 0.2288 0.2141 0.22870.15 0.2675 0.2323 0.2242 0.2164 0.2164 0.2391 0.2240 0.23900.3 0.3024 0.2675 0.2573 0.2492 0.2492 0.2711 0.2607 0.27100.5 0.3630 0.3263 0.3174 0.3091 0.3091 0.3282 0.3224 0.3282
0.5
0.1 0.6600 0.6259 0.6190 0.6125 0.6125 0.6509 0.6149 0.65030.15 0.6581 0.6290 0.6213 0.6135 0.6136 0.6489 0.6180 0.64840.3 0.7396 0.7169 0.7082 0.6997 0.6997 0.7265 0.7086 0.72650.5 0.8469 0.8374 0.8320 0.8268 0.8268 0.8397 0.8335 0.8397
N = 60
0.1
0.1 0.1156 0.1049 0.1033 0.1007 0.1007 0.1199 0.1016 0.11990.15 0.1131 0.1041 0.1023 0.1003 0.1003 0.1160 0.1018 0.11590.3 0.1215 0.1134 0.1102 0.1076 0.1076 0.1207 0.1099 0.12070.5 0.1352 0.1294 0.1256 0.1237 0.1237 0.1327 0.1268 0.1327
0.15
0.1 0.1817 0.1703 0.1683 0.1660 0.1660 0.1926 0.1670 0.19260.15 0.1844 0.1747 0.1719 0.1695 0.1695 0.1928 0.1701 0.19280.3 0.2051 0.1950 0.1910 0.1878 0.1878 0.2084 0.1900 0.20840.5 0.2387 0.2284 0.2249 0.2209 0.2209 0.2342 0.2262 0.2342
0.3
0.1 0.4287 0.4062 0.4008 0.3965 0.3965 0.4183 0.4001 0.41820.15 0.4441 0.4215 0.4169 0.4111 0.4111 0.4321 0.4169 0.43210.3 0.5025 0.4811 0.4765 0.4708 0.4708 0.4872 0.4776 0.48720.5 0.6212 0.6009 0.5955 0.5912 0.5912 0.6043 0.5985 0.6043
0.5
0.1 0.9001 0.8927 0.8908 0.8892 0.8892 0.9050 0.8902 0.90500.15 0.9144 0.9085 0.9069 0.9054 0.9054 0.9183 0.9060 0.91830.3 0.9562 0.9537 0.9529 0.9520 0.9520 0.9570 0.9527 0.95700.5 0.9891 0.9889 0.9888 0.9887 0.9887 0.9891 0.9888 0.9891
(see next page)
53
Table Continued(see last page)
ES ρ13 Zo Zd Zs Zm Zh Th Tw Tm
N = 90
0.1
0.1 0.1268 0.1193 0.1179 0.1164 0.1164 0.1379 0.1173 0.13790.15 0.1367 0.1302 0.1287 0.1270 0.1270 0.1457 0.1279 0.14570.3 0.1460 0.1404 0.1390 0.1374 0.1374 0.1496 0.1390 0.14960.5 0.1671 0.1620 0.1598 0.1581 0.1581 0.1670 0.1605 0.1670
0.15
0.1 0.2311 0.2222 0.2205 0.2183 0.2183 0.2498 0.2189 0.24970.15 0.2351 0.2269 0.2252 0.2239 0.2239 0.2503 0.2245 0.25030.3 0.2540 0.2468 0.2442 0.2421 0.2421 0.2606 0.2440 0.26060.5 0.3351 0.3269 0.3243 0.3219 0.3219 0.3337 0.3248 0.3337
0.3
0.1 0.5763 0.5594 0.5567 0.5530 0.5530 0.5756 0.5562 0.57560.15 0.5932 0.5771 0.5736 0.5704 0.5704 0.5900 0.5734 0.59000.3 0.6555 0.6416 0.6390 0.6361 0.6361 0.6487 0.6395 0.64870.5 0.7815 0.7716 0.7676 0.7648 0.7648 0.7742 0.7699 0.7742
0.5
0.1 0.9802 0.9792 0.9789 0.9786 0.9786 0.9823 0.9786 0.98230.15 0.9831 0.9819 0.9814 0.9808 0.9809 0.9848 0.9813 0.98480.3 0.9934 0.9932 0.9931 0.9931 0.9932 0.9936 0.9931 0.99360.5 0.9991 0.9991 0.9991 0.9991 0.9992 0.9991 0.9991 0.9991
N = 120
0.1
0.1 0.1502 0.1455 0.1445 0.1429 0.1429 0.1626 0.1436 0.16250.15 0.1593 0.1530 0.1520 0.1512 0.1512 0.1711 0.1515 0.17110.3 0.1722 0.1664 0.1654 0.1647 0.1647 0.1764 0.1653 0.17640.5 0.2030 0.1978 0.1960 0.1948 0.1948 0.2027 0.1962 0.2027
0.15
0.1 0.2813 0.2745 0.2728 0.2713 0.2713 0.3041 0.2719 0.30410.15 0.2874 0.2796 0.2778 0.2762 0.2762 0.3061 0.2767 0.30610.3 0.3292 0.3231 0.3212 0.3191 0.3191 0.3396 0.3207 0.33960.5 0.3973 0.3910 0.3888 0.3873 0.3873 0.3995 0.3894 0.3995
0.3
0.1 0.6898 0.6793 0.6773 0.6745 0.6745 0.6925 0.6771 0.69250.15 0.7134 0.7037 0.7017 0.6998 0.6998 0.7148 0.7017 0.71480.3 0.7790 0.7705 0.7688 0.7671 0.7671 0.7767 0.7693 0.77670.5 0.8829 0.8769 0.8755 0.8743 0.8743 0.8796 0.8764 0.8796
0.5
0.1 0.9951 0.9948 0.9947 0.9947 0.9947 0.9958 0.9947 0.99580.15 0.9963 0.9960 0.9958 0.9958 0.9959 0.9970 0.9958 0.99700.3 0.9991 0.9990 0.9990 0.9990 0.9990 0.9991 0.9990 0.99910.5 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
54
4.4 Discussion
To sum up this chapter, according to the simulation results, our data support as-
sertions that Olkin’s Z demonstrated relatively the higher empirical level and the
highest empirical power than did other test statistics, especially when ρ12 and ρ23 are
small to moderate, so we cannot recommend Olkin’s Z to apply in medical data.
In addition, based on all the result of simulation above, we advise that the re-
searchers who are interested in this topic when they meet different situations, they
should choose different relatively appropriate test statistics to use. When correla-
tions are small to moderate (i.e, 0 to 0.4), the other seven test statistics, Dunn and
Clark’s Z , Steriger’s Z, Meng’s Z , Hitter’s Z, Hotelling’s t, William’s t, William’s
modified t per Hendrickson, perform well and are appropriate for applying to real
medical data. Especially when sample size is small, Steriger’s Z, Meng’s Z , Hitter’s
Z and William’s t are relatively optimal to use. More specifically, when correlation is
small (i.e, 0 to 0.2), Dunn and Clark’s Z, Hotelling’s t and William’s modified t per
Hendrickson perform well; when correlation is moderate (i.e, 0.3 to 0.4), Steriger’s Z,
Meng’s Z , Hitter’s Z and William’s t were more appropriate.
Moreover, we do not recommend the other seven test statistics when correlation
coefficients are large (i.e, 0.5 to 0.9) before doing modification. The method of mod-
ification we mentioned in Chapter 3 is used in generated data with pattern Σ1 and
effectively re-normalized the test statistics even thought the methods cannot exten-
sively use in general case and is complicated to apply in real medical data. Especially,
through simulation, we found starting point of modification, that is, ρ13 played a role
55
in Σ. In that case, the change of ρ13 could improve the test such that the empirical
levels hover around the nominal levels. Researchers can extend the idea of modifica-
tion when they deal with the same issues with ordinal variables. Therefore, we apply
all the test statistics expect Olkin’s Z to the real medical data in next chapter, and
compare the results from them.
56
Chapter 5
Applications
5.1 Data Sources and Features
5.1.1 Data Sources
As it is mentioned in the introduction, the medical data is from a project of cancer
patient study of a medical institution in the United States. It involves the patients’
pain scale data, history of certain disease treatments, chemotherapy and medica-
tion treatments on multiple time points, concomitant medication record, neuromeres
treatment, and individual’s demographics, etc. The data information of patients were
recorded several times, and the first visit time is a baseline time without any medical
treatment in the medical institution. The other visit times are study visit times when
the patients have had chemotherapy or medical treatment. However, some patient
drop out or absent a few times due to death or moving. The types of variables in the
medical data are ordinal, nominal, string, interval, continuous and ratio. In order to
57
test the equality of two or more correlation with a single ordinal longitudinal variable
based on a common sample, the correlations between three pairs of adjacent visit
times based on the five ordinal variables are what we concern about. The five ordinal
variables in a common sample are deep pain sensation (X), pain intensity (Y ), lack
of energy (A), nausea (B), and joint pain/muscle cramps (C). The first two ordinal
variables are neuropathic pain data, and they are ordinal variables ranging from 0
to 10. The remaining three of the five show patient’s performance status during the
medical treatment. They are nominal variables, but we treat them as ordinal ranging
from 0 to 4, where 0 = not at all, 1 = a little bit; 2 = somewhat; 3 = quite a bit; 4 =
very much.
5.1.2 Features
We first consider deep pain sensation at baseline time t0, visit time t1, visit time t2,
and visit time t3 to explain the features of the ordinal data set. After sorting out the
data set to three contingency tables, that is, Table 5.1: X(t0) vs X(t1), Table 5.2:
X(t1) vs X(t2) and Table 5.3: X(t2) vs X(t3), several features of the data appear.
58
Table 5.1: Deep Pain Sensation (X) at baseline time t0 and visit time t1 withoutmodification of data
X(t0)X(t1) 0 1 2 3 4 5 6 7 8 9 10 Total
0 14 4 1 2 211 1 1 1 32 1 1 23 04 1 15 1 16 1 17 1 18 09 0
10 0
Total 16 5 2 0 0 3 2 1 1 0 0 30
Note: The empty grids are filled with “0”.
Table 5.2: Deep Pain Sensation (X) at baseline time t1 and visit time t2 withoutmodification of data
X(t1)X(t2) 0 1 2 3 4 5 6 7 8 9 10 Total
0 10 2 1 1 1 151 3 1 42 1 1 23 04 05 1 2 36 1 17 1 18 1 19 0
10 0
Total 14 2 0 1 4 1 1 2 1 1 0 27
Note: The empty grids are filled with “0”.
59
Table 5.3: Deep Pain Sensation (X) at visit time t2 and visit time t3 without modi-fication of data
X(t2)X(t3) 0 1 2 3 4 5 6 7 8 9 10 Total
0 8 1 1 101 1 1 22 03 1 14 1 1 1 1 45 1 16 1 1 27 1 18 09 1 1
10 0
Total 11 0 1 2 1 1 3 2 1 0 0 22
Note: The empty grids are filled with “0”.
First of all, the deep pain sensation from the sample in the medical data is an
ordinal variable. The scales of the variable are divided into 11 levels from 0 to 10
which are order of categories. Even the deep pain scale cannot describe the difference
between categories in numerical way, the difference does exist. Secondly, it is an
extremely sparse data. Since the variable is set into 11 ordinal categories, so there
are total 121 grids. But only 30 patients are studied in this project. So the most
of the entries on the table has no data falling into them so that it is an extremely
sparse ordinal table. Last but not least, the total number of observations on each
contingency table is not the same because the patients drop out from the project over
times. The first table (Table 5.1) shows X at the baseline time t0 and visit time t1
with 30 patients. Table 5.2 displays X at the baseline time t1 and visit time t2 with
27 patients. In the third contingency table (Table 5.3), there are only 22 patients
left.
60
For now, based on the features of data, we cannot analyze the correlations based
on the ordinal variable from a common sample. Besides, it will effect the results with
extremely sparse contingency table. That is, we need to do some modification of the
data.
5.1.3 Modification of the real data in contingency tables
The data set in contingency tables are modified in the following steps. We still take
the ordinal variable, deep pain sensation, as an example to explain. To begin with, in
order to solve the problem of the extremely sparse ordinal contingency table, we re-set
the ordinal categories of the variable into 5 levels. We combine pain scales 0 and 1 as
0, 2 and 3 as 1, 4 and 5 as 2, 6 and 7 as 3, and 8, 9 and 10 as 5. The three categories
8, 9 and 10 are combined as one categories because only few observations belong to
the range. In addition, the objective in the chapter is to compare the correlations
based on one ordinal variable between adjacent visit times (i.e, t0, t1, t2 and t3). It
means that we need to guarantee the specified sample size N of the groups that the
correlations are based on. By recollecting the data, we found 22 valid observations
which involved in the project over four visit times. After doing the modification of
the data, we obtain the following three new contingency tables (Table 5.4, 5.5, 5.6)
with deep pain sensation.
With regard to the idea of the modification for the data set, the contingency tables
(Table 5.7 to Table 5.18) for the remaining four ordinal variables, pain intensity (Y ),
lack of energy (A), nausea (B), and joint pain/muscle cramps (C), are shown below.
Since the three ordinal variables, lack of energy, nausea, and joint pain/muscle cramps,
are categorized as 5 scales, we do not modify the categories of three variables.
61
Table 5.4: Deep Pain Sensation (X) at baseline time t0 and visit time t1
X(t0)X(t1) 0 1 2 3 4 Total
0 14 2 2 181 1 12 1 1 23 1 14 0
Total 15 2 3 1 1 22
Table 5.5: Deep Pain Sensation (X) at visit time t1 and visit time t2
X(t1)X(t2) 0 1 2 3 4 Total
0 11 1 2 1 151 1 1 22 1 2 33 1 14 1 1
Total 12 1 5 3 1 22
Table 5.6: Deep Pain Sensation (X) at baseline time t2 and visit time t3
X(t2)X(t3) 0 1 2 3 4 Total
0 9 1 2 121 1 12 1 2 1 1 53 1 1 1 34 1 1
Total 11 3 2 5 1 22
Note: The empty grids are filled with “0”.
62
Table 5.7: Pain Intensity (Y ) at baseline time t0 and visit time t1
Y (t0)Y (t1) 0 1 2 3 4 Total
0 14 2 1 1 1 191 1 1 22 03 1 14 0
Total 15 2 1 3 1 22
Note: The empty grids are filled with “0”.
Table 5.8: Pain Intensity (Y ) at baseline time t1 and visit time t2
Y (t1)Y (t2) 0 1 2 3 4 Total
0 10 1 2 1 1 151 1 1 22 1 13 1 1 1 34 1 1
Total 12 3 4 2 1 22
Table 5.9: Pain Intensity (Y ) at baseline time t2 and visit time t3
Y (t2)Y (t3) 0 1 2 3 4 Total
0 10 2 121 1 1 1 32 1 2 1 43 1 1 24 1 1
Total 12 5 2 2 1 22
63
Table 5.10: Lack of Energy (A) at baseline time t0 and visit time t1
A(t0)A(t1) 0 1 2 3 4 Total
0 2 3 3 1 91 3 5 82 1 1 1 33 2 24 0
Total 3 6 9 4 0 22
Table 5.11: Lack of Energy (A) at baseline time t1 and visit time t2
A(t1)A(t2) 0 1 2 3 4 Total
0 1 1 1 31 4 1 1 62 3 2 2 2 93 1 2 1 44 0
Total 2 8 6 4 2 22
Table 5.12: Lack of Energy (A) at baseline time t2 and visit time t3
A(t2)A(t3) 0 1 2 3 4 Total
0 1 1 21 1 3 3 1 82 1 3 1 1 63 1 2 1 44 1 1 2
Total 4 3 10 3 2 22
Note: The empty grids are filled with “0”.
64
Table 5.13: Nausea (B) at baseline time t0 and visit time t1
B(t0)B(t1) 0 1 2 3 4 Total
0 10 3 2 1 161 3 2 52 1 13 04 0
Total 10 6 5 0 1 22
Table 5.14: Nausea (B) at baseline time t1 and visit time t2
B(t1)B(t2) 0 1 2 3 4 Total
0 9 1 101 2 2 2 62 1 2 1 1 53 04 1 1
Total 12 5 1 3 1 22
Table 5.15: Nausea (B) at baseline time t2 and visit time t3
B(t2)B(t3) 0 1 2 3 4 Total
0 8 1 2 1 121 2 3 52 1 13 2 1 34 1 1
Total 10 7 3 1 1 22
Note: The empty grids are filled with “0”.
65
Table 5.16: Joint Pain/Muscle Cramps (C) at baseline time t0 and visit time t1
C(t0)C(t1) 0 1 2 3 4 Total
0 11 1 1 1 141 2 1 2 1 62 1 1 23 04 0
Total 14 2 3 3 0 22
Table 5.17: Joint Pain/Muscle Cramps (C) at baseline time t1 and visit time t2
C(t1)C(t2) 0 1 2 3 4 Total
0 11 1 2 141 2 22 1 1 1 33 1 2 34 0
Total 12 2 6 2 0 22
Table 5.18: Joint Pain/Muscle Cramps (C) at baseline time t2 and visit time t3
C(t2)C(t3) 0 1 2 3 4 Total
0 11 1 121 1 1 22 3 2 1 63 1 1 24 0
Total 15 3 2 2 0 22
Note: The empty grids are filled with “0”.
66
Therefore, as the tables show above, the scales of all five ordinal variables are
re-divided into 5 levels from 0 to 4. There are total 25 grids, and 22 observations
are studied in this project. That is, we improve the contingency tables and solve the
potential problem which might arises from extremely sparse table.
5.2 Test Results and Conclusion of the Medical
Data
According to the simulation study, we compared the eight test statistics which can be
used to test equality of correlation based on common ordinal longitudinal variable.
As it is discussed in the previous chapter, the remaining seven of the eight statistical
tests are efficient for sample sizes are small or large.
The hypotheses for the medical data are
H0: r12 = r23 = r34;
Ha: At least one pair of rij are different, where i, j = 1, 2, 3, 4 and i 6= j.
we cannot test the above hypotheses by the test statistics, so we do the hypothesis
test separately, that is,
H0 : r12 = r23 vs Ha: r12 6= r23
H0 : r23 = r34 vs Ha: r23 6= r34
where,
r12 : the correlation coefficient between an ordinal variable at baseline time t0 and
the ordinal variable at visit time t1.
r23 : the correlation coefficient between the ordinal variable at baseline time t1 and
67
the ordinal variable at visit time t2.
r34: the correlation coefficient between the ordinal variable at baseline time t2 and
the ordinal variable at visit time t3.
After processing the data sets and testing by the appropriate statistics, we obtain
the correlations coefficients for the five ordinal variables in Table 5.19, and the test
results in Table 5.20 to Table 5.24.
Table 5.19: The correlations coefficients for ordinal variables
Ordinal variables r12 r13 r23 r24 r34
Deep pain sensation 0.7460 0.3044 0.4437 0.1530 0.4970
Pain intensity 0.4326 0.2647 0.1679 0.5207 0.6886
Lack of energy 0.4253 -0.059 0.2868 0.5730 0.3636
Nausea 0.3650 -0.1562 0.6983 0.5572 0.5015
Pain joint/muscle cramps 0.3865 0.3142 0.6914 0.6956 0.6045
68
Table 5.20: Result of deep pain sensation
Hypothesis
Tests
Test
Statisticsα = 0.01 α = 0.05 α = 0.10
H0:r12 = r23
Ha:r12 6= r23
Zd = 1.6183 NR NR NR
Zs = 1.6083 NR NR NR
Zm = 1.5992 NR NR NR
Zh = 1.5976 NR NR NR
Th = 1.7851 NR NR R
Tw = 1.6635 NR NR NR
Tm = 1.7847 NR NR R
H0:r23 = r34
Ha:r23 6= r34
Zd = −0.2159 NR NR NR
Zs = −0.2159 NR NR NR
Zm = −0.2159 NR NR NR
Zh = −0.2159 NR NR NR
Th = −0.2278 NR NR NR
Tw = −0.2171 NR NR NR
Tm = −0.2278 NR NR NR
Note in Table 5.20 to Table 5.24: 1. “R” in brackets denotes re-
jecting null hypothesis; “NR” in brackets denotes doing not re-
ject null hypothesis. 2. The critical values are Z0.005 = ±2.575,
Z0.025 = ±1.96, Z0.05 = ±1.645, t(19),0.01 = ±2.861, t(19),0.05 =
±2.093 and t(19),0.10 = ±1.729.
69
Table 5.21: Result of pain intensity
Hypothesis
Tests
Test
Statisticsα = 0.01 α = 0.05 α = 0.10
H0:r12 = r23
Ha:r12 6= r23
Zd = 1.1395 NR NR NR
Zs = 1.1320 NR NR NR
Zm = 1.1248 NR NR NR
Zh = 1.1305 NR NR NR
Th = 1.1690 NR NR NR
Tw = 1.1616 NR NR NR
Tm = 1.1689 NR NR NR
H0:r23 = r34
Ha:r23 6= r34
Zd = −2.5906 R R R
Zs = −2.5202 NR R R
Zm = −2.4574 NR R R
Zh = −2.4871 NR R R
Th = −2.9015 R R R
Tw = −2.8433 NR R R
Tm = −2.9004 R R R
70
Table 5.22: Result of lack of energy
Hypothesis
Tests
Test
Statisticsα = 0.01 α = 0.05 α = 0.10
H0:r12 = r23
Ha:r12 6= r23
Zd = 0.4637 NR NR NR
Zs = 0.4633 NR NR NR
Zm = 0.4629 NR NR NR
Zh = 0.4632 NR NR NR
Th = 0.4883 NR NR NR
Tw = 0.4667 NR NR NR
Tm = 0.4882 NR NR NR
H0:r23 = r34
Ha:r23 6= r34
Zd = −0.3882 NR NR NR
Zs = −0.3879 NR NR NR
Zm = −0.3877 NR NR NR
Zh = −0.3878 NR NR NR
Th = −0.3905 NR NR NR
Tw = −0.3893 NR NR NR
Tm = −0.3905 NR NR NR
71
Table 5.23: Result of nausea
Hypothesis
Tests
Test
Statisticsα = 0.01 α = 0.05 α = 0.10
H0:r12 = r23
Ha:r12 6= r23
Zd = −1.3347 NR NR NR
Zs = −1.3296 NR NR NR
Zm = −1.3319 NR NR NR
Zh = −1.3314 NR NR NR
Th = −1.7988 NR NR R
Tw = −1.3723 NR NR NR
Tm = −1.7955 NR NR R
H0:r23 = r34
Ha:r23 6= r34
Zd = 1.2459 NR NR NR
Zs = 1.2384 NR NR NR
Zm = 1.2327 NR NR NR
Zh = 1.2318 NR NR NR
Th = 1.2969 NR NR NR
Tw = 1.2708 NR NR NR
Tm = 1.2969 NR NR NR
72
Table 5.24: Result of joint pain/muscle cramps
Hypothesis
Tests
Test
Statisticsα = 0.01 α = 0.05 α = 0.10
H0:r12 = r23
Ha:r12 6= r23
Zd = −1.5092 NR NR NR
Zs = −1.4982 NR NR NR
Zm = −1.4877 NR NR NR
Zh = −1.4890 NR NR NR
Th = −1.6213 NR NR NR
Tw = −1.5487 NR NR NR
Tm = −1.6210 NR NR NR
H0:r23 = r34
Ha:r23 6= r34
Zd = 0.6805 NR NR NR
Zs = 0.6790 NR NR NR
Zm = 0.6782 NR NR NR
Zh = 0.6778 NR NR NR
Th = 0.6923 NR NR NR
Tw = 0.6851 NR NR NR
Tm = 0.6923 NR NR NR
73
From Table 5.20 and Table 5.23, we can conclude that the set of correlations
for the two ordinal variables (i.e, deep pain sensation and nausea) are consistent.
The values of the seven statistical tests are located in the confidence interval either
−Zcv < Ztest < Zcv or −t(19),cv < Ttest < t(19),cv at significant level α = 0.01, 0.05 and
0.10, except that at the significant level α = 0.10, we reject H0 : r12 = r23 by using
Hotelling’s t and William’s modified t per Hendrickson. It indicates that the relation
between two adjacent visit times for both of deep pain sensation and nausea maintain
a consistency with chemotherapy and medical treatment in most of the cases. It also
supports that scale of the two ordinal variables at one visit time is dependent on
the scale at previous visit time, and the dependent intensity will not change with
time of therapy. It does not mean that the patients received an ineffective treatment,
the strength of the treatment for patients which embodied in the scale of deep pain
sensation and degree of nausea keep consistent with time.
The test result for other two ordinal variables, lack of energy and joint pain/muscle
cramps, are displayed in Table 5.22 and Table 5.24. It is stronger evidence that shows
the homogeneity of the two sets of correlated correlations because all statistical tests
for the corresponding hypothesis test do not fall in the reject region. Especially in
the Table 5.22, the values of the test are far away from the critical values. The results
for these two ordinal variables show the important clinical and medical significance
that we figured out in last part.
However, with regard to test the equality of correlations, the result for pain in-
tensity Table 5.21 comes to a different conclusion. We accept the null hypotheses
r12 = r23 by using the seven test statistics under all the three nominal level, while we
74
reject another null hypotheses r23 = r34 by using the same statistical tests at nominal
level α = 0.05 and α = 0.10. Besides, at the nominal level α = 0.01, H0 : r23 = r34
is rejected by using Dunn and Clark’s Z test, otherwise accepting H0. We can find
that the relationship between pain intensity at visit time t3 and pain intensity at visit
time t4 is stronger than the relationship between t2 and t3, or that between t1 and t2
because all the test statistics for testing whether r23 = r34 are negative. It indicates
that pain intensity at visit time t4 is much more dependent on pain intensity at visit
time t3 than that other adjacent times. The difference might be caused by the medical
treatment or even the patients’ psychological impact from the procedure, time and
the effect of the treatment.
To sum up, the correlations with deep pain sensation (X), lack of energy (A), nau-
sea (B) and pain joint/muscle cramps (C) between adjacent visit times are consistent,
that is, r12 = r23 = r34. However, there is the difference between the correlations r23
and r34 for pain intensity at the nominal levels α = 0.05 and 0.10. It means we
reject the null hypotheses and r23 = r23 6= r34. It is interesting to note that there
appears the different conclusion for the two kind of pain index, deep pain sensation
and pain intensity. We do not have to doubt the conclusion by our the method. This
inconsistency and variation may exist in different pain observations. In addition, it
might be caused by data collection and sorting. Researchers may be interested in this
difference and do further investigations.
75
Chapter 6
Conclusion and Future Work
6.1 Future Work
The possible future work involves modifying test statistics by using bootstrap method,
not only in simulation study but also in real data, testing the equality of a set of
correlated correlations simultaneously by using chi-square statistics.
6.1.1 Modification of Test Statistics by using Bootstrap Method
1.Simulation and resampling
According the simulation study, we found some test statistics are relatively appropri-
ate but there are not optimal, so we need to modify the test statistics. One idea of
the modification is using bootstrap method. The bootstrap procedure is based on the
idea of resampling the data. The data which resampling by original data may be used
as substitute for the population when the population distribution is unknown. We
76
use the bootstrap method in simulation study to modify the test statistics which we
mentioned in Chapter 3, thereby the empirical levels are closer to the nominal levels.
The steps for obtaining a bootstrap estimate modification of test statistics θmodified
are as follows:
1. Follow the simulation method in Chapter 4 to generating a sample U = (U1, U2, U3)
and categorizing it as 5 levels Xmn, where m = 1, 2, 3, and n is sample size which
need to be set. The generated sample is original sample, and compute θoriginal
from the original data.
2. Draw a resample or bootstrap sample of size n with replacement from the o-
riginal sample. We denote k be the number of such bootstrap samples, usually
k ≥ 1000. Compute θb,i, the estimated of θ obtained from the ith bootstrap
sample. Then we have the mean and variance of θb,i by the k bootstrap samples
as,
E(θb,i) =1
k
k∑i=1
θb,i
and
V ar(θb,i) =1
k
k∑i=1
(θb,i − E(θb,i))2
3. Obtain the bootstrap modification of θmodified as
θmodified =θoriginal − E(θb,i)√
V ar(θb,i)
4. Repeat step 1, 2 and 3 on 10,000 times over to obtain the empirical bootstrap
distribution of θmodified, then compare the bootstrap distribution with normal
distribution.
77
2.Bootstrap in real data
Modifying the test statistics by using Bootstrap methods in real data is similar with
that in simulation study part. Now we the original sample is known. So the steps for
obtaining a bootstrap modification of test statistics θmodified is simple and as follows:
1. Compute θoriginal from the original data
2. Draw a resample or bootstrap sample of size n with replacement from the orig-
inal sample, and compute θb by the bootstrap sample.
3. Repeat step 2 on k times over, k ≥ 1000, to obtain the boostrap esitimate of
mean and variance as,
E =1
k
k∑i=1
θb,i
V ar =1
K
k∑i=1
(θb,i − E)2
4. Obtain the bootstrap modification of test statistics as,
θmodified =θorginal − E√
V ar
6.1.2 Testing the Equality of a Set of Correlated Correlations
by using Chi-square Statistics
The objective of this thesis is to test the heterogeneity of correlations with an ordinal
variable between adjacent visit times t1, t2, ..., ts. If visit times are more than 3 times,
so that there are more than two correlations between adjacent visit times. The null
78
hypotheses test should be r1 = r2 = ... = rk, where k = 1, 2, ..., s− 1. We divide the
null hypotheses test into several parts, such as H0(1) : r1 = r2; H0(2) : r2 = r3; ...;
H0(k− 1) : rk−1 = rk, because the test statistics which were introduced in Chapter 3
can only use to test the equality of two correlations. However, Meng, Rosenthal and
Runbin proposed a χ2 test to readily test the significance of heterogeneity of a set
of correlations by means of a simple extension of their Z test statistics which named
Meng’s Z in Chapter 3. The hypothesis test for now is,
H0 : r1 = r2 = ... = ri = ... = rk, i ∈ k, k = 1, 2, ..., s− 1
Ha: At least one pair of ri 6= ri+1.
The χ2 test performed as,
χ2 =(N − 3)
∑ki (Zri − Zr)
2
(1− rx)h,with df = k − 1
where,
h =1− f r2
1− r2= 1 +
r2
1− r2(1− f)
f =1− rx
2(1− r2),which must be ≤ 1
r2 =1
k
k∑i
r2i
In this χ2 test equation, Zri = 12
ln 1+ri1−ri is Fisher r − to − Z transform, and Zr is
the mean of the Zri . rx is the median intercorrelation being tested for heterogeneity.
χ2 test statistic follows chi-square distribution on k − 1degree of freedom, where k is
number of correlations that need to compare.
79
6.2 Conclusion
In the present thesis, we first introduced ordinal data including measuring methods,
classification, difference and advantage using ordinal data. Then we reviewed several
models for ordinal responses, how to apply them and the restriction of the models in
different situations. In Chapter 3, we focused on test statistics of measuring dependent
correlations in contingency tables. Before that, we needed to sort an ordinal variables
based on time in contingency tables and compute the correlation coefficients with
scores. In some case, we also need to reorganize contingency tables of ordinal data
due to sparse tables and observations’ dropping off over time. Next we presented
eight test statistics for testing the equality of two or more dependent correlations in
a common sample. In addition, we did a modification of test statistics in last part
but there still are some issues to apply it in real data.
In Chapter 4, we evaluated the eight test statistics by simulation. In terms of
empirical level and empirical power, the results of simulation indicated that the choice
as to which test statistics is optimal, which depends not only on sample size but also
on the magnitude of the correlations. Through summarizing the results of simulation
and considering with the condition of real medical data, we chose the seven test
statistics, Dunn and Clark’s Z , Steriger’s Z, Meng’s Z , Hitter’s Z, Hotelling’s t,
William’s t, William’s modified t per Hendrickson to apply to the real medical data.
Especially, we paid attention to Steriger’s Z, Meng’s Z , Hitter’s Z and William’s
t. The four test are relatively optimal to use because of the small sample in the
medical data. In last part of Chapter 5, we summarized the result of application
80
in real data and analyzed important clinical significance of the studies in the thesis.
Unfortunately, the modification of test statistics we presented in Chapter 3 were not
ideally used in real data. So we came up with an idea of modification by using
Bootstrap method not only in simulation evaluation but also in real data in Chapter
6. In addition, testing the equality of a set of correlated correlations by using Chi-
square statistics were considered to suggest that researchers who work with the same
issue might consider this direction to study.
81
Bibliography
[1] Agresti, A. (2007). An Introduction to Categorical Data Analysis. New York :
John Wiley and Sons.
[2] Agresti, A. (2010). Analysis of Ordinal Categorical Data (2nd ed). New York :
John Wiley and Sons.
[3] Ananth, C. V.,and Kleinbaum, D. G. (1997). Regression models for ordinal re-
sponses: a review of methods and applications. Internat. J. Epidemiol, 26, 1323-
1333.
[4] Bender, B. and Grouven, U. (1998). Using binary logistic regression models for
ordinal data with non-proportional odds Journal of Clinical Epidemiology, 51,
809-816.
[5] Dunn, O. J. and Clark, V. A. (1969). Correlation coefficients measured on the
same individuals. Journal of the American Statistical Association, 64, 366-377.
[6] Dunn, O. J. and Clark, V. A. (1971). Comparison of tests of the equality of de-
pendent correlation coefficients. Journal of the American Statistical Association,
66, 904-908.
82
[7] Efron, B. and Tibshirani, R. J. (1994). An Introduction to the Boorstrap. Chap-
man and Hall/CRC.
[8] Fahrmeir, L. and Tutz, G. (2001). Multivariate Statistical Modelling Based on
Generalized Linear Models(2nd ed). New York: Springer.
[9] Fisher, R. A. (1921). On the probable error of a coefficient of correlation deduced
from a small sample. Metron, 1, 1-32.
[10] Gautam, S. and Kimeldorf, G. (1999). Some results on the maximal correlation
in 2× k contingency tables. The American Statistician, 53, 336-341.
[11] Hendrickson, G. F., Stanley J. C., and Hills, J. R. (1970). Olkins new formu-
la for significance of r13 vs. r23 compared with Hotellings method. American
Educational Research Journal, 7, 189-195.
[12] Higgins, J. J. (2004). Introduction to Modern Nonparametric Statistics. Califor-
nia: Thomson Learning.
[13] Hittner, J. B., May, K., and Silver, N. C. (2003). A Monte Carlo evaluation of
tests for comparing dependent correlations. The Journal of General Psychology,
130, 149-168.
[14] Hotelling, H. (1940). The selection of variates for use in prediction, with some
comments on the general problem of nuisance parameters. Annals of Mathemat-
ical Statistics, 11, 271-283.
[15] Kampen, J. and Swyngedouw, M. (2000). The Ordinal Controversy Revisted
Quality and Quantity, 34, 87-102.
83
[16] McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models (2nd ed).
London: Chapman and Hall.
[17] Meng, X. L., Rosenthal, R., and Rubin, D. B. (1992). Comparing correlated
correlation coefficients. Psychological Bulletin, 111, 172-175.
[18] Neill, J. J., and Dunn, O. J. (1975). Equality of dependent correlation coefficients.
Biometrics, 31, 531-543.
[19] Olkin, I. (1967). Correlations revisited. In J. C. Stanley (Ed.), Improving exper-
imental design and statistical analysis 102-128. Chicago, IL: Rand McNally.
[20] Pearson, K., and Filon, L. N. G. (1898). Mathematical contributions to theory
of evolution: IV. On the probable errors of frequency constants and on the
influence of random selection and correlation. Philosophical Transactions of the
Royal Society of London, Series A, 191, 229-311.
[21] Peterson, B. and Harrell, F. E. (1990). Partial proportional odds models for
ordinal response variables. Journal of the Royal Statistical Society. Series C, 39,
205-217.
[22] Piepho, H. and Kalka, E. (2003). Threshold models with fixed and random effects
for ordered categorical data. Food Quality and Preference, 14, 343-357.
[23] Silver, N. C. and Dunlap, W. P. (1987). Averaging correlation coefficients: Should
Fishers Z transformation be used? Journal of Applied Psychology, 72, 146-148.
[24] Steiger, J. H. (1980). Tests for comparing elements of a correlation matrix. Psy-
chological Bulletin, 87, 245-251.
84
[25] Stevens, S. S. (1951). Mathematics, measurement, and psychophysics. New York:
Wiley.
[26] Torra, V., Domingo-Ferrer, J., Mateo-Sanz, J. M. and Ng, M. (2006). Regres-
sion for ordinal variables without underlying continuous variables. Information
Sciences, 176, 465-474.
[27] Tutz, G. (1991). Sequential models in categorical regression. Comput.Statist.Data
Anal, 11, 275-295.
[28] Vogt, W. P. (1993). Dictionary of Statistics and Mathodology. London: Sage.
[29] Walker, S. H. and Duncan, D. B. (1967). Estimation of the probability of an
even as a function of several independent varianles. Biometrika, 54, 167-179.
[30] Williams, E. J. (1959). The comparison of regression variables. Journal of the
Royal Statistical Society, Series B, 21, 396-399.
85