estimation of correlation betwees [n chinese university of hong kong 丨 _ ... (i,j = 1,2) • since...
TRANSCRIPT
ESTIMATION OF CORRELATIONS BETWEEN [;
TRUNCATED CONTINUOUS AND POLYTOMOUS VARIABLES
by I Wai-ehung LUI [•
A 1::
Thesis |
submitted to
(Division of Statistics)
the Graduate School
The Chinese University of Hong Kong 丨
_
of the Requirements for the Degree of |
Master of Philosophy |
(M Phil. )
I June, 1994 |
, I P ^ i
/ u . t ^ M
THE CHINESE UNIVERSITY OF HONG KONG
GRADUATE SCHOOL
The undersigned certify that we have read a thesis, entitled
"Estimation of Correlations between Truncated Continuous and Polytomous
Variables" submitted to the Graduate School by Lui Wai C h u n g ( 當 辟 愈 )
in partial fulfillment of the requirements for the degree of Master of
Philosophy in Statistics. We recommend that it be accepted.
Dr, S. Y. Lee
Supervisor
u) - y', ^ — ,
Dr. W. Y. Poon
Supervisor
Dr. K. H. Li
Dr. S. Y' Cheung
Prof. P. C. Chang
External Examiner
DECLARATION
No portion of the work referred to in this thesis has been
submitted in support of an application for another degree or
qualification of this or any other university or other institute of
learning.
ACKNOWLEDGEMENT
I would like to express my deep gratitude to my supervisors, Dr.
Sik-yum Lee and Dr. Wai-yin Poon, for their supervision and
encouragement during the course of this research program. It is also a
pleasure to extend my gratitude to all the staff of the Department of
Statistics, especially to Mr. Michael Leung and Dr. P. S. Chan for their
kind assistance.〜
ABSTRACT
In this thesis, a method of estimating correlations for the
model with truncated continuous and polytomous variables is developed.
First, maximum likelihood method is used for estimation with one
continuous truncated variable and one polytomous variable. The model is
then extended to several polytomous variables. To avoid heavy
computational time in obtaining these maximum likelihood estimates, the
Partition Maximum Likelihood method is proposed. The asymptotic
properties of the estimates are also studied. Finally, the
computational aspects is described and a simulation study is conducted
to investigate the performance of the estimates.
CONTENTS
Page
Chapter 1 Introduction. 1
Chapter 2 Estimation of the model with one truncated
continuous variable and one polytomous variable 6
§ 2.1 The model
§ 2.2 Likelihood function of the model
§ 2.3 -Derivatives of F(侈)
§ 2.4 Asymptotic properties of the model
Chapter 3 Estimation of the model with one truncated
continuous variable and several polytomous variables.... 22
§ 3.1 The model
§ 3.2 Partition Maximum Likelihood (PML) estimation
§ 3.3 Asymptotic properties of the PML estimates
Chapter 4 Optimization procedures and Simulation study 43
§4.1 Optimization procedures
§ 4.2 Simulation study
Chapter 5 Summary and Conclusion. ,54
Tables 56
References 76
Chapter 1. Introduction
Most of the current statistical literature on sampling
concerns unrestricted samples. In most real-life situations, however,
researchers are likely to find that their samples are ,truncated,.
Suppose we have a random sample with size n. The data are observed only
if its value is greater and/or smaller than a pre-assigned value, and it
is missed otherwise. Someone use the term ’truncated, to describe this
kind of data while someone use the the term 'censored, instead. In this
thesis, we use the term ' truncated' . Example can be found in a life
test, where the experimenter decides to stop the experiment before all
of the units on test have failed. Truncation is one kind of data
missing pattern which is non-ignorable which leads us to treat it
carefully. Moreover, truncation is a common topic since applications
can be found in quality control, life testing, biometrics, economics,
business, agronomy, manufacturing, engineering, medical and biological
sciences, management sciences, social sciences, and most areas of the
physical sciences.
Truncated samples with unknown sample size were first
encountered quite early in the development of modern statistics by Sir
Francis Galton (1897). His objective was to test the suitability of
trotting records, provided by the Wallace Year Book, Vols 8-12
(1892-1896), a publication of the American Trotting Association.
Afterwards, Karl Pearson (1902), Pearson and Lee (1908), R. A. Fisher
1
(1931, 1936) gave more theoretical analysis of the truncated samples
with unknown sample size. Later on, Stevens (1937) considered the
estimation of the mean and standard deviation from truncated sample with
known sample size in normal distribution. He applied the results to the
truncated time-mortality curve. Cohen (1950) used the method of maximum
likelihood to estimate the parameters of normal populations from singly
and doubly truncated samples with known sample size. In four later
papers Cohen (1955, 1957, 1959, 1961) extended the results given in his
1950 papers. More examples can be found in Ha Id (1949), Gupta (1952),
Hartar and Moore (1966), Schineider (1986) and Cohen (1991).
When continuous latent variables are only observable in
categorical form, they are called polytomous variables. In many
applications, particularly in behavioral and social science,
investigators frequently encounter dichotomous or polytomous data. For
instance, in behavioral studies, a subject is often asked to answer the
question on scale like
approve approve don't know disapprove disapprove • strongly strongly
It is an example in which a continuous variable underlies a polytomous
observed variable. When analyzing such variable, some statisticians may
assign integer value to each category and proceed in the analysis as if
the data had been measured on an interval scale with the desired
2
distribution. Many statistical methods seem to be robust against such
deviation from the distributional assumption, however, sometimes it may
lead to erroneous result. Olsson (1979a) showed that due to the biased
estimates of correlation, the application of factor analysis to such
kind of discrete data will lead to erroneous conclusions. Hence, as
expected, the applications of principal component analysis, multiple
correlations, canonical correlation analysis and structural equation
models may also lead to incorrect result because these statistical
methods may also depend largely on the estimation of correlation. So,
it needs to develop a method to estimate the "true" polychoric
correlation coefficients which are more reliable.
Assuming the normality of the underlying distribution, Pearson
(1901) introduced the tetrachoric correlation coefficient to estimate
the true correlation from a 2x2 contingency table. Lancaster and Hamdan
(1964) extended it to the polychoric case. Martinson and Hamdan (1971)
developed a two-step maximum likelihood method to estimate the
polychoric correlation coefficients. In their method, the thresholds
are first estimated by cumulative marginal proportions, and then the
polychoric correlation is estimated with the thresholds fixed at their
estimates. Olsson (1979b) developed the full maximum likelihood
approach to estimate the correlation and thresholds. Lee and Poon
(1986) extended the model to p-dimensional contingency table and used
the generalized least squares estimation. Statistical methods based on
different assumptions in analyzing polytomous data have been developed.
3
Examples are Lee, Poon & Bentler (1990, 1992), Poon and Lee (1992), Poon
and Leung (1993). The analysis of polytomous data related to missing
data was encountered by Lee & Chiu (1990). F u r t h e r m o r e, L e e & Tang
(1992) studied the estimation of polychoric and polyserial correlation
with incomplete data.
The main purpose of this thesis is to develop a method to
estimate the parameters in the model with truncated continuous variable
as well as polytomous variables. These parameters include the
correlations among the variables and the thresholds of the polytomous
variables. In Chapter 2, we treat the model with one truncated
continuous and one polytomous variable, and the method of maximum
likelihood is proposed. Asymptotic properties in this model are also
given. In Chapter 3, an extended model with one truncated continuous
and several polytomous variables is considered. To avoid the heavy
computational time in evaluating the multivariate distribution
functions, the Partition Maximum Likelihood (PML) estimation is used
(see, Poon and Lee 1987). The idea is to divide the r-dimensional model
into r(r-l)/2 sub-models to obtain Partition Maximum Likelihood
estimates. Statistical properties of the Partition Maximum Likelihood
estimates are also established. Chapter 4 describes the computational
aspects of the estimators. In order to find the estimated value, the
modified Davidon-Fletcher-Powell (DFP) algorithm as well as Fisher
Scoring algorithm, which are iterative optimization methods, are used.
In the latter part of Chapter 4, it describes a simulation to study the
4
Chapter 2. Estimation of the model with one truncated continuous
variable and one polytomous variable.
2.1 The model
Let X, Y be random variables. Assume that (X, Y)' distributes
as bivariate normal with mean vector \x and correlation matrix P = (〜),
(i,j = 1,2) • Since our main concern is to estimate the correlation of
the model, without loss of generality, and for simplicity, y is assumed
to be a zero vector in the following passage. Also, note that p^^ = P2 2
= l , and let p ^ = p which is what we interest in.
Suppose that the random variable X is continuous and is right
truncated at a known truncation point , say c. That is, we only observe
those X-values which are less than or equal to c, and miss those
X-values which are greater than c.
Moreover, suppose that the random variable Y cannot be
observed directly. We can only observe it through a polytomous random
variable Z, which is defined as
Z = k if a, ^ Y < a, , (2.1)
k k+1
for k = 1,...,h w i t h 、 = - ⑴ ; ah + 1
= +oo.
6
We call { a , a , ah + 1
> the thresholds of Z. Note that
these thresholds are also unknown except a^ and .
Let 分 be the parameter vector in this model, then the
dimension of is h, and it is given by,
= { p’ a2, ... , a
h }, (2.2)
Now, suppose we have a random sample from ( X , Z ), with
sample size n. Also assume that m of these n vectors having observed
X-value. Without loss of generality, let the last (n-m) X’ s ’
...’ X be truncated and denote X . = ( X X ),. Also let
z = ( Z , … ’ Z ),be the corresponding observed polytomous data, -mis m+1 n
Also denote XQ b s
= ( Xr X
2 X
m ), and Z
Q b s = { Z ^ . . .
7 Z
m ),
Note that the number of observed and missing X-value, m and n-m
respectively, are known after the sample is drawn. Moreover, XQ b s >
Z , Z are the observations which are available while ^ are those ~obs ~mis 〜mis
truncated X—value which are greater than c. Denote X = ( )’
and Z = ( Z' , Z>
. ) , .
2.2 Likelihood function of the model
In this section, we will derive the likelihood function of
the model. Suppose L (侈 | X, Z ) be the likelihood function, then it
can be expressed as the following.
7
I j
<x f (芒,g丨它)
~obs ~mis ~obs 〜mis 〜
= f ( X [ , Z l I 侈)• f( X • , Z • I 侈) 、-obs ~obs ~ -mis -mis ~
= f ( Z I X , 旮 ) • f C X ^ 丨 旮 ) 丨 钞 ) ~obs
1
-obs, ~ ~obs' ~ -mis' -mis ~
(2.3)
We have decomposed the likelihood function in (2.3) into three
parts. We shall discuss these three parts separately as the following.
Part I.
Consider the first term f( Z , | X , , ) in the likelihood
~ODS ~ODS ~
function. Let 〜 b e the number of observation in ZQ b s
= ( ..., Z ^ ),
that are equal to k ( k = 1, ... , h ). S o , 、 + n2 + . . . + 〜 = m .
(•’\ J. T_ Also denote X, be the i element among the n. observations such that
k K th
the corresponding polytomous variable Z is in the k category ( that is
Z = k ) . Then by the independency of the observations,
f( Z , I X , , ) ~obs ~obs 〜
- f ( z., •.., Z I X ” . . . , X , 办 ) 1 m l . m -
m
= y ] f( Z I X 办)
j=l J J
~
8
= n Sk
P“ 〜 = k I X p ) ) k=l i=l
(2.4)
y
By our assumption, (X, Y) has a bivariate normal
distribution. By standard normal properties, the conditional
distribution of Y given X ,say Y| is given by Y lx = N( pX, 1-p 2 ) where
’ S , denotes ' is distributed as ’.
Therefore, for any i=l, ...,n and k=l, ..., h
Pr( Z . = k 丨 x j1
) ) l k
二 Pr( Y. < ak + 1
I 亡))
^ r a
k+i ) . r \ ^
p X
k1 }
1 = $ - $
I (1 - p2
)1 / 2
J ^ (1- P2
)1 / 2
^
= $ ( a 二 ^ ) 一 少(〜:丄)
$ ( a *. ) if k=l r
^»i
-$ ( a
k ! l , i }
- $ (
V i } i f
1 米
1 - $ ( a^ . ) if k=h h, l
(2.5)
v( i )
r t
1 ,2
来 a - p . Xk | 1 t
where a. • = — a n d = r 7 7 5
~.exp( ) dt
k’1
(1 - p2
)1 / 2
. (2tt) 1 / 2
2 -co
9
which is the distribution function of standard normal N(0,1) . Also
来 米
note that a , . ) = 0, a, • ) =1. So, we get 1,1 h+1,l
f( ?obs 丨?obs' ~ )
h n
k r 来 米 -
= n n $
( a
k +i , i ) 一
$ (
a
k , i ) k=l i=l
L
' J
•
( 2 . 6 )
Part II.
For the second term f( X , | ^ ), by the independency of the
~obs ~
observations, it can be seen that
f (
w ? )
m
= n f
( x
i I ? ) i=l
m —1/2. 1 2 = ( 2 7 1 ) • exp{ - X
i >
i=l 已
L
m
= ( 2 T T ) 'm / 2
• exp{ - X .2
} •
i=l
(2.7)
Part III.
米
For the third term f( X • , Z . | ^ ), let n, be the number
~mis ~mis 〜 k
10
of observations in Z • = (Z Z )' that are equal to k
〜mis m+1 n 来 来 来
(k=l,...,h). So n1 + n
2 +. . . + n ^ = n-m .
来
Consider that for any i = 1 〜 a n d for any k = l,...,h ,
Pr { Z . = k and X . > c } l l
= P r { ak ^ Y ^ < a
k + 1 and X ^ > c >
r V i 「 +0 0
= 02( x , y; p) dx dy
J
an
J
c k
( 2 . 8 )
1 f
2 2 、 , 、 I x - 2pxy + y i
where 0o( x , y; p) = exp < o f
2
2ti il-p ) ^ 2 n-p ) )
denotes the bivariate normal density function. For simplicity, we
denote it as 0 (x, y) in the following passage. Therefore, we have
f( X . , Z . 丨 侈 ) ~mis -mis ~
n
= n f( x
i , 、 丨 侈 ) j=m+l
J J
:
来
h n
= I T ffk P r
< = k and X. > c } k=l i=l
1 1
l * r a
i 丄 1 r* + c o
h n r f k + 1「 -i = j ] Tfk 诊2
(X
, y) d x d y
k=l i=l L . … J
J
a. c k
11
*
h r I r k+i r -i k x
= n 1 y) dx dy V
k= l L L *
Z
J ) . J
a, ^ c k
(2.9)
Combining the results in (2.6), (2.7) and (2.9), we finally
derive the likelihood function in this model, which is given by,
L ( 侈 丨 X , Z ) 〜 〜 〜
= L ( p , a0, … , a I X , Z )
乙 n 〜〜
« n nk
f ^ a
kI i , i ) 一
$
( ) x
k=l i=l L
, J
m
( 2 7 r ) - m / 2 • exp{ - - i - [ X ^ > x
i=l
睾
, r a
w i r +co n
i, h f r 「 k+1
[T \ 09( x , y) dx dy \
k=l ^ L . • J J
• J
a, J
c k
(2.10)
In order to find the maximum likelihood estimate, we would
like to maximize the likelihood function L( ^ I X, Z ) which is
expressed in (2.10). Equivalently, by ignoring the constant terms, we
12
would like to minimize the negative log-likelihood function F(它).
F(^) <x 一 log L(^) 〜 〜
h nk
« - [ I log 卜 ( ) 一“ ) •
k=l i=l
h
* r r 〜+ 1
r+ w
i
- ^ n* log 02( x , y) dx dy ,
k=l J
a c (2 .11)
where log represents natural logarithm throughout this
passage.
To further express the term, notice that since
r \ + i r +0 0
沴(x, y) dx dy J
a. J
c k
= $2( + o o , a
k + 1 ; p) 一 少
2( + 0 0 ’ a
k; p) 一 a
k + 1 ; p) + $
2( c’ a
k; p)
( 2 . 1 2 )
i 广 x p y 2 2 \ , 、 I u - 2puv + v I , ,
where $0( x , y ;p) = exp < ^ Y dvdu 2
2n (l-pZ
) . ^ 2 il-p ) )
「 一 -co -00 denotes the distribution function of bivariate normal. For simplicity,
we denote it as y) in the following. So, F(侈)can be expressed as Z ~
13
F(^)
h n
k r 来 来 ’
< x , [ [ log [ 染k + 1
’ J " 〜 i ) •
k=l i=l
h
- ^ n* log $2(+oo,a
k + 1) 一 $
2(+oo,a
k) 一 +
$
2( C
’a
k )
k=l
(2.13)
2.3 Derivatives of F(^).
A
To find maximum likelihood estimate ^ of 它,we are required to
minimize the negative log-likelihood function F(^) in (2.13) which has
been derived in the previous section . Since the optimum solution
cannot be solved algebraically in closed form, the modified
Davidon-Fletcher-Powell (DFP) algorithm, which will be discussed in
Chapter 4, is used. Due to the need of the first partial derivatives of
F ⑷ in this algorithm, we compute them as the following.
5F( p, a 〜 ) In order to find , we first calculate
dp
.) (u, v; p)
!^~ . By Johnson and Kotz (1972), = 02(u,v;p)
dp dp “
So, we can see that
14
a$(a* .) k,l
dp
来
* a a
k i K > 1
dp
* 9
( a
k " p X
k 1 = • ) •
k’1
aP I (i- p
2
)1 / 2
J
r - p2
)1 / 2
- (ak- p x p n + m -
PV
1 / 2
卜 制 l
:_ i- p .
Y( i )
* p a
k " X
k = . ) •
2 3 / 2
K’1
(1- p2
) .
(2.14)
5F( p, a , . . ., ah )
According to (2.12) and (2.13), is dp
given by
5F( p, a2, . . . , a
h )
dp
I > r
来 * i - l r 帅 . ) .) =
" E I 染 k + l ’ i H( a
k ’ i ) . 7 7 一 ^
k=l i=l J d p 9 P
(h
* 『 「a
k + l 「+ c
° r
l - - y n k 0
2( x , y) dx dy x
I lc=l J
a. J
c k
15
a r V i r + c o
] —— 0 (x, y) dx dy —
dp ^
J
a, J
c
h n
, / _ 1 p 尸 「 * * 2 3/2
= - I I ^ 〜 1 , 1 ) - $ ( a
k’i ) J
1
" P } X
k=l i=l I
- 来 (1 1 来 (1 ) "I
^ r - v+ o o
, ° w - ” 〜( c
’ ° w + vc
,a
k ) •
k= ! L 、 ⑴ , \
+ 1) _ V
+ W
’a
k ) _ ¥ ‘ W + $
2( C
,a
k ) • .
(2.15)
来
As remarks, for the term k=l in the summation, since $ ( ak i
) «
* r i ) 0 in F ⑷ , t h e term 0(a . ) • (pa -X ) in the derivative vanishes.
〜 iC y 1 JC XV 来
Similarly, for the term k=h in the summation, since $ ( ak + 1
丄)=1 in F(它),
* r i) the term 0(a, , . ) • (pa. ) vanishes in the derivative.
k+l,i k+1 k:
Now, we are trying to find the partial derivatives of F(它)
with respect to the unknown thresholds. For any t=2 h , consider
that
a$(a* .) t, l
a a
t
来
9 a . . * t i
=
余( a
t’i ) • — a a
t
16
, Y
( i ) 来
d
( a
t " p X
t 1 = ‘ • • —
t,1 Q
I 2.1/2 , da
K
(1- p )
* 2 - 1 / 2 =4>{oc
t 丄)•(1- p ) •
( 2 . 1 6 )
Moreover, by referring to Johnson and Kotz (1972), we know
(u, v; p) / u - pv x that = 0(v) • $ — — - and if u=+oo, then
atL \ L / c» - p ) 乂
抛(+00’ V; p ) / +00 - p v X
=d>(v) . $ f a 2.1/2 I
dw ^ (1- P ) }
- 0 ( v ) • $(+00)
= 0 ( v ) .
(2.17)
Therefore, we have
a r a
t + i r + c o
—— 0 (x, y) dx dy
da ‘ 、 a
t c
d 「 _ =—— $
2(+oo,a
t + 1) 一 $
2(+oo’a
t) - $
2( c’a
t + 1) + $
2( c , a
t)
a a
t L
、
. ( c - pa^ x -
=一 d>{oc^) • 1- <E> and t “ 2、1/2
L ^ (1- p ) ; J
(2.18)
17
a ra
t r + c o
—— 02( x , y) dx dy
doc. s % t c
d r
•
a a
t L
r ( c
~ pa
+. \ -
= 0 ( a ) • 1 - — L 乂 ( 1 _ p ) 乂
J
.
(2.19)
Finally, for any t=2,...,h ,
aF(^)
a a
t
f h n
a
f p r * * 1
=一 I $ ( \
+i , i ) —
$ (
a
k,i) a a
t l
k =i
i =i “ ^
h
" I \ l Q
g [ V+ M
,a
k+1 ) "
$
2( + 0
°’a
k)
k=l
- Vc
,a
k + 1 ) + VC
’ V ] }
o / n
o t-i 厂 来 来 _
= — — 1 “ I l o
g $
( a
t , i ) _ $
( a
t—l,i ) 5 a
t t i = 1
n
" I 地卜(a
t +l ^ i ) 一
$ ( ) .
i=l '
18
来 厂 “
- n ^ log $2(+oo,a
t) - $
2(+oo,a
t__
1) - $
2(c’<x
t) + ^
2( c , a
t_
1)
* r i - n log $
2( + o o , a
t + 1) - $
2(+oo,a
t) —
$
2(
c
’a
t + 1) + 尘 ‘
" * 、 n 2,-1/2
b 来 来
i=l $ ( a. . ) - $ ( a ) t,l t-l,l
" * 、 “ 2,-1/2 n 0(a. .) . (1- p ) t t, l
+ z ; ; i = 1
电(a
tll,i) 一$
(a
t : i )
- r c - pa^ X -
* L
^ (l- p ) } 1
- n “
$ (+oo,at) 一 $
2(+oo,a
t_
1) - $
2( c , a
t) +
$
2
( c , a
t - l}
_ r c - pa^ X -
来 L ^ (1- p ) 夕 J
+ n r $
2(+oo’a
t + 1) — $
2(+oo’a
t) — $
2( c , a
t + 1) + $
2(c’a
t) •
(2 .20)
来
As a remark in this equation, note that when t=2’ )=0 来
and $2(+oo, a
t_
1) = 0 . Also, when t=h, $ ( a
t + 1 丄)=1 and ^(+00,
a
t + 1)
=
l .
19
At this stage, we have find the h partial derivatives of F(^).
As we have mentioned before, the minimum solution of the negative
log-likelihood function F ⑷ cannot be obtained algebraically in closed
form, so iterative procedure, which requires the first partial
derivatives, is used to obtain the maximum likelihood estimates . The
procedure that we used will be introduced in Chapter 4.
2.4 Asymptotic properties of the model.
The maximum likelihood estimate (MLE) of = ( p , 〜 , . . . , 〜 ) ,
A
分 is consistent. Moreover, if is the true parameter value of 它,then
under mild regularity conditions, it follows from the well known
asymptotic theory (See, e.g., Rao 1973) that the asymptotic distribution
of n1 / 2
- 一旮)is multivariate normal with zero mean vector and the
covariance matrix is given by the inverse of the information matrix.
That is
, f 「, a F ( 办 ) 、(d F W 1 r
1
—I I 〜 〜 I ⑷ = j E • -
V. L /s/ � J ‘
(2 .21)
八
and the estimate of the information matrix, I(^) is given by
20
^ , f m
「, aF(x.,z.) 、 f aF(x.,z.) ,y
] 工 ⑷ : 丄 , y ————"“
+
〜 n [ ¥ ] y ]
j
n
f, SF(Z.) 、 f aF(z.) x'l )
i H . - — ^ 、 J V J
i=m+l L
~ ~
( 2 . 22 )
The derivatives in (2.22) has been derived in the preceding
passage indirectly during the finding the partial derivatives. Hence,
the asymptotic covariance matrix can be estimated and hence the 八
estimated standard errors of can also be obtained.
21
Chapter 3. Estimation of the model with one truncated continuous
variable and several polytomous variables.
3.1 The model
In Chapter 2, we have studied the model with one truncated
continuous variable and one polytomous variable. Now, we shall extend
the model containing several polytomous variables.
Suppose X, Y1, Y
2,.••’ Y
p
a r e (
P+ 1 )
standardized random >
variables, with We also assume that (X’ Y1 > Y
2’ ... ’ Y
p) has a
(p+1) dimensional multivariate normal distribution with zero mean vector
and correlation matrix P. Denote P as
>
/ 1 p i
p = p n
k. ~ j
(3.1)
>
where p = (p p p ) is the vector denoting the correlation 〜 1 Z p
>
between X and Y = ( Y” Yp) , and IT = ( p
a b) denotes the matrix
of correlation of Y with p , being the correlation of Y and Y . ~ a.b a D
Similar to Chapter 2, we assume that the random variable X is
continuous and is right truncated at a known point c. Also, suppose for
any a=l p , is latent and is observed by where
22
Z = k(a) if a . r ^ < a
W a、
+i
a a,k(a) a a,k(.aj+l
(3.2)
for k(a)=l h(a) with 〜x = -co ; 〜
h⑷
+ 1 = And let
t oc = { a … . ’ a , , 、 } b e the vector of the unknown thresholds of Z . ~a a,2 a,h(a)
a
Let be the parameter vector in the model, then
9
? = {
’ … , V P
2 V " " P
p,p-1; a
i , 2 ,..., a
l’h(l) ;a
p , 2 ’…
'a
p,h(p)
(3.3)
and its dimension is given by,
P
dim (5) = p + p(p-l)/2 + ^ ( h(a)-l )
a=l
P
= p ( p - l ) / 2 + V h(a).
a=l
(3.4)
Now, suppose we have n identical and independent random
observations of the form ( X。),Z(」)’...’ Z ? ) , and suppose that m
of these n observed vectors having observed X-values. Similar to
Chapter 2’ we let the last (n-m) X, s, X ,...’ X be missing by
(1) (ni) truncation and the remaining m X,s, X ’ … , X are observed.
Let n , 、 represents the number of observations corresponding k(a)
23
to Z(a) = k(a), while 〜 ⑷ represents the number of observations
corresponding to Z(a) = k(a) and Z(b) = k(b), and 〜 r e p r e s e n t s the
f
number of observations corresponding to Z = k where Z = (Z^, . . . , Z^) >
and k = (k(l) k(p)). Then, we have
h(a) h(a) h(b)
I n
k ( a ) E E n
k ( a ) , k ( b )
k(a)=l k(a)=l k(b)=l
h(l) h(p)
= Z …• I n
k = n
.
k(l)=l k(p)=l ~
(3.5)
Furthermore, let m, t 、represents the number of observed X, s such that
k (.a j
the corresponding variable Z(a) is equal to k(a). Denote these observed
X’s by X 冗),...’ J
. And let mk ( a )
represents the number of
missing X such that the corresponding variable is equal to k(a).
Then,
\ ( a ) +
\ ( a ) = n
k ( a ) ,
h(a)
[m
k ( a )
k(a)=l
24
h(a)
I \ ( a ) = n
"m
• k(a)=l
(3.6)
Also, let mk represents the number of observed X’ s such that
the corresponding variables Z equal k. Denote these observations by
乂⑴ X(竺). And let m, represents the number of missing X such k , …
,
k k
that the corresponding Z equal k. Then, we have � �
m
k +
\ = n
k ’ 〜 〜 〜
h(l) h(p)
I I m
k =
m
' k(l)=l k(p)=l ~
h(l) h(p)
l … I \ = n
"m
' k(l)=l k(p)=l ~
(3.7)
3.2 Partition Maximum Likelihood (PML) estimation
y To estimate the parameter vector = { p ^ . . . , p ; P
2 1, • • •,
25
P p ,p- r
a
i ,2 ,...,
a
i , h ( i ) •’a
P’
2 ’... '
a
p,h(p) } i n t h i s m o d e l
, w e a p p l y
the Partition Maximum Likelihood (PML) estimation method.
For every a=l, • • . ’ P , P is estimated based on the random
Si sample from the truncated continuous - polytomous sub-model
9
corresponding to (X,Z ) which we have discussed in Chapter 2.
Let 8 = (a , p ). According to (2.13) in Chapter 2, the a
negative log-likelihood function for this sub-model is given by
F (13 ) a -a
, , v m h(a) k(a)
« 一 I I ^ g [ $ ( o c Ja ) + 1
. ) - $ ( ) •
k(a)=l i=l
h(a)
- [ \ ( a ) l 0 g
[ $
2( + C 0
’a
a, k (
a)
+l ) "
$
2( + C 0
’a
a’k(
a) )
k(a)=l
* $
2( C
,a
a , k ⑷ + 1 ) + $
2( C
’a
a , k⑷) • •
(3.8)
Y( v )
* a - p • X. f x
, * a,u ^a k(a) where a = ^ ,
u
’v
(1 - p 2
)1 / 2
. a
26
Moreover, by (2.15) and (2.20) in Chapter 2’ the partial
derivatives of F with respect to the parameters in g are given as
3L 议
follows:
5F O ) a ~a
dp a
, , v m h(a) k (a)
「 . 来 来 ..1 2 3/2 L L k(a)+l,l k(a)’i J a
k(a)=l L
『 , * 、 , v ⑴ 、 _
( a
k ( a )+l , i
}
• ( p
a 'a
a , k ( a )+l 一
X
k ( a ) )
* (i) 、 1 I
- 勞(\ ( 幻 , 1 ) • (
p
a 'a
a , k ( a ) - X
k ( a ) ) J |
一 h
ya )
[ j ; f ⑴,、,跡 1) H k ⑷ )
: : V ^ M i l ? 1 -
-$
2( c
,a
a , k ⑷ + 1 ) + $
2( c
'a
a , k ( a )}
- •
(3.9)
dF (/3 ) a〜 a
da , ( x a,1(a)
27
" W i 少 ( ) • d 力 一1 / 2
= _ [ ; ;
1 = 1
龟(a
l(l),i ) - “ a
l(a)-l,i )
〜 ⑷ “ a
na; , i ) •
( 1
- ~2 )
-1 / 2
+ I ; ;
i = 1
^ ( a
l ( a )+l , i
}
" $ ( a
l ( a ) , i 3
. ' 、 「 , J
c
- 〒a’ l ( a ) 1 1
m
l ( a ) - l • 勞 ( 〜 ’ 丄 ⑷ ) . ^ H( 1
. 2) 1 / 2
J , ^a
~ (
、 「 , “ C
“ P
aa
a >l ( a ) 丫 1
1(a) ^ a,1(a) [ I (i- p
2
)1 / 2
) J O.
+ — : — —
$
2( + C 0
'a
a , l ( a ) + l
)
-$
2( + M
'a
a , l ( a ))
-$
2( c
'a
a , K a )+l
) + $
2( C
'a
a , l ( a ))
’
(3.10)
for any 1(a) = 2,..., h(a).
To estimate the polychoric correlation p ,, for a’b 愁 1 P
with a>b, the bivariate sub-model corresponds to { Z ^ Z ^ ) is considered.
Let = ( & , , % ’ , Pa b ). Suppose d
k ( a ) k ( b ) denotes the
probability such that Z =k(a) and Zf e=k(b). Then
d
k(a),k(b)
28
= P r ( Z = k(a) and Z^ = k ( b ) ) a b
=P r
( a
a , k ( a ) ^ Y
a < a
a , k ( a )+l a n d a
b , k ( b ) ^ Y
b < a
b,k(b)+l )
= a
a , k ( a )+l ’
a
b,k(b)+l ) “ $
2 ( a
a , k ( a ) , a
b,k(b)+l )
一 a
a , k ( a )+l ’
a
b , k ( b ) ) + a
a , k ( a ) , a
b , k ( b ) ) .
(3.11)
Let L be the likelihood function in this sub-model, then by ab
the independency of the observations,
L , O J ab -ab
h ( a ) h ( b )
( A ,
n
k(a),k(b)
"w n n (
d
k ( a ) , k ( b ) ) . k(a)=l k(b)=l
(3.12)
The negative log-likelihood function in this sub-model is given by
F , O , ), where ab -ab
F , (/3 J ab -ab
h(a) h(b)
" " I I n
k ( a ) , k ( b ) 'l 0 g ( d
k(a),k(b) ) •
k(a)=l k(b)=l
(3.13)
The partial derivatives of F
a b(§
a b) w i t h
respect to the
29
parameters in ^ , are derived as follows.
aF (3 ) To find ~ , we first know from Johnson and Kotz
^ a b
(u,v;p)
(1972) that =沴(u’v). So dp 乙
a F
a b
(
ga b
)
a p
a b
h(a) h(b) ^ y ^
n
k(a)’k(b) • o q
k ( a )>k ( b )
k(a)=l k(b)=l d
k(a),k(b) P
a b
h(a) h ( b )(
^ ^ n
k(a),k(b) •
k(a)=l k(b)=l^ d
k(a),k(b)
_ 2( a
a , k ( a )+l '
a
b , k ( b )+l
)
" 々2( a
a’k(
a),
a
b,k(b)+l)
\
一 t( a
a, k (
a)
+ l’
a
b,k(b)) +
〜( a
a, k (
a)’
a
b , k ( b ) ) J ^ J
(3.14)
To find the partial derivatives of Fa b(§
a b) with respect to
^^k(a) k(b)
the thresholds, we first find in three cases separately. da , ( x
a,1(a)
30
Case I: l(a)=k(a).
Since by Johnson and Kotz (1972),
(u,v;p) , v-pu >
= 0 ( u ). $ ,
du ^ (1-p ) y
9 d
k ( a ) , k ( b )
〜 ’ K a O
M 2(
、 , l (a) ’
a
b , k a ^ + 1 ) + 〜 〜 ⑷ 〜 刚 )
= 一 — + —
a a
a , l ( a ) a < X
a,l(a)
f 、 [ J
a
b,k(b)+l - p
a b 'a
a , l ( a )飞
” ( 〜 ’ 丄 ⑷ ) . — (x
.p 2
}i / 2 j
- ab
f a
b , k ( b ) 二 p
a b 'a
a , l ( a ) ) + 2 1/2
1 ( ^t ) J J *
(3.15)
Case II: l(a)=k(a)+l.
a d
k ( a ) , k ( b )
a a
a , l ( a )
— 一 •
da “ 、 doc r
, a,1(a) a,1 la)
31
乂 . [ J a
b,k(b)+l - p
a b 'a
a , l ( a ) ) = 0 ( a
a , K a )}
' H ( x_
p 2
}l/2 J
L p
a b
_f a
b , k ( b ) - ^ b ^ a ^ C a ) )
一 ~ 7 " ; 2 ,1/2 J (
^ ) ;
J '
(3.16)
Case III: l(a)^k(a) and l(a)^k(a)+l.
In this case, !
= 0. doc ,
(,
a,1(a)
(3.17)
Finally,
dF , (|3 ,) ab ~ab
doc , ( a,1(a)
0 r h(a) h(b)
o
I I n
k ( a ) , k ( b ) 'l 0 g ( d
k(a),k(b) 3
a a
a , l ( a ) L k(a)=l k(b)=l J
. r h ( b )
o
= I n
l ( a ) , k ( b ) 'l 0 g ( d
l ( a ) , k ( b ) )
a,1(a) L
k(b)=l
h(b) -I
+ I n
l ( a ) - l , k ( b ) 'l o g ( d
l ( a ) - l , k ( b ) )
k(b)=l
32
h(b)
=_ x
n
l ( a ) , k ( b ) . c
^Q
l(a)>k(b)
k(b)=l d
l ( a ) , k ( b ) a a
a , l ( a )
h(b)
一 V n
l(a)^L,k(b) . o a
l ( a ) — l , k ( b )
k(b)=l d
l(a)-l’k(b) 5 a
a , l ( a )
h(b) f r ^ D
./v 、
r
n
l ( a ) , k ( b ) J a
b , k ( b )+l -
P
a b a
a , l ( a ) ]
k(b)=l a
l ( a ) , k ( b ) { L a b
(a
b ’ k ( b ) - p
a b 'a
a , l ( a )
1
( ^ a b ) -y
h(b) f 「 /v n
./V
r
n
l(a)-l,k(b) .f ,
a
b,k(b)+l - p
a b a , 1 ( a ) ]
- I . — “ � ) • \ ( 2 } l / 2 J k(b)=l
a
l(a)-l,k(b) [ L
ab
卿 J tt
b,k(b) - p
a b 'a
a, l ( a ) I
一 7 T 2 ,1/2 ‘ 1
( K b ) -y
h(b) r n n 1
=_ V . n a ) - l , k ( b ) -
n
l(a),k(b) • )•
L , , a,丄taj
k(b)=l L a
i (a) - l , k ( b ) 1 ( a ) , k ( b )」 V.
f a
b,k(b)+l _ p
a b 'a
a , l ( a )、 2 ,1/2
L 1
( 工 卞 北 ) ;
(a
b , k ( b ) 一 p
a b 'a
a , l ( a ) \ 1
一 d) — •
f , 2 、l/2 1
( ^ a b ) ) - . y
(3.18)
33
Similarly, for any b=l,...p , and 1(b)=2,...,h(b) ’
ab ~ab
a a
b’l(b)
h(a) f r ^
n 1
r n
k(a)’l(b)-l k(a),l(b) . (
k(a)=l L d
k(a),l(b)-l d
k ( a ) , l ( b )」 V
“ (a
a >k ( a ) + l - P a b ^ b ^ C b ) 1
L 1
(丄 卞 让 )
;
\
(a
a ’ k ( a ) - p
a b 'a
b , l ( b ) ) 1 一 ® — — •
r , 2 ,1/2 』
y
(3.19)
Similar to Chapter 2, the minimum of Fa( g
a) and
F
a b( g
a b)
cannot be solved algebraically in closed form. Hence some iterative
methods that mentioned in the next chapter will be used to obtain the
solution. Note that the iterative methods require the first partial
derivatives of the objective function and this is why we derive them in
the above.
In the Partition Maximum Likelihood estimation, we separate
the huge p+1 dimensional model into P + 1
C2 = p(p+l)/2 sub-models. In
order to obtain estimates of these smaller sub-models, we only need to
34
compute single and double integrals instead of the complicated multiple
integrals which will occur if full maximum likelihood estimation method
is used. So, a lot of computer time can be saved. However, the
thresholds estimates are not unique, there are p sets of threshold
estimates for each a based on p different sub-models. We use the mean
of these estimates as our final thresholds estimates since the
difference among them are very small based on our experience in
simulation studies.
3.3 Asymptotic properties of the PML estimates
y >
For each a=l,...’p ’ let g = (a^ , p ) be the vector that
〜a 〜a oi minimizes the function F (|3 ). Also, for a , b = l , . . . , p, a > b , let
a ~a t > y
g , = ( oc , a^ , p , ) be the vector that minimizes F (/3 ) • Lab ~a ab
a D
~a D
:, 》 ., 》 » Moreover, denote i) = ( , . . . , 3 ; 13,,..., 3 - ) and
〜 〜丄 〜P 〜乙丄 〜 丄
V = ( g:’..•’ 玲 ’ ; ? 。 ; , . . . , 玲 : ” , ) .
T h e n b
y standard maximum
likelihood theory that under mild regularity conditions, 5 is a
consistent estimator of v , where T) represents the vector of true
parameter value of i).
According to mean value theorem, for each a==l,...,p ,
35
dF (g ) dF O ) a2
Fo(/)
a
_ = a
~a o
+ ^ _ • (g - )
八 A ^ A 〜ao 5/3 dp dl3 a 俘。
~a -a
(3.20)
来
where 13 is the true value of the parameter vector 13 , and 13 is a ~ao ~
a
vector that lies between 玲 and . ~a 〜ao
Also, by mean value theorem, for each a,b=l p, a>b,
dF (g ) aF L ) d2
F ,(13 ,) a b ^ a b
J
= ab义abo) + ab '-ab .
(§ )
—
~ab 〜abo 5
§ab a
§ab a
§ab ^ a b (3.21)
来
where 侈ab〇 is the true value of the parameter vector g
a f e , and g
a f e is a
vector that lies between g , and ga b o
.
Combining the equations of the form in (3.20) for a=l p
and of the form in (3.21) for a,b=l,...,p ; a>b, we will obtain the
following matrix equation,
36
一 r i r o • o •
;
0
o
dF (g ) dF O ) a2
Fa(/) . 〜 a L a a
~a o
o a
o m 9
〜a〜ao dB a/3 di3 a/3 . 〜a 〜a 〜a • + 0 0 • • • • •
• 參
2 *
dF ^(g J dF K
) d F
a h
(
5a b
) o 〜
a
§ab 5
§ab ‘ 呼 a b ^ a b
0 • •
• . o • • • • • o •
—J L— 」 L
(3.22)
a F
( H ) f a F
l ' a F
p a F
2i
a F
P:
P- l 1’
Denote — , . • • ’ '» » . •., 1
% d
S2i %
f P- i
;
and let H be the diagonal block matrix with appropriate diagonal blocks
* -1 ^ a ^ l3
* which are equal to H O ) = n V - (for a=l p) and H
a b( g
a b) =
a d!3 5 玲 ’ ~
〜a 〜a
-1 a 2 p w O — ( f o r a,b=l,...p, a>b) respectively.
~ab -ab
Hence, the following equation is established.
37
aF(7)) dFiin ) * 〜 〜O
TT 广
= + n-H (刃一7) ) • 〜〜o
dt) dT\
(3.23)
d¥{.r\)
Due to the definition of tj, we know that = 0, and 〜
a
2
hence
. , ,/ 0 aF(i7 )
1/2 广 、 ru
*、-l -1/2 n • (T)一7) J = - C H ) -n •
~ dr\
(3.24)
Now, for every i=l n , k(a)=l h(a) , a=l,.‘ ., p ’
let AF (t), k, i) denote the augmented vector which has p(p+l )/2 ~ 〜 〜
sub-vectors containing partial derivatives from two categories. The
first p sub-vectors are given by either
_ n y
( D d 「 (
a
a,k(a)+l p
a A
k x
- l Q
g $
z T T T i 一 dl3 ^ (1 - P
0 ) 乂
~a :fm
a*
a . . . - p 'X. n ( a,k(a) a k 、
s> “ 2、 1/2
V (1 - p ) a . J
if the sample* s corresponding variable X is observed, or
38
a 厂
- ― “ 丄 叩 $
2(+W
'a
a,k(a)+l
)
一 $2
(+⑴,
a
a,k(
a))
op L
-a
- ^ Z ^ ' V k C a ) . !3
+ $
2( C
,a
a, k (
a) ) .
if the sample's corresponding variable is missing, for any a=l,...’p.
Secondly, the sequent p(p-l)/2 sub-vectors are given by
d
~ — ~ l o g
(d
k(a),k(b)) a
§ab
d log Pr{ Z = k(a), Z, = k(b) >
… a
D a
§ab
where a,b=l,...p, a>b .
By defining this AF ( 5 , k, i), we have
a r c , ) ?)、
— = - l … I I ^ ( V !5, i ) . a
5 k(l)=l k(p)=l i=l
(3.25)
Finally, we get
h(l) h(p) n
k
n1 / 2
- ( r 5 o) = ( H
*r l
-n
"1 / 2
' I ... I I ^
k(l)=l k(p)=l i=l
(3.26)
39
Now, since for i=l,,..’ 〜, k ( a ) = l , . . . , h(a) , a=l,..., P ,
(X⑴,Z,(i), ) is a sequence of identically and independently distributed k
random vectors, it can be shown that by the central limit theorem, the
h(l) h(p) n
k
asymptotic distribution of n一1 / 2
. ^ … ^ \ AF k , i) is
k(l)=l k(p)=l i=l
multivariate normal with mean zero and covariance matrix
In addition, as g ’ j3 , are consistent estimates of 3 and
〜a 〜aD
<3 ’ the diagonal blocks in H will converge in probability to the
corresponding matrices Ha( g
a Q) and H
a b( g
a b o) respectively. That is,
* p 来 一 1 P 一1
H ——-——> H and hence (H ) > H .
1 /P Therefore, n • (tj-t) ) is asymptotically distributed as
multivariate normal distribution with zero mean and covariance matrix
H一1
n H- 1
where H is a diagonal block matrix with diagonal blocks equals o “
H (|3 ) for a=l, . . . ,p and H , (6 , ) for a’b=l,...,p , a>b. at 〜ao 3LD 〜aDo
Since our main concern is about the correlations, we let (r =
(n n ‘ o . , o ) be the vector of the unknown correlations.
...,Pp, y
2 V ' H
p , p - 1
Suppose J is a selection matrix such that Jt? = o; , then the Partition
Maximum Likelihood (PML) estimator ^ of <r is given by o = J ^ and the
1 / 2 � • asymptotic distribution of n •(〔-〔 ) is hence multivariate normal with
40
-1 -1 , mean zero and covariance matrix JH Q H J •
o
To find an estimate of the H~ , we actually need to find the
estimates for the blocks H (/3 ) and H , (3 K) . For the truncated
a 8LD ~SLD
continuous - polytomous sub-models, Ha( g
a) is estimated by H
a( g
a) where
it is given by
^ m r, aF (X. , Z . ) 、 / a F a ( x ” z ) x H , 、 r I a l i a i l
H (/3 ) = ) — • + a
;
u
\ dB ^ a/3
n
「 a F (z.) v ( 5F ( z . ) V" a l a l
. • ^ a/3 ) L d!3 i=m+l
L
~a J
(3.27)
for a=l,...,p. For the polytomous - polytomous sub-models, H
a b( g
a b)
c a n
be obtained as a by-product in the final stage of the iteration in
Fisher,s Scoring optimization procedure. The details will be discussed
in the following chapter.
Since (X,⑴,Z,⑴) is a sequence of identically and k -k
independently distributed random vectors, i = l , . • . ,〜, k ( a ) = l h(a)
a=l, . . . , p, the corresponding AF Cv, k, i) is also a sequence of
i.i.d. random vectors. Hence, we use the usual sample estimate to
estimate Q as the following.
41
Let AF denote the mean value of AF ( 5 ’ k, i), that is
x 「 h(l) h(p)
n
k •
af = — [ … r t ^ n
n
L k(l)=l k(p)=l i=l J •
(3.28)
Then the estimate of Q is given by,
:
T h(l) h(p) n
k ,“
“ 一 I . • I I (AF(^k,i)-AF) . [AF(5,k,i)-AF)
n一
1
k(l)=l k(p)=l i=l . • 參
(3.29)
八
Furthermore, since E[ AF ( 5 , k, i)] = 0, we can approximate Q by
『 h ( l ) h(p) n
k ,“
n = 一 V .. y Y AF(^,k,i)-AF(^,k,i)'
/ / • L4 〜 〜 〜 〜 〜 〜
n一
1
[ k(l)=l k(p)=l i=l ..
(3.30)
42
Chapter 4 . Optimization procedures and Simulation study.
4.1 Optimization procedures
As mentioned in Chapter 2 and Chapter 3, the maximum
likelihood estimates of the parameter vector for each bivariate
sub-model is obtained by minimizing the corresponding negative
log-likelihood function. However, in practice, the minimum of the
negative log-likelihood function cannot be obtained in closed form.
Hence, some iterative algorithm (See, e.g., Lee & Jennrich, 1979) should
be used. We shall describe how to apply the modified
Davidon-Fletcher-Powell (DFP) algorithm as well as the Fisher Scoring
algorithm in analyzing two different kinds of sub-models as follows.
The procedure for minimizing (3.8) for the sub-model with
truncated continuous - polytomous pair which based on modified
Davidon-Fletcher-Powell (DFP) algorithm has been implemented by writing
in FORTRAN IV with double precision. DFP algorithm, which is also
referred as the variable metric method, is a line search algorithm (See,
e.g. , Luenberger 1973). Let f (x) be the objective function, then the
basic steps of the algorithm is as follows.
Given a symmetric positive definite matrix SQ ,and a starting point x
q ’
then starting with k=0,
Step 1 Set d = ~S
k 'g
k w h e r e
is the gradient vector of
43
the objective function f evaluated at x and is a
symmetric positive definite matrix.
Step 2 Minimizes f(x. + a-d, ) with respect to a^O to obtain
〜iC ~K
� \
( i i i )
Bk = a
k ^ k a n d
( i v )
i k+i '
Step 3 Set gk = |
k + 1 - I
k and
s … A ^ : - 咖 、
U p d a t e k a n d
k+1 k , ^ ,c ^
Bk 3k % s
k 2k
return to Step 1.
(4.1)
In 1970, Broyden, Goldford, Fletcher and Shanno suggested the
so called BGFS formula. The global convergence of the BGFS method with
inexact line searches which satisfy the conditions suggested by
Goldstein (See, e.g., Fletcher 1979) has been proved. The two
conditions suggested by Goldstein in the minimization procedure are
given as,
(i) f
k_ f
k + l ~ ""p
.ik’Sk ’ f o r s o m e
P€
(0
,1 / 2
)
(4.2)
which preserves the positive definite property of S and hence the
function value decreases monotonically in every iteration. Due to the
advantages of the method, we replace Step 3 of the procedure by the BGFS
44
formula,
c ‘ f , 9k'
s
k3k ) Bk Sk' B A W i A ’ S
W 1 = s. + 1+ JC+1 K , J 一 , — ^ ,〜 1
Bk 3k
;
Bk 3
k Bk 3
k •
(4.3)
In the modified Davidon-Fletcher-Powe11 (DFP) algorithm, the
positive definite matrix S, is updated in each iteration. Although this
algorithm ensures the decreasing of the objective function in
iterations, unlike the Fisher Scoring algorithm, the final Sk in the
iterations does not provide a good estimate for the Hessian matrix
(See, Lee & Jennrich, 1979).
The procedure for minimizing (3.13) for the sub-model with
polytomous - polytomous pair has been developed by Poon and Lee (1987),
and a program based on Fisher Scoring algorithm written in FORTRAN IV
with double precision has also been implemented. The basic step of the
th Scoring algorithm at the i step is given by,
M . = - < I . -1
? . (4.4) ~ l l
where ^ is a step-size parameter which takes the first value in the
sequence { 1’ 1/2, 1/4,••. > that reduces the function value, is the
gradient vector and is the information matrix at the it h
step with
= E ( g . g . , ) . Actually, we only need the first partial derivatives to
obtain the information matrix. The Fisher Scoring algorithm not only
produces the maximum likelihood estimate, but also an approximation of
45
its asymptotic covariance matrix and hence its standard errors in the
sub-models. So, the Fisher Scoring algorithm produces the consistent
estimate of H for each polytomous - polytomous sub-model in ab ^ab
Chapter 3.
As we can see that in either the DFP algorithm, or the Fisher
Scoring algorithm, only the first partial derivatives with respect to
the parameters are needed in the iterations. And these derivatives have
been derived in previous chapters.
In general, both of the algorithms are robust to the starting
value of the parameter vector. However, a good starting value would
reduce the time of convergence. Hence, we use the 'sample, correlation
based on the truncated or the polytomous data in each sub-model to be
the starting value for the parameter of correlation. For those
sub-models which involve the truncated continuous variable, we replace
the missing value by the truncation point value to calculate the
starting point. This approach uses all the data in the calculation of
the starting value- Although this starting point may possess bias,
based on our experience in the simulation study, it is a good starting
value since the procedure converges quickly to the solution. For the
starting values of the thresholds in the sub-models, we use the inverse
of the standard normal distribution evaluated at the cumulative cell
proportion of the polytomous variable.
46
Furthermore, the program is said to be converged and the
iterative procedure will stop if the root mean squares of the gradient
vector is less than a pre-assigned small number, say e.
4.2 Simulation study
Based on the algorithm discussed in the previous section, a
computer program written in FORTRAN IV with double precision has been
implemented to obtain the Partition Maximum Likelihood (PML) estimates
associating with the model that has been discussed in Chapter 3.
To study the behaviour of the estimate, different situations
which includes different sample sizes, different correlations matrices
and different truncation points of the continuous variable are used in
the simulation study.
The study is based on simulated data drawn from a multivariate
normal distribution with the dimensions of X and Y are one and three
respectively. The mean vector of the distribution \jl is chosen to be the
zero vector, and the correlation matrix P are taken as follows,
(I) Small correlations between variables:
47
l.o i r L Q “
p l.o 0.1 1.0 ^ =
p1
p l.o 一 0.1 0.1 1.0 p
2
p1 2
p 1.0 0.1 0.1 0.1 1.0 L 广 1 3 23 J L J
(II) Reasonably large correlations between variables:
“1.0 1 [ 1.0 . p 1.0
= 0.5 1.0
P =
p1
p 1.0 “ 0.5 0.5 1.0 p 2 p 1 2 p 1.0 0.5 0.5 0.5 1.0
> 严 3 1 3 23 L J
Moreover, the known truncation points of X are taken as:
(A) c=1.2816 is the 90 percentile of a standard normal
distribution which gives about 10% of truncated data.
(B) c=0.0000 is the 50 percentile of a standard normal
distribution which gives about 50% of truncated data.
For the three polytomous variables Z^, Z^ and Z^, we assume
each of them has three categories and we choose different kinds of
thresholds values for them. For variable Z^, we consider a symmetric
distribution and approximately equal amount of data in the categories ,
which means about one-third for each category. For the variables Z^ and
Z^, we consider asymmetric distributions which skew at the opposite
48
directions. The ratios of the data in categories of Z2 are roughly 20%,
30% and 50%; while 5€%, 30% and 20% are roughly the ratios for Z ^
Finally, the thresholds values of the polytomous variables are given as
follows:
a
l = { a
l , l = "" ’ a
l , 2 =
- 0 .5
, a
l’3 =
°'5
, a
l , 4 = + C
° }
a
2 = { a
2 , l =
- ⑴ ,a
2 , 2 = - 0 .8
, a
2’3 = , a
2 , 4 =
+⑴}
a
3 = { a
3 , l =
-“ ’ a
3 , 2 =
0.0 , a
3 , 3 =
°'8
, a
3 , 4 = }
.
In addition, five different sample sizes are in
considerations. (1) n=50 , (2) n=100 , (3) n=200 , (4) n=400 and (5)
n=800 .
With two sets of correlation matrices, two sets of truncation
points , one set of thresholds vectors and five sets of sample sizes,
there are totally twenty different combinations. For each combination,
50 replications are performed where the multivariate normal variates are
generated by the subroutine DRNMVN of IMSL (1975) with the specified
mean vector and correlation matrix. The simulated continuous random
vector , Y0, Y
0) ' is transformed to the polytomous vector Z=(Z ,
〜 1 Z 〜 上
Z , ' based on the pre-assigned threshold values. Then the
parameters are estimated by our PML method. The convergence criterion e
is taken to be 0.0005. The simulated results concerning the
correlations and the thresholds estimates are reported from Table I.A.I
to Table II.B.5. { Note that Table I.A.I refers to the simulation study
49
with correlation matrix(I), truncation point(A) and sample size(l) and
so on. }
In each of the tables, the following statistics are reported.
(i) The mean values of the estimates:
50
1
so 人 1
k=l
^ f k") th where 〜 represents the i element of the estimated
parameter vector in the k ^ replication.
(ii) The root mean square errors:
50 1 / ?
( 1 n
2 1 RMSE. = \ - ^ ― V ( - ) —
1 1 50 丄 1 1 f k=l
th>
where 办.represents the i element of the true parameter
vector.
(iii) The sample standard errors of the estimates:
50 1 / ?
� H r K � ) 2 } k=l .
(iv) The average of estimated standard errors of the estimates:
50
50 1 八(k)
S.E. . = V (estimated standard error o f 、 ) 1
50 • k=l
(v) The ratio of the sample standard errors to the average of
estimated standard errors of the estimates:
S.D..
R . = — — ^ • E< • • •
i
We would expect that S.D. . is close to S.E. and thus the 、‘ l i
ratio R . would be close to one. l
From the tables, the following phenomena are observed.
(1) The mean values of the estimates are very close to the true
parameter values and the root mean square errors (RMSE), the sample
standard errors (S.D.) and the estimated standard errors (S.E.) are
reasonably small in all situations, especially when the sample size is
large.
(2) By comparing the tables with different sample sizes, as
expected, when the sample size increases, the RMSE, S.D. and S.E. of all
the estimates decreases and the estimates are much more accurate in all
situations.
(3) By comparing the tables with different truncation points of
51
the continuous variable, we can see that when the truncation point
increases which means less data are truncated, the following are
observed. The RMSE, S.D. and S.E. of the estimates of polyserial
correlations are smaller due to the gain of the information about the
continuous data. However, there is no change about the estimates of the
polychoric correlations since they are not affected by the truncation
point by applying the Partition Maximum Likelihood method.
(4) By comparing the tables with different correlation matrices,
we can see that when the population correlations increase, the RMSE,
S.D. and S.E. of all the estimates decrease since higher correlations
give more information between variables.
(5) In the 10% truncated case, the estimates of the polyserial
correlations are better than those of polychoric correlations by
comparing their RMSE,, S.D. and S.E. However, this phenomenon vanishes
in the 50% truncated case which give no significant difference between
the two types of correlation estimates.
(6) In all situations, the estimates of the correlations , either
polyserial or polychoric, are better than those of the thresholds
estimates.
(7) Within the estimates of the thresholds, we can see that those
estimates involving the polytomous variable with symmetric thresholds
52
(i.e. Y.) are better than those of the other estimates involving the
polytomous variable with asymmetric thresholds (i.e. Y or Y^). That
means in all situations, p has smaller RMSE, S.D. and S.E. than p2 or
p . Also, p1 2 and p
1 3 has smaller RMSE, S.D. and S.E. than p
2 3-
(8) In all situations, the Ratios fall into the range (0.8, 1.2).
It indicates that the estimates of the standard errors are acceptable.
53
Chapter 5. Summary and Conclusion
In this thesis, we develop a method in estimating the
correlations between the truncated continuous and the polytomous
variables. By using the method of Partition Maximum Likelihood (PML)
estimation proposed by Poon and Lee (1987), the (p+1)-dimensional model
is divided into p(p+l)/2 bivariate sub-models which can be classified as
two different kinds. The first kind involves one truncated continuous
variable and one polytomous variable while the second kind involves
variables which are both polytomous. The likelihood functions of these
sub-models have been found and the estimates of the parameters are
obtained through the modified Davidon-Fletcher-Powell (DFP) algorithm.
It follows from the statistical theories that these maximum likelihood
estimates have nice asymptotic properties. The asymptotical results are
also reported. Based on the results of our simulation study, we observe
that the estimates are very accurate in various conditions, including
different correlation matrices, truncation proportions and sample sizes.
We can also see that the results are still pretty good even when the
sample size is as small as 50 and proportion of truncation is as large
as 50%.
Certainly, this thesis gives only the starting point of the
problem, there are still a lot of practical problems that are needed to
be studied. The most trivial extension is to consider continuous
variable with doubly truncation, that is truncated at both sides. It is
54
believed that similar procedures can be applied and similar results will
be obtained. We can also consider the extension of the truncated
continuous variable to multi-dimensional. Based on similar ideas
provided in this thesis, we believe that new results on these topics can
be achieved in future.
55
Table II.A.2
( n = 100 )
( c = 1.2816 ) (10% truncated)
Parameter TRUE EST. RMSE S.D. S.E. RATIO
p 0.100 0.137 0.148 0.145 0.164 0.884
p 0.100 0.095 0.177 0.179 0.168 1.067 h
2
p 0.100 0.093 0.171 0.173 0.164 1.055
p 0.100 0.052 0.167 0.161 0.180 0.895 " 1 2 p 0.100 0.142 0.182 0.178 0.175 1.018
p 0.100 0.081 0.181 0.182 0,180 1.008
a -0.500 -0.516 0.189 0.190 0.187 1.017 1 , 2 a 0.500 0.530 0.209 0.209 0.188 1.109 1,3
a -0.800 -0.782 0.186 0.188 0.200 0.938 2,2
a 0.000 0.023 0.193 0.193 0,178 1.087 2,3
a 0.000 -0.022 0.192 0.192 0.177 1.083 3,2
a3 3
0.800 0.825 0.216 0.216 0.203 1.066
TRUE = TRUE PARAMETER VALUE
EST. = MEAN OF ESTIMATES
RMSE = ROOT MEAN SQUARE ERROR
S.D. = SAMPLE STANDARD ERROR
S.E. = MEAN OF ESTIMATED STARDARD ERROR
RATIO = RATIO OF S.D. TO S.E.
56
Table I.A.2
( n = 100 )
( c = 1.2816 ) (107. truncated)
Parameter TRUE EST. RMSE S.D. S.E. RATIO
p 0.100 0.114 0.127 0.128 0.110 1.161
p 0.100 0.105 0.111 0.112 0.115 0.969
p 0.100 0.110 0.132 0.133 0.117 1.133
p 0.100 0.135 0.131 0.128 0.126 1.104
p 0.100 0.090 0.123 0.124 0.127 0.973
p 0.100 0.109 0.129 0.130 0.130 1.001
a -0.500 -0.495 0.139 0.140 0.131 1.070 1,2
a 0.500 0.485 0.135 0.136 0.131 1.036 1,3
a -0.800 -0.794 0.115 0.117 0.141 0.827 2,2
a 0.000 -0.011 0.131 0.132 0.126 1.051 2,3
a 0.000 0.002 0.145 0.147 0.125 1.172 3.2
a 0.800 0.836 0.154 0.151 0.143 1.051 3.3
TRUE = TRUE PARAMETER VALUE
EST. = MEAN OF ESTIMATES
RMSE = ROOT MEAN SQUARE ERROR
S.D. = SAMPLE STANDARD ERROR
S.E. = MEAN OF ESTIMATED STARDARD ERROR
RATIO = RATIO OF S.D. TO S.E.
57
Table II.A.2
( n = 100 )
( c = 1.2816 ) (10% truncated)
Parameter TRUE EST. RMSE S.D. S.E. RATIO
p 0.100 0.102 0.078 0.079 0.079 0.999
p o.100 0.096 0.090 0.090 0.081 1.109
p 0.100 0.095 0.073 0.074 0.081 0.905 3 .:、、
p 0.100 0.113 0.083 0.083 0.089 0.923
p 0.100 0.109 0.078 0.078 0.090 0.868
p 0.100 0.103 0.102 0.103 0.093 1.108
a -0.500 -0.491 0.084 0.085 0.093 0.916 1,2
a 0.500 0.515 0.113 0.113 0.093 1,216 1,3
a -0.800 -0.803 0.114 0.115 0.100 1.151 2,2
a 0.000 -0.014 0.094 0.094 0.089 1.062 2,3
a 0.000 -0.014 0.086 0.086 0.088 0.971 3.2
a 0.800 0.781 0.100 0.099 0.099 0.999 3.3
TRUE = TRUE PARAMETER VALUE
EST. = MEAN OF ESTIMATES
RMSE = ROOT MEAN SQUARE ERROR
S.D. = SAMPLE STANDARD ERROR
S.E. = MEAN OF ESTIMATED STARDARD ERROR
RATIO = RATIO OF S.D. TO S.E.
58
Table I.A.2
( n = 100 )
( c = 1.2816 ) (107. truncated)
Parameter TRUE EST. RMSE S.D. S.E. RATIO
p 0.100 0.112 0.049 0.048 0.055 0.865
p 0.100 0.100 0.053 0.054 0.057 0.941
p 0.100 0.087 0.055 0.054 0.057 0.934
p 0.100 0.093 0.058 0.058 0.064 0.918
p 0.100 0.105 0.076 0.076 0.063 1.201
o 0.100 0.095 0.064 0.065 0.065 0.986 p
2 3
a 一 0 . 5 0 0 -0.499 0.067 0.068 0.065 1.035 1,2
a 0.500 0.506 0.072 0.072 0.066 1.101 1,3
a -0.800 -0.781 0.074 0.073 0,070 1.039 2,2
a 0.000 0.019 0.070 0.068 0.063 1.085 2,3
a 0.000 -0.002 0.063 0.063 0.063 1.011 3.2
a 0.800 0.796 0.074 0.074 0.071 1.056 3.3
TRUE = TRUE PARAMETER VALUE
EST. = MEAN OF ESTIMATES
RMSE = ROOT MEAN SQUARE ERROR
S.D. = SAMPLE STANDARD ERROR
S.E. = MEAN OF ESTIMATED STARDARD ERROR
RATIO = RATIO OF S.D. TO S.E.
59
Table I.A.2
( n = 100 )
( c = 1.2816 ) (107. truncated)
Parameter TRUE EST. RMSE S.D. S.E. RATIO
p 0.100 0.098 0.031 0.032 0.039 0.805
p 0.100 0.095 0.044 0.044 0.040 1.101
p 0.100 0.094 0.044 0.044 0.041 1.079
p 0.100 0.110 0.043 0.042 0.045 0.931
p 0.100 0.102 0.049 0.050 0.045 1.110
p 0.100 0.096 0.043 0.044 0.046 0,941
a -0.500 -0.506 0.040 0.040 0.046 0.861 1,2
a 0.500 0.488 0.050 0.049 0.046 1.056 1,3
a
2 2 -0.800 -0.798 0.037 0.037 0.050 0.748
a 0.000 0.002 0.041 0.041 0.044 0.930 2,3
a 0.000 0.001 0.046 0.046 0.044 1.044 3.2
a 0.800 0.803 0.059 0.060 0.050 1.192 3.3
TRUE = TRUE PARAMETER VALUE
EST. = MEAN OF ESTIMATES
RMSE = ROOT MEAN SQUARE ERROR
S.D. = SAMPLE STANDARD ERROR
S.E. = MEAN OF ESTIMATED STARDARD ERROR
RATIO = RATIO OF S.D. TO S.E.
60
Table I.B.3
( n = 200 )
( c = 0.0000 ) (507. truncated)
Parameter TRUE EST. RMSE S.D. S.E. RATIO
P i 0.100 0.136 0.167 0.164 0.179 0.917
p2 0.100 0.099 0.204 0.206 0.182 1.133
p 0.100 0.083 0.167 0.167 0.186 0.902
p 0.100 0.052 0.167 0.161 0.180 0.895
p 0.100 0.142 0.182 0.178 0.175 1.018 "13
n 0.100 0.081 0.181 0.182 0.180 1.008 h
23
a -0.500 -0.517 0.189 0.190 0.187 1.018 1,2
a 0.500 0.530 0.210 0.210 0.189 1.109 1,3
a -0.800 -0.782 0.186 0.187 0.199 0.938 2,2
a 0.000 0.022 0.192 0.193 0.179 1.078
a 0.000 -0.022 0.191 0.192 0.178 1.079
a 0.800 0.826 0.217 0.218 0.204 1.069 3,3
TRUE = TRUE PARAMETER VALUE
EST. = MEAN OF ESTIMATES
RMSE = ROOT MEAN SQUARE ERROR
S.D. = SAMPLE STANDARD ERROR
S.E. = MEAN OF ESTIMATED STARDARD ERROR
RATIO = RATIO OF S.D. TO S.E.
61
Table I.B.3
( n = 200 )
( c = 0.0000 ) (507. truncated)
Parameter TRUE EST. RMSE S.D. S.E. RATIO
p. 0.100 0.117 0.141 0.142 0.120 1.179
p 0.100 0.107 0.118 0.119 0.127 0.936
p3 0.100 0.112 0.151 0.152 0.127 1.197
p… 0.100 0.135 0.131 0.128 0.126 1.014
p1 3 0.100 0.090 0.123 0.124 0.127 0.973
p2 3 0.100 0.109 0.129 0.130 0.130 1.001
a -0.500 -0.494 0,139 0.141 0.131 1.074 1,2
a 0.500 0.485 0.136 0.137 0.132 1.037 1,3
a -0.800 -0.794 0.116 0.117 0.141 0.830 2,2
a 0.000 -0.011 0.131 0.132 0.126 1.044
a3 2
0.000 0.002 0.145 0.147 0.125 1.173
a 0.800 0.836 0.154 0.151 0.144 1.049 3,3
TRUE = TRUE PARAMETER VALUE
EST. = MEAN OF ESTIMATES
RMSE = ROOT MEAN SQUARE ERROR
S.D. = SAMPLE STANDARD ERROR
S.E. = MEAN OF ESTIMATED STARDARD ERROR
RATIO = RATIO OF S.D. TO S.E.
62
Table I.B.3
( n = 200 )
( c = 0.0000 ) (507. truncated)
Parameter TRUE EST. RMSE S.D. S.E. RATIO
oA 0.100 0.097 0.089 0.090 0.086 1.051
p 0.100 0.091 0.095 0.095 0.088 1.086
p 0.100 0.096 0,080 0.080 0.089 0.907
p 0.100 0.113 0.083 0.083 0.089 0.923
p 0.100 0.109 0.078 0.078 0.090 0.868
p 0.100 0.103 0.102 0.103 0.093 1.108
a -0.500 -0.491 0.085 0.085 0.092 0.922 1,2
a 0.500 0.516 0.113 0.113 0.094 1.212 1,3
a - 0 . 8 0 0 一 0 . 8 0 3 0 . 1 1 4 0 . 1 1 5 0 . 1 0 0 1 . 1 4 9 2,2
a 0.000 -0.014 0.094 0.094 0.089 1.057 2,3
a 0.000 -0,014 0.086 0.086 0.088 0.975 3.2
a 0.800 0.781 0.101 0.100 0.100 1.002 3.3
TRUE = TRUE PARAMETER VALUE
EST. = MEAN OF ESTIMATES
RMSE = ROOT MEAN SQUARE ERROR
S.D. = SAMPLE STANDARD ERROR
S.E. = MEAN OF ESTIMATED STARDARD ERROR
RATIO = RATIO OF S.D. TO S.E.
63
Table I.B.3
( n = 200 )
( c = 0.0000 ) (507. truncated)
Parameter TRUE EST. RMSE S.D. S.E. RATIO
pA 0.100 0.109 0.051 0.051 0.061 0.837
P 2 0.100 0.100 0.054 0.055 0.062 0.879
p3 0.100 0.094 0.058 0.058 0.063 0.923
p… 0.100 0.093 0.058 0.058 0.064 0.918
p 0.100 0.105 0.076 0.076 0.063 1.201
p 0.100 0.095 0.064 0.065 0.065 0.986
a -0.500 -0.499 0.067 0.068 0.065 1.035 1,2
a 0.500 0.506 0.072 0.073 0.066 1.101 1,3
a -0.800 -0.782 0.074 0.073 0.070 1.039 2,2
a 0.000 0.019 0.070 0.068 0.063 1.079 2,3
a 0.000 -0.002 0.063 0.063 0.062 1.015 3.2
a 0.800 0.796 0.074 0.075 0.071 1.060 3.3
TRUE = TRUE PARAMETER VALUE
EST. = MEAN OF ESTIMATES
RMSE = ROOT MEAN SQUARE ERROR
S.D. = SAMPLE STANDARD ERROR
S.E. = MEAN OF ESTIMATED STARDARD ERROR
RATIO = RATIO OF S.D. TO S.E.
64
Table I.B.3
( n = 200 )
( c = 0.0000 ) (507. truncated)
Parameter TRUE EST. RMSE S.D. S.E. RATIO
p 0.100 0.099 0.038 0.038 0.043 0.884
p 0.100 0.099 0.049 0.049 0.044 1.109 2
p 0.100 0.097 0.047 0.048 0.045 1.069
p 0.100 0.110 0.043 0.042 0.045 0.931
p 0.100 0.102 0.049 0.050 0.045 1.110
p 0.100 0.096 0.043 0.044 0.046 0.941 "23
a -0.500 -0.506 0.040 0.040 0.046 0.861 1,2
a 0.500 0.488 0.050 0.049 0.046 1.057 1,3
a -0.800 -0.798 0.037 0.037 0.050 0.750 2,2
a 0.000 0.002 0.041 0.041 0.045 0.930 2,3
a 0.000 0.001 0.046 0.046 0.044 1.047 3.2
a 0.800 0.803 0.059 0.060 0.050 1.192 3.3
TRUE = TRUE PARAMETER VALUE
EST. = MEAN OF ESTIMATES
RMSE = ROOT MEAN SQUARE ERROR
S.D. = SAMPLE STANDARD ERROR
S.E. = MEAN OF ESTIMATED STARDARD ERROR
RATIO = RATIO OF S.D. TO S.E.
65
Table II.A.2
( n = 100 )
( c = 1.2816 ) (10% truncated)
Parameter TRUE EST. RMSE S.D. S.E. RATIO
pA 0.500 0.537 0.108 0.102 0.121 0.843
p 0.500 0.508 0.115 0.115 0.128 0.900
p 0.500 0.519 0.137 0.137 0.127 1.079 3
p 0.500 0.467 0.145 0.142 0.146 0.979
p 0.500 0.524 0,134 0.133 0.139 0.960
p 0.500 0.496 0.145 0.146 0.146 1.002
a -0.500 -0.514 0.178 0.179 0.178 1.003 1,2
a 0.500 0.531 0.190 0.190 0.183 1.040 1,3
a -0.800 -0.797 0.197 0.199 0.194 1.024 2,2
a 0.000 0.061 0.188 0.180 0.172 1.048 2,3
a 0.000 -0.010 0.176 0.178 0.169 1.049 3.2
a 0.800 0.814 0.222 0.223 0.198 1.127 3.3
TRUE = TRUE PARAMETER VALUE
EST. = MEAN OF ESTIMATES
RMSE = ROOT MEAN SQUARE ERROR
S.D. = SAMPLE STANDARD ERROR
S.E. = MEAN OF ESTIMATED STARDARD ERROR
RATIO = RATIO OF S.D. TO S.E.
66
Table II.A.2
( n = 100 )
( c = 1.2816 ) (10% truncated)
Parameter TRUE EST. RMSE S.D. S.E. RATIO
p, 0.500 0.509 0.083 0.083 0.082 1.011
p2 0.500 0.497 0.091 0.092 0.088 1.044
p3 0.500 0.494 0.085 0.085 0.090 0.943
p… 0.500 0.520 0.111 0.110 0.100 1.103
p 0.500 0.492 0.094 0.094 0.103 0.916
p 3 0.500 0.497 0.101 0.102 0.107 0.948
a -0.500 -0.494 0.125 0.127 0.126 1.008 1,2
a 0.500 0.487 0.128 0.128 0.127 1.006 1,3
a 一 0 . 8 0 0 -0.801 0.122 0.123 0.136 0.903 2,2
a 0.000 -0.013 0.111 0.111 0.121 0.920 2,3
a3 2
0.000 -0.012 0.140 0.141 0.120 1.175
a 0.800 0.825 0.143 0.142 0.140 1.016 3 f
TRUE = TRUE PARAMETER VALUE
EST. = MEAN OF ESTIMATES
RMSE = ROOT MEAN SQUARE ERROR
S.D. = SAMPLE STANDARD ERROR
S.E. = MEAN OF ESTIMATED STARDARD ERROR
RATIO = RATIO OF S.D. TO S.E.
67
Table II.A.2
( n = 100 )
( c = 1.2816 ) (10% truncated)
Parameter TRUE EST. RMSE S.D. S.E. RATIO
p 0.500 0.506 0.065 0.065 0.059 1.103
p 0.500 0.496 0.065 0.066 0.063 1.049
p3 0.50Q 0.509 0.059 0.059 0.061 0.961
p 0.500 0.507 0.072 0.073 0.071 1.023
p 0.500 0.510 0.066 0.066 0.071 0.931
p 0.500 0.508 0.073 0.074 0.075 0.984
a -0.500 -0.496 0.076 0.076 0.088 0.861 1,2
a 0.500 0.511 0.105 0.106 0.091 1.168 1,3
a -0.800 -0.799 0.090 0.091 0.096 0.945 2,2
a 0.000 -0.017 0.093 0.092 0.085 1.081 2,3
a 0.000 -0.020 0.082 0.081 0.084 0.954 3,2
a
3 3 0.800 0.777 0.100 0.098 0.097 1.010
TRUE = TRUE PARAMETER VALUE
EST. = MEAN OF ESTIMATES
RMSE = ROOT MEAN SQUARE ERROR
S.D. = SAMPLE STANDARD ERROR
S.E. = MEAN OF ESTIMATED STARDARD ERROR
RATIO = RATIO OF S.D. TO S.E.
68
Table II.A.4
( n = 400 )
( c = 1.2816 ) (10% truncated)
Parameter TRUE EST. RMSE S.D. S.E. RATIO
p 0.500 0.501 0.035 0.035 0.042 0.836
p 0.500 0.495 0.039 0.039 0.044 0.904
p 0.500 0.493 0.043 0.043 0.044 0.962
p 0.500 0.491 0.042 0.041 0.051 0.801
p 0.500 0.500 0.047 0.048 0.051 0.933
p 0.500 0,495 0.052 0.052 0.053 0.982
a -0.500 -0.500 0.062 0.063 0.063 0.998 1,2
a 0.500 0.506 0.066 0.067 0.064 1.041 1,3
a -0.800 -0.790 0.065 0.065 0.068 0.964 2,2
a 0.000 0.016 0.063 0.061 0.060 1.013 2,3
a 0.000 -0.001 0.062 0.063 0.060 1.049 3.2
a 0.800 0.795 0.072 0.072 0.069 1.047 3.3
TRUE = TRUE PARAMETER VALUE
EST. = MEAN OF ESTIMATES
RMSE = ROOT MEAN SQUARE ERROR
S.D. = SAMPLE STANDARD ERROR
S.E. = MEAN OF ESTIMATED STARDARD ERROR
RATIO = RATIO OF S.D. TO S.E.
69
Table II.A.4
( n = 400 )
( c = 1.2816 ) (10% truncated)
Parameter TRUE EST. RMSE S.D. S.E. RATIO
pA 0.500 0.497 0.029 0.029 0.030 0.964
p 0.500 0.496 0.031 0.030 0.031 0.999
p 0.500 0.498 0.034 0.035 0.031 1.121
p 0.500 0.509 0.032 0.031 0.036 0.853
p 0.500 0.504 0.039 0.039 0.036 1.085
p 0.500 0.505 0.033 0.033 0.037 0.892 h
23
a -0.500 -0.505 0.037 0.037 0.044 0.840 1,2
a 0.500 0.490 0.047 0.046 0.045 1.025 1,3
a 一0.800 -0.796 0,043 0.044 0.048 0.912 2,2
a 0.000 -0.005 0.039 0.039 0.043 0.918 2,3
a 0.000 -0.007 0.044 0.044 0.042 1.049 3.2
a 0.800 0.801 0.051 0.052 0.049 1.062 3.3
TRUE = TRUE PARAMETER VALUE
EST. = MEAN OF ESTIMATES
RMSE = ROOT MEAN SQUARE ERROR
S.D. = SAMPLE STANDARD ERROR
S.E. = MEAN OF ESTIMATED STARDARD ERROR
RATIO = RATIO OF S.D. TO S.E.
70
Table II.B.4
( n = 400 )
( c = 0.0000 ) (50% truncated)
Parameter TRUE EST. RMSE S.D. S.E. RATIO
p 0.500 0.520 0.131 0.131 0.137 0.952
p 0.500 0.507 0.123 0.124 0.141 0.882
p3 0.500 0.498 0.145 0.147 0.149 0.984
p 0.500 0.467 0.145 0.142 0.146 0.979
p 0.500 0.524 0.134 0.133 0.139 0.960
p 0.500 0.496 0.145 0.146 0.146 1.002 h
23
a -0.500 -0.513 0.180 0.182 0.179 1.016 1,2
a 0.500 0.532 0.198 0.198 0.189 1.046 1,3
a -0.800 -0.797 0.199 0.201 0.194 1.037 2,2
a 0.000 0.061 0.193 0.185 0.178 1.036 2,3
a 0.000 -0.009 0.180 0.181 0.170 1.066 3.2
a 0.800 0.817 0.229 0.231 0.203 1.138 3.3
TRUE = TRUE PARAMETER VALUE
EST. = MEAN OF ESTIMATES
RMSE = ROOT MEAN SQUARE ERROR
S.D. = SAMPLE STANDARD ERROR
S.E. = MEAN OF ESTIMATED STARDARD ERROR
RATIO = RATIO OF S.D. TO S.E.
71
Table II.B.4
( n = 400 )
( c = 0.0000 ) (50% truncated)
Parameter TRUE EST. RMSE S.D. S.E. RATIO
p 0.500 0.518 0.093 0.092 0.093 0.989
p 0.500 0.501 0.096 0.097 0.099 0.982
p 0.500 0.497 0.102 0.103 0.103 0.995
p 0.500 0.520 0.111 0.110 0.100 1.103
p 0.500 0.492 0.094 0.094 0.103 0.916
p 0.500 0.497 0.101 0.102 0.107 0.948
a -0.500 -0.497 0.127 0.128 0.125 1.022 1,2
a 0.500 0.483 0.130 0.131 0.131 0.993 1,3
a
2 2 -0.800 -0.803 0.124 0.125 0.136 0.919
a 0.000 -0.016 0.113 0.113 0.125 0.902 2,3
a 0.000 -0.016 0.140 0.140 0.120 1.168 3.2
a 0.800 0.820 0.144 0.144 0.143 1.013 3.3
TRUE = TRUE PARAMETER VALUE
EST. = MEAN OF ESTIMATES
RMSE = ROOT MEAN SQUARE ERROR
S.D. = SAMPLE STANDARD ERROR
S.E. = MEAN OF ESTIMATED STARDARD ERROR
RATIO = RATIO OF S.D. TO S.E.
72
Table II.B.4
( n = 400 )
( c = 0.0000 ) (50% truncated)
Parameter TRUE EST. RMSE S.D. S.E. RATIO
p 0.500 0.500 0.071 0.072 0.067 1.079
p 0.500 0.491 0.070 0.070 0.069 1.008
p 0.500 0.508 0.060 0.060 0.069 0.873
p 0.500 0.507 0.072 0.073 0.071 1.023
p 0.500 0.510 0.066 0.066 0.071 0.931
p 0.500 0.508 0.073 0.074 0.075 0.984 h
23
oc -0.500 -0.495 0.077 0.077 0.089 0.871 1,2
a 0.500 0.512 0.106 0.106 0.093 1.137 1,3
a -0.800 -0.798 0.090 0.091 0.096 0.950 2,2
a 0.000 -0.016 0.094 0.094 0.089 1.060 2,3
a 0.000 -0.019 0.083 0.081 0.085 0.962 3.2
a 0.800 0.779 0.101 0.100 0.099 1.014 3.3
TRUE = TRUE PARAMETER VALUE
EST. = MEAN OF ESTIMATES
RMSE = ROOT MEAN SQUARE ERROR
S.D. = SAMPLE STANDARD ERROR
S.E. = MEAN OF ESTIMATED STARDARD ERROR
RATIO = RATIO OF S.D. TO S.E.
73
Table II.B.4
( n = 400 )
( c = 0.0000 ) (50% truncated)
Parameter TRUE EST. RMSE S.D. S.E. RATIO
p 0.500 0.499 0.039 0.039 0.047 0.824
p 0.500 0.493 0.046 0.046 0.048 0.949
p 0.500 0.500 0.048 0.048 0.050 0.969 3
p 0.500 0.491 0.042 0.041 0.051 0.801
p 0.500 0.500 0.047 0.048 0.051 0.933
p 0.500 0.495 0.052 0.052 0.053 0.982
a -0.500 -0.500 0.062 0.063 0.063 1.005 1,2
a 0.500 0.506 0.067 0.068 0.066 1.029 1,3
a -0.800 -0.791 0.065 0.065 0.068 0.962 2,2
a 0.000 0.016 0.062 0.061 0.063 0.971 2,3
a 0.000 -0.001 0.062 0.063 0.060 1.054 3.2
a 0.800 0.795 0.073 0.074 0.070 1.053 3.3
TRUE = TRUE PARAMETER VALUE
EST. = MEAN OF ESTIMATES
RMSE = ROOT MEAN SQUARE ERROR
S.D. = SAMPLE STANDARD ERROR
S.E. = MEAN OF ESTIMATED STARDARD ERROR
RATIO = RATIO OF S.D. TO S.E.
74
Table II.B.5
( n = 800 )
( c = 0.0000 ) (50% truncated)
Parameter TRUE EST. RMSE S.D. S.E. RATIO
p 0.500 0.500 0.030 0.031 0.033 0.914
p 0.500 0.498 0.032 0.032 0.034 0.928
p3 0.50Q 0.499 0.039 0.039 0.035 1.110
p 0.500 0.509 0.032 0.031 0.036 0.853
p 0.500 0.504 0.039 0.039 0.036 1.085
p 0.500 0.505 0.033 0.033 0.037 0.892
a -0.500 -0.506 0.037 0.037 0.044 0.835 1,2
a 0.500 0.488 0.047 0.046 0.046 1.000 1,3
a -0.800 -0.797 0.043 0.044 0.048 0.911 2,2
a 0.000 -0.006 0.040 0.040 0.044 0.899 2 y 3
a 0.000 一 0 . 0 0 8 0.045 0.045 0.042 1.066 3.2
a 0.800 0.799 0.053 0.053 0.050 1.078 3.3
TRUE = TRUE PARAMETER VALUE
EST. = MEAN OF ESTIMATES
RMSE = ROOT MEAN SQUARE ERROR
S.D. = SAMPLE STANDARD ERROR
S.E. = MEAN OF ESTIMATED STARDARD ERROR
RATIO = RATIO OF S.D. TO S.E.
75
References
Cohen, A. C. (1950)
Estimating the mean and variance of normal populations from singly
and doubly truncated samples. Ann. Math. Statist., 21, 557-569.
Cohen, A. C. (1955)
Censored samples from truncated normal distribution. Biometrika,
42, 516-519.
Cohen, A. C. (1957)
On the solution of estimating equations for truncated and censored
samples from normal populations, Biometrikay 44, 225-236.
Cohen, A. C. (1959)
Simplified estimators for the normal distribution when samples are
singly censored or truncated, Technometricst 1, 217-237.
Cohen, A. C.,Jr. (1961)
Tables for maximum-likelihood estimates; singly truncated and
singly censored samples, Technometrics, 3, 535-541.
76
Cohen, A. C. (1991)
Truncated and censored samples: theory and applications. New York:
Marcel Dekker, Inc.
Fisher, R . A. (1931)
Properties and applications of Hh functions. Introduction to
British A. A. S. Math. Tables, 1, xxxvi-xxxv.
Fisher, R. A. (1936)
Statistical Methods for Research Workers, 6th ed. , Oliver and Boyd.
Fletcher, R. (1979)
Practical Methods of Optimization, vol 1 : unconstrained
optimization. New York : John Wiley & Sons.
Galton, F. (1897)
An examination into the registered speeds of American trotting
horses with remarks on their value as hereditary data. Proc. Roy. Soc.
Lond., 62, 310-314.
Gupta, A. K. (1952)
Estimation of the mean and standard deviation of a normal
population from a censored sample, Biometrikat 39, 260-273.
77
Hald, A. (1949)
Maximum likelihood estimation of the parameters of a normal
distribution which is truncated at a known point.
Scandinavian Actuarial Journal, 32, 119-134.
Harter, H. L. and Moore, A. H. (1966)
Iterative maximum likelihood estimation of the parameters of normal
populations from singly and double censored samples, Biometrika, 53,
205-213.
IMSL Library (1975)
International Mathematical and Statistical Libraries (ed. 5).
Houston, Texas.
Johnson, N. L. & Kotz, S. (1972)
Distributions in statistics : Continuous multivariate
distributions. New York : John Wiley & Sons.
Lancaster, H. 0. & Hamdan, M. A. (1964)
Estimation of the correlation coefficient in contingency tables
with possibly nonmetrical characters. Psychometrikat 29’ 383-391.
78
Lee, S. Y; & Chiu, Y. M. (1990)
Analysis of multivariate polychoric correlation models with
incomplete data. British Journal of Mathematical and Statistical
Psychologyt 43, 145-154.
Lee, S. Y. & Jennrich, R. I. (1979)
A study of algorithms for covariance structure analysis with
specific comparisons using factor analysis. Psychometrika, 44, 99-113.
Lee, S. Y. & Poon, W. Y. (1986)
Two-step estimation of multivariate polychoric correlation.
Communication in Statistics, Theory S Methods, 16, 307-320.
Lee, S. Y. , Poon, W. Y. & Bentler, B. M. (1990)
A three-stage estimation procedure for structural equation models
with polytomous variables. Psychometrikay 55, 45-51.
Lee, S. Y., Poon, W. Y. & Bentler, B. M. (1992)
Structural equation models with continuous and polytomous
variables. Psychometrikat 57, 89-105.
Lee, S. Y. & Tang, M. L. (1992)
Analysis of structural equation models with incomplete polytomous
data. Communication in Statistics : Theory and Methods, 21(1), 213-232.
79
Luenberger, D. G, (1973)
Introduction to linear and nonlinear programming. Addison-Wiley
Pub. Co.
Martinson, E. 0. & Hamdan, M. A. (1971)
Maximum likelihood and some other asymptotically efficient
estimators of correlation in two way contingency table. Journal of
Statistical Computational and Simulation, 1, 45-54.
Olsson, U. (1979a)
On the robustness of factor analysis against crude classification
of the observations. Multivariate Behavioral Research, 14, 485-500.
Olsson, U. (1979b)
Maximum likelihood estimation of the polychoric correlation
coefficient. Psychometrika, 44, 443-460.
Pearson, K. (1901)
Mathematical contributions to the theory of evolution, viii: On the
correlation of characters not quantitatively measurable. Philosophical
transactions of the Royal Society of Londony Series A., 195, 1-47.
Pearson, K. (1902)
On the systematic fitting of frequency curves. Biometricsy
2, 2-7.
80
Pearson, K . & Lee, A. (1908)
On the generalized probable error in multiple normal correlation.
Biometricst 6, 59-69.
Poon, W . Y. & Lee, S. Y. (1987)
Maximum likelihood estimation of multivariate polyserial and
polyserial and polychoric correlation coefficients. Psychometrika,
52(3), 409-430.
Poon, W. Y. & Lee, S. Y. (1992)
Statistical analysis of continuous and polytomous variables in
several populations. British Journal of Mathematical and Statistical
Psychology, 45, 139-149.
Poon, W. Y. & Leung, Y. P. (1993)
Analysis of structural equation models with interval and polytomous
data. Statistics S Probability Letters, 17, 127-137.
Rao, C. R. (1973)
Linear statistical inference and its applications. New York : John
Wiley and Sons.
81
Schneider, H. (1986)
Truncated and censored samples from normal populations. New York :
John Wiley and Sons.
Stevens, W . L. (1937)
The truncated normal distribution, Appendix to paper by C. I. Bliss
on: The calculation of time mortality curve, Arm. Appl. Biol., 24,
815-852.
Wallace Year Book (1892-1896)
American Trotting Association, Vols. 8-12.
82