ad a072 595 north carolina univ at chapel hill … i / ad a072 595 north carolina univ at chapel...

21
I —,, I / AD A07 2 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP STATISTICS P/S ti/I A Sf lCT OF SEQUENTiAl. PROCEDURES FOR ESTIMATING THE LARSEN ICAN——ETCCU) AI$S 7S ft 4 CARROLl. APOSN—vs—nqe UNCLASSIFIE D MIC S ~~ —i1S ~ AFOSR—TN—7S-1 35 NI.

Upload: ngodiep

Post on 05-Jun-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

I—,,I / AD A072 595 NORTh CARO LINA UNIV AT CHAPEL HILL INST OP STATISTICS P/S ti/I

A SflCT OF SEQUENTiAl. PROCEDURES FOR ESTIMATING THE LARSEN ICA N——ETCCU)AI$S 7S ft 4 CARROLl. APOSN—vs—nqe

UNCLASSIFIED MIC S~~—i1S~ AFOSR—TN—7S-1 35 NI.

Page 2: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

‘ O L~ ~~~ ~~2.5I . .~ “~~~ uIII~~~

~~ ~~32

IlIII~2

I . I .~~

I 25

~~M~CHOCOPY RISOLUTION TES 1 ChPRT

NA! IONAL BUR[A U OF AN~~A~ L~

Page 3: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

Jfli ri .i~. .

~FC:~ -~~. V$-U30.4,

.•~~~~~~

.

~~~~~~~ •

~

A Study of Sequential Pro cedures for Bsti~stii*g the Larger Mean

8 Raymond J . Carrol 1

Institut, of Statistic, Mimso Sari es

Auiust 1978

D D CIDF PARTM~)ft OF ~~ATIST1C~

DChapel 0131, Ncrt~ CarolIna

~~ rs’ed for pub1io~ rs1nao~d~,trtbut jou anhl, it.d.J

Page 4: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

r

-r ~::.-~. ~~~~~~~

• • ~~~~~~~~

~S i

- . ?~~~t ~•;~ ;— _ • •4_~~ _ -.. r.. ’

~:~‘

• - -

-J .. .•

~ ~~~~~~~~~~~~~~~~~~ ~-~4 -

-,

•1~-~~ Y:~ ,

~~~ ~~~~~~~~~~ ~~~~~~~~~~~~~ —

•-

r\~

,,~~~ •~

-

~~~~~~~~~~~~~~~ ~•~ r-wç’~4.~~-a ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~a-k~ ~~~~~~~~ ~~~ - .

-: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

4 _• • , • •~ -- - • • 4.•• ’

-.

~~~~~~~~~ :~

- -

• - ~~~‘ • ~•

— . ‘ - ,~

_ • ; •

• ~~~• - ~~~- .• i~.

S

AIR FORCE OFi~ t”E ~

- • ~~~~~~~~w• ••

~ ~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~ ~~~ •

_

i n i o t . r)r 4~~ ~

Page 5: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

SE CURITY ASSIF I ATIO N OF THIS PAGE (ITh.n Del. EnI.r.d)

RE ORT DOCUMENTATION PAGE BEFORE COMPLETING FORM

8 1 GOVT A CCESSION NO. 3 RECIPIENT’ S C A T A L O G NUMBER

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -

5- TY P E OF REPOPT 6 PERIOD C O V E R E D

( 6 1 STUDY OF~~EQUENTIAL~~ROCEDURES FOR 1 ( ~ InterimESTIMATING ~HE LARGER MEAN . - ~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

~ ~~~-r~q~~p(5y~~— -

~~~ B- CONTRACT OR G R A N T NUMBER(S)

Raymond J ./Carroll / /AFOSR 75—2796 V

B. PERFORMING O R G A N I Z A T I O N NAME AND ADDRESS 10. PROGRAM ELEMENT , PROJ EC T , T A S K- A R E A & WORK UNIT NUMBERS

U niv e r s i t y of North Carolina /Department of Stat istics 6110 j1 2 4 A5Chapel H i ll , North Carolina 27514

I I . CONT ROLLIN G OFFICE NAME AND ADDRESS ,,,. — i~~~ r•r_ n . _~~~~~~

I/ I /) AugL...1 1$78 \A ir Force Off ice of Sc ient ific Research/NM~s,,~~, 11 H u M B E R OFP A G&Boil ing AFB , Was hington , DC 20332 Zc

14. MONITORING A GENCY NAME 4 ADDfl~~Sc~lI A1.1i.,.n I Iron, Controlling Of f ice) IS. SECURITY CLASS. (of this report)

A ~~~~~

UNCLASSIFIEDIS. . D E C L A S S I F I CA T I O N D OW N G R A D I NG

lB DISTRIB UTION S T A T E M E N T (of this R.porl)

Approved for public release; distribution unlimite -~~~ -

I~~

17. DI STRIBUTION ST. 4E NT (of ‘. i b . tr a ct .nt.,#d in Block 20, II diff.,.nt from R.port)

ix , •_.~_

__~~_—

~~-—-——-- —-

_ _ _ _ _ _ _ _ _ _ _ / - I I

IS. SUPPLEME NTA RY ~ES

IS K E Y WORDS (Continu, on r~ v~ rss aid. iln.c..sary and identify by block numb.,)

Monte—Carlo , Sequent ial Analys is, Rank ing and Select ion, Estimating theLarger Mean , Elimination , Mean Square Error

~O. A B S T R A C T (Contlnu. on ,.. ‘.r.. aid. Il n.c....ry and ld.niily by block numb. ,)

Blumenthal f t - 7~fconsidered the problem of sequential es t imat ion of thelargest of k norma l means when a bound is set on the acceptable mean square errorHe showed that his procedure results in only a small savings in sample size whencompared to a conservative fixed sample procedure for the case of known varianceCarroll (1 .~~)~

’criticized this procedure because it does not give the user theflex ibility of sampling selectively from the k populations . Carroll E.L9.7.&~ def in Ia procedure which early in the experiment eliminates from further consideratonthose populations which are obviously not associated with the largest mean and -w i ~ri~~ FORM

~~~~~I#~ 1 JA N 73 •~~lJ (L~ A.A .UNCLASSrFIED

~ 0 ~~ L7/

,2~~

4YCUR, .rY C L A S S I F I C A T I O N OF THIS P A G E (Wl,,n t )st a PnI.,.d%

Page 6: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

S c L U N I T Y CLASS IF ICAT ION OF THIS P A OE(WlI.n D.Ia Enf.r.d)

20. Abstract continued .

hence provide little relevant information; his theoretical large-samplecalculations indicate possible large savings in sample size with nocorresponding increase in mean square error. In this paper we contrastthe small sample behavior of the two approaches by means of a Monte-Carlosimulation study; both known and unknown variance are considered .

Acce.silon P~or 1NTIS GRA.&IDDC TABUnaEmouncodJustification

______

By__________________)lstribution/

.AyaiiabiUty CodesAvail and/or

1~ t special

LIH

UN CLA S SI Fl EDS E Cu R ITY C L A S S I F I C A T I O N OF T H I S PAGE(Wh.n f lat. Fnte,.dt

Page 7: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

—~~~~~~---- --- -- ---- -- ---

~~~~~~~~~~~~~~~~~~~~~~~~~~

A Study of Sequential Procedures for Estimating

the Larger Mean

by

Raymond J. Carroll *University of North Carolina

Key Words and Phrases: Monte-Carlo, Sequential Analys is, Rank ing andSelection , Estimating the Larger Mean, Elimination , Mean Square 1~rror

Th is work was supported by the Air Force Off ice of Scke~tlfj~Research under Contract AFOSR-75-2796,

D D C_____PIEH,

I__

APPIOY•d 1OX PU

~~

I1C r.1.Q5e ; U U~i~u~u u ~~

3 __... _ _ . - . - ..

- . -~- - - . :~~ •:3 - , ~~

Page 8: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

1. Introduction

Blumenthal (1976) considered the problem of sequential estimation of the

largest of k normal means when a bound is set on the acceptable mean square

error. He showed that his procedure results in only a small savings in sample

size when compared to a conservative fixed sample procedure for the case of

known variance. Carroll (1978) criticized th is procedure because it does no t

give the user the flexibility of sampling selectively from the k populations .

Carroll (1978) def ined a procedure which early in the experiment eliminates

from further consideration those populations which are obviously not associated

with the largest mean and hence provide little relevant information ; his theo-

retical large-sample calculations indicate possible large savings in sample

size with no corresponding increase in mean square error. In this paper we

contrast the small sample behavior of the two approaches by means of a Monte-

Carlo simulation study; both known and unknown variance are considered .

2. Known Variance

We are deal ing wi th independent identically distributed observations

X 1, X~ 2 , . . . from the ith population , i = 1,2. These are assumed to be normally

distr ibuted with means ~l and .i

2 and conwion variance a2 . The goal is to es t imate

the larger mean j 1~ =max( IJ 11p 2 ) with a prespecified bound on the mean square error

(M SE) r. The asymptotic theorems in Blumenthal (1976) and Carroll (1977) take place

as r -* 0. If A = max(~J 1,~j2 ) - minüj 1,j.i 2 ) = I~~1~~ 2I , the mean square error for

es t imat ing ~j by the larger sample mean based on n observations can be w r i t t e n

as

MSE = (o 2/n) H(n~ A/a)

Page 9: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

2

In order to contro l the MSE at a prespecified level r when a is known,

Blumenthal (1976) defined the following stopping time .

Def in i t ion 2.1 . After obtaining m observations from each population , estimate

A by ~lm

- X 2m 1 =

~m and define ~ (m) = inf {n > n0 : (a2/n) H (n ½ ~m”~’~

< r}

and de f ine

N B = infim > m0 : i~(m) < m }

Because for k=2 populations the risk is a decreasing function of the sample

size , one can show that

N B = inf{n > a0: > H(n 1 An/a) I

Carroll (1978) has shown that Blumenthal ’s procedure N B is inefficient in

that it does not make use of all the information available in the data . in

p a r t i c u l a r , it does not recognize cases when one population is obviously

associated with the smaller mean . Carroll (1978) defined a procedure which

attempts to recognize this situation and stop sampling (early in the experi-

ment) for populations which provide information about p*. The idea is based on

a techn ique of Swanepoel and Geertsema (1976) and can be described fully as2fol lows . We take a = 1 throughout.

St~ p # 1. Choose a small value a, which is the probability of falsely elimi-

nat ing the population associated with the larger mean . Let t ing 4 ( ~ ) be t he

standard normal distribution (density) function , def ine b = b (a) by

I - ~~b) + bd~(b) + 42(b)/ 41(b) = cx

Step #2. Define a stopping rule

NE = inf {n > n~ : ~n > 2

12 ((b 2 + log n)/n)~

Page 10: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

3

Step #3. Define the stopping time N(ct) as follows . For a given r, if [ .] is

the greatest integer function, we will take NB observations from each popula-

tion if N B ~~. NE (no elimination necessary). If N

E < N

B (elimination necessary),

we take

NE observations from the population with smaller mean

1l/rl+2 observations from the population with larger mean .

The total sample size is N (cx). Note that N(0.0) = 2NB, so Blumenthal ’s proce-

dure can be re~’d off from the case a = 0.0 . We chose n = [mm !!(x)/rJ-1.0 x

In order to investigate the small sample performance of N ( cx ) , we con duct ed

a Monte-Carlo experiment with 500 i tera t ions and various choices of cc,r and A.

In Tables 1-4 we record the following information .

(1) Average value of N( cx ) .

(2) N( cx)r

(3) Bias

(4) Mean square error divided by r . This should be no more than 1 if we

are to meet our goal of con t ro l l i ng MSE by th e boun d r.

The conclusion one can make from the information in Tables 1-4 j s obvious;

using elmination results in smaller (sometimes much smaller) samp le sizes with

no real increase in bias or mean square error .

3. Unknown Variance

For the case that the variance is unknown, the stopping time NB changes

only in that a2 is now estimated by

s2 = (n—l)~~ i~ l

(X .1 - x .2 - + 1 ) 2

Page 11: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

4

The stopping t ime NE is again suggested by Swanepoel and Geertsema (1976).

Eor a given a, we are going to take n0 > 5. Define

t = .2(1 + a2/4)5

I - F 1 (a) ÷ af4(a) = a

wh ere 1• 1 (f1 ) is the distribution (density) function of a t distribution with

four degrees of freedom . Define

h(a ,n) = ~(tfl)11”

- l)~

Then

NE = inf{n > n~ : IX in - 12n~

> h(ct,n)s }

The results of a Monte-Carlo experiment for this stoppir.~ time are given

in Tables 5-8.

The conclusion is the same as the case of variance known . Using elimina-

tion decreases samp le sire without materially changing bias or mean square error .

Acknowled gement

The tables were compiled by Mr. Robert Smith whose help is grateful ly

acknowled ged.

Page 12: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

5

References

Blumen thal , S. (1976). Sequential estimation of the largest normal mean when

the variance is known . Ann. Statiat. 4 , 1077-1087.

Carroll , R.J. (1978). On sequential estimation of the largest norma l mean when

the variance is known. Institute of Statistics Mimeo Series #1133,

University of North Carolina at Chapel Hill. To appear in Sankh~ia, Series A.

Swanepoel , J.W.H. and Geertsema, J.C. (1976). Sequential procedures with elimi-

nation for selecting the best of k normal populations. S. •lfr. Statist. J.

10, 9-36.

Page 13: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

6

Table 1

Average sample size when the variance is known.

A=2.00 A=l .00 _____

a = .05 r = .10 18.5 1’~.5 19.0

r = .05 28.1 35.4 37.2

r = .02 57.7 70.2 91.4

r = .01 107.3 120.4 183.8

a = .01 r = .10 18.4 19.7 19.0

r = .05 28.9 37.4 37.4

r = .02 58.8 76.6 92.7

r = .01 108.7 127.3 188.4

a = .00 r = .10 21.1 19.9 19.0

r = .05 41.9 40.2 37.5

r = .02 102.0* 101.4 93.0

r = .01 202.0* 202.0* 189.35

* denotes maximum possible sample size .

Page 14: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

7

Table 2

Average sample size times r when the variance is known .

______ A=1.0O A= .20

a = .O~ r = .10 1.85 1.95 1.90

r = .05 1.41 1.77 1.86

r = .02 1.15 1.40 1.83

r = .01 1.07 1.20 1.84

a = .01 r = .10 1.84 1.97 1.90

r = .05 1.45 1.87 1.87

r = .02 1.18 1.53 1.85

r = .01 1.09 1.27 1.88

a = .00 r = .10 2.11 1.99 1.90

r = .05 2.10 2.01 1.87

r = .02 2.04 2.03 1.86

r = .01 2.02 2 .02 1.89

Page 15: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

8

Table 3

Bias )< io2 when the variance is known .

~=2.00 A= l.00a .05 r = .10 i.5 .6 10.6

r = .05 .6 .4 5.0

r = .02 ~.3 -.3 1.1

r = .01 - .1 - .5 - .3

a = .01 r = .10 .9 1.1 10.7r = .05 .6 .3 5.0r = .02 - .3 - .3 1.1r = .01 ~ .4 - .5 - .2

a = .00 r = .10 .8 i.~ 10.7r = .05 .5 .6 5.1r = .02 - .3 - .3 1.1r = .01 ~ .5 - .5 - .2

Page 16: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

9

Table 4

Mean square error divided by r when the variance is known.L

A=2.00 A= 1.00 _____

a = .05 r = .10 .89 .96 .86

r = .05 .88 .92 .76

r = .02 .92 .92 .85

r = .01 1.02 1.02 .95

a = .01 r = .10 .91 .99 .87

r = .05 .88 .93 .78

r = .02 .92 .93 .84

r = .01 1.02 1.02 .94

a = .00 r = .10 .96 1.02 .87

r = .05 .93 .97 .78

r = .02 .93 .93 .85

r = .01 1.01 1.01 .94

Page 17: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

10

Table 5

Average sample size when the variance is unknown .

A=2.00 A=1.00 _____

a = .05 r = .10 23.7 25.0 22.8

r = .05 34.5 50.4 47.9

r = .02 64.4 89.5 126.7

r = .01 114.4 138.3 261.8

cc = .01 r = .10 25.4 25.0 22.7

r = .05 41.1 53.4 48.0

r = .02 70.4 109.7 127.3

r = .01 120.4 156.5 263.8

cc = .00 r = .10 25.4 25.0 22 .7

r = .05 53.9 53.8 47.9

r = .02 97.8 139.6 127.3

r = .01 146.7 241.5 263.8

* indicates maximum possible sample size obtained .

Page 18: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

11

Table 6

Average sample size times r when the variance is unknown .

A =2. 0O A=l.0 0 _____

cx = .05 r = .10 2.37 2.50 2.28

r = .05 1.72 2.52 2.40

r = .02 1.29 1.79 2.53

r = .01 1.14 1.38 2.62

a = .01 r = .10 2.54 2.50 2.27

r = .05 2.06 2.67 2.40

r = .02 1.41 2.19 2.55

r = .01 1.20 1.56 2.64

a = .00 r = .10 2.54 2.50 2.27

r = .05 2.69 2.69 2.40

r = .02 1.96 2.79 2.55

r = .01 1.47 2 .42 2.64

Page 19: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

12

Tabl e 7

Bias x io2 when the varian ce is unknown .I...

A=2.00 A=1.00= .05 r = .10 -1.9 -2.9 6.4

r = .05 .0 .1 2 .7

r = .02 - .6 - .1 .7

r = .O1 - .2 - .2 - .1

cc = .01 r = .10 -1.9 -2.6 6.7

r = .05 .2 -.4 2.7

r = .02 - .6 .5 .7

r = .01 - .2 - .2 - .1

a = .00 r = .10 -1.7 -2 .3 6.9

r = .05 - .3 - .4 2 .9

r = .02 - .4 .0 .7

r = .01 - .2 .7 - .1

Page 20: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

13

Table 8

Mean square error divided by r when the variance is unknown .

A=2.00 A=l .00 _____

= .0 5 r = .10 .87 .87 .71

r = .05 .81 .74 .55

r = .02 .96 .88 .62

r = .01 .93 .92 .64

a = .01 r = .10 .86 .89 .72

r = .05 .78 .70 .55

r = .02 .96 .81 .62

r = .01 .93 .90 .63

a = .00 r = .10 .86 .89 .73

r = .05 .72 .72 .58

r = .02 .92 .66 .62

r = .01 .93 .77 .63

Page 21: AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL … I / AD A072 595 NORTh CAROLINA UNIV AT CHAPEL HILL INST OP ... In this paper we ... Because for k=2 populations the risk is a decreasing

A

-.

- _ _

_ _ _

-F

=1~

Ii..