ww . .

5.
w

. 29.1 ; a
..,Cnn...i “rm“. ,
. .. :
anagréa

3». _

IN... 5-4 i .
. 131* .L
ﬁx»

, Ram-.mmmv
5.. u ,
a‘

.
Iv. ..
A.3..,..u..w,...uuuwza.u. .
, J a? 9,
was : .5."

m i
... $5.“?

. 3% Rama” v: 4.. ..
”1.5.1.1.. u.hm..ﬁ..é+h ﬁrm.

7.} .Hﬁwﬁ..e a. was

. £1: ; . a. ,
géh»

a.

a.

. 1.;
Chat-.- «1
. .. . .:

i
. . 14:3 93 x
Fit! 355
It a?
1..
. S “La
.,J....ihl .:
.lst
51:?

u-xq' .‘

up

 

if?“ _

. . ‘ ‘ , . ‘ ‘ V. Fiﬁ-h. 32...“. , .FwﬁﬁthHME
L .ux‘mxhhvvmf trans. \ C 4. .. .w, y .. f» r. .33 LEW.»
3,... avagiahﬁ; .

. a

”ii E53 L

I f. .7," ,‘1
c)’ t/ v 0K

This is to certify that the

dissertation entitled

Interval Estimation for the Difference of Two Binomial
Proportions in Non-adaptive and Adaptive Designs

presented by

Yichuan Xia

has been accepted towards fulﬁllment
of the requirements for

 

 

Vq "

x‘ -
QM W
CO- Major professor I

nni Pa e
, ”ﬁg/J
Date July 12, 2002 /‘f‘ [-

Co-Major professor
Vincent Melfi

MS U is an Afﬁrmative Action/Equal Opportunity Institution 0-12771

 

 

LIBRARY
Michigan State
University

 

 

 

PLACE IN RETURN Box to remove this checkout from your record.
To AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

6/01 c:/CIRC/DateDue.p6&p. 15

Interval Estimation for the Difference of Two Binomial Proportions in
Non-adaptive and Adaptive Designs
By

Yichuan Xia

A DISSERTATION

Submitted to
Michigan State University
in partial fulﬁllment of the requirements

for the degree of

DOCTOR OF PHILOSOPHY

Department of Statistics and Probability

2002

ABSTRACT
Interval Estimation for the Difference of Two Binomial PrOportions in

Non-adaptive and Adaptive Designs
By

Yichuan Xia

When comparing two treatments with dichotomous responses, the difference
in proportions of successful responses of the two groups is often of primary interest.
Conﬁdence intervals are typically provided to estimate the treatment difference.
This interval estimation problem for both non-adaptive and adaptive designs is
studied in the dissertation.

Several methods of constructing conﬁdence intervals for the difference of two
proportions are evaluated in non-adaptive designs. We begin by exploring the poor
performance of the most widely used conﬁdence interval, the Wald interval. We
show that the poor behavior mainly results from its inappropriate center: the cov-
erage performance can be improved greatly by simply recentering the Wald interval.
We then derive a formula which gives smooth approximation of the coverage prob-
ability of the Wald interval. Regardless of oscillation, this approximation shows
how much the coverage probability of the Wald interval falls below the nominal
level. Our analysis demonstrates the Wald interval is rather anti-conservative and
often behaves much worse than pe0ple’s expectation. As alternatives, the Wald in-
terval with continuity correction, two conﬁdence intervals with adjusted centers(a
Bayesian interval derived from Beta priors and Agresti-Coull’s adding 2 successes

and 2 failures interval) and the proﬁle likelihood based conﬁdence interval are eval-

uated. We compare both their coverage performance and expected lengths with
those of the standard Wald interval. To replace the Wald interval, intervals with
adjusted centers are recommended. Adaptive designs are gaining more attention
nowadays. For adaptive designs, the validity of constructing conﬁdence intervals
discussed in non-adaptive designs is veriﬁed. We evaluate the performance of those
conﬁdence intervals in two general categories of adaptive designs: allocation adap—
tive designs and response adaptive designs. We develop theorems concerning the
connections between the coverage performance and expected lengths of conﬁdence
intervals based on non-adaptive and allocation adaptive designs. The theorems
suggest that the Wald interval does not behave satisfactorily and that the inter-
vals with adjusted centers should be used in allocation adaptive designs. Extensive

simulation supports the same conclusion in response adaptive designs.

ACKNOWLEDGMENTS

I would like to express my deepest gratitude to my dissertation advisors, Pro-
fessor Vincent Melﬁ and Professor Connie Page, for your constant guidance, gen-
erous support and extreme patience during the writing of this dissertation. Your
dedication and contribution to statistics have been my main source of inspiration
and encouragement during my research. I am extremely grateful you suggested
that I work in adaptive designs. I sincerely appreciate the time you put into our
weekly meetings and the help you offered whenever I needed it. Your willingness
to help at anytime encouraged me to keep going no matter what difﬁculties I met
during the research of this dissertation. I also thank you for all the time you Spent
on proofreading my dissertation drafts. I understand that it was not comfortable
to read and correct a non-native speaker’s tedious and stiff writing on statistics. If
time could ﬂow backwards, I would not hesitate even a second to ask you to be my

thesis advisors again.

I would also like to thank Professor Roy Erickson and Professor Habib Salehi

for serving on my thesis committee. Your help is highly appreciated.

My special thanks go to Professor Roy Erickson and Professor Hira Koul who
taught me Theory of Probability and Theory of Statistics during my ﬁrst year of

graduate study. The two courses turns out to be very important in my research. I

iv

also thank Professor Connie Page for training me to be a (good) statistical consul-
tant which helped me a lot on my job searching and will beneﬁt my future career.
I also owe Professor Vincent Melﬁ many thanks for teaching me two courses: Mod-
ern statistics and Sequential analysis and guiding me on the usage of Later and
C/C++ when I was in trouble with them.

I will not forget to express my sincere gratitude to Professor Stapleton. Thank
you for being a true friend for so many years. Those scenarios when I got along
with you are so lovely. I will not forget your warm but ﬁrm encouragement when I
faced some trouble just before and after I came to the department; your suggestions
on how to be a (good) teaching assistant; your correction on my English errors in
emails; the jokes we played back and forth ......

I also thank the department for offering me the assistantship for four years.

At last, but not the least, I want to thank all the professors and friends who
ever helped me during my stay at Michigan State University. Life here became

easy, meaningful and colorful because of you.

List of Figures

2.1

2.2

2.3

2.4

2.5

2.6

3.1

3.2

Exact coverage probability of the nominal 95% Wald interval for
p1 = 09,102 = 0.1 and n1 = n2 = n: 6 to 100 ............
Exact coverage probability of the nominal 95% Wald intervals for
p1=p2=0.5andp1=p2=0.9withn1 =n2=n=20to 100 ...
Exact coverage probability of the nominal 95% Wald interval atnl =
77.2 = 10 and 121 = 10, 122 = 100 With p2 = 0.9 and p1 = 0.8 to 0.999
with jump size 0.001 ..........................
Exact and SE approximate coverage probabilities of the nominal 95%
Wald intervals for p1 = 0.9, p2 = 0.1 and 7n = n2 = n ........
Exact and SE approximate coverage probabilities of the nominal 95%
Wald intervals for p1 = 0.8, p2 = 0.3 and 17.1 = n2 = n ........
Exact and SE approximate coverage probabilities of the nominal 95%

Wald intervals for p; = 0.8, p2 = 0.3 and 722 = 2m .........

Exact coverage probability Boxplots of some 95% nominal intervals
Comparison of exact coverage probabilities for p2 = 0.5, n1 = n2 =

20 at 95% nominal level ........................

13

14

15

33

34

 

3.3

3.4

3.5

3.6

3.7

3.8

4.1

4.2

4.3

4.4

4.5

Comparison of exact coverage probabilities for p1 = p2 = 0.01 to
0.99,n1 = 112 = 20 at 95% nominal level ................
Comparison of exact coverage probabilities at p1 = 0.7, p2 = 0.5,
11;» = 211.; at 95% nominal level .....................
Comparison of exact coverage probabilities at p1 = 0.9, p2 = 0.8,
n2 = 2111 at 95% nominal level .....................
Comparison of approximate expected lengths of some conﬁdence in-
tervals for p2 = 0.5 and m = 17.2 = 25 .................
Comparison of approximate expected lengths of some conﬁdence in-
tervals for p2 = 0.1 and m = 10, 112 = 20 ...............
Comparison of coverage probabilities for p1 = 0.21 to 0.99, 172 =

p1 — 0.2, n1 = n2 = 20 at 95% nominal level .............

Coverage probability Boxplots of some 95% nominal intervals upon
RPW(1,1) ................................
Expected Length Boxplots of some 95% nominal intervals under
RPW(1,1) ................................
Coverage probabilities of three 95% nominal intervals for n = 20 and
1),, = 0.5 under RPW(1,1) .......................
Expected lengths of three 95% nominal intervals for n = 20 and
p4 = 0.5 upon RPW(1,1) ........................
Coverage probabilities of three 95% nominal intervals for n = 20 and

12,, = 0.9 upon RPW(1,1) ........................

45

46

48

52

53

56

72

73

74

74

75

4.6

4.7

4.8

Expected lengths of three 95% nominal intervals for n = 20 and
pA = 0.9 upon RPW(1,1) ........................ 76
Coverage probabilities of three 95% nominal intervals for n = 10- 100
and 12,. = 0.7, p3 = 0.4 upon RPW(1,1) ................ 78
Expected lengths of three 95% nominal intervals for n = 10 — 100

and p4 = 0.7, p3 = 0.4 upon RPW(1,1) ................ 78

viii

TABLE OF CONTENTS

1 Literature Review 1
1.1 Introduction ............................... 1
1.2 Some Conﬁdence Intervals and Comparisons ............. 2
1.3 Application ............................... 8

2 Wald Interval Estimation for the Difference of two Binomial Pro-

portions 9
2.1 Introduction ............................... 9
2.2 Coverage Pr0perti£s of the Wald Conﬁdence Interval ........ 10
2.3 A Reason for Inadequate Coverage .................. 18

2.4 A smoothing formula obtained by Edgeworth Expansion methods . 20

3 Interval Estimation for the Diﬁ'erence of two Binomial Proportions 36

3.1 Introduction ............................... 36
3.2 Some Alternative Intervals ....................... 38
3.2.1 The Wald interval with continuity correction ......... 38
3.2.2 Intervals with adjusted center ................. 38

3.2.3 The proﬁle likelihood based intervals ............. 40

3.3 Comparison of Intervals with Explicit Forms ............. 40
3.3.1 Coverage Pmperties ...................... 41
3.3.2 Expected Lengths ........................ 47

3.4 Comparison of the Wald Interval and all Proposed Alternatives . . 54

4 Interval Estimation for the Difference of Two Binomial Propor-

tions in Adaptive Designs 59
4.1 Introduction ............................... 59
4.2 Notation and Some Adaptive Designs ................. 60
4.2.1 Some Allocation-Adaptive Designs .............. 62
4.2.2 Some Response-Adaptive Designs ............... 63
4.3 The Conﬁdence Intervals in Adaptive Designs ............ 64

4.4 Comparison of Conﬁdence Intervals in Allocation Adaptive Designs 66
4.5 Comparison of Conﬁdence Intervals in Response Adaptive Designs . 71

4.6 Conclusion ................................ 79

Bibliography 80

Chapter 1

Literature Review

1 .1 Introduction

In clinical trials and in industrial work, to compare a new treatment with a stan-
dard (control) treatment, the difference in probabilities of successful responses of
the two groups is often of primary interest. Conﬁdence intervals are typically pro-
vided to estimate the treatment difference. There exist quite a lot of methods for
constructing conﬁdence intervals for the difference of the two success probabilities.

The most widely used conﬁdence interval, the Wald interval, which is an
asymptotic conﬁdence interval computed based on a normal approximation, does
not behave satisfactorily. In this dissertation, the poor performance of the Wald
interval and the reason for the poor coverage performance are explored in Chapter
2. In Chapter 3, some selected methods for constructing conﬁdence intervals for
the difference of two treatment proportions are evaluated and compared with the

Wald interval. We restrict attention to non-adaptive designs in these two chapters.

1

Nowadays, adaptive designs, in which the allocation of next subject to a certain
treatment depends on accumulating information, is more widely used. The interval
estimation problem is studied for adaptive designs in Chapter 4.

In this dissertation, we use three types of “coverage” probabilities: exact, ap-
proximate and nominal coverage probabilities. The exact coverage probability of a
conﬁdence interval is the actual coverage probability of that interval. The approxi-
mate coverage probability is an approximation of the coverage probability. We will
be using an Edgeworth expansion to derive the approximate coverage probability
of the Wald interval. The nominal coverage probability is its named conﬁdence
level. For example, a 95% conﬁdence interval has nominal coverage probability
0.95 though its exact coverage probability might be different from the claimed level.
Sometimes we don’t specify whether a coverage probability is exact, approximate
or nominal if it is obvious in context.

Before presenting our ﬁndings, it is useful to give a survey of related literature.

1.2 Some Conﬁdence Intervals and Comparisons

Though we are interested in conﬁdence intervals for the difference of two pro-
portions, it is worthwhile to mention two papers on conﬁdence intervals for one
proportion which have impacted our study.

Let us begin by introducing the Wald intervals for one proportion and for the
difference of two proportions.

Let X denote the number of successes from n Bernoulli trials with success

probability p and let :3 denote the sample proportion. For two independent treat-
ments, let X1, X2 denote the numbers of successes from treatment 1 and treatment
2 respectively, so that X,- ~ Bin(n,~, p.) for 2' = 1, 2. Let 2,, represent l—a percentile

of the standard normal distribution.

1. The 100(1 - a)% Wald conﬁdence interval for p is
15 i Za/2 150“ ﬁlm

2. The 100(1 — a)% Wald conﬁdence interval for p1 - 132 is

. . ‘ 1—‘ ‘ 1—“
pl—pgiza/2¢pl( P1)+P2( P2)

 

 

n1 n2

One way to derive these conﬁdence intervals is to invert large sample Wald
tests, which evaluate standard errors at the maximum likelihood estimates. For
instance, the interval for p is the set of po values having P-value exceeding a in

testing

Hozpzpoversus Hazpyépo

using the approximately normal Wald test statistic. The Wald intervals are some-
times called the standard intervals. Although these two intervals are simple and
applied most often, a considerable literature shows that they behave poorly.
Brown et al. (2002) consider conﬁdence intervals for one proportion. They
notice there is a widespread misconception that the problems of the Wald interval
are serious only when p is close to either boundary, or when the sample size n is
rather small. Brown et al. (2002) shows that virtually all of the conventional wisdom

3

and popular prescriptions are misplaced because the Wald interval has a pronounced
systematic bias due to its inappropriate center. They derive two-term Edgeworth
expansions as an analytical tool to compare and rank the some selected intervals
with regard to their coverage probabilities. They also give the two-term expansions
for the expected lengths of the Wald interval and some alternative intervals.
When deriving the two-term Edgeworth expansions for the coverage probabil-
ities of those intervals for p, Brown et al. (2002) express all the conﬁdence intervals

in a uniﬁed form:

1/2 ‘_
{ZELSWSM},
p(1-p)

where l. and u. are not related to the sample proportion 15. Since the statistic
121/2 (i) — p) / m has lattice structure, a direct application of a Theorem of
Bhattacharya and Ranga Rao (1976) gives the desired Edgeworth expansions. But
this method does not apply in two treatment problems.

Brown et al. (2002) Show that the Wilson conﬁdence interval for p, due to
Wilson (1927), behaves much better than the standard interval. The Wilson interval
is based on inverting the test with standard error evaluated at the null hypothesis
value, which is the score test approach. Given level of signiﬁcance (1, this interval

contains all p0 values for which

 

 

 

 

n1/2 « _
lP Pol < Za/2
100(1-pa)
and has the form
X + 22 /2 711/22 22
052 i 65/2 15(1 -13) + “/2. (1.2.1)
n+za/2 n+z.o‘/2 4n

This interval turns out to behave better than the Wald interval for p.

Some conﬁdence intervals for p1 - p2 are motivated by the Wilson interval for

Agresti and Coull (1998) noticed that (1.2.1) can be rewritten in the following

way:

 

1 TI. 1 22/2
12,, —— ‘1-‘ —— +- ——‘-’-— .
/2 rz+z;‘:/2 p( p) n+z2/2 4 n+2?”2

Hence, the midpoint of the Wilson interval is a weighted average of 15 and 1/2,
and it equals the sample proportion after adding 32/2 pseudo observations, half
of each type. The square of the coefficient of 20/2 in this formula, is a weighted
average of the variance of a sample proportion when p = [3 and the variance of
a sample proportion when p = 1/2, using 72 + .22” in place of the usual sample
size n. Motivated by this decomposition of the Wilson interval, Agresti and Caffo
(2000) proposed adding 4 pseudo observations, one success and one failure from
each treatment, to get the conﬁdence interval for p1 — p2,

Also motivated by the Wilson conﬁdence interval for p, Newcombe (1998)
proposed a method that performs substantially better than the Wald interval. This
conﬁdence interval results from the single-sample score intervals for p1 and p2.

Speciﬁcally, let I,- < u,- be the roots for p, in

for z' = 1, 2. Newcombe’s hybrid score 100(1 -— a)% interval is deﬁned as

 

 

. , lI—l u l—u . . 1_ 11—1
(p1‘p2)"za/2\/l—(——ll+‘l(‘—2)a(1h—P2)+za/2\/ul( “0+ 2( 2)

n1 n2 n1 n2
Unlike quite a lot of other conﬁdence intervals, the Newcombe’s interval is not
symmetric around 131 -152. It has margins of errors different from those of the Wald
interval.

Newcombe (1998) evaluate eleven methods of constructing conﬁdence inter-
vals for p1 - 132 through simulation. Some of those conﬁdence intervals have rela-
tively complicated expressions compared to intervals discussed so far such as the
Wald interval and the Agresti-Coull interval. Newcombe (1998) suggests replacing

the Wald interval with the Newcombe hybrid score interval.The proﬁle likelihood

method (introduced in detail later), involving

~

{A E (-1~ 1) = 2001.132) - 1(A)) S Xi(a)},

where A = p1 — p2, A = 131 — 132 and 1 denotes the log-likelihood function of (A, p2),
RA) 2 rgaxl(A,p2), is also considered in his paper. Newcombe (1998) shows
this interval has the best coverage and location prOperties among all the eleven
conﬁdence intervals while it displays an undesirable anomaly. Suppose X1, 721 and
152 are held constants, while in —> 00 through values which keep X2 integer valued.
One expects that a good method would produce a sequence of intervals, each nested
within its predecessor, tending asymptotically towards some corresponding interval
for the single proportion, shifted by the constant 132. Yet the proﬁle likelihood
method gives a sequence of lower limits which increase up to a certain 722, but
subsequently decrease, violating the above consideration.

6

\

) .

Agresti and Caffo (2000) evaluate the Wald interval, the Agresti-Coull interval,
a Bayesian interval(considered in detail in Chapter 3) and Newcombe’s hybrid score
interval.They ﬁnd their exact coverage probabilities and mean expected lengths at
some speciﬁc pairs (711,722) with p1 and p2 taking values from the unit square.
It is shown that the Agresti-Coull interval has better coverage performance than
N ewcombe’s hybrid score interval.

The above results all involve non-adaptive designs with constant sample sizes.
The sample sizes in adaptive designs are not constants but random variables. To
distinguish from non-adaptive design, we use N,(k), S,(k) to denote the sample size
and the number of successes from treatment z' for 2' = 1, 2.

Wei et al. (1990) studied the interval estimation problem for p1 — p2 and a
speciﬁc adaptive design: randomized play-the-winner design, which is due to Wei
and Durham (1978) and tends to assign more study subjects to the better treatment.
Wei et al. (1990) developed a network algorithm to ﬁnd the joint distribution of
(N1(k), 6306) + 32(k), 51(k)), through which exact conﬁdence intervals for p1 — p2
could be derived. The authors suggest using this method when the sample size n
is small or moderate. There are two disadvantages of this method which limit its
wide application . First, though the network algorithm can be easily modiﬁed to
accommodate other adaptive designs, the computation of the joint distribution of
(N1(k), 530:) + 52(k), Sl(k)) is not very easy. Second, the exact conﬁdence interval
does not have an explicit form. Wei et al. (1990) also evaluated the Wald interval

and the proﬁle likelihood based interval for p1 — p2 with randomized play-the-

winner designs though simulation. They found that the Wald interval was rather
anti-conservative. The proﬁle likelihood method was recommended in Wei et al.

(1990) for a moderate-sized or large sample design.

1 .3 Application

All the conﬁdence intervals studied in this dissertation are based on asymptotic
theory. However, some conﬁdence intervals behave rather well even when sample
sizes are small or moderate. Therefore, one may use conﬁdence intervals that will
be suggested for a broad range of sample sizes.

Moreover, the simplicity of all the conﬁdence intervals except the proﬁle like-
lihood interval is an attractive feature from the point of view of applications.

People who wish to perform adaptive designs have a wide variety of adaptive
allocation procedures at their disposal. And the corresponding asymptotic theories
of quite a lot adaptive designs are also reported. There are numerous references
on the interval estimation problem for p1 — p2. But most of them concentrate on
non-adaptive designs. We hope this dissertation will be useful for constructing good

conﬁdence intervals for p1 — p2, especially in adaptive designs.

Chapter 2

Wald Interval Estimation for the
Difference of two Binomial

Proportions

2.1 Introduction

Interval estimation for a single binomial prOportion and the difference of two bino-
mial proportions are used extensively in practice and have been widely discussed
in the literature. It is well known that the standard Wald intervals behave poorly.
Brown et al. (2002) focused on the interval estimation for one binomial proportion
and explored the reason why the coverage probabilities of the Wald interval for one
binomial proportion are often far less than the nominal level even when the sample

size is moderate or quite large. They evaluated the approximate coverage proba-

bilities and expected lengths of the Wald interval for one binomial proportion and
its candidate replacements. Inspired by their article, we studied interval estimation

for the difference of two binomial proportions.

In Section 2, we focus on the poor performance of the Wald interval of the
difference of two binomial proportions by exhibiting its behavior through a few
examples. As will be shown, the Wald interval for the difference of two binomial
proportions, deﬁned in (2.2.1), shares some similar properties to those of one bi-
nomial proportion addressed in Brown et al. (2002). For example, the discreteness
of the Binomial distribution leads to oscillatory coverage probabilities and the true
coverage probabilities often differ signiﬁcantly from the nominal level even when
the two pr0portions are near 0.5 and sample sizes are moderate or large. We also
note that unbalanced sample sizes, when the two proportions are close, among some
other issues, may have severe effects on the coverage probabilities. In Section 3, we
explore the reason for the poor performance of the Wald interval. Section 4 deals
with a smooth approximation of the coverage probability of the Wald interval by

applying Edgeworth Expansion methods.

2.2 Coverage Properties of the Wald Conﬁdence

Interval

Let X1 and X2 be two independent random variables, X,- ~ Bin(n,-,p,-) , where

p,- 6 (0,1) forz' =-- 1,2. Let :3, = Xi/ng. As mentioned in Chapter 1, the 100(1—a)%

10

Wald conﬁdence interval for p1 -- p2 is

- 1_ . . 1_ .
21-miz%\/p-———l( “HM, (2.2.1)

 

n1 712
where 222:. denotes the 100( — %) percentile of the standard normal distribution.
We will use Chi/(111,712, X1,X2) to denote this interval and CPw(n1,n2,p1,p2) to

denote its exact coverage probability. Then

CPW(nli n21p11p2)

=P{p1 —' p2 E CIw(n1,n2,X1,X2)} (2.22)
In 2 Tl 3 -:r: n :r: _

= Z Z ( 1)p11(1_p1)(n1 1)( 2)1’220 ’P2)(n2 I"01.41.051.332)
11:01:220 $1 $2

where Ap = {(xl,z2)|p1-p2 E Chi/(711,112, 3:1, (132)}. We will present a few examples
to show that the coverage probabilities of the Wald interval are typically lower than
its nominal level.

The probabilities reported in the following plots and tables unless otherwise
speciﬁed, are the result of exact probability calculations produced in S-Plus. Instead
of using the algorithm given in equation 2.2.2, which contains two loops, we apply

a more efﬁcient one:

CPW(nla n21plap2)

"2
=ZP{Lw(n1,n2,pl,p2,i) < X1 < UW(nlin2,P11P2,i)}P{X2 = Z} (2.23)
i=0

where Lw(n1,n2,p1,p2,z') < Uw(n1,n2,p1,p2,i) are two proper real roots of the

equation as a function of X1:

 

X. X21 \/X.(1—X1) 1(1—2‘)

“"pFE‘E 2 (12.13 W

11

The algorithm given in (2.2.3) contains only one loop.

Example 1. Figure 1 plots the exact coverage probabilities of the nominal 95%
standard Wald interval for p1 = 0.9, p2 =: 0.1, 111 = 712 = 71 when n varies from 6 to

100. Two important features of the Wald interval are exhibited in the ﬁgure.

First, there exists a very strong oscillation which is due to the discreteness of
the binomial distribution. Therefore, the coverage probability does not at all get
steadily closer to the nominal level though the magnitude of the oscillation tends
to decrease. For example, at n = 14, the coverage probability is 0.942, but it is
only 0.808 at n = 15. Even when n is as large as 67, the coverage probability is
only 0.914. When n = 100, the coverage probability is still not satisfactory, it is

only 0.927. Only after n 2 300 does the coverage probability ﬂuctuate above 0.94.

Second, the Wald interval is anti-conservative: the coverage probabilities at
most values of n are less than the nominal level. Among all the coverage probabil-
ities (for n = 6 up to n = 100), only three reach the nominal level. There are 50

coverage probabilities less than 0.93 and 31 less than 0.92.

Similar to the phenomenon pointed out in Brown et al. (2002) for one sample
interval, the existence of the oscillation of the coverage probability makes some
quadruples lucky and some unlucky. For instance, the quadruple (n1, 112,191,122) =
(53, 53, 0.9, 0.1) is lucky, with the exact coverage probability of the Wald interval
equal to 0.9501. But (n1,n2,pl,p2) = (54, 54,0.9, 0.1) is unlucky, with the exact
coverage probability 0.9132. Similarly, changing the proportions may result in some
lucky or unlucky quadruples as well. Further, the lucky or unlucky quadruples

12

Figure 2.1: Exact coverage probability of the nominal 95% Wald interval for p1 =

0.9,p2=0.1 and n1=n2=n=6to 100

 

 

 

 

 

 

 

 

u:
02-4
0
g
33
0.9
m
a)
f
>
O
03,
c:

 

 

 

20 40 60 80 100

are not predictable. There is no obvious pattern to follow on telling whether a

quadruple is lucky or not.

Example 2. Suppose the conﬁdence level is 95%, 111 = 712 = n and It varies from 20
to 100. Compare the coverage probabilities in two cases. Case one, p1 = 19;; = 09;
case two, 191 = 102 = 0.5. Conventional wisdom might suggest that the coverage
probabilities in case 2 would be higher than those in case one. But this is not true.
Figure 2 plots the coverage probabilities in the two cases. It is surprising to see
CPw(n, n, 0.5, 0.5) is not obviously higher than CPw(n, n, 0.9, 0.9). When n varies
from 20 to 100, CPw(n,n,0.9,0.9) has less oscillation. All CPw(n, n,0.9,0.9) are
located between 0.940 and 0.949. The range of CPw(n, n, 0.5, 0.5) is [0.919, 0.953].

This example demonstrates we cannot judge the coverage probabilities of the Wald

13

Figure 2.2: Exact coverage probability of the nominal 95% Wald intervals for p1 =

p2=05andp1=p2=0.9withn1=n2=n=20to100

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0
a;
O
8 I l“1 r\ 1A\\ l \ II‘\
5 " I'\ I \ I \ I ‘ \
g \ I \\ l \ l \ j \ \ I \\ I \\
E l l I \ l \ , I \ I \ i
g 3 I \ l L ‘ 5 \ j ‘— ’ \ ' \l
d l l I I I \ j \ I \j \l
O: l \ : \\ j \ j \I v
a O) I ‘\ ' \ l \i
t! 0’ ' ‘I \' V
3 d i il "
° I I
0 N I
g g " - "' coverage at p1=p2=0,5 ..........
-— coverage at p1=p2=o,9
32
C I I I I I

interval by whether p1 and p2 are close to center or not. The relative positions of

191 and p2 affect the coverage probability.

Since there are four quantities n1, 112, p1 and p2 affecting the coverage proba-
bility, considering only the magnitudes of proportions is not enough. In fact, not
only the relative positions of 1); and 112 but also the relative sizes of 111 and 71;. may

inﬂuence CPw(n1, 712,111,122) signiﬁcantly: Moreover, the four quantities interact.

Example 3. Fix p1 = p2 = 0.9. Consider the coverage probabilities at 711 = 712 =
10 and 111 = 10,712 = 100 and nominal conﬁdence level 95%. Which coverage
probability is greater? It is striking to see that C'Pw(10,10,0.9,0.9) = 0.8282

and CPW(10, 100, 0.9, 0.9) = 0.6474. The large sample size does not improve but

14

Figure 2.3: Exact coverage probability of the nominal 95% Wald interval atnl -

112 = 10 and n1 = 10, 112 = 100 with p2 = 0.9 and p1 = 0.8 to 0.999 with jump size

 

 

 

 

 

 

 

 

 

 

 

 

0.001
O. -
— coverage at n1=n2=10
-——- ooverageatn1=10,n2=100 “141
I I ~"
03
a o’ ‘
5
a
‘3:
9- 0°. _
g, o
S
0
>
O
0 [x .1
0'
<9 _
O t I I I I
0.80 . 0.85 0.90 0.95 1.00
p1

reduce the coverage probability in this special case. And the big difference between
the two coverage probabilities cannot be explained only by the phenomenon of
oscillation. This suggests the drOp on the coverage probability in case two is caused

by unbalanced sample sizes. Figure 3 plots the coverage probabilities for p1 varying

between 0.8 and .999 with step size 0.001.

Table 1 and 2 list some coverage probabilities under different conﬁdence levels
for some p1, pg, 711 and n2. Observe how much the sample sizes might affect the
coverage probability. From Figure 2.3 and the two tables, we may conclude that a

larger sample size on one 11,- could not guarantee a better coverage probability. The

15

Table 2.1: Exact coverage probability of the nominal 95% Wald interval

 

 

 

 

nl 10 10 10 30 30 30 100 100 100

712 10 30 100 10 30 100 10 30 100
p1=.9,p2=.5 .911 .920 .804 .906 .934 .937 .896 .940 .947
p1=.9,p2— .871 .849 .688 .849 .938 .908 .688 .908 .927
p1=.8,p2— .894 .900 .877 .898 .940 .929 .893 .931 .939
p1=.6,p2— .922 .913 .908 .913 .934 .941 .908 .941 .948
p1=.9,192— .949 .869 .647 .869 .945 .918 .647 .918 .948
p1=.5,p2=.5 .912 .917 .905 .917 .948 .939 .905 .939 .944

 

 

 

 

 

 

 

 

 

 

 

 

Table 2.2: Exact coverage probability of the nominal 99% Wald interval

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

711 10 10 10 30 30 30 100 100 100

712 10 30 100 10 30 100 10 30 100
p1 = .9,p2 = .5 .963 .963 .892 .968 .984 .978 .968 .982 .988
p1 = .9, p2 = .1 .878 .856 .754 .856 .946 .953 .754 .953 .982
p1 = 8,122 = .3 .972 .948 .895 .954 .976 .979 .958 .981 .986
p1 = .6, p2 = .4 .974 .964 .954 .964 .982 .983 .954 .983 .988
p1 = .9, p2 = .8 .991 .973 .692 .973 .991 .956 .692 .956 .989
p1 = .5,p2 = .5 .958 .966 .967 .966 .986 .985 .967 .985 .987

 

 

16

 

 

 

 

relative magnitudes(balanced or not) of the two sample sizes is another issue that

inﬂuence the coverage probability signiﬁcantly.

It is obvious from above examples that the exact coverage probability of Wald
interval seldom achieves the nominal level. We will examine the reason theoretically

in next section.

At the end of this section, it is worthwhile to mention an issue that might cause
a non-negligible loss of the coverage probability of the Wald interval. Unlike a lot of
alternative intervals, the Wald interval is sensitive to whether a conﬁdence interval
is deﬁned as open or closed. The next remark gives such an example. Neither
Brown et al. (2002) nor Agresti and Coull (1998) speciﬁcally mentioned whether
their conﬁdence intervals were closed or not. But their results are consistent with
open conﬁdence intervals. In Wei et al. (1990), the authors speciﬁed open conﬁdence

intervals. In this report, we deﬁne a conﬁdence interval to be Open.

Remark 2.2.1. The shrinkage of the Wald interval to an empty set, (a, a), at some
realizations of (711,131; n2,p2) can cause its poor coverage performance, especially
when both sample sizes are small and both proportions approach boundaries. The
coverage probability of the Wald interval is at most 1 — (p;l1 + qI“)(p’2'2 + (132)
regardless of the nominal level. For example, when p1 = p2 = 0.95 and both sample
sizes are 20, the coverage probability of the Wald interval is at most 0.872 regardless
of the nominal level. A more simple and instructive example is 711 = 712 = 10 and
p1 = 112 = 0.9. If X1 = X2 = 10, then the conﬁdence interval shrinks to (0,0).
Though p1 — p2 = 0, but since 0 ¢ (0,0), (10,10) is not a proper pair in the AP

17

deﬁned at the beginning of this section. Note that P{X1 = X2 = 10} = 0.1216,

which makes C'Pw(10, 10, 0.9, 0.9) at most 0.8784.

2.3 A Reason for Inadequate Coverage

Similar to the reason for the inadequate coverage of the Wald interval for one bi-
nomial proportion explored in Brown et al. (2002), we will show that the poor
performance of the Wald interval is due mainly to the fact that the Wald conﬁ-
dence interval is symmetric about a “wrong” center. Although 131 — 152 is the MLE
and an unbiased estimator of p1 — 192, as the center of a conﬁdence interval it causes
a systematic loss of coverage from the nominal level. As we will see in next Chap-
ter, by simply recentering the interval, one can improve the coverage performance

signiﬁcantly.
One way to derive the Wald interval is to invert the large-sample Wald test.
The nominal (1 — a)% Wald interval for p1 — p2 is the set of 6,, for which

I151 - 152 - 512i
\/I51(1- ﬁll/711171520 - I52)/n2

 

 

< Za/2

Hence, in deriving the Wald interval, the following consequence of the central

limit theorem plays an important role:

. p: -Ap2 -(p1A-p2) - __c_> N(0,1)
\/P1(1- I’ll/"1 +132(1 - P2l/nz

 

For simplicity, denote the left side above Wmm. Even for quite large values of 711
and 712, the actual distribution of an, can be far from the the standard normal
distribution for many p1 and p2 as we will show next. Thus the very premise on

18

which the Wald interval is based is seriously ﬂawed for moderate and even quite

large values of 711 and 712.

The bias of anm which is EWmm, from the mean of standard normal

distribution can be analytically computed by doing standard expansions. Denote

can, = 15,- —— p,- for 2' = 1,2. Then simple algebra gives

W — ”"1 ’ “"2
nimz "‘

PIQX+(91-Pllwn1'W121L + P202+(02-P2)wn2 —w3,2
n1 ’12

where q,- = 1 — p,- for 1' = 1, 2. Let u = 9%? + 3:733. Denote the denominator b, then
l/b can be expressed as

l
2 2 --
11—1/2 (1 + (91*Pllwn1 + (Q2 —' 1’2)an _ (Wm + wn2)) 2

77.121. T1211. 77.11]. 71211.

=u'1/2(1+x)’1/2,

2 .2
_ (QI—Pllwn] (92-p2)wn2 _ wn, w. - _ 71/2
where 2: —- mu + "2" mu + 47.2.. . Since can, — 0,,(71, ), a Taylor

expansion yields

_ :r 31:2 5a:3 _
(Wm ‘7 wmlu U20 " 5 + ? — T6_ + 0p((711/\ 712) 3/2))

Wn1.n2 =

The formulas for central moments of the binomial distribution then yield an

approximation to the bias:

p1—1/2 1 9 p2-1/2 1 9
EW,, .-.-.-——-—-— 1 ———--—1 --—-———— 1 -—-———--~1
""2 n1u1u1/2 ( + 111(2111 ) n2u2u1/2 + n2(2u2 )
+ 30p. - 1/2)-(p2—1/2))
2n1n2u1u2u1/2

_ 15021 -1/2)(p2 -1/2) (122 -1/2 _ 91—1/2)

2n1n2u1u2u3/2 n2 111

+ 0((n1 /\ n2)'3/2) (2.3.1)

19

where

From (2.3.1), it can be seen that when both p1 and p2 approach 1 / 2 for ﬁxed
sample sizes, the bias tends to decrease. Therefore, ignoring the oscillation effect,
one can expect to increase the coverage probability by shifting the terms of the cen-
ter of Wald conﬁdence interval from 151 and :52 toward 1/2 and 1 / 2. When the two
proportions are close for comparable sample sizes, the effect from p1 could coun-
teract that from p2. On the other hand, it also explains why the Wald conﬁdence
interval behaves poorly when the sample sizes are extremely unbalanced for some
p1 and 192: the effect from p1 cannot cancel out most effect from p2. In general,
equation (2.3.1) can be used as a rule of thumb to explain how interaction among

n1, 112, p1 and p2 affects the bias and thus the coverage probability.

2.4 A smoothing formula obtained by Edgeworth

Expansion methods

In this section, we will not justify Edgeworth Expansions, but rather will use Edge-
worth expansion techniques to derive a formula that works well in approximating
coverage probabilities of the Wald interval in a variety of settings. See Bhattacharya

and Ranga Rao (1976) and Hall (1992) for more details on Edgeworth Expansions.

First a theorem from Hall (1992) on Edgeworth expansion is presented. It
gives general conditions under which the Edgeworth expansion is valid and will be

20

used as a tool to derive the smooth approximation of the coverage probability of

the Wald interval.

Theorem 2.4.1. (Hall, 1992, page 56) Let X, X1,X2,..., be independent and
identically distributed random column d-vectors with mean It, and put I: = n‘1 2;, X;
. Assume the function A : Rd —+ R has j + 2 continuous derivatives in a neighbor-
hood Of/l. = E(X), that 2401) = 0, that E(||X||j+2) < 00, and that the characteristic

function x of X satisﬁes

limsuplx(t)| < 1. (2.4.1)

IltII-wo

The above inequality is called Cramer’s condition. Denote the asymptotic vari-

ance of nl/zAOT) by 02. Suppose a > 0. Then forj Z 0,

P(n1/2A(-X_)/o S :13) =<I>(:r) + n'1/2r1(:r)d>(x) + n'1r2(a:)d(:r) + - ~-

+ n‘j/zrj(:r)¢(z) + 0(n"j/2) (2.4.2)
uniformly in :c, where r; is a polynomial of degree at most 33' — 1, odd for evenj
and even for odd j, with coeﬁicients depending on moments of X up to order j + 2.

According to the arguments (pages 47, 48) in Hall (1992), the rj for j = 1,2
in Theroem 2.4.1 are given by
r1(x) = — {km + ék3,1(:r2 — 1)} (2.4.3)
and

1 1 1
r2(:r) = —:I: {5(k23 + kiz) + £09.“ + 4Ic1,2k;.,,1)(2:2 — 3) + 5143,42? — 10:1:2 + 15)},

(2.4.4)

21

where those k’s can be determined through the following expressions of It’s that
may are expanded in terms of k’s as a power series in n“1
#ij = n'(j"2)/2(kj,1 + n”1kJ-,2 + 71—21013 + ...),j Z I (245)
Let 5,, = n1/2(é - 60)/&. The It’s are deﬁned by
K1," == E(Sn)
K2,, = E(SZ) — (E(S,,))2 = var(S,,)
n3," 2 E(Sﬁ) — 3E(S,2,)E(Sn) + 2(E(S,,))3 = E(Sn — ES")3
64,. = E(Sil - 4E(SS)E(Sn) - 3(E(S.2.)2 +12E(S.2.)(1‘3(Sn))2 - 6(E(Sn))4
= E(Sn — ES")4 — 3(va.r(S,,))2 (2.4.6)
To derive the smooth approximation of the coverage probability of the Wald
interval, we deﬁne some notation. Let {YM- :j= 1,2, . . .} and {Yzj :j = 1,2,...)
be two independent sequences of independent Bernoulli random variables, I’m- ~

Bernoulli(p,~) and let X,- = 2:;1KJ, where i = 1,2. Let '7 and K3 stand for the

skewness and kurtosis of D = Ym — Yu respectively. Then
7 = E(D - ED)3 = E (Y1,1 - Y2,1 - (p1 - 191))3
= P191011 "- 191) — P292012 '7' P2)
and
K. = E(D — ED)4 — 3(tIar(D))2
= E (Y1 - Y2 - (P1 - I02))4 - 3(vaT(Y1 - Y2))2
= p191(q1 - 721)2 + 19292012 — 192)2 - 219%? - 219393

22

We do not have appropriate random variables to apply Theorem 2.4.1 directly
since Bernoulli random variables do not satisfy Cramer’s condition. In general,
absolutely continuous random variables satisfy Cramer’s condition. Therefore, we
need to smooth Bernoulli random variables ﬁrst. However, there is another problem
arising after smoothing: the Wald test statistic, through which we may deﬁne the

exact coverage probability of the Wald interval, is

 

W _ P1‘P2—(P1'7P2)
n ,n — .. - - . 1
l 2 \/p1(1 -p1)/n1+p2(1-pz)/n2

on the set

- 1_ - - 1_ -
11mm ={P1( P1) +P2( P2) > 0}
711 112

and has no deﬁnition on

—- A 1- A A 1— A
Hmm = {P1( P1) + P2( P2) = 0}.
77.1 722

Consequently, we need to consider the smoothing random variables on Hum”. But
Theorem 2.4.1 does not apply to random variables deﬁned on a proper subset.

Since P(II,,,,,) = (p'f+q§‘) (p3 +q3), which is of higher order of 0(n‘3/2). Hence,
the probability that an, has no deﬁnition can be absorbed in 0(n‘3/2). And a
smooth approximation of the coverage probability of the Wald interval will be given
in an expression with error term 0(n’3/2).

What would happen if the Edgeworth Expansions were theoretically valid on
the subset 11%,”? We will focus on Hum, henceforth.

For simplicity, we consider the case when n1 = n2 = n 2 2.

The procedure of deriving the smooth approximation contains four steps.
First, we create two sequences of random variables and deﬁne a statistic Tn," by

23

using those created random variables. We then show statistic Tn," can be used to
approximate the exact coverage probability of the Wald interval. In step 2, we
verify the validity of doing Edgeworth expansion for statistic T n," if the expansions
were valid on a subset. The Edgeworth expansion for T W, is derived in step 3. Last,
in step 4, the smoothing formula of the coverage probability of the Wald interval is

given by applying results from the ﬁrst three steps.

Step 1. We ﬁrst create two sequences of random variables to be used in the
Edgeworth expansion and show the exact coverage probability of the Wald interval

can be approximated using a statistic deﬁned through the created random variables.
Suppose £131“ and "iu' are two independent sequences of i.i.d random variables
forj = 1,2, - - -, both are independent of Yid- for 2' = 1,2 and EiJ ~ U(—1/n4,1/n"),

17,-,3- ~ U(1—1/n4,1 + l/n“). For 2' = 1,2 and j = 1,2,-.., deﬁne

T1,,- = Gall/3.1 = 0] + Pull/m = 11° (2-4-7)
Then
1 1
Ym‘ - a: < Tm“ < Ym‘ + '7; (2-4-8)

Put T,- = 2;:17},j/n The following inequality holds by applying inequality

(2.4.8)

1
Ti-

, — 1

24

Consider the quantity under the square root in W”.

 

151(1‘ 151)+ 1520‘ P2)

 

 

n Tl
< (71+ 1/n‘)(1 - TI +1/n“) + (72 +1/n“)(1 — 273+ 1/n")
12 TL
7“ —T Tl—T
<—§—1—-‘-)-+—3-€———Q+%. (2.4.10)
11 n TL

Similarly,

. _* *1_* Tl—T‘ Tl—T
MO 111) +pz( 122) > _1_l__L)_+.L_2_)_ 35 (2.411)
n n n n n

NotethatQU—J—m+ﬂ§E—)—fg SOifandonlyif’i—‘(l—nul’ﬂ+‘i'2—(-1;;_—p‘22 =0,
which is out of our consideration according to previous analysis. Then, on Hum,

deﬁne

 

= TT—TZ-(pl—pz)
\/T1(1-T1) + ran—113)

n n

Tn,n

 

Therefore, applying inequalities (2.4.10) and (2.4.11), the following inequality chain

holds,

 

Tl-T'2:-(P1—P2)"2/n4<W Tr-T2—(P1—P2H'2/n4
¢M+M+3§ ¢m+m_4

 

 

 

5

Further more, a few lines of expansions and algebra yield,

Tn,,.(1— 2/n3) — 11-2 < WM, < Tn,n(1+ 2/n3) + 71-2

25

Therefore, the coverage probability of Wald interval satisﬁes

PW = P (IWn,n| S Za/2)
=P (WW, 5 2w) - P (WW, < —za,2)
SP (2:... s (Zn/2 + n‘2)/(1 - 2/n3)) — P (Tm < -(Za/2 + n'2)/(1 + 2#131)
(2.4.12)

Similarly,

P (IWn,n| S Za/2) (2”413)
2P (TM, 3 (Zn/2 - n‘2)/(1 + 2/n3)) — P (TM, < -—(Za/2 — n‘2)/(1 — 2/n3))
(2.4.14)

Step 2, verify the validity of performing an Edgeworth expansion for Tm".

T1 j
Deﬁne a sequence of random vectors ZJ- = , where j = 1, 2, - - -. Note that

T2,]-

ET,”- =E(€z‘,j[Yi,j = 0) + "Lip/Li = 1])

=E(€i,j)(1 — Pi) + E(Tli,j)Pi = Pi

P1
so the Zj's are i.i.d random vectors satisfying )1 = E(Zj) = ( . Let

P2
_ Tl-
Z = 5:23;, Z,- = . For any vector x = (3(1),x(2)) 6 (0,1)2, deﬁne
T;
(1) _ (2) ._ ...
A(x) = 1' 37 (P1 P2)

 

 

«($(1)(1 — 3(1)) + 3(2)(1— 3(2)»
Then A(x) is a inﬁnitely differentiable function with A(_Z-) = nl/zTnm and A(p) = O.

26

Note that the asymptotic variance of n1/2A(Z) is 1, according to Theorem 2.4.1,

Phil/214(2) .<. x} = P{”1/2Tn.n.m,p2 S as}
= 4(4) + n‘1/2T1(I)¢($)+ n-1r2<x)¢<x)+~-

+ n’k/2rk($)¢(z) + 0(n‘k/2).

The above expansion is valid as an asymptotic series to 1: term if for any positive

integer j,
E(HZJ-HHz) < 00 (2.4.15)
and

limsup (Eexp(z't<1>z§” + it<2>z§2))| < 1. (2.4.16)
It“)l+lt‘2’I-+oo

By an argument (page 65) in Hall (1992), Cramer’s condition (2.4.16) holds if the
distribution of the random vector ZJ- has a non-degenerate, absolutely continuous
component, which is satisﬁed in current settings. The former inequality (2.4.15) is

guaranteed by

k 2

E(||Zj||"+2) = E(D/1,312 + |Y2,j|2)k—;—2 S 4+.

Step 3, derive the two-term Edgeworth expansion of Til/214(2).

27

For simplicity, use 5,, to denote Til/224(2). Let W,- = T,- — p., for 2' = 1, 2, then

E(W,) = o E(WE) = 13% + o(n-8)

 

E(W?) = piQi(::2- pi) + 0(71—8)

 

 

. . a 3 - 2 2
E(Wi‘i) = pigt(pr + q,)1-:;3(n 1)p1 Q: + 0(71-8) ~ 0(n-2)
? .2 . _ .
E(Wis) = 1019,41,); p.) + 0(n'4) + 0(n'8) ~ O(n"3)
3 a
E(Wis) = 13:1qu + 0(n-4) + 0(n‘8) ~ 0(n‘3) (2.4.17)

Then with 5,. = n1/2A(-Z-) and W1 = 012(71-1/2),

Sn = 111”(W — W2)((W1+ p1)(-W1+ 41) + (W2 + p2)(—VV2 + (ﬁn-V2

= n1/2(W1 — W2) (plql + p242 + (ql - pawl + (q2 — p2)VV2 — (W,2 + Wyn—W

_ _ —1/2
= 111/2(W1 - way-”2 (1+ ————(q‘ T 7")W1+ ___(q2 T 1MW... — 1(11'3 + 14/22))

7.

2
1 .
= n1/2T—1/2(W1 — W2){1 —' 57-12(01- P0”?
i=1
1 2 3 3
+ 2T—1 ;[1 + ZT—1(Qi - Pi)2]Wi2 + 37-2% - P1)(Q2 - P2)W1W2}
:9

Therefore, apply the moment equations (2.4.17) and the independence of W1 and

28

W2, we have

E6.) = nl/2r-WE ((Wl - W2)(1 — $74241.- — paw») + 001-1)

1
= —§n"1/2T‘3/2’y + 0(n'1)
2
E(Sg) = nT—1E{(W1 — W2)2(1- 7-1 2((15 — p,)W,-
i=1

2
+ 7‘1 2(1 + T‘1(q,- — p,-)2)W,-2

i=1

+ 2T_2(€11 - P1)(<12 — P2)W1W2)} + 0(71—3/2)

= 1 + n“1 + 2n‘17_2(p¥qf + pgqg) + 27‘372 + 0(n‘3/2)

2
E6?» = n3/2r-3/2E ((Wl — W2)3(1 - 37-12(4- - paw») + 0(n-l)
i=1
= —;n_1/2T'3/l27 + 0(n—1)

and

2
E(SZ) = n2r-2E{(Wl - W2>4(1 — 27—1 2(4- — paw.-

i=1
2

+ 7-1 2(2 + 3T_1(qi — pi)2)I’Vz-2

i=1
+ 6T—2(QI - P1)(Q2 — P2)W1W2)} + 0(71—3/2)

= 3 + 611‘1 + 18n‘17‘2(p¥qf + pgqg) — 211-1745 + 287—372 + 0(n‘3/2)

29

Hence, by equations (2.4.6)

141,. = E(Sn) = —-;—n‘1/27'—3/27 + 0(n“)
”2,11 = E(Si) " (£31530)2

= 1+ 21‘1 + 2n’1'r'2(p§qf +p§q§) + En’lr’a’f" + 0(n'3/2)
Kan = E(Sﬁ) - 3E(Sr21)E(Sn)+ 2(E(Sn))3

= -2n'1/2T‘3/2'y + 001“)
and

11:4," = E(Si) - 4E(53)E(Sn) - 3(E(53.)2 +1219(53)(E(Sn))2 - 609(5))»4

= 611-174(12qu + pgqg) — 2n"lr‘2n + 1211—17—372 + 0(n’3/2)
Therefore, in the notation of (2.4.5), the two-term Edgeworth expansion for T W, is
P(Tn,n < a) = <I>(a) + n‘1/2r1(a)q5(a) + n’1r2(a)¢(a) + 0(n‘3/2)

where r1(a) and r2(a) are given in equation (2.4.3) equation (2.4.4) with

1
191,2 = - “Fa/2

2 ’Y

7
km = 1 + 27—20912? + 19343) + {"372
k3,1 = —2T—3/2’)’

(“4.1 = 5T_2(PiCIi + qug) - 27’2“? + 127—372

in which 7' = plql + sz2.

Step 4, compute the smooth approximation of the coverage probability of the

Wald interval.

30

By equation (2.4.12),

PW 3P (TM 3 (20,,2 + n‘2)/(1 — 2/n3)) - P (Tan < -(Za/2 + n'2)/(1 + 2/n3))
=<I>((Za/2 + n‘2)/ (1 - 2/n3)) — <I>(-(Za/2 + n‘2)/ (1 + 2/113))
+ n‘1/2ri((Za/2 + n'2)/(1- 2/123))<z>((Z../2 + n'2)/ (1 - 2/n‘°’))
— n‘1/2n(—(Za/2 + n'”)/ (1 + 2/115“))<15(—(Ze../2 + n“"’)/ (1 + 2/113))
+ n'1r2((Za/2 + n"2)/(1- 2/n3))¢>((z../2 + n-2)/(1 — 2/n3))
_ Trim—(z...)2 + n-2)/(1+ 2/n3))¢(—(Za/2 + n‘2)/(1 + 2/n3))
+ O(n"3/2)
=(1- a) + 2n’1r2(Za/2)¢(Za/2) + O(n"3/2) (2.4.19)
The cancellation is valid because all functions appeared in the two-term Edgeworth
expansion of Sn are continuous and the n-2 terms can be absorbed in 0(n-3/2) .

That r1 (1:) and (Mac) are even functions and rw(:r,) is an odd function also guarantees

the last two steps.

Similarly, it can be shown that
PW 2 (1 — a) + 2n-1r2(za,2)¢>(za,2) + 0(n-3/2). (2.4.20)

Combine inequalities (2.4.19) and (2.4.20), then we have the smoothing for-
mula for the coverage probability of the Wald interval:
The coverage probability of the Wald interval is at most 1 — (12'; + q?)(p§z + q?) and

can be expanded as
PW = (1 — a) + 2n'1r2(Za/2)¢(Za/2) + 0(n-3/2) (2.4.21)

31

where ”(Zn/2) = r2(Za/2) in equation (2.4.4) with

- 7 ._
k2.2 = 1 + 27' 2(Pi4i +P§CI§) + 47 372

k3,1 = -2r'3/2'y

164.1 = (ST—200%? + 12323) - 27'2n + 127‘312
in whichr=p1q1+p2q2 when0<a< 1 andn1=n2=n22.

Remark 2.4.1. Neglecting the error term 0(n‘3/2), one can see the approximate
coverage probability given by the smoothing formula is a smooth function with
respect to sample size n and the proportions, thus it does not inherit the oscillation
of the exact coverage probability. Because we used random variables with absolutely
continuous distribution functions instead of the discrete binomial random variables
in deriving the expansion, the oscillation is likely caused by the discreteness of

Bernoulli distribution.

Remark 2.4.2. Compare the approximate coverage probabilities from the expan-
sion and the exact coverage probabilities, for instance, see Figure 4 and 5, in which
the nominal levels are 95% and the SE stands for the smooth expansion. We notice
that the coverage probabilities approximated by the smoothing formula follows the
trend of the exact coverage regardless of the oscillation. The approximate cov-
erage given by the smooth formula is lower than the nominal level. Our further
study shows that at 95% nominal level, the rw(Za/2) term is always negative un-

less both proportions are either less than 0.028 or greater than 0.972. When the

32

Figure 2.4: Exact and SE approximate coverage probabilities of the nominal 95%

Wald intervals for p1 = 0.9, p2 = 0.1 and n; = n2 = n

 

§

:5 tn

3 02 -

e O

n.

0

a cl

9

g o - - -- SE approximation
0 :2 J exact coverage

 

 

 

 

20 40 60 80 1 00

n1=n2=n

nominal level is 90%, if 0.001 S p,- S 0.999 for i = 1 or i = 2, then rw(Za/2) < 0.
When the two proportions take some extreme values, the (p'f‘ + qi")(p’2'2 + q?)
term may achieve non-negligible amount even for quite large sample sizes, say
0.999200 = 0.819, 0.99200 = 0.134. Therefore, formula 2.4.21 explains why the Wald
interval has typically lower exact coverage probabilities than the nominal level and

the formula gives the order of the negative deviation of the coverage probability of

the Wald interval from the nominal level.

Remark 2.4.3. The smooth approximation of the exact coverage probability of
the Wald interval at different sample sizes is still valid when n,- = 7r,-n for i = 1, 2,
where 71'1 and 11'; are two relatively prime positive integers and n is an positive

integer. Figure 6 plots the exact coverage probabilities and their approximations

33

Figure 2.5: Exact and SE approximate coverage probabilities of the nominal 95%

Wald intervals for p1 = 0.8, p2 = 0.3 and n1 = 712 = n

 

 

Milli/2% WWW
11W v 1v

— — —- SE approximation
exact coverage

 

 

 

 

 

 

--------

 

Coverage Probability
0.92 0.93 0.94 0.95
:2

 

 

 

 

 

 

 

0.91

 

20 40 60 80 1 00

n1=n2=n

given by the smoothing formula when p1 = 0.9, p2 = 0.8, n; = 2111 and n1 varies

from 20 to 100 at 95% level. The smoothing formula approximates well in this case

too.

Remark 2.4.4. For any discrete random variables that have ﬁnitely many possible
values, a corresponding formula can be derived through the method applied above.

For other discrete random variables, the constant 4 in deﬁning g” and 17 may take

some other value.

According to our analysis, we conclude that the Wald interval behaves much

worse than people’s expectation and should be used with caution. Alternative

intervals will be evaluated in next Chapter.

34

Figure 2.6: Exact and SE approximate coverage probabilities of the nominal 95%

Wald intervals for p1 = 0.8, p2 = 0.3 and 712 = 2n1

 

 

 

 

 

 

 

 

 

 

 

é o
.o v
3 °? ‘
O
9
a
0 -
8’ . .
5 / -—-- SE approxnmatlon
> o __
8 8 1 / ................... exact coverage ________
C. V
20

40 60 80 1 00

n1

35

Chapter 3

Interval Estimation for the
Difference of two Binomial

Proportions

-3.1 Introduction

The poor performance of the Wald interval for the difference of two binomial pro-
portions has been addressed in last chapter. Consequently, there are quite a lot
of methods of developing alternative intervals suggested. Their performances differ

signiﬁcantly.

In Section 2, we present several interval estimation methods as candidates to
replace the Wald interval, each with its motivation. The candidate intervals are

classiﬁed into three groups: (1) The Wald interval with continuity correction. It has

36

the same center as the original Wald interval. (2) Conﬁdence intervals with adjusted
centers. We select two of such intervals. One is derived through Bayesian approach
with Beta prior distributions and then using normal approximation. we identify it
the (approximate) Bayes interval in the report. Another one is pr0posed by Agresti
and Coull (1998). The main idea is to add four pseudo observations. Both intervals
have different centers from the Wald interval. (3) The proﬁle likelihood based
conﬁdence interval. Which, unlike the other intervals, does not have an explicit

form.

In Section 3, the performances of the above intervals with explicit forms along
with the Wald interval are evaluated. We assess those intervals on two aspects.
One is their coverage probabilities. All the coverage probabilities in this section are

computed exactly rather than by simulation. The other is their expected lengths.

The proﬁle likelihood based interval is taken into consideration in Section 4.
We compare the coverage probabilities and lengths of all the alternative intervals

along with the Wald interval through simulation.

According to our analysis, we recommend the intervals with adjusted centers

as substitutes for the Wald interval.

We concentrate on intervals with 95% nominal level in this chapter. The

conclusions also hold for intervals with other nominal levels.

37

3.2 Some Alternative Intervals

The following are some alternatives to the Wald interval.

3.2.1 The Wald interval with continuity correction

There are a few intervals with different correction terms in this category. The most

widely used one is

 

 

.. . ‘ 1— ‘ ‘ 1— “ 1 1
P1 — P2 21: 2'2 p____1( P1) + p2( 122) + + — (3.2.1)
2 n1 n2 2n1 2n2

It results from inverting the Wald test: when computing the p-value, a continuity
correction is applied for improving the accuracy of the central limit theorem ap-
proximation. This interval has the same center as the Wald interval and a greater

margin of error.

3.2.2 Intervals with adjusted center

Approximate Bayes interval

The method is motivated by using the Bayesian estimates instead of the maximum
likelihood estimates when deriving conﬁdence intervals. For i = 1, 2, the indepen-
dent conjugate Beta(a, b) prior distribution results in the posterior distribution of
p,- is Beta(a + X;, b + 111 — X.) with mean 5, = (Xi + a)/(n.- + a + b) and variance
15.-(1 — 13,-) / (n.- + a + b + 1). Using a normal approximation for the distribution of

the difference of the posterior beta variate leads to the approximate Bayes interval

 

 

151—13221529-

2

Jan-51) + 2520-152)

n1+a+b+1 n2+a+b+1

38

In particular, if a = b, the estimators 1'51 and 132 are driven to be closer to 1/ 2 than
131 and 152 respectively unless p, = 1/2. Suggested by Berry (1996) (p.291), the
approximate Bayes interval in the report is speciﬁed to take a = b = 1, which leads

to

 

 

- .. Pl(1 'Pl) P2(1‘P2)
— :i: 9. 3.2.2
pl 172 22\/ n1+ 3 + 722 + 3 ( )

where 13" = (Xi+1)/(n.-+ 2) fori=1,2.

The Agresti-Coull interval
As mentioned in Chapter 1, motivated by the Wilson interval for one binomial pro-
portion as shown in Wilson (1927), Agresti and Coull (1998) suggested an interval
with Z3025 z 4 pseudo observations, one success and one failure from each binomial
population. Then the sample proportions are p, = (X. + 1)/(n,~ + 2) for i = 1, 2.
Replacing p, with p, and n,- with n.- + 2 for i = 1, 2 in the ordinary Wald interval

yields the Agresti-Coull interval

 

~ ~ 1510-131) 1520—152)
CI = — 2i: 2 -——— —-—- 3.2.3
A P1 p2 22¢ n1+2 + n2+2 ( )

The above two intervals have the same center. The approximate Bayes interval
is a subset of the Agresti-Coull interval. Since they have a different center from
the Wald interval with or without continuity correction but a similar form, we call

them intervals with adjusted centers.

39

3.2.3 The proﬁle likelihood based intervals

Unlike the other intervals discussed so far, proﬁle likelihood based intervals do not
have explicit forms. Suppose the log-likelihood function of 0 = (A, p2) is l (A, p2),
where A = p1 — p2 is the parameter of interest and p2 is regarded as a nuisance
parameter. Let {(A) = rriaxl(A,p2), which is called the proﬁle likelihood for A,
where the range of po for the maximization is (0,1 — A) if A Z 0 and (—A, 1)

otherwise. Then an approximate 100(1 — a)% proﬁle likelihood interval for p1 — 122

is
{A e (-1,1) =2<l<A,112)—i(A))s xi(a)} (3.24)

where A = 131—152 and xﬂa) is the 1000 upper percentage point of xi. This interval
follows from the fact that for A = A0, 2(1(A,p2) — f(Ao)) is asymptotically chi-

squared distributed with 1 degree of freedom as shown in Cox and Hinkley (1974)

3.3 Comparison of Intervals with Explicit Forms

We address the comparisons on two aspects: coverage properties and expected
lengths. For convenience, we deﬁne some uniﬁed notations: CI. represents interval
*, CP. and EL. refer to its coverage probability and expected length respectively,
where air may be W, WCC, B, AC and PLB that indicate the Wald interval,
the Wald interval with continuity correction, the approximate Bayes interval, the

Agresti-Coull interval and the proﬁle likelihood based interval respectively.

40

Figure 3.1: Exact coverage probability Boxplots of some 95% nominal intervals

 

 

1.0

 

 

 

0.8

 

 

Coverage Probability
0 6
l

 

 

 

 

:- - median=0.933 median=0.966 median=0.949 median=0.953
mean==0.924 mean=0.964 mean=0.951 mean=0.956
N. ..
o :
Wald Wald_CC Bayes Agresti-Coull

3.3.1 Coverage Properties

Since the Wald interval with continuity correction contains the Wald interval, its
coverage probability is always no less than that of the Wald interval. Similarly,

the coverage probability of approximate Bayes interval cannot exceed that of the

Agresti-Coull interval.

To explore the average performance of the four intervals for small to moderate
sample sizes, we randomly sampled 10,000 values of (n1,p1;n2,p2), taking p1 and p2

independently from U(0, 1) and taking n1 and n2 independently from the uniform

41

distribution over {10, 11,. . . , 50}. We evaluated the exact coverage probabilities of

the four intervals for each realization of (721, pl; 712, p2) at the 95% nominal level.

Figure 3.1 shows the coverage probability boxplots of the four intervals. The
means and medians of the coverage probabilities of the four intervals are listed in
the ﬁgure as well. The performance of the Wald interval is very poor. Both the
mean and the median of its coverage probabilities are much lower than the nominal
level and those of other intervals. Its behavior is not stable either. It suffers from
occasionally very low coverage probabilities. For example, the minimum coverage

probability of the Wald interval from our evaluation is only 0.170.

Contrary to the Wald interval, the coverage probabilities of the Wald interval
with continuity correction tends to be much higher than the nominal level. It has
81.4% of its coverage probabilities greater than 0.96. Since it is just a simple spread
of the Wald interval, it inherits some disadvantages of the Wald interval such as

unstable and occasionally very low coverage probabilities.

Compared with the Wald interval and its variate, coverage probabilities of
approximate Bayes interval and Agresti-Coull interval lie more closely to the nom-
inal level, which makes them more reliable. This can also be seen from the mean
distances of the coverage probabilities of the four intervals from the nominal level,
which are 0.026,0.019,0.07 and 0.07 for the Wald, Wald with continuity correction,

Bayes and Agresti-Coull intervals respectively.

According to the analysis in Chapter 2, when both p1 and p2 are getting closer

to 1/2, EWmm, the bias of the test statistic Wm”, from the mean of standard

42

normal distribution tends to decrease, where

W _ 151 -152- (101 -p2)
n in? — a. A A A 3
l \/P1(1—P1)/n1+P2(1-P2)/712

through which the Wald interval can be derived. This explains why the coverage

 

behaviors of the intervals with adjusted centers are better than that of the Wald

interval: 15',- is closer to 1 / 2 than 13,- unless p‘, = 1 / 2.

Average performance over the unit square for (p1,p2) and {10,11,. . .,50} x
{10, 11, . . . , 50} for (n1, 712) can mask certain behaviors of the four intervals in some
regions. In particular, some pairs of (p1, p2) are of more interest, say |P1 -p2| small.
Similarly, some pairs of (n1, 112) may be more important, for example, proportional
sampling may result in some favorable (n1, n2). Hence, it is necessary to focus on

some special and common cases.

Figure 3.2 shows how the coverage probabilities of the four intervals vary for
122 = 0.5 and p1 varying between 0.01 and 0.99 with step size 0.01 at 711 = no = 20
and nominal level 95%. In this case, the coverage probabilities of the Wald interval
never achieve the nominal level. Its mean coverage is 0.933. The coverage of the
Wald interval with continuity correction is always above 0.96 with mean 0.969.
Most coverage probabilities of the Bayes interval and the Agresti-Coull interval
ﬂuctuates between 0.94 and 0.96. The mean coverage probabilities are 0.946 and

0.951 respectively.

Figure 3.3 demonstrates the behaviors of the four intervals when p1 = p2 vary
between 0.01 and 0.99 with step size 0.01 at 711 = 712 = 20 and nominal level 95%.
Again, the coverage probability of the Wald interval never reaches the nominal

43

Figure 3.2: Comparison of exact coverage probabilities for p2 = 0.5, n1 = 212 = 20

at 95% nominal level

 

 

 

 

 

 

 

 

 

o
Q _
'— —— Wald
Wald with correction ~—
00 — — -< Bayes
0? - Agresti-Coull -..-
o ...
CD 13': '-., ..
a? - “ ’
o

 

 

Coverage Probability

0.94
1

 

 

0.92
l

 

 

 

0.0 0.2 0.4 0.6 0.8 1.0

p1

44

Figure 3.3: Comparison of exact coverage probabilities for p1 = p2 = 0.01 to

0.99,n1 = 122 = 20 at 95% nominal level

Coverage Probability

0.92 0.94 0.96 0.98 1.00

0.90

 

 

 

 

 

 

 

 

 

 

 

uuuuuuuuuuuuuu

 

 

p1=p2

45

0.8 1.0

Figure 3.4: Comparison of exact coverage probabilities at p1 = 0.7, p2

||
93
01

n2 = 2n1 at 95% nominal level

 

0.96 0.98
l 1

Coverage Probability
0 94
l

 

 

— Wald

------------- Wald with correction
— - —- Bayes

————— Agresti-Coull

0.92
l

 

 

 

 

 

0.90
1

 

1 0 20 30 40 50

n1

level and the coverage of its variate remains above 0.96. It is striking to see that
all the coverage probabilities of the three alternative intervals converge to 1 and
the coverage of the Wald interval drops dramatically as p1 = p2 approach either
boundary. This is because the Wald interval suffers from shrinkage to an empty
set for some realizations of (nl, pl; 712, p2) while the other intervals do not. When
both sample sizes approach boundaries and sample sizes are small or moderate, the

chance that the Wald interval is empty might be non-negligible or quite large.

The coverage performances for p1 = 0.7, p2 = 0.5 and 112 = 2711 are plotted

46

in ﬁgure 3.4. When sample sizes are small, the coverage probabilities of the Wald
interval are much lower than the nominal level while the other three intervals have
coverage probabilities much closer to the nominal level. As sample sizes increase,
all the coverage probabilities are driven to the nominal level. Though the exact
coverage probabilities of the Wald interval with continuity correction and the inter-
vals with adjusted centers are much higher than the 95% nominal level, this does
not persist for all p1 and p2 when 712 = 2711. For example, if p1 = 0.9, p2 = 0.1,
n1 = 12,n2 = 24, the coverage probability of the Wald. interval with continuity
correction is only 0.915. More coverage probabilities for p1 = 0.9, p2 = 0.1 are

plotted in ﬁgure 3.5.

The discussion so far shows that the coverage of Wald interval is too low
and the coverage of Wald interval with continuity correction is often too high. The

intervals with adjusted centers have coverage probabilities around the nominal level.

3.3.2 Expected Lengths

In addition to coverage probability, parsimony in length is another important issue
for evaluating a conﬁdence interval. We have shown that the coverage probabilities
of intervals with adjusted centers are much higher than the Wald interval in a fre-
quentist sense, but the gain on coverage probability is not due to greater lengths.
On the contrary, the intervals with adjusted centers often have smaller lengthes
than the Wald interval. But for the Wald interval with continuity correction, the

improvement of coverage probability is completely through widening the Wald in-

47

Figure 3.5: Comparison of exact coverage probabilities at p1 = 0.9, pg

712 = 2721 at 95% nominal level

Coverage Probability

0.88 0.90 0.92 0.94 0.96 0.98

0.86

 

l

 

 

 

   
 

 

 

 

 

 

Wald
------------- Wald with correction
— — - - Bayes
_____ Agresti-Coull
l l r r I
10 20 3° 4° 50

n1

48

terval.

22

Theorem 3.3.1. Denote 1? by c. Then

 

 

 

-12 2 2 3 2
cELw = u1/2 _ u / (4(P141 + T P2CI2) + P191011 ‘P1) + r p2q2(q2 —p2) )

 

 

8 "2 n2(qu1 + TP2CI2)
+0(n‘2) (3.3.1)
cELWCC = cELw + 12:: + 0(n"2) (3.3.2)
cELB = cELw + 411/2 1 + ’2 — 7W“ + ’2p2q'2) + 0(n-2) (3.3.3)

27412121 + r2242)

2 2
1/21 + r — 6(pqu + T p2Q2) + 0(n—2)

cEL = cEL + u
AC W 2n(piqi + 772222)

 

(3.3.4)

where r = 711/712 , n = 71.1 and

u = qul + P202

1
. = - (mm + 710222)
n1 n2 n

Proof. Deﬁne w,- = (p,- —p,-)/p,-, then 15,- = piw, +p,-. The length of the Wald interval,

denoted by Lw, is

b1
E
II
[\D
N
MID
A
II. M to
:9)
’1:
P l
:6»
V
~l~

+ fw(wi. 122)] + 0,,(n'2) (3.3.5)

where fw(w1, (.02) only contains terms of w,- and wlwg and has mean 0. The second
step is achieved by multivariate Taylor Expansion and w,- = Op(n‘i). Equation
(3.3.1) follows from taking expected value of equation (3.3.5) with respect to wl
and W2.

Since the length of Wald interval with continuity correction is

LWCC = Lw + 22%(1/2711-1-1/2712),

49

some elementary algebra results in (3.3.2).
The length, L B, of approximate Bayes interval, is
_1_
pi( (1 — pi
L =2 e
. . (2: .. .3 )

2
=2ze(t% — -t'i 2 (—t‘1 n’p'(q p) 2 + "‘1” >13?
2 2 4 (n.- + 2)4(n,- + 3) (n,- + 2)2(n,- + 3)

 

 

i=1

+ fB(w1,w2)) + Op(n-2) (3.3.6)

where f3 ((421,012) only contains terms of w,- and wlwg and has mean 0. The second
step is again achieved by multivariate Taylor Expansion and w,- = 0,,(n’i). The t

has the expression

:Pl(1 - P2) + P2(1 — P2)

 

n1 '1’ 3 n2 '1' 3
1 — 7 1 - 7
=Pl_‘11 + 219141 +___p2t12+ P292 0 (71-3)
1 2 _ 7 2
=11 (1 + + r (P141 + r P242)) + 0(n‘3) (3.3.7)
"(P141 + 1'P2(I2)

Therefore, replacing the t with the expression given by (3.3.7) and taking
expectation of equation (3.3.6) with respect to wl and wg gives the desired result

(3.3.3).

The proof of (3.3.4) is very similar to the proof for the approximate Bayes

interval and is omitted. El

A direct conclusion of Theorem 3.3.1 is the comparison results of the expected

lengths of the intervals.

50

Corollary 3.3.1. Denote r = nl/ng and n1 = n, then up to an error of 0(n‘2),

ELw Z ELB if and only if

7(P191 + 1'2P242) _>_ 1 + 7‘2

and ELWCC Z ELB if and only if

 

1 + r2 - 7(12121 + ”12222)

1 + r >
\/n(piqi + r2292)

 

Remark 3.3.1. The above corollary can be applied to the Agresti-Coull interval

after replacing the 7’s by 6’s.

Remark 3.3.2. Based on the corollary, when both p1 and p2 are in (0.173, 0.827),
the Bayes interval is shorter than the Wald interval and it is the shortest among the
four intervals if the 0(n'2) error is neglected. In addition, if p,- E G — % a?
is satisﬁed for either i = 1 or i = 2, the Bayes interval is shorter than the Wald
interval with continuity correction if the 0(n‘2) error is neglected. Therefore, when
one sample size is not too small and the corresponding proportion is not too close

to the boundaries, the approximate Bayes interval is shorter than the Wald interval

with continuity correction.

Remark 3.3.3. The Wald interval with continuity correction is often much longer
than the other three intervals. Figure 3.6 and 3.7 plot the approximate expected
lengths of the four intervals under different conditions when nominal conﬁdence
level is 95%. They demonstrate that the expected lengths of those intervals except

the Wald interval with continuity correction are comparable.

51

Figure 3.6: Comparison of approximate expected lengths of some conﬁdence inter-

vals for p2 = 0.5 and n1 = 112 = 25

 

 

 

 

 

 

 

 

 

ix
0' 1 """""""""""""""""""""""""""""""""""""""""
‘9. a
o
.C
‘6:
C
2
g 3 ‘
X
LIJ
9. _
o — EL of Wald interval
------------- EL of Wald interval with correction
- - -- EL of Bayes interval
----- EL of Agresti-Coull interval
3;. a
r f l I I I
0.0 0.2 0.4 0.6 0.8 1 .0

p1

52

Figure 3.7: Comparison of approximate expected lengths of some conﬁdence inter-

vals for p2 = 0.1 and n1 = 10, 71.2 = 20

 

1.0

..............
.......
........
......
.' u
n ’0
,u '-
a
o' '-
o' '-
n

0.8

Expected length
0 6
l

 

 

0.4
1
\ ‘

EL 01 Wald interval \
EL oi Wald interval with correction
— - —- EL of Bayes interval

————— EL oi Agresti-Coull interval

 

 

 

 

 

0.2

 

I I I ‘1 l l

0.0 0.2 0.4 0.6 0.8 1.0

p1

53

Now we may conclude that the poor coverage performance of the Wald interval
is not because it is short. On the contrary, compared to intervals with adjusted
centers, the Wald interval is often longer than them but with less coverage proba-
bility. The high coverage probability of Wald interval with continuity correction is
achieved by widening the Wald interval dramatically. Hence, in replacing the Wald

interval, intervals with adjusted centers are much more preferable.

3.4 Comparison of the Wald Interval and all Pro-

posed Alternatives

The comparison is based on a simulation with 10000 iterations for each selected
(n1,p1; n2,p2). The simulation results are summarized in next table, in which we
use WCC and PLB to indicate the Wald interval with continuity correction and the
proﬁle likelihood based interval respectively. Since the Wald interval with continuity
correction is not as good as the intervals with adjusted centers, we will not compare

it with other intervals.

Through table 1, we can see that the proﬁle likelihood based interval does
improve upon the Wald interval on coverage probabilities in a frequentist sense. As
listed in the table, its coverage probabilities are (much) higher than the coverage
probabilities of Wald interval except a few points of (n1,p1;n2,p2). Hence, the

coverage of this interval is more reliable than the Wald interval. This suggests that

54

Table 3.1: Comparison of Conﬁdence intervals at 95% level

 

 

 

 

 

 

 

 

Coverage Probability Length
71 1 02 p1 p2 Wald WCC Bayes AC PLB Wald WCC Bayes AC PLB
10 10 .9 .1 .870 .878 .953 .953 .857 .458 .658 .556 .579 .467
.9 .8 .874 .989 .976 .985 .921 .567 .767 .601 .626 .638
.8 .3 .896 .973 .951 .953 .949 .707 .907 .671 .699 .708
.8 .7 .917 .975 .956 .968 .913 .707 .907 .671 .699 .728
.6 .4 .919 .972 .955 .961 .941 .812 1.01 .730 .760 .812
.6 .5 .907 .966 .955 .957 .941 .821 1.02 .736 .766 .822
.5 .5 .910 .955 .954 .954 .945 .830 1.03 .741 .771 .831
15 15 .9 .1 .809 .955 .962 .971 .930 .396 .529 .450 .463 .406
.9 .8 .933 .979 .960 .970 .923 .479 .613 .496 .511 .527
.8 .3 .932 .955 .937 .956 .940 .591 .724 .568 .584 .596
.8 .7 .931 .976 .953 .958 .933 .591 .724 .568 .584 .614
.6 .4 .932 .973 .956 .959 .949 .676 .810 .626 .644 .676
.6 .5 .930 .974 .933 .933 .931 .684 .817 .631 .649 .684
.5 .5 .939 .954 .951 .954 .951 .691 .824 .636 .654 .689
20 2O .9 .1 .916 .920 .956 .956 .967 .352 .452 .387 .396 .367
.9 .8 .943 .972 .960 .971 .931 .423 .523 .433 .445 .463
.8 .3 .938 .965 .954 .954 .931 .517 .617 .501 .512 .523
.8 .7 .940 .970 .949 .951 .939 .517 .617 .501 .512 .544
.6 .4 .942 .973 .938 .956 .942 .591 .691 .556 .569 .596
.6 .5 .931 .963 .949 .955 .949 .598 .698 .561 .574 .601
.5 .5 .918 .958 .954 .954 .954 .604 .704 .566 .578 .606
50 50 .9 .1 .932 .969 .949 .949 .953 .231 .271 .240 .242 .248
.9 .8 .944 .967 .953 .957 .941 .273 .313 .276 .279 .294
.8 .3 .945 .969 .951 .951 .946 .333 .373 .329 .332 .354
.8 .7 .940 .966 .944 .948 .943 .333 .373 .329 .332 .355
.6 .4 .941 .957 .944 .944 .944 .380 .420 .370 .374 .394
.6 .5 .939 .966 .940 .940 .940 .384 .424 .374 .377 .396
.5 .5 .939 .962 .939 .939 .939 .388 .428 .377 .381 .397
20 10 .9 .1 .856 .955 .956 .960 .930 .409 .559 .479 .495 .434
.9 .8 .873 .945 .972 .972 .906 .520 .670 .530 .549 .557
.8 .3 .908 .966 .943 .951 .937 .632 .782 .597 .618 .631
.8 .7 .913 .966 .949 .949 .939 .632 .782 .597 .618 .644
.6 .4 .921 .964 .946 .951 .944 .710 .860 .649 .671 .701
.6 .5 .917 .966 .947 .947 .945 .720 .870 .655 .677 .710
.5 .5 .926 .969 .944 .947 .944 .725 .875 .659 .682 .714
10 20 .9 .1 .856 .958 .953 .957 .932 .410 .560 .479 .495 .434
.9 .8 .945 .983 .970 .984 .927 .475 .626 .516 .533 .540
.8 .3 .911 .963 .946 .955 .924 .604 .754 .587 .607 .609
.8 .7 .921 .966 .954 .963 .923 .604 .754 .587 .607 .629
.6 .4 .921 .962 .945 .950 .941 .710 .860 .645 .671 .701
.6 .5 .919 .962 .944 .948 .939 .715 .865 .653 .675 .706
.5 .5 .925 .967 .941 .945 .941 .725 .875 .659 .681 .714

 

 

 

 

55

Figure 3.8: Comparison of coverage probabilities for p1 = 0.21 to 0.99, p2 = p1—0.2,

m = 712 = 20 at 95% nominal level

 

 

 

 

 

 

 

 

 

a:

O). ..

O

8
.3 o‘
.5
'5
(U
.0
2
O.
G V
O) O) _
E o'
m
>
o -..-. .'
o .‘-'.. O: H".

— Wald
Wald with correction
g. _ ——— Bayes
o ----- Agresti-Coull
------------- Profile likelihood
l T I I I
0.0 0.2 0.4 0.6 0-8
p2

the proﬁle likelihood interval might be a good alternative to the Wald interval.

However, for small or moderate balanced sample sizes, the coverage behavior
of the proﬁle likelihood based interval is questionable. Figure 3.8 plots the coverage
probabilities of the ﬁve 95% nominal intervals at n1 2 n2 2 20 and p1 = 0.21 to
0.99 with step-size 0.01 and p2 = p1 — 0.2. In this speciﬁc case, though the lengths
of the proﬁle likelihood based interval are always greater than those of the Wald

interval, its coverage behavior is even worse than that of the Wald interval .

In general, one disadvantage of the proﬁle likelihood based interval is, for

56

balanced sample sizes, the lengths of the proﬁle likelihood intervals are greater

than those of the Wald intervals.

There is an “outlier” among the coverage probabilities of the proﬁle likelihood
based interval in the table. When n1 = M = 10 and p1 = 0.9, p2 = 0.1, the
coverage probability of the proﬁle likelihood based interval is only 0.857 while all the
other coverage probabilities of this interval listed in the table are greater than 0.90.
This is due to the discrete nature of the binomial distribution. Some quadruples
(n1,p1,n2,p2) are lucky and some are unlucky. The quadruple (n1,p1,n2,p2) =

(10,0, 9, 10, 0.1) is an unlucky one for the proﬁle likelihood interval.

According to table 1, compared to the intervals with adjusted centers, the
proﬁle likelihood interval does not behave better. The coverage probabilities of
the Bayes interval and the Agresti-Coull interval are seldom less than those of the
proﬁle likelihood interval and have less deviation according to the above table. For
small or moderate sample sizes, when p1 and p2 are close, the coverage probabilities
of intervals with adjusted centers tend to be greater than or equal to those of
the proﬁle likelihood interval. This property makes the intervals with adjusted
centers more attractive because it is more common that the difference of the two

proportions of interest is small or not very large.

In addition, except that p1 and p2 are close to boundaries, intervals with
adjusted centers are always shorter than the corresponding proﬁle likelihood based
intervals. The other disadvantage of the proﬁle likelihood based interval is that it

does not have an explicit form.

57

Based on our evaluation, all the candidate intervals improve the coverage prob-
abilities greatly upon the Wald interval in a frequentist sense. The phenomenon
of over nominal coverage probability occurs quite often to the Wald interval with
continuity correction, which tends to have the largest expected length. The proﬁle
likelihood based interval has a better coverage performance but greater expected
length than the Wald interval when the binomial proportions are not close to the
boundaries, and the computation of this interval is complex. Moreover, our ex-
tensive simulation shows the performance of the intervals with adjusted center is

better than other intervals.

With respect to the ﬁve conﬁdence interval methods discussed for construct-
ing approximate 100(1 — (1)70 two-sided intervals, we recommend intervals with
adjusted centers as substitutes for the Wald interval. Because of their stable cov-
erage behaviors, they have relatively reliable coverage performance even when n1 ,
n2 are very small and p1, p2 are close to boundaries. Their simple expressions make
the computation easier. Moreover, their lengths are not longer than other intervals
in a frequentist sense. Especially, when the proportions are not very close to the
boundaries, the lengths of the intervals with adjusted centers tend to be smaller
than the others. As for which interval to choose between the Bayes interval and
the Agresti-Coull interval, it depends on one’s favor. The former is shorter and a

little bit less conservative.

58

Chapter 4

Interval Estimation for the
Difference of Two Binomial

Proportions in Adaptive Designs

4.1 Introduction

In clinical trials and in industrial work, adaptive designs which use accumulating
information to assign subjects to different treatments, are often highly desirable.
People apply adaptive designs for two possible aims: ﬁrst, to draw reliable statistical
inferences for the beneﬁt of future subjects, which can be thought of as an utilitarian
goal. Second, to assign each subject to the treatment with better performance,
which is the individualistic goal.

In this chapter, the approaches of constructing conﬁdence intervals for non-

59

adaptive designs will be applied to adaptive designs. A sequential adaptive model
is considered in which two treatments are compared and the responses are binary:
success or failure.

In section 4.2, notation and some existing adaptive designs will be introduced.
The validity of extending non-adaptive methods to adaptive designs will be checked
in section 4.3. As will be explained in more details, adaptive designs are classiﬁed
into two categories: allocation adaptive designs and response adaptive designs. In
section 4.4, the connection between the coverage performance and expected lengths
of a conﬁdence interval derived from a non-adaptive design and its counterpart from
allocation adaptive design is stated and proved. In section 4.5, simulation results

are given for response adaptive designs.

4.2 Notation and Some Adaptive Designs

The two populations to be compared are referred to as Population A and Popu-
lation B, and {Xk : k 2 1} and {1”,c : k 2 1} denote the potential independent
observations from populations A and B respectively. For each k 2 1, exactly one
of (Xk,Yk) is actually observed. It is assumed that (X1,Y1), (X2,Y2), . .. are i.i.d.,
where X1 ~ Bernoulli(pA) and Y1 ~ Bernoulli(p3). The total sample size is n,
the number of observations from populations A and B. For each k > 1, deﬁne 6,,
to be 1 or 0 according to whether the kth object is assigned to population A or

B. The symbols N A(k) and N 3(k) indicate the numbers of the ﬁrst k observations

60

that are allocated to p0pulation A and B through stage I: . Then

and

k
NBUC) = 2(1— 61‘) = k — NA(k)°

i=1

Further, deﬁne SA(k) and Sg(k) to be the numbers of successes from populations

A and B through stage 1:. Then
k
SAUC) = 25.913
i=1

and
k

330:) = 2(1- 601/.

i=1

As stated in Geraldes (1999), most adaptive designs ﬁt into one of two general
categories: allocation adaptive designs and response adaptive designs. The former
encompasses those approaches for which the allocation of each subject does not
depend on the responses of previous subjects but only depends on the subject’s
covariate levels (when covariate information is taken into consideration) and the
allocations and covariate levels of the previous subjects. The second category in-
cludes those approaches for which the allocation of each subject depends also on
the responses of the previous subjects. Hence, the main difference between alloca-
tion adaptive designs and response adaptive designs is that (X 1, Y1), (X2,Y2), . ..
are independent of the 6’s in allocation adaptive designs and they are dependent in

response adaptive designs.

61

There are a lot of adaptive designs in the literature, for example, the doubly
adaptive biased coin design proposed by Eisele (1994), the play-the-winner design
proposed by Smythe and Rosenberger (1995). Woodroofe (1982) considers the prob-
lem of sequentially allocating patients to treatments when covariate information is

present. We will introduce some adaptive designs in the next two subsections.

4.2.1 Some Allocation-Adaptive Designs

The Biased Coin Design, pr0posed by Efron (1971), allocates the next subject to
one of the two populations, A or B, according to the following rule. Let D), denote

the difference of NA(k)/k and N3(k)/k. Let p0 be a constant in [05,1]. Then

il-Po, ika>0;
P(5k+1=1)= 1/2, ika=0;

p0, if Dk < 0.
This allocation policy tends to balance the number of observations from both pop-
ulations.
The Adaptive Biased Coin Design, proposed by Wei (1978), allocates subjects
to A or B according to the following rule. Let D], denote the difference of N A(k) / k
and N3(k)/k. Let h : [—1,1] —> [0,1] be a non-increasing function such that
h(:r) = 1 — h(—x) for any as 6 [-1,1]. Then P(6k+1 = 1) = h(Dk). This allocation

policy may force an extremely imbalanced experiment to be balanced very quickly.

62

4.2.2 Some Response-Adaptive Designs

The Randomized Play-the- Winner Rule, proposed by Wei and Durham (1978),
tends to allocate more subjects to the population with higher success proportion.
This rule can be described with an urn model. An urn has balls of two different
types, marked A or B. We start with 0: balls of each type. When a subject enters
the study, a ball is drawn at random and replaced. If it is type A, then the subject
is assigned to A. It is assigned to B otherwise. If the observation of the subject is
aisuccess, then [3 balls of the same type are added. Otherwise, 6 balls of the other
type added to the urn. This rule is denoted by RPW(a, B).

The Randomized Adaptive Design, pr0posed by Melﬁ and Page (1995), tends to
allocate subjects to both populations according to an optimal proportion. Suppose
{Uk : k 2 1} are a sequence of i.i.d. random variables, whose common distribution
is U(0,1), and which are independent of both {X,c : k 2 1} and {Y1c : k 2 1}. To

minimize the variance of the estimatorpAUc) - 153(k), the desired proportion is

PA(1 - PA)

”(p/hm) = W+ 193(1 -p3)°

Let 17k = 7r(13,4(k),153(k)), where 15,4(k), 13,4(k) are two estimators of the success

 

probabilities pi and 103. Then

6k+1 = [{Uk+1 < 7706)}.

63

4.3 The Conﬁdence Intervals in Adaptive Designs

Because of the adaptive nature of the design, the distribution of SA(k) may no
longer be Binomial Bin(NA(k), pA). And S A(k), 53(k) are no longer independent.
Therefore, the validity of constructing conﬁdence intervals using the non-adaptive
formulas needs to be veriﬁed.

The maximum likelihood estimators, at stage 1:, of the success probabilities,

1),, and p3, are

 

 

mus) - 22%.)
and
1013(k) = 2:22))-

Some statisticians have studied asymptotic prOperties in some adaptive design
settings, such as Eisele and Woodroofe (1995), Bai et al. (2002), Rosenberger (1993)
and Rosenberger et al. (1997).

Melﬁ et al. (2001) prove some theorems and applied them to show that

(NAM/262(k) - p.) Na(k)‘/2(p‘a(k) — p3)
(PAqul/2 , (pBQB)l/2

under a wide range of adaptive design rules, where Z1 and Z2 are independent

 

) :5. (21,22) (4.3.1)

standard normal random variables. Wei et al. (1990) proved the same result under
randomized play the winner rule using martingale technique. Therefore, the adap-
tive version of the Wald conﬁdence interval and the Wald interval with continuity

correction up to stage It with nominal level 100(1 — a)% are

_ - _ . p‘A(k)qla(k) pia(k)qia(k)

 

(4.3.2)

64

and

 

 

 

all)

and.) = no.) at.) .. ( fill???) .. ”Bl-“ill“ . 2,10,, . W)
(4.3.3)
When A = A0 for A = 1),, — 123, it follows from (4.3.1) and the arguments in
Cox and Hinkley(1974, page 322-323) that the variable 2{I(A(k),133(k)) — l(Ao)} is
asymptotically chi-squared distributed with one degree of freedom, where A(k) =

13,.(k) —133(k). Thus an approximate 100(1—a)% proﬁle likelihood based conﬁdence

interval for 1),; — p3 of adaptive designs is:
Claw) = {A 6 (-1,1) : 2(l(13(k).233(k)) - RA» 5 xi(a)}. (4.3.4)

To derive the conﬁdence intervals with adjusted centers for adaptive designs,

we deﬁne two estimators for 1),, and p3:

- _ SAlk) +1

and
- _ 513(k) +1
”8“” — W“

Theorem 4.3.1. In the above adaptive setting, if 5%? ——> 1 and L255) —> 1 in
probability as k —-) 00, where {ab bk} are positive constants with ak and bk tending

to inﬁnity. Then,

 

(NAUC) + (JP/”(12346) - PA) (NBUC) + C)‘/2(p"a(k) - 103) c
( PAQA ’ «P343 ) = (21,22),
(4.3.5)

where c is a constant and Z1, Z2 are independent standard normal random variables.

65

Proof. The desired conclusion is a direct result of Corollary 3.1 in Melﬁ et al.

(2001). [:1

This theorem gives the validity of the conﬁdence intervals with adjusted centers
for adaptive designs. Hence, the nominal level 100(1—a)% Bayes and Agresti-Coull

conﬁdence intervals for adaptive designs are

 

_ - .. 1241(qu~ (k) p‘a(k)<i (k)
013(10— 114(k) - 193(k) 2‘: 2% \/W + W (43-6)

and

 

 

_ .. - p1(k)q”.4(k) p"B(k)q"B(k)
Clﬁdk) —PA(k) -pB(k)iz°,‘/NA(k)+2 + Na(k)+2' (4-3-7)

We will consider the performance of the above conﬁdence intervals in the next

section.

4.4 Comparison of Conﬁdence Intervals in A110-

cation Adaptive Designs

For convenience, we deﬁne some notation: CI {‘(k) is the conﬁdence interval derived
by method It and based on a certain adaptive design up to stage k, where * might be
W, WCC, B, AC and PLB that indicate the Wald interval, the Wald interval with
continuity correction, the approximate Bayes interval, the Agresti-Coull interval
and the proﬁle likelihood based interval respectively. If necessary, we may replace
A by a speciﬁc adaptive design. And CI.(i, j) is the counterpart of CIﬂk) from a

66

non-adaptive design with i observations from population A and j observations from
population B, where i+ j = 1:. Similarly, use ELf(k) to denote the expected length
of the conﬁdence interval derived by method * and based on an adaptive design.
EL. (i, j ) represents the counterpart of BL? with i observations from population A
and j observations from population B.

The following theorem explores the connection of the coverage probabilities
between conﬁdence intervals based on allocation adaptive designs and non-adaptive

designs.

Theorem 4.4.1. In allocation adaptive designs,

I:
Plp. — p. e 013(k)) = Zap. — p. 6 cm, k -j))P<N,.(k> =1)

i=0
where the * may be any conﬁdence interval that the non-adaptive version CI. (j, k —

j ) only involves svﬂ‘icient statistics: S A and S B.
The proof of this theorem is based on the next Lemma.

Lemma 4.4.1. In allocation adaptive designs, suppose a and b are any non-negative

integers satisfying a g j and b S k — j, then
1- P(SA(’€) = GINAUC) = j) = (1)1720 — will“;
2. assoc) = bINAUc) =1) -—- (kg-into - par-H and

s. P(SA(k) = a, 513(k) = blNA(k) = j)

= P(SA(k) = GINAUC) = J')P(Sa(k) = bIMUC) = j)-

67

Proof. Let 75’ = (61, . . 461:)-

For anyj 6 {0,1,...,k},
Cl
. —> —->
{N4(k) =J}=U{ 6 = 6.}
t=l
where 3;) is such a k—dimension vector that has j elements with value 1 and the
other k — j elements with value 0. There are C'j = (j) different such vectors. We

put them in order.

Note that

(4.4.2)

The third step is valid because {61,l Z 1} is independent of the i.i.d sequence
{X,-, i = 1, 2, . . .} in allocation adaptive design. Hence, those 5’s only indicate when
to take observations from A and B. We have proved that S A(k) has a conditionally
binomial distribution.

Similarly, we can prove the conclusion related to P(Sg(k) = bl N A(k) = j ).

68

Next to prove that given N A(k) = j, S A(k) and 83(k) are independent.

The conditionally joint distribution of S A(k) and S 3(k) is

J' k-J k
=P (EX, =a,ZY.-=b, | 25,-:3') (4.4.3)

Therefore, by the independence of the responses and the allocations, equation

(4.4.3) can be rewritten in the following way:

P(SA(k) = 0,5300 = blNAlk) =1)

=P(:X=a|NA(k) =jP) (Zr-mum) :3)

i=1

=P (5406) = a|N4(l€) = j) P (519(k) = b|N4(k) = 1') (4-4-4)
Hence, the lemma holds. El

Proof. (of Theorem 4.4.1).

For any conﬁdence interval,

P(p4 - 193 e 019(k))
k

=ZPoA—pgem ( ) more): 2')

J:

= Z P(p4 - 123 e CII‘(k)INA(k) = j)P(NA(k) = 3') (4.4.5)

i=0

69

If the non-adaptive version of the conﬁdence interval only involves sufﬁcient

statistics: SA and S B, the desired conclusion is achieved by applying Lemma 4.4.1

to (4.4.5). Cl

Remark 4.4.1. The condition that N—ﬁfﬂ —> 1 and 1‘ng —> 1 in probability as
k —> 00 is not needed for the proof procedure, but does guarantee the validity

of the asymptotic normality needed in constructing those conﬁdence intervals in

general adaptive designs.

There is a similar theorem concerning the connection of the expected lengths

of conﬁdence intervals in non-adaptive designs and allocation adaptive designs.

Theorem 4.4.2. In allocation adaptive designs,

1:
EL? = ZEN]; k — j)P(N.. = j),

i=0
where the * may be any conﬁdence interval that the non-adaptive version CI. (j, k —

3') only involves the svﬂ‘icient statistics: S A and S 3.
Proof. This proof is similar to the one of Theorem 4.4.1. El

Remark 4.4.2. The ﬁve conﬁdence intervals considered in the dissertation satisfy

the requirements of the two theorems.

Remark 4.4.3. The two theorems imply that for allocation adaptive designs, a

conﬁdence interval should behave well if it behaves well in non-adaptive designs.

70

4.5 Comparison of Conﬁdence Intervals in Re-

sponse Adaptive Designs

For response adaptive designs, we do not have simple results as we do in allocation
adaptive designs. The main reason is Lemma 4.4.1 does not hold in response
adaptive designs because 6’s are not independent of the responses X’s and Y’s.

However, for response adaptive designs, we still have the same conclusion
via simulation: if a conﬁdence interval behaves well in non-adaptive designs, one
may expect this conﬁdence interval to behave well in response adaptive designs.
We obtain this conclusion through extensive simulation studies on some response
adaptive designs. All the results shown use a simulation with 10000 iterations for
each realized (n,pA,pB).

We concentrate on RPW (1,1), the randomized play the winner design with
a = 1 and ﬂ = 1, in this dissertation. Similar conclusions hold for some other
response adaptive designs such as the randomized adaptive designs.

As we did in non-adaptive designs, to explore the average performance of the
ﬁve conﬁdence intervals, we randomly sampled 10,000 values of (n, pA, p3), taking
pA and p3 independently from U (0, 1) and taking it from uniform distribution over
{10, 11, . . . , 100}. We then applied RPW(1, 1) rule to the sampled n to achieve the
sample sizes from the two treatments.

Figure 4.1 shows the average coverage performance of the ﬁve intervals with

means and medians of the coverage probabilities listed. Similar to the results in

71

Figure 4.1: Coverage probability Boxplots of some 95% nominal intervals upon

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

RPW(1,1)
8 -: v—HI—u F-ﬁ r—':—\ ,—__.
3.? - rm l::l r—L
g o
E
2
8 s -
o”
median==.914 median=.962 median=.950 median=.955 median=.949
8 mean: 875 mean::.948 mean=.952 mean=.958 mean=.950
d d

 

 

Wald Wald_CC Bayes Agresti-Coull PLB

non-adaptive designs, the Wald interval behaves poorly: the coverage probability
is unstable and very low with median 0.914 and mean 0.875 at the 95% nominal
level. It also occasionally has very low coverage probabilities. Though the cover-
age probabilities of the Wald interval with continuity corrections are higher than
those of the Wald interval, it inherits some disadvantages of the Wald interval too:
occasional very low coverage probabilities and unstable performance. The average
coverage behaviors of the Bayes interval, the Agresti-Coull interval and the Proﬁle
likelihood based interval are very similar: their means and medians are close to the
95% nominal level. The proﬁle likelihood interval is not as stable as the intervals
with adjusted centers.

When comparing Figure 3.1 and 4.1 or simply comparing the corresponding

72

Figure 4.2: Expected Length Boxplots of some 95% nominal intervals under

RPW(1,1)

 

F“

1.5

medians.475 median=.587 median=.488 median=.499 median=.504
mean=.499 mean==.646 mean=.517 mean=.539 mean=.537

F!
'———H

 

 

 

Mean Length

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

hd

 

 

 

Wald Wald_CC Bayes Agresti-Coull PLB

mean and median coverage probabilities, we notice one interesting point. The
average coverage performance of the Wald interval and the Wald interval with
continuity correction in RPW(1,1) is much worse than it is in non-adaptive designs.
However, this is not true for intervals with adjusted centers, which makes the
intervals with adjusted centers desirable with RPW(1,1) and some other adaptive
designs because of their stable performance.

We also plot the mean length boxplots of the ﬁve intervals with RPW(1,1) in
Figure 4.2. Since the Wald interval with continuity correction is too wide compared

to other intervals, we discard it in the following comparisons. And because the two

intervals with adjusted centers are very similar, we will only consider the Agresti-

Coull interval henceforth.

73

Figure 4.3: Coverage probabilities of three 95% nominal intervals for n = 20 and

m = 0.5 under RPW(1,1)

 

 

0.95

Coverage Probability
O 90
1

 

 

 

 

 

 

 

 

 

In
{D -l
o‘

—— Wald
o ------------ Agresti-Coull
g _ —-- Profile likelihood

0.2 0.4 0.6 0.8
pb

Figure 4.4: Expected lengths of three 95% nominal intervals for n = 20 and p A = 0.5

upon RPW(1,1)

 

 

 

 

 

 

 

 

 

0!
O
°°. J
O
E
_1 f‘. —l
o
i
2
‘0. ..
O
— Wald
------------- Agresti-Coull
m ——-- Proﬁle likelihood
o' ‘ , a l
0.2 0.4 0.6 0-8

pb

74

Figure 4.5: Coverage probabilities of three 95% nominal intervals for n = 20 and

pi = 0.9 upon RPW(1,1)

 

 

 

 

 

 

 

 

 

8.
:""‘7:\\ ............................................
.xt— —- \ ...................................................... ,
‘ _____ ‘ , , , _ _ \ , \ / i
\ ’ \ /
s a - — w...
E 6 ~ ----------- Agresti-Coull
2 - - -- Proﬁle likelihood
Q ..
8
9 s
O
[\- _
o T l 1 l
0.2 0.4 0.6 0.8
pb

Figure 4.3 plots the coverage probabilities of the Wald interval, the Agresti-
Coull interval and the proﬁle likelihood based interval at the 95% nominal level
for n = 20, [9,4 = 0.5, p3 = 0.05 through 0.95 with step-size 0.05 with RPW(1,1).
And Figure 4.4 plots the corresponding mean lengths of those three intervals. We
see that the Agresti-Coull interval has both satisfactory coverage performance and
the expected length in this setup. Its coverage probabilities are almost all (right)
above the nominal level and it has the shortest length unless pH is close to either
boundary, i.e, the difference of 1),; — pH is not very big.

Though the values p3 taken are symmetric around p4 = 0.5, Figure 4.3 and
Figure 4.4 do not exhibit any symmetry. This is due to the adaptive nature of the

RPW(1,1) design and pA/pB being not symmetric around pA = 0.5.

75

Figure 4.6: Expected lengths of three 95% nominal intervals for n = 20 and p A = 0.9

upon RPW(1,1)

 

 

 

 

 

 

 

 

 

0. a
o
h
o’ -
§ .. .
o
i
2
m j —— Wald
c5 ------------- Agresti-Coull \
—--. Profile likelihood
V. s
O I l I j

pb

Different from the setup of Figure 4.3 and Figure 4.4, let pA = 0.9 in Figure
4.5 and Figure 4.6. When p; is far from p3, the coverage probability of the proﬁle
likelihood interval is rather high. It drops when p, and p3 gets closer. Contrary
to the proﬁle likelihood interval, the coverage probability of Agresti-Coull interval
is much less sensitive to the relative positions of pA and p3. The coverage remains
above nominal level. Though the expected length of the Agresti-Coull interval is
much greater than that of the proﬁle likelihood based interval when pA - p3 is large,
it is close to the latter when pA - pH is not very large. This is veriﬁed through
our extensive simulation. Actually, when the total sample size n increases, the
disadvantage of the expected length of the Agresti-Coull interval when pA — p3 is

large decreases. For example, when n = 100 and keep p, and p3 same as in Figure

76

4.6, the mean lengths of the three intervals are comparable. The expected length
of the Agresti-Coull interval is less than that of the proﬁle likelihood based interval
most of the time and it has the smallest length when p3 is not very close to either
boundary.

In Figure 4.6, one may notice that the expected length of the Wald interval is
much smaller of those of the other two intervals, especially when 193 is close to 0 or
1. This is due to the high frequency of the occurrence of the empty Wald interval
when the sample size is small and the success proportions are close to boundaries.
This is also the reason for the low coverage probability of the Wald interval in
Figure 4.5. The feature of the much lower expected length of the Wald interval is
not so obvious or does not exist for moderate(say, n = 50) or large sample size(say,
n = 100).

Let us compare the three conﬁdence intervals from another point of view: let
the total sample size n vary and keep pA and p3 as constants.

Figure 4.7 and Figure 4.8, respectively, plot the coverage probabilities and
mean lengths of the three conﬁdence intervals for n varying from 10 through 100
with p, = 0.7 and p3 = 0.4. The Agresti-Coull interval has both the highest
coverage probability and the shortest expected length for most values of n. This
makes the Agresti-Coull interval very attractive in application. Another advantage
of the Agresti-Coull interval is it may achieve the nominal level for very small sample
sizes. When the sample size increases, the coverage probability of the Agresti—Coull

interval tends to go down and ﬂuctuate around the nominal level which may be

77

Figure 4.7: Coverage probabilities of three 95% nominal intervals for n = 10 — 100

and 1),, = 0.7, p3 = 0.4 upon RPW(1,1)

 

 

 

   

 

 

 

 

 

3 _ ''''''''' I~PM:17:7,?‘2‘};E7t7iﬁyii©m7\7‘evm
g. o -1 A If V
I \’ \I
- I
a.
3 8 .
9 o‘
g — Wald
8 J ------------- Agresti-Coull
—-—- Profile likelihood
(D
to ..
o
I l 1 I f
20 40 60 80 100

Figure 4.8: Expected lengths of three 95% nominal intervals for n = 10 — 100 and

p), = 0.7, p3 = 0.4 upon RPW(1,1)

 

1.0

 

— Wald
_ ............. Agresti-Coull
— - - - Profile likelihood

 

 

Mean Length

0.4 0.5 0.6 0.7 0.8 0.9

 

 

 

 

78

explained by the central limit theory for adaptive designs.

Our extensive simulation shows that the Agresti-Coull interval always has the
most satisfactory coverage probability when pA and p3 are not far apart from each
other (say, lpA — pH] < 0.5). When IpA — p3| is very large, the proﬁle likelihood
based interval has the highest coverage probability. The expected length of the
Agresti-Coull interval is also satisfactory unless the two proportions are close to

boundaries.

4.6 Conclusion

In summary, compared to other intervals discussed, the intervals with adjusted cen-
ters behave best with RPW(1,1). They have both stable and satisfactory coverage
probabilities and expected lengths in a frequentist sense. The stableness of the two
intervals makes them good intervals in other adaptive designs. Our simulation with
some other adaptive designs such as the randomized adaptive designs and adap-
tive weighted difference designs, due to Geraldes (1999), conﬁrms this conclusion.
Therefore, we suggest the intervals with adjusted centers to be used in adaptive
designs.

One may expect to improve the coverage performance of the intervals with
adjusted centers for large sample size by adjusting the weights of 114(k) and 1 / 2

when deﬁning 114(k) for large k’s. We may adjust 153(k) the same way.

79

Bibliography

AGRESTI, A. and CAFFO, B. (2000). Simple and effective conﬁdence intervals for

proportions and differences of proportions result from adding two successes and

two failures. Amer. Statist. 54 280—288.

AGRESTI, A. and COULL, B. A. (1998). Approximate is better than “exact” for

interval estimation of binomial prOportions. Amer. Statist. 52 119—126.

BAI, Z. D., HU, F. and ROSENBERGER, W. F. (2002). Asymptotic properties

of adaptive designs for clinical trials with delayed response. Ann. Statist. 30

122-139.
BERRY, D. (1996). Statistics: A Bayesian Perspective. Belmont, CA:Wadsworth.

BHATTACHARYA, R. N. and RANGA RAO, R. (1976). Normal approximation and
asymptotic expansions. John Wiley & Sons, New York-London-Sydney. Wiley

Series in Probability and Mathematical Statistics.

BROWN, L. D., CAI, T. T. and DASGUPTA, A. (2002). Conﬁdence intervals for

a binomial proportion and asymptotic expansions. Ann. Statist. 30 160—201.

80

COX, D. R. and HINKLEY, D. V. (1974). Theoretical statistics. Chapman and

Hall, London.

EFRON, B. (1971). Forcing a sequential experiment to be balanced. Biometrika

58 403-417.

EISELE, J. R. (1994). The doubly adaptive biased coin design for sequential clinical

trials. J. Statist. Plann. Inference 38 249-261.

EISELE, J. R. and WOODROOFE, M. B. (1995). Central limit theorems for doubly

adaptive biased coin designs. Ann. Statist. 23 234—254.

GERALDES, M. (1999). Covariates in adaptive designs for clinical trials. Ph.D

Dissertation, Michigan State University .

HALL, P. (1992). The bootstrap and Edgeworth expansion. Springer-Verlag, New

York.

MELFI, V. F. and PAGE, C. (1995). Randomized adaptive designs. Inst. Math.

Statist., Hayward, CA.

MELFI, V. F., PAGE, C. and GERALDES, M. (2001). An adaptive randomized

design with application to estimation. Canad. J. Statist. 29 107—116.

NEWCOMBE, R. G. (1998). Interval estimation for the difference between inde-

pendent proportions: Comparison of eleven methods. Statistics in Medicine 17

873—890.

81

ROSENBERGER, W. F. (1993). Asymptotic inference with response-adaptive treat-

ment allocation designs. Ann. Statist. 21 2098—2107.

ROSENBERGER, W. F., FLOURNOY, N. and DURHAM, S. D. (1997). Asymp-
totic normality of maximum likelihood estimators from multiparameter response-

driven designs. J. Statist. Plann. Inference 60 69—76.

SMYTHE, R. T. and ROSENBERGER, W. F. (1995). Play-the-winner designs,

generalized Polya urns, and Markov branching processes. In Adaptive designs

(South Hadley, MA, 1992). Inst. Math. Statist., Hayward, CA 13-22.

WEI, L. J. (1978). The adaptive biased coin design for sequential experiments.

Ann. Statist. 6 92—100.

WEI, L. J. and DURHAM, S. (1978). The randomized play-the-winner rule in

medical trials. J. Amer. Statist. Assoc. 73 840—843.

WEI, L. J., SMYTHE, R. T., LIN, D. Y. and PARK, T. S. (1990). Statistical in-
ference with data-dependent treatment allocation rules. J. Amer. Statist. Assoc.

85 156—162.

WILSON, E. (1927). Probable inference, the law of succession, and statistical

inference. American Statistical Association 22 209—212.

WOODROOFE, M. (1982). Sequential allocation with covariates. Sankhya Ser. A

44 403—414.

82