.
. k.
i. a: .u...
3.5.... .
«I.
.c

an???
Ira: a ﬁr. -
.. r h: a
. . :
ttﬂuuﬂ». . a
in}: .5
:l.
ﬁ
x!
‘1—‘ZQVi

arm;

 

{“3.
p.539”. 1
. o

laurunﬂuavxﬂxﬁ. I 1.
I vtuWﬂu

.......M t.
‘usﬁho.
T)...

3.1.)!

l) . .rrliﬂi.$.

.V..Lx~".nﬁ‘:~)ul

C . ...Il.lv .

52.)..13PH0v73‘. u
.4 5 .11

«an.

5115!:
F X;

Fusfyh...h -. HI. 9

if 2x. $.51...
.3 Lti

 

"”5 illllllllllllllllIIIHHHIIIIllllllilllllIIIIHIIIIIIHIIHHI

”‘79 31293 01810 3741

This is to certify that the

dissertation entitled
Co mt {ans (n Add 9 L V4 Designs

{of C 9'9“ch TrCaQS

presented by

M artfu {4Q Crést‘ nq GerchoLls

has been accepted towards fulﬁllment
of the requirements for

Yﬁ-D degree in Statistics and vrobqkiex‘tj

 

 

gum/p

M ajor professor/

Date 06- 30*‘19

 

MSU is an Afﬁrmative Action/Equal Opportunity Institution 0-12771

 

 

LIBRARY
Michigan State
UnIversIty

 

 

PLACE IN RE1URN BOX to remove this checkout from your record.
TO AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE

DATE DUE

DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1m animus-p.14

COVARIATES IN ADAPTIVE DESIGNS FOR CLINICAL TRIALS
By

Margarida Cristina Geraldes

A DISSERTATION

Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY
Department of Statistics and Probability

1999

(l

llllt'i'VJl“

I ‘1‘
(“WW

all Il
(are iv.”
My»

be (15117,”
. .11
ml“ I93!” I}!
In II“

_l

ImplF‘ﬂiq
. ' l?

ABSTRACT
COVARIATES IN ADAPTIVE DESIGNS FOR CLINICAL TRIALS
Bv

I/

Margarida Cristina Geraldes

This dissertation addresses the problem of designing clinical trials in such a way
that a good compromise is achieved between the need to draw reliable statistical
inferences from the data collected in the trial (utilitarian goal of a design) and the
concern that each patient enrolled in the trial is receiving the best possible medical
care (individualistic goal). These goals are often conflicting. The problem can further
be complicated when relevant clinical information (covariates), which is likely to affect
the responses of the patients to the treatments, is to be incorporated into the design.

In this dissertation we develop and study three new designs that seek a com-
promise between the utilitarian and individualistic goals. These procedures can be
implemented in clinical trials for which eligible patients arrive sequentially and can
be given either one of the two treatments being compared or evaluated in the trial.
It is assumed that the responses of patients to the treatments are dichotomous (i.e.
either a success or a failure). The new designs are randomized and adaptive. The ﬁrst
design is called The Adaptive Weighted Differences Design, abbreviated AVVD, and
does not use covariate information on the patients. It can be seen as a generalization
of The Adaptive Biased Coin Design of Wei (1978), so that ethical issues are also

taken into consideration. This is achieved by taking into account, at each stage of the

 

trial. but
of (Mill;
Ihv C
pitfalls 1.
(fibril;-
ihiitl "it
CRI‘W
Ilie CR
Rule 17:,
ll‘. Mlle r
hip SUE:
linit E};
The ll};
lIPajm.
the Ifli
{Biker

Wu .1711
Simul-

Tin-Ups}:

,— s
r...
5--

9 ID

If") ( IiiY‘v
u

trial, both the proportion of patients assigned to each treatment and the proportion
of patients successfully treated in each treatment group. The second design is called
The Covariate Adaptive Weighted Diﬁerences Design, abbreviated CAW D, and incor-
porates covariates into the AWD design using an innovative approach that consists in
crossing-over information on the responses of patients from stratum to stratum. The
third design is called The Covariate Randomized Play-the- Winner Rule, abbreviated
CRPW, and corresponds to a multiple urn model; each urn represents a stratum.
The CRPW rule can be seen as a generalization of The Randomized Play—the-W inner
Rule (by Wei and Durham (1978)) within strata. It allows the responses of patients
in one stratum to change the composition of the urns corresponding to all the possi-
ble strata of the population of patients. Strong laws of large numbers and a central
limit theorem are proved for each design. The proofs rely on martingale techniques.
The main result for the CRPW rule is that the proportions of balls representing each
treatment in each urn converge almost surely as the number of patients enrolled in
the trial converges to 00. The proof of this type of results for single urn models is
rather involved. The crossover of information on the responses of patients signiﬁcantly
complicates the arguments needed to prove this type of convergence. Monte Carlo
simulations are used to evaluate the performance of the designs. The simulations
illustrate the excellent combined performance of the designs (in terms of both the
proportion of patients successfully treated and the mean squared error for estimating
the treatments difference), for suitable choices of their parameters, when compared

to complete randomization and the randomized play-the-winner rule.

To Miguel.

iv

..
..
"w 1‘"

 

.J

 

 

With i:
l. 2 -
mums

GlW'ITiil

ACKNOWLEDGEMENTS

I would like to express my sincere gratitude to my dissertation advisers, Professor
Connie Page and Professor Vincent Melﬁ, for their constant guidance and mentorship
during the past two and half years. Thank you for always having a word of encour-
agement when I needed it the most. I sincerely appreciate the time you put into our
weekly meetings and that you were still willing to answer my questions when I came
knocking at your doors. Thank you for all the hours you spent proofreading my many
dissertation drafts. My job search showed me how important a good research theme
is; I am extremely grateful you suggested that I work in adaptive designs. Finally,
thank you so much for believing in me and giving me the courage to pursue a career
in the pharmaceutical industry.

I would also like to thank Professor James Stapleton and Professor Peter Lappan
for serving on my guidance committee and carefully reading my dissertation. I am
sure you could think of better things to do in these beautiful spring days.

I want to thank Professor Page for teaching me how to be a good consultant
(or so I would hope). Thank you for giving me the opportunity to learn applied
statistics, SAS and SPSS and for your words of encouragement during the entire SCS
experience. Thank you for being so friendly and patient, and for always having the
time and tact to ﬁx problems that arose when I allowed clients to step over the line.
There is nothing I would want to change in the way you supervised the activities of

the SCS. Finally, I can never thank you enough for accepting to be the chair of my

 

guidilliw (
I Lil‘”
5'tihjﬂ'l53
have had
LaTt‘X. I“
mat rum
Illilljh tux:-
re‘ami I}.
I will
rim 1 fix
efﬁcient 5'
on the oil
for 03m:
me whwm
93):;1}‘11ri [g
émd your
to take in
A? lay
mt} IQ ﬁg.

7‘

diVV'Fv-
(1.; }
.‘ J"

guidance committee. It’s how all this started.

I also want to thank Professor Melﬁ for ﬁrst teaching me two of my favorite
subjects: martingales and adaptive designs. You are amongst the best teachers I
have had in my many years as a student. Thank you also for introducing me to
LaTex, re-introducing me to Fortran and for not, losing your patience every time I
went running to your office asking you to ﬁx my errors. I shudder to imagine how
much time I would have spent if I had had to learn, on my own, all the computer
related things that you taught me.

I would not want to forget to thank Professor Stapleton for being so friendly
when I ﬁrst contacted the department of Statistics and Probability. Thanks to your
efﬁciency and warm e-mails I never once doubted my decision to come to a country
on the other side of the Atlantic Ocean to continue my graduate studies. Thank you
for offering me an assistantship even though I had not requested one and for helping
me whenever my University back home tried to make my life difﬁcult. Finally, I truly
enjoyed taking your classes in linear models and categorical data analysis; the subject
and your teaching of it were like a breath of fresh air after some of the courses I had
to take in this department.

At last, but not the least, I want to thank my parents and brothers who encouraged
me to live my life on my own terms and to make my own decisions. I know you will

always be there to pick up the pieces...

vi

 

Cont

LIST OF

1 Litera

II 11

Contents

LIST OF FIGURES xii
1 Literature Review 1
1.1 Introduction ................................ 1
1.2 Allocation-Adaptive Designs ....................... 3
1.3 Response-Adaptive Designs ....................... 13
1.4 Applications ................................ 18
2 The Adaptive Weighted Differences Design 20
2.1 Introduction ................................ 20
2.2 The Allocation Policy ........................... 21
2.3 Strong Laws of Large Numbers ..................... 25
2.4 Central Limit Theorem .......................... 32
2.5 Evaluation of the Design ......................... 35
3 The Covariate Adaptive Weighted Differences Design 49
3.1 Introduction ................................ 49
3.2 The Allocation Policy ........................... 51

vii

 

3.3 Sir

3.1 ('1‘
3.3 EV
The C1
1.1 It
12 Ti
13 R,
1.4 ('1

3.2.1 General Notation and Assumptions ...............

3.2.2 The CAWD Allocation Policy ..................
3.3 Strong Laws of Large Numbers .....................
3.4 Central Limit Theorem ..........................
3.5 Evaluation of the Design .........................

The Covariate Randomized Play-the-Winner Rule

4.1 Introduction ................................
4.2 The Allocation Policy ...........................
4.3 Strong Laws of Large Numbers .....................
4.4 Central Limit Theorem ..........................
4.5 Evaluation of the Design .........................

viii

68

90

9O

92

119

119

142

List 0

3.1 (11:11;

 

3.2 C111“;
33 (“diff
21 C-III.‘

23 (‘(,YV:V

- .1“;
.

'V
"l
A
A4
L:
”:4

Iv
(I)
:2
V-‘
D'—«

1 CM;

3.1 (M,

Pi":

List of Figures

2.1

2.2

2.3

2.4

2.5

2.6

2.7

2.8

2.9

2.10

3.1

3.2

3.3

Comparisons (AWD design) for n = 20 and PA = 0.05 .........
Comparisons (AWD design) for n = 20 and PA = 0.35 .........
Comparisons (AWD design) for n = 20 and [7A 2 0.50 .........
Comparisons (AWD design) for n = 20 and pg 2 0.65 .........
Comparisons (AWD design) for n = 20 and 17,; = 0.95 .........
Comparisons (AWD design) for n = 100 and p4 = 0.05 ........
Comparisons (AWD design) for n = 100 and 17,4 : 0.35 ........
Comparisons (AWD design) for n = 100 and [7/4 = 0.50 ........
Comparisons (AW D design) for n = 100 and [7/1 = 0.65 ........

Comparisons (AW D design) for n = 100 and [2A 2 0.95 ........

Comparisons of proportion of successes (CAW D design) for n = 30,

Comparisons of mean squared error in stratum 1 (CAW D design) for
n = 30, pA(1) = 0.50, 173(1) : 0.10, 114(2) : 0.15 and pB(2) = 0.40 . .
Comparisons of of mean squared error in stratum 2 (CAVV D design)

for n = 30, [14(1) 2 0.50, 173(1) 2 0.10, pA(2) = 0.15 and [13(2) : 0.40

ix

46

47

48

72

73

74

34 Cum
[1‘11

3.5 Cum

3.6 Cum

.

l1 If I}

3.7 (bur;

 

3.‘ CHILI

3-9 (1111;;
for n
3-10 Cum;
P111

1 a
3‘11 (will;

'31? (‘Unu

{HI- .7)

3.13 Con;

Pal,

314 Cf)”.

3.4 Comparisons of proportion of successes (CAW'D design) for n = 150,
pA(1) = 0.50, 173(1) 2 0.10, 1),;(2) = 0.15 and 173(2) 2 0.40 ......
3.5 Comparisons of mean squared error in stratum 1 (CAW D design) for
n = 150, pA(1) = 0.50, 123(1) 2 0.10, pA(2) = 0.15 and 198(2) 2 0.40
3.6 Comparisons of of mean squared error in stratum 2 (CAW D design)
for n 2150,17,;(1) = 0.50, 173(1) 2 0.10, 114(2) 2 0.15 and 113(2) 2 0.40
3.7 Comparisons of proportion of successes (CAWD design) for n = 30,
154(1) 2 0.90, 123(1) 2 0.10, p,4(2) : 0.60 and 123(2) 2 0.15 ......
3.8 Comparisons of mean squared error in stratum 1 (CAW D design) for
n = 30, p,4(1) = 0.90, [13(1) 2 0.10, p,.1(2) 2 0.60 and 173(2) 2 0.15 . .
3.9 Comparisons of of mean squared error in stratum 2 (CAWD design)
for n :2 30,12,1(1): 0.90, 173(1) = 0.10, pA(2) = 0.60 and 173(2) 2 0.15

3.10 Comparisons of proportion of successes (CAWD design) for n = 150,

3.11 Comparisons of mean squared error in stratum 1 (CAW D design) for
n = 150, pA(1) = 0.90, p3(1) = 0.10, pA(2) = 0.60 and 173(2) = 0.15
3.12 Comparisons of of mean squared error in stratum 2 (CAWD design)
for n = 150, p,.;(1) = 0.90, 123(1) 2 0.10, pA(2) = 0.60 and 193(2) = 0.15
3.13 Comparisons of proportion of successes (CAW D design) for n = 30,
pA(1) = 0.35, 173(1) : 0.50, pA(2) = 0.15 and [73(2) 2 0.85 ......
3.14 Comparisons of mean squared error in stratum 1 (CAW D design) for

n = 30, pA(1) = 0.35, 173(1) 2 0.50, pA(2) : 0.15 and 173(2) : 0.85 . .

76

77

78

79

80

81

82

83

84

 

3.13 (1111
for r
3.16 Curr
P111

3.1? Curr;

3.15 Curr:

I111 ’1

‘11 Gilli;

3.15

3.16

3.17

3.18

4.1

4.2

4.3

4.4

4.5

4.6

4.7

Comparisons of of mean squared error in stratum 2 (CAW D design)
for n = 30,19,1(1) = 0.35, 113(1) 2 0.50, p,4(2) = 0.15 and 193(2) 2 0.85 86
Comparisons of proportion of successes (CAIN D design) for n = 150,
114(1) 2 0.35, 193(1) 2 0.50.11.1(2): 0.15 and [73(2 )2 0. 85 ...... 87
Comparisons of mean squared error in stratum 1 (CAW D design) for
n = 150, 114(1) 2 0.35.1)B(1) = 0.50, 114(2) 2 0.15 and 173(2) : 0.85 . 88
Comparisons of of mean squared error in stratum 2 (CAW’D design)

for n = 150, 154(1) 2 0.35, 173(1) 2 0.50, 19,1(2) = 0.15 and 113(2): 0.85 89

Comparisons of proportion of successes (CRPW design) for n = 30,
1744(1) = 0.50, 193(1) 2 0.10, 114(2): 0. 60 and pB(2 )2 0. 85 ...... 124
Comparisons of mean squared error in stratum 1 (CRPW design) for
n = 30, 114(1) 2 0.50, pB(1) : 0.10.11.4(2) = 0.60 and [73(2) : 0.85 . . 125
Comparisons of mean squared error in stratum 2 (CRPW design) for
n = 30, [14(1) = 0.50, 193(1) 2 0.10. 291(2) 2 0.60 and 193(2) = 0.85 . . 126
Comparisons of proportion of successes (CRPW design) for n = 150,
[711(1): 0.50, 133(1) : 0.10, p.4(2)— — 0. 60 and 113(2 )2 0. 85 ...... 127
Comparisons of mean squared error in stratum 1 (CRPW design) for
n = 150, pA(1) = 0.50, 193(1) 2 0.10, p.4(2) = 0.60 and 193(2) 2 0.85 . 128
Comparisons of mean squared error in stratum 2 (CRPW design) for
n = 150, p.4(1) : 0.50, [73(1) : 0.10, pA(2) = 0.60 and 173(2) : 0.85 . 129
Comparisons of proportion of successes (CRPW design) for n = 30,

pA(1) = 0.65, 173(1) 2 0.10, [51(2): 0. 40 and p3(2 )2 0.15 ...... 130

xi

 

4.111 CNN
i111]

4.11 Cu“

4.8

4.9

4.10

4.11

4.12

4.13

4.14

4.15

4.16

4.17

4.18

Comparisons of mean squared error in stratum 1 (CRPW design) for
n = 30,13,1(1) = 0.65, pB(1) = 0.10, [14(2) 2 0.40 and 123(2) :— 0.15 . .
Comparisons of mean squared error in stratum 2 (CRPW design) for
n = 30.11.4(1) = 0.65, 173(1) 2 0.10, [14(2) 2 0.40 and p3(2) = 0.15 . .
Comparisons of proportion of successes (CRPW design) for n = 150,
1),;(1) = 0.65, p3(1) = 0.10, 114(2) 2 0.40 and p3(2) = 0.15 ......
Comparisons of mean squared error in stratum 1 (CRPW design) for
n = 150, [14(1) 2 0.65, p3(1) = 0.10, 1),;(2) = 0.40 and p3(2) = 0.15

Comparisons of mean squared error in stratum 2 (CRPW design) for
n = 150, 114(1) 2 0.65, [13(1) : 0.10, 114(2) 2 0.40 and 103(2) 2 0.15

Comparisons of proportion of successes (CRPW design) for n = 30,

Comparisons of mean squared error in stratum 1 (CRPW design) for
n = 30,19,1(1) = 0.35, pB(1) = 0.10, 114(2) 2 0.65 and 193(2) :2 0.15 . .
Comparisons of mean squared error in stratum 2 (CRPW design) for
n = 30, pA(1) = 0.35, p3(1) = 0.10, [)A(2) 2 0.65 and 103(2) 2 0.15 . .
Comparisons of proportion of successes (CRPW design) for n = 150,
pA(1) = 0.35, 123(1) 2 0.10,p,1(2) = 0.65 and pB(2) = 0.15 ......
Comparisons of mean squared error in stratum 1 (CRPW design) for
n = 150, [14(1) 2 0.35, p3(1) = 0.10, 114(2) 2 0.65 and 193(2) 2 0.15

Comparisons of mean squared error in stratum 2 (CRPW design) for

n = 150, pA(1) 2: 0.35, 193(1) 2 0.10, pA(2) = 0.65 and 193(2) 2 0.15

xii

131

132

133

134

136

137

138

139

140

141

 

Chal

Liter

1.1 11

(111151111?! a
in redurin;
patients at
can be m.

then.

The pp

to which (1

Chapter 1

Literature Review

1. 1 Introduction

Consider a clinical trial to evaluate the relative effectiveness of two drugs, A and B,
in reducing the risk of rejection following kidney transplant. Suppose that eligible
patients arrive sequentially and must be treated immediately and that each patient
can be given either one of the drugs, A or B, and will be assigned to exactly one of
them.

The problem is to decide which patients involved in the study will be allocated
to which drug so that two goals are achieved. On one hand it is desirable that the
data collected in the study can be used to draw reliable statistical inferences for
the benefit of future patients (this can be thought of as an utilitarian goal [Sarkar
(1991)]). On the other hand each patient should be allocated to the drug showing the
best performance thus far in the study (this can be thought of as an individualistic

goal [Sarkar (1991)]).

 

4, (half-1“
clan with a “
gulls are u" 1"
of flit" 2M1“ ‘
plll'ail’d \\'l1('l
serially "f IL
to the (er?-
Should Iiialﬂ‘
but is likely I

In this (in,
lam and llltllVl
”lighted Di:
informatiun o
Drferenm D
adaptive \Wl ;

ME COE'I'N‘HMI
Ihe appmarh
Ping-3f“, ll'zm

BEE-ll”? I)“.

A design for the clinical trial is a procedure that attempts to provide the physi-
cian with a solution to this problem. Unfortunately, the utilitarian and individualistic
goals are usually conflicting. Thus, designs are developed that either focus on one
of the goals or seek a compromise between them. The problem can further be com-
plicated when relevant clinical information (such as age, sex, general physical status,
severity of the disease, etc.), which is likely to affect the responses of the patients
to the drugs, is to be incorporated into the design. Using this covariate information
should make the design more efficient in terms of achieving the individualistic goal
but is likely to complicate statistical inference.

In this dissertation three new designs that seek a compromise between the utilitar-
ian and individualistic goals are developed and studied. The ﬁrst, called The Adaptive
Weighted Differences Design, is presented in Chapter 2 and does not use covariate
information on the patients. The second, called The Covariate Adaptive Weighted
Diﬁerences Design, is presented in Chapter 3 and incorporates covariates into the
adaptive weighted differences design using an innovative approach. The third, called
The Covariate Randomized Play-the- Winner Rule, is presented in Chapter 4 and uses
the approach introduced in Chapter 3 to incorporate covariates into the Randomized
Play-the- Winner Rule proposed by Wei and Durham (1978).

Before presenting and studying these new designs we begin by reviewing some
of the numerous procedures that have already been proposed in the literature. This
review will serve several purposes. Firstly, it introduces the designs that are used
as basic points of reference for developing the three new procedures described and

studied in later chapters. Secondly, it gives an idea about the different approaches

 

{foremnphu
been (‘ulblilt‘l
regarded as ii
this (liw‘rtati
of those new i
The eiiiph
henceforth he:
which (in u. ll
in the ("lihii'al
llost of r},
lhi’first iiiriu
hi or her core:
011th allow
[0 ELS Ohio/7mg
Which the alh

Ddhemg ‘ I)“

1~2 All

is deﬁll‘il in

. “a

(for example, Bayesian theory, iterative procedures, optimum design theory) that have
been considered for developing designs for clinical trials. Finally, this review can be
regarded as an introduction and motivation for the three new procedures presented in
this dissertation; it will be particularly useful in showing what are the novel aspects
of these new designs.

The emphasis of this dissertation is on designs with covariates. Such designs will
henceforth be referred to as covariate designs, as opposed to non—covariate designs
which do not take into consideration covariate information on the patients involved
in the clinical trial. The main focus of this chapter is in reviewing covariate designs.

Most of the designs that will now be presented ﬁt in one of two general categories.
The first includes those approaches for which the allocation of each patient depends on
his or her covariate levels (when covariate information is taken into consideration) and
on the allocations and covariate levels of the previous patients — these will be referred
to as allocation-adaptive designs. The second category includes those approaches for
which the allocation of each patient depends also on the responses of the previous

patients —- these will be called response-adaptive designs.

1.2 Allocation-Adaptive Designs

As deﬁned in the previous section, allocation-adaptive designs are such that the allo-
cation of each patient depends only on the allocations of the previous patients (and
possibly on the covariate levels of the present and previous patients). These designs

do not take into consideration the responses of patients to the treatments under eval-

 

Utilitill in a ('.
are orient ed ‘
hassiriiitig t
of the Hill-Ill)
Fred. haitutt'
This sect int;
tieris. Altht
pmt‘t‘ihitm
Trailiiiuiml
:Blal'lﬂwl} ,
used as his
We ”start \\
have been
in Chaplet
“'ei 1975
lri the:
afri've $qu
to either
furt‘m.”Ii

(O’Variatc.

uation in a clinical trial and, therefore, do not address the individualistic issue. They
are oriented towards statistical inference. A popular approach, with this goal in mind,
is assigning treatments in such a way that some degree of balance is achieved in terms
of the number of patients allocated to each treatment. When covariates are consid-
ered, balance is also sought in the distribution of treatments across covariate levels.
This section reviews some of the approaches developed for achieving balanced alloca-
tions. Although the main emphasis will be on designs which incorporate covariates,
procedures which do not make use of this extra information will also be presented.
Traditional designs such as Complete Randomization, Randomized Permuted Blocks
[Blackwell and Hodges (1957)] and The Biased Coin Design [Efron (1971)] are often
used as basic points of reference when developing approaches that do use covariates.
We start with a brief review of these three non-covariate designs and show how they
have been modiﬁed to include covariates. The new procedures that will be developed
in Chapters 2 and 3 are related to The Adaptive Biased Coin Design prOposed by
Wei (1978); this design is also included in this review.

In the remainder of this section, unless otherwise noted, it is assumed that patients
arrive sequentially at the experimental site and that each patient can be assigned
to either one of T 2 2 treatments and will be assigned to exactly one of them.
Furthermore, when covariate information is to be used in the design, it is assumed that
covariate measurements are obtained on each patient before allocation to a treatment.

Throughout this section, reference is made only to categorical covariates. It is
worth mentioning that discrete or continuous covariates can also be considered by
grouping the possible values of the covariates into ﬁnitely many levels.

4

 

Supp: m,
{inn consist
l/f. As. (ii
ital triai is
it can he u
conscious it
ill? lili‘tllt‘d
Plrtf‘ rainlt
Spet'iaiii‘ '
9WD Hit’ire
by (‘tttiside
Cnn‘iplm.

tetals air.

The R

H.937],- E,“

adl'alit‘g

Ff‘" , -
\dx‘ldTlol

rand!) m i 1.

Suppose that there are T 2 2 treatments under evaluation. Complete Randomiza-
tion consists of assigning each patient to treatment t, t E {1, - - - ,T}, with probability
l/T. As discussed in Efron (1971), when the main emphasis of a design for a clin-
ical trial is on the utilitarian aspect, complete randomization is well established for
it can be used as a basis for statistical inference while minimizing the possibility of
conscious or unconscious selection bias [Blackwell and Hodges (1957)] on the part of
the medical investigator. Ethical considerations aside, a major disadvantage of com-
plete randomization is that some unpleasant imbalances in treatment totals are likely,
especially when the number of patients in the trial is small. This problem becomes
even more apparent when covariates are considered relevant and strata are formed
by considering subgroups of patients with common combinations of covariate levels.
Complete randomization in this setting can create serious imbalances in treatment

totals across strata.

The Randomized Permuted Blocks Design, introduced by Blackwell and Hodges
(1957), assumes that the total number of patients involved in the trial is known in
advance. The design can be described as follows. If there are two treatments under
evaluation, randomly divide the number of patients into blocks of length 2b. Then
randomly assign, within each block, 1) patients to each one of the treatments. Al-
though this design can be quite effective in achieving balance while retaining some
randomization, its main disadvantage is that the assignments of some patients will be

known in advance. This procedure can be modiﬁed to allow for covariates by simply

 

assigning pat
proposal by
not usually k
(lure. shim, l
Ethattheii
lads lllt'it'd
Strata (811])

35 the nuntl.

Th6" Bu,
0fihtltrea
asigntnent:
X81: = k —
Nil: 7‘ e _‘

if D], >

um:

assigning patients within each stratum by means of randomized permuted blocks, as
proposed by Zelen (1974). However, the total number of patients in each stratum is
not usually known in advance, which is a restriction on the applicability of this proce-
dure. Also, Pocock and Simon (1975) note that a major difficulty with this approach
is that the number of strata increases rapidly as the number of covariates and their
levels increase. Furthermore, they argue that randomized permuted blocks within
strata can prove inadequate to achieve its basic goal - balance within each stratum -

as the number of strata approaches the number of patients involved in the clinical trial.

The Biased Coin Design, proposed by Efron (1971), allocates patients to one
of two treatments, A or B, according to the following rule. Suppose that after k
assignments, NA], patients have been allocated to treatment A and the remainder
N 3,1: = k — N A, k have been allocated to treatment B. Let Dk denote the difference
NAJc/k — NBJc/k. Let p be a constant in [0, 1]. Then, for p 2%

if Dk > O, allocate the (k + 1)st patient to treatment A with probability 1 ——— p;

if D), = 0, allocate the (k + 1)st patient to treatment A with probability %;

if D, < 0, allocate the (k + 1)st patient to treatment A with probability p.
This allocation policy tends to balance the number of patients allocated to both
treatments, the tendency being weakest if p = % (which corresponds to complete
randomization) and strongest if p = 1 (which corresponds to randomized permuted
blocks with b = 1). Wei (1978) notes that a disadvantage of this procedure is that,

in assigning the next patient to a treatment, the allocation policy neither takes into

consideration the number of patients treated thus far, nor does it discriminate be-

6

 

 

tit't’t’ll Sillilll .

ofthe biaseil

Th? Add,"
annﬁntu
h“ i.., _:

MP ill“. ll (‘1‘.
ahutated to

h::~l.l: ~

rn

ﬂ-ii
thil. hi
3Hf9d luit‘
35. ill? exp.
YaHIPiEI’ p

annnglp

POFW
Up
a Imp."
’-
Li" -, .
J\dnarc

; t'

tween small and large absolute values of Dk. Wei (1978) proposes a new procedure

of the biased coin type that takes these issues into consideration.

The Adaptive Biased Coin Design, proposed by Wei (1978), allocates patients
according to the following rule. Suppose that after k assignments, Ni“, patients
have been allocated to treatment A and the remainder N BJC = k — NA) have been
allocated to treatment B. Let Dk denote the difference N Ag, / k — NB’k/k and let
h : [-1,1] —+ [0,1] be a non-increasing function such that h(a:) = 1 — h(—:z:), for all
:r 6 [—1,1]. Then, allocate the (k + 1)st patient to treatment A with probability
h(Dk). This allocation policy forces an extremely imbalanced experiment to be bal-
anced but tends to complete randomization as the difference Dk tends to zero (i.e.
as the experiment approaches perfect balance). Efron’s biased coin design with pa-

rameter p is the particular case of the adaptive biased coin design corresponding to

setting h(:1:) = p for —1 g :1: < 0 and h(0) : 1/2.

Both Efron (1971) and Wei (1978) mention that if covariate information is avail-
able, in can be taken into consideration by applying the biased coin design procedures

separately within each stratum.

Pocock and Simon (1975) suggest an allocation rule which can be viewed as a
generalization of Efron’s biased coin design to more than two treatments and several
covariates. The design relies on a function G which measures the total amount of
imbalance (in the distribution of treatment numbers within the levels of each covari-

7

 

 

ate] multin-
ate then null-u
ailntratiun [in
get its (hand

and dilut‘iitiui.

 

the pvtiniiuuiu
(lUllllZalltill. fat
for small trial:
disttihutinn of
treatments ti.

three designs.

.h'tlx'lllSim
dhgn of iii»
In the WWW;

in clinical tr

i .

to the if?”

D

t‘bnnm,
Sign ”19m [1‘

Bit”: .
‘F’d (in.

tuna] DP .

.V
I“

ate) resulting from each one of the possible assignments at each stage. Treatments
are then ranked according to their G—values (rank 1 2 minimal imbalance value) and
allocation probabilities are such that the smaller the G ~value for a treatment, the big-
ger its chance of being assigned to the patient. Several possibilities for the G function
and allocation probabilities are suggested. Finally, simulations are used to compare
the performance of this new approach and three traditional designs: complete ran-
domization, randomly permuted blocks and randomly permuted blocks within strata.
For small trials, two treatments, a speciﬁc function G and other assumptions on the
distribution of covariate levels, Pocock and Simon’s procedure is shown to enable
treatments to be balanced across several covariates more effectively than the other

three designs.

Atkinson (1982) uses optimum design theory to develop a randomized balanced
design of the biased coin type, for clinical trials with two or more treatments and
in the presence, or absence, of covariate information. The procedure can be applied
to clinical trials for which interest relies mainly in contrasts between treatment ef-
fects on a response variable; the expected response is assumed to be linearly related
to the treatments and (if present) covariates. In this context, Atkinson refers to
D A—optimum design theory [Sibson (1974)] to give a procedure for obtaining the as-
signment probabilities in a biased coin rule. This procedure is called DA—Optimal
Biased Coin Design. Analytic expressions for the assignments probabilities are ob-
tained only in the absence of covariates. Similarly to Wei’s adaptive biased coin
design, this new procedure is also shown to respond to increasing imbalance.

8

 

Atkiiisoi
pmpett it‘s i
complete it
design. Sui
can he t’.\'.[>l
stage of th-
numher of
t’ariahle at
expressit )ll
the loss la.
far. the It
WT) ”me
C‘C‘Wariates
are run U
{E PXDGPT;
term 016 t
mill deiq
lS 'dlSO n,”

Atkinson (1998) follows up Atkinson (1982) and uses simulations to compare the
properties of the DA—optimal biased coin design with those of three other procedures:
complete randomization, a balanced deterministic approach and Efron’s biased coin
design. Such properties are related to the loss of information due to imbalance, which
can be expressed as the number of patients on whom information is unavailable at each
stage of the trial. For two treatments the loss is deﬁned as the difference between the
number of patients treated thus far and the ratio between the variance of the response
variable and the variance of the estimated contrast of treatment effects. An analytical
expression for the loss is also derived when more than two treatments are considered;
the loss is, in this case, expressed as a function of the number of patients treated thus
far, the number of treatments, the design matrix and the matrix of contrasts. For
two treatments only, simulations are run both under the assumption of independent
covariates and correlated covariates, and for more than two treatments simulations
are run only under the assumption of independent covariates. The results show,
as expected, that the balanced deterministic approach has the best performance, in
terms of the loss, while complete randomization has the very worst. Efron’s biased
coin design is consistently better than Atkinson’s D A—optimal biased coin design. It
is also noted that, for all these designs, the loss increases as the number of covariates

increases, and that it is higher for correlated than for independent covariates.

Ball, Smith, and Verdinelli (1993) develop, within the Bayesian framework, a
randomized balanced design of the biased coin type for clinical trials with T 2 2
treatments and in the presence of one covariate. The procedure only has direct

9

 

prat‘tit'al rt
patients is

main disati
of patient>

has levels t
pl’tiptill it 'li‘
level “as I?)
With (‘m'ari
pt; '1" fr] lllt
patients Wi

[It‘allllf‘lllﬁ

Smi.
Ah. and

mth ”NM-r

T .
lip prllii‘dbl

he . .; -
d‘f‘s‘illF‘t]

pi‘- \"y.
J ‘ BiYYiv),

. N4

13 miller

practical relevance to clinical trials where a pool of covariate categorized potential
patients is available from which the new patients can be selected; this is one of the
main disadvantages of the design. The allocations and the model for the responses
of patients to treatments can be described as follows. Suppose that the covariate
has levels u, (j = 1, , J) and that k patients have already been assigned, with
proportions fz'j = n,J-/k allocated to treatment i (i = 1, , T) when the covariate
level was 12,-. Suppose that It more patients are to be allocated with mij 2 0 patients
with covariate level uj to be allocated to treatment i. Let pij = mij/k. So, if $0. =
pij + fij then the overall design for the 2k observations allocates k3,},- = kpij + k fij
patients with covariate value v, to treatment i. The responses, yU-l, of patients to

treatments are assumed to be such that
yijl ~ Nfgz' +13%, 02)

where i = 1, , T, j = 1, , J, l = 1, , 71,-]- +m,j, Zi2j(nij+m,j) = 2k, and
yijl are conditionally independent given 6,, B and 02. It is assumed that the effects of
the T treatments or T—l new treatments (when a control treatment is considered) are
exchangeable with a common variance 72 and that both 02 and 72 are known. Ball,
Smith, and Verdinelli (1993) show how the optimal proportion 17;]. can be identiﬁed
with respect to design criteria such as D-optimality and A-optimality [Silvey (1980)].
The probability, pfj of choosing the (k + 1)st patient to have covariate value v, and to
be assigned to treatment i, is then computed based on the relation between 23,,- and

p,J-. Asymptotic properties of the above allocation scheme are considered when there

is either a vague prior knowledge of the exchangeable treatments or a strong prior

10

[0 each l“

of path‘lllS

\Vu (15
("an ])t’ dl’l

presentttitf

 

anMyl»
be int'tiilt'wi
Criterion. ('
theexpernn
unanatettu
ll : [u]. . . .
innndancetgf

15 llE‘illitfd 85

Wt" .

(‘f‘d'r ..
“(mate fil [,

the - - -
. iUld] [I] ll.
Au I1

 

speciﬁcation of the exchangeable treatments implying that they are essentially iden-
tical. It is shown that, in these particular cases, the proportion of patients allocated
to each treatment converges almost surely to determined limits as the total number

of patients converges to 00.

Wu (1981) gives an iterative construction of nearly balanced assignments that
can be applied in the context of clinical trials with T 2 2 treatments and in the
presence of p 2 1 covariates with, possibly, different number of levels. The procedure
can only be implemented if the covariate measurements of all the patients that will
be involved in the trial are known in advance. The new design criterion, called B—
Criterion, can be described as follows. Suppose that N patients are available for
the experiment. Each patient has p covariate measurements. Assume that the ith
covariate can take r,- different levels, so that each patient has a combination of levels
11 2 (ul, , up), 1 g u, g r,, where the ith covariate is at level u,-. A measure of
imbalance of the assignment, within the set 'of patients with ith covariate at level u,,

is deﬁned as

where nt(u,) represents the number of patients allocated to treatment t and with ith
covariate at level u,-. Let A, be a weight reﬂecting the importance of covariate i. Then,
the total imbalance of the assignment is deﬁned as

P Ti T 2
IB 2 2A,- : Z (nt(u,) — %Znt(ui)) a

1“: i=1

 

and a treat
mi timizeti

[5 derived.

tients with
n’u} = if
mr-nts so tl:
measure. X
€Xt’tiltt‘tl. l
essentially.

the effet-t oi
mciit 8' to r«
assigrunent.~
(‘nmhinatitir
design is (1..
{1951) also

so that hing},
.‘sihle My“:
that it neith

and a treatment assignment is called B—optimal if the corresponding I B measure is
minimized among all possible assignments. A sufficient condition for B—optimality
is derived. The ﬁrst step in the procedure is to check whether the number of pa-
tients with level combination u, denoted n(u), is 2 T. If this is the case, say,
n(u) = qT + r, 0 g r < T, randomly assign qT patients in stratum u to the T treat-
ments so that each treatment is assigned to q patients. This does not change the IB
measure. Now the remaining n(u) is less than T and the so called B—algorithm can be
executed. This heuristic method for constructing nearly B—optimal designs consists,
essentially, of applying a routine which is based on measuring two effects; the ﬁrst is
the effect of switching a patient with a combination of levels 11, from receiving treat-
ment 3 to receiving treatment t; the second is the effect of exchanging the treatment
assignments of a patient with a combination of levels u and another patient with a
combination of levels v, from (s, t) to (t, s). As initially described, this balanced
design is deterministic and tries to balance assignments only across main effects. Wu
(1981) also presents some modiﬁcations that can be introduced in the B—algorithm,
so that higher order effects and randomization can be taken into consideration. Pos-
sible advantages of this design over, for example, the design of Atkinson (1982) are
that it neither assumes the existence of a regression model for the responses nor does
it require matrix inversion. Still, the relative merits of theses approaches are not
discussed. The B—optimal design can be particularly useful when N, the number of
patients involved in the trial, is not much larger than r1 x x rp, the total num-
ber of possible level combinations. Its main disadvantage is that information on the
covariate levels of the patients may not be available in advance.

12

 

1.3 He

Retail that
depends hull
treatments 11
takes priority
and the utili’.
tan. in wine
Tlih wr‘tit
however. wirl
Haiti pitiptmu
the flt‘l't‘hipmf
lll Wildl in
site. each path
to exactly W.
pattern 15 dﬁsi;
W in the (1,.
ll'riIiPm

“‘91 and D.

We i‘
l)»’..\_,[]](, Q: If

..,‘:‘rr:‘-}

if)" r

,l[.]pr'," ~ .

‘ J‘III\‘) '
pf? 1,.
ulhfllVPV

”as of

1.3 Response-Adaptive Designs

Recall that response-adaptive designs are such that the allocation of each patient
depends both on the previous allocations and on the responses of the patients to the
treatments under evaluation. Such designs are used when the individualistic goal
takes priority or when some compromise is sought between both the individualistic
and the utilitarian goals. Incorporating covariates in this setting is quite natural and
can, in some cases, simplify the design (see, for example, Woodroofe (1979)).

This section will focus on response-adaptive designs with covariates. It is started,
however, with a traditional non-covariate design, The Randomized Play-the- Winner
Rule proposed by Wei and Durham (1978), which is the basic point of reference for
the development of The Covariate Randomized Play-the- Winner Rule in Chapter 4.

In what follows it is assumed that patients arrive sequentially at the experimental
site, each patient can be assigned to either one of two treatments and will be assigned
to exactly one of them, and the response of each patient is observed before the next
patient is assigned to a treatment. Furthermore, when covariate information is to be
used in the design, it is assumed that covariate measurements are obtained on each

patient before allocation to a treatment.

Wei and Durham (1978) introduce The Randomized Play-the- Winner Rule as a
possible solution to the problem of designing a clinical trial in such a way that a good
compromise is achieved between the need to derive information about the relative

effectiveness of the treatments and the desire of treating each patient in the best

13

 

poggjlllf’ ‘
that 9”"
patients I
hat initit
rantinnllﬁ‘
merit i. “
the patii‘I
the ft;-liu'~\'
j: if a fail
]. Witter? «.
ll IS SlIUWI
small. but
different is
two menti-
ment in a
tnt'ariat es

Rpll] ’1. if

1.

Pitytheqr

‘ r‘.
.45.. \[q

i -.
treatments

7.-
)\. Iv .
I “a: Trier,

his.

Iﬂllrou

4?: tit

possible way. The design can be described with an urn model as follows. Suppose
that there are two treatments, A and B under evaluation and the responses of the
patients to the treatments are dichotomous (either a success or a failure). The urn
has, initially, u balls of each type, u 2 1. When a patient enters the study, a ball is
randomly drawn from the urn. If the ball is of type i, then assign the patient to treat-
ment i, where i E {A, B}. The ball is then replaced in the um and the response of
the patient is observed. The composition of the urn will now be changed according to
the following rule. If a success has occurred, add ﬂ balls of type i and a balls of type
j; if a failure has occurred on treatment i, add a balls of type i and ﬂ balls of type
j, where ,3 2 a 2 0, i, j E {A, B} and i 7E j. This rule is denoted by RPW(u, oz, B).
It is shown that the RPW(u, a, 5) rule introduces more randomization when B/oz is
small, but tends to put more patients on the better treatment when 6/01 is large. So,
different choices for the pair (a, [3) give different levels of compromise between the
two mentioned goals. The main advantage of this design is that it .is easy to imple-
ment in a real clinical trial. The new design introduced in Chapter 4 incorporates
covariates into a multiple urn model, and can be regarded as a generalization of the
RPW(u, 0, ,8) rule. A more complete discussion of the properties of the randomized

play-the-winner rule is deferred to Chapter 4.

This section will conclude with three designs whose goal is to assign patients to
treatments in order to maximize the expected total response of the patients to the
treatments. This problem is referred to, in the literature, as the bandit problem. A
thorough discussion of classical bandit models appears in Berry and Fristedt (1985).

14

 

Here. tl'.
models t
that will
enrolled
tistit‘al t-j

unknt MI:

“in )1 ll
alienating
way that t
a Bayesial

treatment

r635P*‘f‘tlt'e[~,~

on the patie
tit (Ii. 3
trihuted as: f.
iii) I. has
fill) the (it,
(it) the (it;

it] 6-) p

«L if
as,

Here, the focus is on bandit models which incorporate covariates. The ﬁrst of these
models was introduced by Vt’oodroofe (1979) and is described below. All the models
that will now be presented were developed under the assumption that each patient
enrolled in the study can either be treated by a standard treatment, A, whose sta-
tistical characteristics are known, or a new treatment, B, whose characteristics are

unknown.

Woodroofe (1979) and Woodroofe (1982) consider the problem of sequentially
allocating patients to treatments, when covariate information is present, in such a
way that the expected value of a response variable is maximized. Woodroofe adopts
a Bayesian approach by assuming that the distribution of responses for the new
treatment depends on an unknown parameter which, in turn, is assumed to have
a known prior distribution. The problem can, formally, be described as follows.
Let Xk and Yk denote the potential responses of patient It to treatments A and B,
respectively. For each It 2 1 exactly one of (Xk, Yk) is actually observed. Suppose that
before assigning patient It to one of the treatments, a covariate M, can be observed
on the patient. It is assumed that

(i) (Vk, Xk, Yk), for k 2 1 are conditionally independent and identically dis-
tributed as (V, X, Y) given the value of an unknown parameter 6) = 6;

(ii) V has a known distribution F;

(iii) the conditional distribution GX(o|v) of X given V = u is known;

(iv) the conditional distribution Gy(-]v) of Y given V = 1) depends on G) = 6;

(V) 9 has a known prior distribution 7t.

15

 

Here. X a
taking \‘ait
extra-ratio
ti]; = I (if ii
i 2 1. a, i:
2.6. a lllt‘d:~
the torari;

pnpulatit ,1

Where 0 <
0i 9 and 4
With ”Spin
Clle of it"
‘.‘
pl'pllldljit‘m

L‘- 2111 91]]

“Hi;

Here, X and Y are assumed to be real-valued but V and 9 can be quite general,
taking values in Polish spaces. The distributions F and it are assumed to yield ﬁnite
expectations for X and Y. An allocation policy is a sequence 6 = {6k : k 2 1}, where
(5;, = 1 or 0 if the kth patient is allocated to A or B, respectively. Furthermore, for each
It 2 1, 6k is ameasurable function of {6], (5]- Xj+(1-6j) Yj, Vj, Vk : j = 1, , k—l},
i.e. a measurable function of the previous allocations, responses, covariate values and
the covariate value of the present patient. Woodroofe (1979) deﬁnes, for an inﬁnite

population of patients, the expected a—worth of a policy 6 when the prior is it, as

t/t’ata. 7t) 2 E’r {EC/”1W Xe +(1— 6031]} ’
k=1

where 0 < a < 1 and E7r denotes the expectation with respect to the joint distribution
of G and {(Vk, Xk, Yk) : k 2 1}. Given a and it, the goal is to maximize Wa(6, it)
with respect to 6. Under certain regularity conditions, Woodroofe (1979) describes a
class of asymptotically optimal allocation policies. Woodroofe (1982) focuses on ﬁnite
populations. IfN is the number of patients enrolled in the trial, and 6 = {(51, - . - , 6N}

is an allocation policy as deﬁned above, then the expected response of 6 is deﬁned as

N
a - a {3... + t - W} ,
i=1
where E7r denotes the expectation with respect to the joint distribution of G and
{(Vk, Xk, Yk) : k = 1, , N}. Given 7r, the goal is to maximize RN(6, 7r) with
respect to 6. Woodroofe (1982) shows that the optimal policy can be determined by
an algorithm based on backward induction and investigates the asymptotic properties

of the policy, in the case of a large population (i.e. when N —> oo).

16

 

Clitl'ltil
whith the :
lit a ti’pit'a
the same it
this tintinii
and «in the
denote the
I if! lit‘ai},
attttalit‘ ill)
a (“Ui‘ariat e
Values are 1‘

lid. with i

Clayton (1989) proposes a covariate model for a Bernoulli bandit, i.e. a bandit in
which the responses of the patients to the treatments are Bernoulli random variables.
In a typical bandit it is assumed that all patients receiving the same treatment have
the same marginal probability of success. By introducing a covariate, Clayton extends
this notion assuming that the probability of success depends both on the treatment
and on the covariate. The model can formally be described as follows. Let Xk and Yk
denote the potential dichotomous responses (0 for failure and 1 for success) of patient
k to treatments B and A, respectively. For each It: 2 1 exactly one of (Xk,Yk) is
actually observed. Suppose that before assigning patient It: to one of the treatments,
a covariate I}, can be observed on the patient. It is assumed that the covariate
values are unknown before their observation, but the random variables Vl, V2, - - - are
i.i.d. with a known distribution F. Suppose that functions p and A exist such that
P(X,c = 1|p(Vk)) = p(I’I) and P(Yk = 1|A(Vk)) = A(Vk). The functions p() and
x\() are linked by H, (a known increasing and invertible function) in such a way that
p(u) = H(a + Bo) and /\(v) = H(c + do). Here, a and [3 are unknown constants and
c and d are known constants such that

(i) 5 2 0 and d 2 0, and so p(u) and A(u) are nondecreasing in i);

(ii) a, B, c, d and u are constrained so that p(u) and Mo) lie in [0, 1];

(iii) prior information regarding a and ﬂ is given by a probability distribution R.
Two link functions are studied in the paper: the logit link, H(:r) = eI/(l + ex), and
the log-linear link, H(:r) 2 ex. The worth of a strategy (i.e. allocation policy) is
deﬁned as the expected sum of the ﬁrst n observations for all possible histories re-
sulting from that strategy. A strategy will be called optimal if it yields the maximal

17

 

expected 5
35 in Win.”
arteristit‘S
between tl

at a poiiil

1.4 A

.lletlit‘al it.
of arilapt in
on ariat'itit
Bartlett {1
mezriliram
Sinn of her
analysis of

i
alid Hpﬁii,‘

dUUIJIF‘- b]

:v
13

fig ~
Hi (lt‘prt

‘0 an Rim

The K

. 1H
.4

l
9,.
«id? M h

All

[this

it .
5 Vii);

expected sum. Rather than investigating the asymptotic performance of strategies
as in Woodroofe (1979), Clayton focuses on the determination of the structural char-
acteristics of exactly optimal strategies for the covariate bandit. The relationship
between the standard bandit (which corresponds to the case where F is degenerate

at a point) and the covariate bandit is also studied.

1.4 Applications

Medical investigators who wish to perform clinical trials currently have a wide variety
of adaptive allocation procedures at their disposal. Still, very few clinical trials based
on adaptive designs have been reported in the literature. Cornell, Landeberger, and
Bartlett (1986) report on an adaptive clinical trial to test the efﬁcacy of extracorporeal
membrane oxygenation (ECMO) for the treatment of persistent pulmonary hyperten-
sion of newborn infants. The design used (RPW(1, 0, 1) rule) and the subsequent
analysis of the ECMO trial data created great controversy and much of the criti—
cism of adaptive designs has centered on this trial. Later, Tamura, Faries, Andersen,
and Heiligenstein (1994) describe the rationale, design, and statistical analysis of a
double-blind, stratified (two strata), placebo-controlled trial of out—patients suffering
from depressive disorder. Patients were allocated to treatment or placebo according
to an RPW(1, 0, 1) rule within strata.

The simplicity of implementation of the randomized play-the-winner rule is per-
haps its most atractive feature from the point of view of applications. The choice of a

design should be driven by the simplicity of implementation but also by its statistical

18

properties ax.

ll] llllS (llm‘l

 

properties and the nature of the clinical trial. We hope that the new designs proposed

in this dissertation can be successfully used in adaptive clinical trials.

19

Chapter 2

The Adaptive Weighted

Differences Design

2.1 Introduction

In this chapter, a new adaptive design is deﬁned and studied. It is called The Adaptive
Weighted Differences Design, abbreviated AW D, and it offers a compromise between
the individualistic and utilitarian goals of a design for clinical trials. Patients are
randomly assigned to treatments according to a response-adaptive rule. This new
randomized response—adaptive design can be applied in clinical trials for which

0 patients arrive sequentially;

0 each patient can be assigned to either one of two treatments and will be assigned
to exactly one of them;

0 the responses of patients to treatments are dichotomous (either a success or a

failure);

20

o the response of each patient is observed before the next patient is assigned to a
treatment.

Recall, from Section 1.2, that Wei’s adaptive biased coin design is a randomized
allocation—adaptive design that attempts to assign patients to treatments in such a
way that, at the end of the trial, both treatment groups have received the same num-
ber of patients. As mentioned previously, this design is oriented towards statistical
inference (utilitarian goal). The AW D design can be seen as a generalization of VVei’s
adaptive biased coin design so that ethical issues (individualistic goal) are also taken
into consideration. This is achieved by taking into account, at each stage of the trial,
both the proportion of patients assigned to each treatment and the proportion of

patients successfully treated in each treatment group.

In what follows, the allocation policy for the AW D design is formally described
and strong laws of large numbers and a central limit theorem are proved. The design
is evaluated by comparing its performance with that of complete randomization and

the RPW(1, 0, 1) rule (see Section 1.3).

2.2 The Allocation Policy

Suppose that patients arrive sequentially for treatment and are immediately allocated
to one of two treatments, A or B. For each k 2 1, deﬁne 6),. to be 1 or 0 according
to whether the kth patient is assigned to treatment A or to treatment B. Let Xk

and Yk denote the potential dichotomous responses (0 for failure and 1 for success)

21

of patient k to treatments A and B, respectively. For each It 2 1 exactly one of
(Xk, Yk) is actually observed. Suppose that {(Xk, ll.) : k 2 1} is a sequence of i.i.d.
random vectors. From the point of view of statistical inference we are interested in
estimating the true (unknown) success probabilities, [2,4 and p3, for treatments A and
B, respectively. Here, 1),, = P(X,C = 1) and p3 = P(Yk = 1).

Usual point estimators of 114 and p3 are the proportion of patients successfully
treated in the trial by treatments A and B. To formally deﬁne these estimators we
need to introduce some notation. Deﬁne N,” and [V8,]: to be the number of patients

allocated to treatments A and B through stage k. Then

I:
Ni. = Z 6.-

i=1
and

k
[V3,]: 2 2(1— 61') = k — NA,k-

i=1

Deﬁne S A’k and S 3,}, to be the number of patients successfully treated by treatments

A and B through stage k. Then

I:
5A,. = 26.- X.-
2'21

and

k

5,3,, = Z (1 — 5,) 3;.

i=1

The point estimators, at stage k, of the success probabilities, 1),; and p13. are then
deﬁned to be

5A.}:

ﬂak =
NAJc

 

22

and

I) _ SBJC
BJC — , 7
iVBJC

 

respectively.

Let {Uk : k 2 1} denote a. sequence of i.i.d. Uniform[0, 1] random variables,
independent of the sequence {(.\k, 1),) : k 2 1}. The. sequence of Uk’s is used to
describe the randomization in the allocation policy.

Denote by I() the indicator function.

Wei’s adaptive biased coin design (see Section 1.2) is based on the difference, Dk,

between the proportion of patients allocated to treatments A and B, through stage

 

 

k.
Nik N81: NAk
D: — ,: ——’— —1
k k k ( k )

Dk gives a measure of imbalance of the experiment at stage k. The allocation policy
for the adaptive biased coin design consists of allocating patient k + 1 to treatment
A with probability h(Dk) where h : [—1, 1] —> [0, 1] is a non-increasing function such
that h(;z:) = 1 — h(—;L‘), for all .1: 6 [—1,1]. Wei (1978) suggests using a function h(-)

deﬁned by 12(1) 2 (1 — 1‘)/2 which yields

 

T 1 — Dk
6k+1: [{L'k+1< 2 } (2-1)
So, if D, = 0 then there is perfect balance and the next patient will be allocated
to treatment A with probability % (which corresponds to complete randomization).

Then, as Dk increases from zero (i.e. the more treatment B gets under-represented)

23

the probability of allocating the next patient to treatment A decreases to zero. So, this
allocation policy forces an extremely imbalanced experiment to be balanced but tends
to complete randomization as the difference Dk tends to zero (i.e. as the experiment
approaches perfect balance).

This same reasoning can be applied when the focus is on ethical allocation. In this
setting, the difference between the proportion of patients successfully treated by A
and B, 13,“. -— 133k, is used. This difference will be denoted by A], and the difference
between the success probabilities, 1),, —p3, will be denoted by A. If 5,, = 0 then both
treatments are performing equally well and the next patient is allocated to treatment
A with probability % (which, once again. corresponds to complete randomization).
Then, as 5;, increases from zero (i.e. the better treatment A is performing in com-

parison with treatment B) the probability of allocating the next patient to treatment

A increases to 1. This leads to an allocation policy of the form

1+5
(5,,+1 : 1{U,+, g 2 k}. (2.2)

 

To achieve a compromise between the two previous allocation policies, (2.1) and

(2.2), we consider a convex function of A], and Bit, namely,
It, 2 A5,, — (1 — A)Dk (2.3)

with A 6 (0,1). Then (2.3) can be used to allocate patients according to the policy

 

(2.4)

1+AAk—(I—A)Dk}
2 .

6k+1: 1{ fk+1<

Henceforth, the notation AWD(A) will refer to the allocation policy (2.4) for a con-
stant A E (0, 1) and A will be referred to as the compromise weight.

24

Note that as A increases from 0 to 1, the allocation policy AWD(A) places less

emphasis on balance and more emphasis on ethical allocation.

2.3 Strong Laws of Large Numbers

In this section, unless otherwise stated, it is supposed that patients are allocated to
treatments A or B according to the A\\'D(A) allocation policy where A is a constant
in (0, 1). A fundamental result that will be proved here is the almost sure convergence
of the proportion of patients allocated to each treatment, as the number of patients
treated converges to 00. While working towards proving this result it will be shown
that 13.4,): and 153,1: are strongly consistent estimators of 1),; and 123, the true success
probabilities for each treatment. An expression for the asymptotic proportion of
patients successfully treated following AWD(A) allocations is derived and compared
with the corresponding expression resulting from allocating patients according to
complete randomization.

For k ,>, 1, let .7} be the o-algebra generated by the the ﬁrst I: allocations, potential

responses and auxiliary randomization, i.e.
fk:0{6i, A}, y,“ Uz‘ I lglgk}

and let f0 denote the trivial o-algebra. It is also useful, in the proofs that follow, to

consider the a-algebra
9k = fk V 0{Uk+i}-

Note that (5H1 is gk— measurable and the random vector (Xk+1, n+1) is independent

25

Of gk.

Although the results of this section are only proved for treatment A, similar results

hold for treatment B.

Proposition 2.3.1
lim NA}, z oo a.s.
k—>oo ’

Proof. Since {NAJc : k. 2 1} is a nondecreasing sequence of random variables, then

limk_>00 N,” exists and

{ymnqk<aA=LJﬁm,=nqij>ky as)

Fix It 2 1. Note that

, -
NA,k+l = N4,k + 0k+1

 

=NireltaH<1+%Al—U_A)P(%i)—q

 

 

 

2
Therefore,
_ . 1+AAf—u—A)P(%i)—q .
{JVAJZNA’h V]>k}§ DJ+1> 2 ,V]>k

1+A—1-—1—A zl—i

§{UJ+1> ( ) (2 )( 2 )avj22k}
1— A

:{Uj+1> ——2—, Vj 2 2k} (2.6)

since Aj 2 —1 and NAk/j g 1/2 forj 2 2k. But (2.6) has probability zero for all A

26

in (0, 1) and so, by (2.5)

8

P {klim Nam < 00} g PlN.-i,j -_— Nib Vj > k}

M

k 1

1

—A
< —,Vj22k}=0.

\

M8

P {Cf-1+1 >

a.
| I

l

The result follows. I

The following lemma is a technical result useful in proving the strong consistency

of 114* and 1333,, as estimators of the success probabilities 19,4 and p3.

Lemma 2.3.1 Let {Zk : k 2 1} be a sequence of random variables such that 2,, 2
0 or 1, for each It 2 1. Then

Zk
(Zi+"'+Zk)

 

M8

2<00.

a.
ll

1

Proof. For each to and each k 2 1 let Zk(w) = 2k. Fix n 2 1 and to. Let

n0 2 22:1 [{z;C :1}. Then

 

 

i 2“” — " e "° i< " .1.
1:21 (21(0)) + +Zk(w))2 — k:1(z1+ + 2k)2 k2 \ k2.

Since 2:, l/k'2 < 00, the result follows. I

The proof of Theorem 2.3.1 below uses a result from Hall and Heyde (1980) which

is included in Appendix A of this dissertation.

Theorem 2.3.1 13A,), and [33,), are strongly consistent estimators of the success prob-
abilities pA and p3, i.e.,

lim 13A,}, = 19,, as.

k—mo

27

and
lim 1313.}: = [)3
k-)DO

Proof. Fix It 2

1. \Ve can write

 

1).-i. k — PA

“C(So

X’j—Psl-

SM:

Now, let.

Mk 2

k
:0. e
F1
Since 6,, is Qk_1—measurable and Xk is independent of gm, then

E l5k(Xk — PA) | gk—ll = (5kEl(Xk — 114)]

=amA—ao=0

Hence, {Mb g, : k 2 1} is a martingale. Furthermore, {NA’k : k 2 1} is a nonde-
creasing sequence of non-negative random variables such that Ni“, is Qk_1—measu-
rable for each k 2
m 1
Z N

k=1 k

1. Finally, Lemma 2.3.1 and the definition of N,” imply that

 

 

E[0k( Xk—PA)2 lgk—l]

 

The result follows by Proposition 2.3.1 and a direct. application of Theorem A.0.1. I

It can now be proved that the proportion of patients allocated to each treatment
converges almost surely to a constant, which depends on A, the compromise weight,
and A, the difference between the success probabilities for treatments A and B.

28

Before formally stating and proving this result, we give some heuristics to Show what

the limiting constant should be. So, suppose that

 

 

 

 

 

N
klim A‘k 2 77 as (2.7)
where 77 can be a random variable. Then,
NA 1: NB 1:
D = ’ — ‘
k k k
NA 1:
: 2 ’ — 1
( k )
k——> 277 — 1 as (2.8)
We expect that
. N.4,k _
11m — P(6k+1 — 1 Ifk) = 0 as (2.9)

lit—+00 k
Now, Theorem 2.3.1 and (2.8) imply that

1+AAk—(1—A)Dk

 

P(5k+1= life) 2

 

 

2
1 , _ _ _

__) + AA (1 A)(2n 1) as.

k—)OO 2
So, if (2.9) holds then

1+AA—(1—A)(2n—1)
n 2 as.
2

which, solving for 77, yields

1 A

Therefore, if N A), / k converges, we expect that it converges almost surely to (2.10).

29

Theorem 2.3.2

 

 

, IVA]: _ 1 A ,
"121310 A? — 2 (1+ 2 _ A A) (1.3. (2.11)
and
-, 1V3}; _ 1 A 7'

Proof. Deﬁne a function g on (0, 1) x (0, 1) by setting
q(s, t) = (2 — A)t— (1 — A) 5.
Note that q satisﬁes the regularity conditions of Section 2 in Eisele (1990), namely

(2) q is continuous;

(ii) (1(8» 8) = 8;

(iii) q(s, t) is strictly decreasing in s and strictly increasing in t.

Now, the allocation rule for the AWD(A) design can be written as

(Mai—All}-

NA, 1:
k 7

 

(\DIr—A

5H1: [{Uk+1< q (
where, by Theorem 2.3.1,

1 A - 1 A
o __ _._.__. :— —A .9.
1.1%]2 (1+2—AA’C)] 2(1+2—/\ ) as

Hence, relation (2.11) can be proved by following the same arguments used in the
proof of part (iii) of Lemma 1 in Eisele (1990).

Relation (2.12) follows from (2.11) and the fact that NEH/k = 1 — NAk/k. I

30

Note that a direct consequence of Theorem 2.3.2 is that the relation (2.9) (that

we used to heuristic-ally deduce the limiting proportion of patients allocated to A)

holds.

From an ethical point of view (individualistic goal) we are interested in the pro-
portion of patients successfully treated following A\'VD(A) allocations. Below it is
proved that the asymptotic proportion of patients successfully treated as a result of
AVVD(A) allocations is almost surely greater than that for complete randomization;
as expected, the difference between the ethical peformances of AWD(A) and complete
randomization increases as A increases or as the difference between the treatments
increases.

For each It 2 1, let 8,, denote the number of patients successfully treated through

stage It, i.e. 5,, = 5,4,1, + SEW
Proposition 2.3.2 For complete randomization,

1
lim —£ = — (pA + p3) as. (2.13)

 

S .
1:11:20 % - i (p, + mg) + 2 _ A (pA — pg)! as. (2.14)
Proof. To prove (2.14) note that
5k 1V.~i,k NB 1:

 

 

__:'A +." r
k pAJc k PBJ: k

and use Theorem 2.3.1 (to get the as. limit of [3,“, and 133$), Theorem 2.3.2 (to get
the as. limit of NAk/k and Ngk/k) and algebra.

31

Similar arguments can be used to prove that (2.13) holds. I

2.4 Central Limit Theorem

In this section it is shown that the strongly consistent estimators of the success
probabilities p4 and [)3 are asymptotically independent and normally distributed.
The proof of the theorem below uses uses results from Hall and Heyde (1980) which

are included in Appendix A of this dissertation.

Theorem 2.4.1 As k -—> oo,

\/1VA,k(PA,k — PA) D 0 PA (1.4 0
—> N ,
\/ NB,k('.l33,k — PB) 0 0 PB (13

where qA =1—pA and (13 21—123.
Proof. Fix real constants a and b, deﬁne for each k 2 1 and i = 1, , k,

1 i ,
Alkﬂ' Z — Z [0(Aj — pA)6j + b(Yj _ p3)(1_ 6.7)],

and let gm 2 9,. For each j : 1, ,i let

1 , , -
Zm = — [0 (ij — Paid} + 50’1“— Pelll — 01)]-

yr

The proof of the theorem uses the following three lemmas.

Lemma 2.4.1 (MM, Gk),- : k 2 1, 1 g i g k} is a zero-mean and square-integrable

martingale array with differences {Zm- : 1 g i S k, k 2 1}.

32

Proof. For each h 2 1 and each j = 1, , k, Z,” is QkJ—measurable and in-
tegrable, and E[Z,,,,|gk,,-_,] = 0 (which can be shown as in the proof of Theorem
2.3.1). Hence, {Am G,” : 1 gj S k, k 2 1} is a martingale difference array. There—
fore, {.l[k,,-, QM : k 2 1, 1 g i g k} is a zero-mean martingale array with differences

{ZkJ-z 1g i g k, k 2 1} and for each h 2 1 and each i = 1, , k

MW,)§:E (ﬁ,)
a2
k

//\

 

The result follows. I

Lemma 2.4.2 As k —> oo,

:Z-QNO—3 H—LA\+E- 1—4LA
ki 2PA94 / PBQB 2—A

i: 1

where A = 19A — 193.

Proof. To prove the lemma we use Theorems A02 and A03 in Appendix A. Note
that condition (AA) is trivially satisﬁed by the o—algebras {gm : 1 g i g k, k 2 1}.

Furthermore,
max ]Z,,,|< —(]a|+|b|)—> 0, ask—>00

l<z <k \/_

and
1 .
E (max Z], i) g — (Ial + Ibl)‘2 is bounded for each k 2 1.
lggk k

33

So, conditions (A.1) and (A.3) are satisﬁed by the martingale differences array {2,2,}.

Finally, for verifying condition (A2), note that
" Z, (12 A A b2 A A
Z; k,z' — EPA (1A 1 + 2—_— — 3P}; (13 1 — ﬂ
2 k
a? 1 A

: T +;[(Yi_ (51-— 511.; (12.1 (I + m3)]
%2( (1;— (1—52)—3) 1——A—;\ (21')
I pa) 1 213(13 2 _ A . . 0

So, to verify condition (A.2) it suffices to show that both terms on the right-hand
side converge in probability to O. Re-write the ﬁrst term on the right-hand side of

(2.15) as

 

a2 k , 2 2 Nu: 1 A ,
f 261' [(31 -P.4) ‘PA (1.1] + a. PA (1.4[ k - 5 (1+ 2—:—/\A)]. (2.16)
The second term in (2.16) converges almost surely (hence, in probability) to 0, by
Theorem 2.3.2. To show that the first term in (2.16) also converges in probability
to 0, we use Theorem A.0.3. Deﬁne for each k 2 1 and each 2' = 1, , k, W,- =
(5,- [(X,- — pA)2 — pA q,4)] and l/VkJ- = W}- I {ll'Vil g k}. Since 6,- is g,_1—n1ea311rable and

X, is independent of 9,4, then
EU'I’filgi-J] = E{6i [(Xz' — Pal2 — PA (1.4] [Qt—1} = 0 (1.5-. (2-17)

This implies that {ELM/",1, 9k : k 2 1} is a martingale. Finally, (2.17) together
with the fact that |W',-| g 1 imply that conditions (A.5), (A6) and (A7) of Theorem
A.0.3 are satisﬁed. So, the ﬁrst term in (2.16) converges in probability to 0. We have
shown that the ﬁrst term on the right-hand side of (2.15) converges in probability to
0. Similar arguments can be used to show that the second term on the right-hand

34

side of (2.15) also converges in probability to 0. Hence condition (A.2) of Theorem

A.0.2 is satisﬁed. The result follows. I

Lemma 2.4.3 As I: —> 00,

1 5,4,1: — PA N.4,k o 0 5mm (1 + ﬁA) 0
7/? ——+ N ,
53,1: — PB No.1: 0 0 2PBQB (1 - 337A)
Proof. Since
“ b
2121-, W51 k— PANA,k) + —(SB,k - PBNB,k),

W?

the result follows from Lemma 2.4.2 and the Cramer-Wold technique. I

It is now easy to complete the proof of Theorem 2.4.1. Note that

. k 1
\/N.4,k (P.-i,k " PA) I ‘7 (SA k - P4NA k)

1_VA k
and

. k 1
\/ No.1: (PBJc — P3) = NB k $613.1: — PBNB,k)a

 

and use Theorem 2.3.2, Lemma 2.4.3, Slutsky’s Theorem and algebra. I

2.5 Evaluation of the Design

Recall that the adaptive weighted differences design seeks a compromise between the
individualistic and utilitarian goals. So, we evaluate the AW D design at two levels:
a how ethical are the assignments of patients to treatments?

0 how good is the estimator of the treatments difference ([2,; — pg)?

35

Monte Carlo simulations are used to address these questions. For the ﬁrst one
we look at the proportion of patients successfully treated as a result of AWD()\)
allocations, for different values of A in (0, 1); for the second question we look at the
empirical mean squared error.

Graphical comparisons, in terms of proportion of patients successfully treated and
mean squared error, are made between the AW D design (with compromise weights
equal to 0.2, 0.5 and 0.8), complete randomization (which focuses on utilitarian issues)
and the RPW( 1, 0, 1) rule (which puts more emphasis on individualistic aspects).

Figures 2.1 through 2.10 show the results of 100,000 replications of clinical trials
with sample sizes 71 = 20 and 100, success probabilities for treatment A, 19.4 = 0.05,
0.35, 0.50, 0.65 and 0.95 and a range of values of the success probability for treatment
B, 173 = 0.05 through 0.95. For each allocation policy and each value of n, opr and of
193, the proportion of successes is computed as the average, over 100,000 replications,
of the proportion of patients successfully treated in the simulated trial; the empirical
mean squared error is computed as the average, over 100,000 replications, of the
squared difference between the estimates of ([14 — p3) and the parameter.

The following labels are used in Figures 2.1 through 2.10.

o r 2 complete randomization;
o S = AWD(0.2);

o m : AVVD(0.5);

o l = AWD(O.8);

o W = RPW(1, 0, 1) rule.

36

Simulations conﬁrm some natural expectations on the performance of the ﬁve al-
location rules being compared (see Figures 2.1 through 2.10). The RPW(1, 0, 1)
rule and AVVD(0.8) yield the highest pr0portion of successes followed by AVVD(0.5),
AWD(0.2) and complete randomization, in this order. Also, the differences between
the proportion of patients successfully treated following the ﬁve allocation rules, in-
crease as the treatments difference, (1),; — [)3], increases and as the trial size, 11,
increases. Finally, as the trial size increases the mean squared error decreases for

each one of the allocation rules.

Figures 2.1 and 2.6 show that the two goals of a design for a clinical trial (individ-
ualistic and utilitarian) are not always conflicting. If, for example, treatment A has a
very small success probability and the success probability for treatment B is not large,
then ethical rules like RPVV(1, 0, 1) and AVVD with large A yield not only the highest
proportion of patients successfully treated but also the lowest mean squared errors
for estimating the treatments difference. Note also that, if the success probability for
treatment A is very small but the success probability for treatment B is large, then
the RPVV(1, 0, 1) rule yields high proportion of successes but also very high mean
squared errors (when compared to the other rules); the AW D allocation policy with
large A performs nearly as well as the RPW (1, 0, 1) rule in terms of proportion of
successes but much better in terms of estimating the treatments difference.

Figures 2.5 and 2.10 illustrate the case when the individualistic and utilitarian
goals are, in fact, conflicting. If, for example, treatment A has a very large success
probability and treatment B has a small or moderate success probability, then using

37

an ethical rule (versus using a rule that focuses on statistical inference) gives much
higher proportion of patients successfully treated in the trial but performs rather

poorly when estimating the treatments difference.

Figures 2.1 through 2.10 illustrate the excellent combined performance (in terms
of both the proportion of patients successfully treated and the mean squared error
for estimating the treatments difference) of the AWD allocation policy for suitable
choices of A. If, as in many real clinical trials, there is some previous information on
the success probability of one of the treatments, the simulations give rough guidelines
as to good choices of the compromise weight for each situation. So, suppose there
is some information on the success probability of tretament A. We suggest choosing
compromise weights as follows.

0 If 1),; is very small then use a large A for both small (Figure 2.1) and large (Fig-
ure 2.6) trials;

0 if 19,; is moderately small then use a moderate A for small trials (Figure 2.2) and a
moderately large A for large trials (Figure 2.7);

o if 1),; is moderate then use a moderate A for small trials (Figure 2.3) and a large A
for large trials (Figure 2.8);

o if 1),; is moderately large then use a moderately large A for small trials (Figure 2.4)
and a large A for large trials (Figure 2.9);

o if 192; is very large then use a moderately large A for small trials (Figure 2.5) and a

large A for large trials (Figure 2.10).

38

Proportion of Successes

 

 

 

 

0.8

0.6

0.4

0.2

PB

Mean Squared Error

 

 

 

 

r smw
/ ////
r smw
/ ////
rl SW
/////
r QW
/M/ﬂ
//
//
000.0 0N0.0 0F0.0

0.8

0.6

0.4

0.2

PB

Figure 2.1: Comparisons in terms of prOportion of successes and mean squared error

for n = 20 and 19,4 2 0.05.

39

 

. Iﬂilli l #lilil

0.0

dullII

®.O

ifd-

ll..lJ|

v.0

allillli

N0

T fill-llllll l \
no.0 mod mod v0.0 00.0

 

 

Proportion of Successes

 

 

 

0.8

0.6

PB
40

0.4

0.2
0.35.

 

 

 

w I m a
/ A \ \
w .1 m s
/ A \ \
w/ \1 \m\\....
1 M w I mrs
/ A \\\
wl ms
K m.
m... ,_ I
r w ms
6 r _, :
0. E 1W ms
d : :
e 1w m.
r : _:
m m m 1w .ms
2 Z _:
I am. Iwms
/ o a 2:
W e Ima
.. M 1..
A. 1 .3.
w 1 2.
W. 2.
a //
1M. MW
so No 8.0 mod mod vod 8.0

Figure 2.2: Comparisons in terms of proportion of successes and mean squared error

for n = 20 and 12,1

Proportion of Successes

 

 

 

 

 

 

 

 

no W
6 ‘ 5|
¢Y/m
” — ¢¥5T4$
o /%2 /
zlﬁ‘
e - me
o ‘4
I
U"
n _ m’
0 ./
.l
v "V
2 1:2; 2
d‘*T¢
0.2 0.4 0.6 0.8
PB
Mean Squared Error
/W
l w
/
B - /w
_ ; ~w= — -
‘ ,1:1:W’W w ' |"‘I—I
/\k’ __ __ _ 2
g _ ég;@f@:@—@—@~@:@S@:
4&4 \T3 \
- /$/ \gkm
/ﬂ/ \ \
r/ s m
8 J r?‘ \s
o 0
0.2 0.4 0.6 0.8

PB

Figure 2.3: Comparisons in terms of proportion of successes and mean squared error

for n = 20 and 1911 = 0.50.

41

lliqilllidilﬁ. ll '1'. ﬁll.i4 .llllln
w. 0 m . O v . O

0.0

5.0

FlgUre .2 2

Proportion of Successes

 

 

 

 

 

 

 

 

A
m /
o ‘ 2%?
wé
4
N - 3"
0 ‘/
I/‘/
Q 1 "
o 9"
, 5%?!
Y..m’m $5?
r
0.2 0.4 0.6 0.8
PB
Mean Squared Error
0)
q w
o
/
.. w/w
E q W’W/W/
=w_Y:Y’
- W‘W ‘ l ‘ l A | ~
in /W’ @_ r215
0. " 1a ”:11
° :;:@’@ Q g @ @:@2
8.. /* S\m
o 5" s
0.2 0.4 0.6 0.8
PB

Figure 2.4: Comparisons in terms of proportion of successes and mean squared error

for n = 20 and 1),; = 0.65.

42

Proportion of Successes

 

 

 

 

 

 

 

 

In"
m _ ‘,
o ,m’
W’Y’;:¥;¥’
_ aw—W’ , ’ é
—|—|—|"" /m%
l—l—l ,m/$
'\ _ m/m/ /§/
0 /m/ /§/
,m /r
(O s/ /
-« / r
o /s r/
/S /
s /r
to. A /'
o I I T I
02 04 03 03
PB
Mean Squared Error
W’W—W~w
/ ~w\
8 - w/w W‘W‘W’w
0 w/
/
w
8 w/ I
V W / I\
Q - /'/l |
0 w | m,m—m—m—m‘m\ \l
w l /g,s’8 8~ :T\ I
g - /|/@§T \@§r§
o /@5 §@
3" \m
I I I I
02 GA 05 03

PB

Figure 2.5: Comparisons in terms of proportion of successes and mean squared error

for n = 20 and 19,; = 0.95.

43

QC

QC

r4\\ It, «I il‘ 11' J, I I: . L .. MU.
C. U.
. 1
woo 0 wood 2
P» \\
r
a n.
.15 H.
F 90.“

Proportion of Successes

 

 

/
/
08

0.2

 

 

PB

Mean Squared Error

 

 

 

 

0.8

0.6

0.4

0.2

133

Figure 2.6: Comparisons in terms of proportion of successes and mean squared error

0.05.

for n = 100 and 19,4

44

Proportion of Successes

 

 

 

 

 

 

 

 

/w
(n
d d /W I
/w/|/m
w I m/
/|/m/ t?
0' ‘ / ﬁ 255"
Ma:
[I53
I‘m
V. . "
O "
././
.’
mﬁ*"'
N 5
d .4 U, I I I I
0.2 0.4 0.6 0.8
PB
Mean Squared Error
w
m w
a + /
0
/w
/w
_ ' ,w
‘ .,g—I-U=U=¥E¥:¥;x:l—l—|_|_|
"’.’ ‘0: :
8. a o
O I T I r
0.2 0.4 0.6 0.8
PB

Figure 2.7: Comparisons in terms of proportion of successes and mean squared error

for n = 100 and pg 2 0.35.

45

 

3

Proportion of Successes

 

 

 

 

w
/

_ /W/I
rx /w§'%';r£|
d I /¥§*é

/ ¢
4%?
é
d “A
“a .
/
l0 ﬁ ‘/.
o ./
./
9"
" aw?!
‘_ .. :Iﬁé
or) J5; y
o ‘ $15
T I I I
02 CA 06 03

PB

Mean Squared Error

 

 

 

 

:9 W
3 ‘ /
/W
,w’w
- 5 sm-U-H—Himi E _
ﬂ“5ﬂ m x g * éEITI‘I
o"’ 9:9:

8 V 'I‘
o d I I T I
O

02 Q4 06 08

PB

Figure 2.8: Comparisons in terms of proportion of successes and mean squared error

for n = 100 and pA = 0.50.

46

 

mo

0.0

mo

v.0

20.0

are ‘7'

,._,

FIEI

Bed 08.0 mood

iOI' n

Proportion of Successes

 

 

 

 

 

/w
a) W/
6 ‘ gé®‘@
/ ’
“é
'5 ‘ ué
O 'I
/
,3
(D. _. "I.
o , 5m?”
“ ’Y’V:¥E¥;¥é
mIm:§é
Y..m’ /§/
0 /?/
§/
0.2 0.4 0.6 0.8
PB
Mean Squared Error

92 w
Q -
o

.I
v
a q
o w

0010
\\
ell-E
|\\
°?-E
E-E
III
G-E
III
GE
III
03E
II
GE
III
GE
III
as
E
I
\
2

0.006

 

 

 

0.2 0.4 0.6 0.8

PB

Figure 2.9: Comparisons in terms of proportion of successes and mean squared error

for n = 100 and pA : 0.65.

47

0.6 0.7 0.8 0.9

0.5

0.015 0025

0.005

Proportion of Successes

 

 

 

 

 

 

 

 

u"
o: d ,m7
o (W’ ’
W—w—w—w—w—w—w—w—w—w—w—W‘W’w,¢?%
/ 7
Q I’l,|:r§;
o * , ,|’ / /
I—I—I—I—l—l'l ' /@¢@
m’m:§/
N q /m/ /§/
0 m/m /§/
m,m ,s/
,m’ s r
m s’r/
co. 4 S/r/
O /S/l'/
5/5 r/
/
In. _ r/
O I I I I
0.2 0.4 0.6 0.8
PB
Mean Squared Error
LO
8 ‘ w—W
d W/w’ W\w
\
/ W
- W \w
w/ \
m w
_ / \
Q ‘ w W
0 \
// W
4 w \w
/ .—- — — — § \ \
§ - w /. ,TzTéT-@=T=T-@—@=rb=¢n; \ w
5 ‘ﬂ
I I I I
02 DA 05 08

PB

Figure 2.10: Comparisons in terms of proportion of successes and mean squared error

for n = 100 and 19,; = 0.95.

48

 

Cha1

The

Diff

3.1

35599“:
uuo an
conﬁuna

indepen
I”)? USFd
intnuju
ancaV
batten
is calh

rSirxziIa

Chapter 3

The Covariate Adaptive Weighted

Differences Design

3. 1 Introduction

As seen in Chapter 1, a widely used approach for incorporating covariate information
into an adaptive design consists of ﬁrst forming strata by considering all possible
combinations of levels of relevant covariates and, then, using an adaptive design to
independently allocate patients within each stratum. Clearly, this procedure can also
be used to incorporate covariates into the AWD allocation policy. In this chapter, we
introduce a new approach that generalizes the one described above by allowing the
allocation of patients in one stratum to depend on the allocations and responses of
patients previously treated in the same and other strata. This new adaptive design
is called The Covariate Adaptive Weighted Diﬁerences Design, abbreviated CAVVD.

Similarly to the AWD design, this new design can be applied in clinical trials for

49

which
0 parieuh

o earl) pa?

 

to exartly I
0 {hp r951)“
urol:

o the resin
treammu.
Sim"? the (
are neiwlml
o relmuu

physical st
to a treatr
0 all the st
lexwls are I

111 Each 011

with t
to a ”SIM
lHdh-MUHE,
In Villa

and Sty;mg

ls .,
, Mama,”

which

0 patients arrive sequentially;

0 each patient can be assigned to either one of two treatments and will be assigned
to exactly one of them;

0 the responses of patients to treatments are dichotomous (either a success or a fail-
ure);

0 the response of each patient is observed before the next patient is assigned to a
treatment.

Since the CAW D procedure makes use of covariate information, further assumptions
are needed. It will also be assumed that

0 relevant covariates on a patient (concomitant information such as age, sex, general
physical status, severity of the disease, etc.) are available before assigning him/ her
to a treatment;

0 all the strata that can be formed by considering common combinations of covariate
levels are known before the trial begins and there will be at least one patient treated

in each one of those strata.

With the CAWD design, patients are randomly assigned to treatments according
to a response-adaptive covariate design. A compromise is again sought between the
individualistic and utilitarian goals of a design for clinical trials.

In what follows, the allocation policy for the CAWD design is formally described
and strong laws of large numbers and a central limit theorem are proved. The design
is evaluated by comparing its performance with that of the AW D design within strata,

50

(rmnplote

Semen l.-

 

3.2 "I

3.2.1 (

The fullmx'

Supp»
is examim
the pm.“
We treat]
the kill p;
the P‘YIU’II‘
treatment
Ub5““'(*d .
each k 2
5“ Of all

Ltd, ramj

“‘hpreg‘ f(
FUJI” [h 0

If g .1.
.LUAHQ‘W]

complete randomization within strata and the RPW(1, O, 1) rule within strata (see

Section 1.3).

3.2 The Allocation Policy

3.2.1 General Notation and Assumptions

The following notation and assumptions will be used here and in Chapter 4.
Suppose that patients arrive sequentially for treatment. Upon arrival each patient
is examined to determine his / her covariate levels. Let 2) denote the stratum to which
the patient belongs, where v 6 {1,2, - -- ,r}. The patient is then allocated to one of
two treatments, A or B. For each k 2 1, deﬁne 6;. to be 1 or 0 according to whether
the kth patient is assigned to treatment A or to treatment B. Let X k and Yk denote
the potential dichotomous responses (0 for failure and 1 for success) of patient k to
treatments A and B, respectively. For each k 2 1 exactly one of (Xk, Yk) is actually
observed. Suppose that {(Xk, Yk) : k 2 1} is a sequence of i.i.d. random vectors. For
each k 2 1, let Vk denote the stratum to which patient k belongs. Denote by V the
set of all possible strata {1,2, - -- ,r}. Suppose that {I}c : k 2 1} is a sequence of

i.i.d. random variables such that
P(I"’1 = v) = C(‘U), Vv E V

where, for each i) E V, C(U) is an unknown constant in (0, 1) and Zvev 6(2)) 2 1.
From the point of View of statistical inference we are interested in estimating the true

(unknown) success probabilities, 1),,(0) and p3('v), for treatments A and B, within

51

each stratum v 6 V. Here, for each 11 E V, p,4(v) = P(X,c = 1 l V), = v) and
pB(v) : P(Yk=1|I‘}C= 1)).

Throughout the remainder of this subsection, k 2 1 and v E V are ﬁxed.

Usual point estimators of 1),,(21) and pB(-v) are the proportions of patients success-
fully treated in the trial by treatments A and B within stratum “v. To formally deﬁne
these estimators we need to introduce some notation. Let Nk(v) denote the number

of patients treated in stratum v through stage k. Then

Nk(v 22:31“; 22)}.

Deﬁne NA,k(v) and NB k(’L ) to be the number of patients in stratum U which are

allocated, through stage k, to treatments A and B, respectively. Then,
NA 14:: [{V— _ v} 6
and
N3k(v 221””, =v}( (,1—6)= Nk(v)—N,4,k(v).

Deﬁne SA,k(v) and 33,],(12) to be the number of patients which belong to stratum v

and are successfully treated by treatments A and B through stage It. Then

34 ,( 6) —ZI{V,- _ v}0, .1

and

SBk(’U 221‘“ =’(,)U}l—6 )Y.

The point estimators, at stage k, of the success probabilities within stratum v, pA('v)

and 113(1)), are then deﬁned to be

 

- SA,k('U)
p1 [C(11) — \’A,k('b)
and
63 .(v) = —S.B"‘('“),
’ NB,,,(2))
respectively.

Let {U,c : k 2 1} denote a sequence of i.i.d. Uniform[0, 1] random variables
independent of the sequence {(Vk, Xk, Yk) : k 2 1}. The sequence of Uk’s is used to

deseribe the randomization in the allocation policy.

3.2.2 The CAWD Allocation Policy

In what follows, the general allocation policy for the CAWD design is of the form

6,ch1 = de+1(v)1{vk+1 = v} (3.1)

06V

where, for each v E V, 6k+1(v) is to be speciﬁed.

The allocation policy will ﬁrst be described in the particular case corresponding
to allocating patients using an AW D allocation policy (2.4) within strata. Suppose
that patient k + 1 is in stratum 'vo. Then, in the expression for the AWD allocation
policy, replace 3,, (the difference between the estimated success probabilities for A
and B) by the difference between the estimated success probabilities for A and B

within stratum no,

A

AHUO) = 13A,k('b‘0) ‘ I38,k(‘vol (3-2)

and replace Dk (the difference between the proportion of patients allocated to A and
B) by the difference between the proportion of patients allocated to A and B within

stratum 1'0,

N.-1,k(‘l'0) _ Ne,k(l’0)

D U = i
k( 0) Nk(l’0) Nk('1-’0)

This yields

 

<1+ 151mm) — (1 - MDIJUol}, (3.4)

2

The idea. behind the CAWD design is that, when allocating a patient with covariate
value v0, it may be possible to increase the overall proportion of patients successfully
treated in the trial by using information on the responses of patients treated in the
previous stages in strata '1) 75 110. So, instead of simply using Ak(v0) in the allocation
policy, we use a weighted average of Ak(v) ’8, namely

k(v0, 1102(an 10,11)Ak(v) (3.5)
vEV

where m(1)0,v) ’s are non-negative real numbers such that £2va 771('110,v) 2 1. This

yields

1+ AAk('110, AI) — (l — A)Dk(v0)

2 (3.6)

 

5k+1(1’0) = I Uk+1 \<.

Suitable choices for the constants m(~110, 11) will be discussed in Sections 3.3 and
3.5. The m('1)0,'11)’s can be interpreted as the weights to be placed on responses of
patients previously treated in strata v E V, when allocating a patient in stratum 110;

these constants will be referred to as the crossover weights (from strata v to stratum

54

v0). Denote by M the matrix of crossover weights,

m(1, 1) m(1, 2) m(1, 7')
771(2, 1) 771(2, 2) m(2, 7‘)

M = . (3.7)
m(r, 1) 771(r, 2) m(r, 7‘)J

 

 

with non-negative elements and such that the sum of the elements in each row equals
1.

Note that (3.4) (which is the AVVDM) allocation policy within strata) is the par-
ticular case of (3.6) corresponding to choosing a diagonal matrix of crossover weights
with diagonal elements equal to 1. We will denote such matrix by Ms.

Henceforth, the notation CAWD(/\, M) will refer to the allocation policy given by
(3.1), (3.6) and (3.5), for a constant (A E (0, 1) (again referred to as the compromise

weight) and a matrix of crossover weights M.

3.3 Strong Laws of Large Numbers

In this section, unless otherwise stated, it is supposed that patients are allocated to
treatments A or B according to the CAWD(/\, M) allocation policy. The asymptotic
results proved in this section are the analogues to the asymptotic results proved for
the AVVD(/\) allocation policy in Section 2.3. A fundamental result that will be proved
here is the almost sure convergence of the proportion of patients allocated to each
treatment within each stratum, as the number of patients treated converges to 00.

While working towards proving this result it will be shown that 13A,),(12) and 133,100

55

are strongly consistent estimators of p,4('v) and p3(v), the true success probabilities
for treatments A and B, within each stratum v E V. An expression for the asymptotic
proportion of patients successfully treated following a CAWD(/\, M) rule is derived
and compared with the corresponding expressions resulting from allocating patients
according to complete randomization within strata and according to the AWD(/\) rule
within strata.

For k 2 1, let .7), be the a-algebra generated by the the first k allocations, potential

responses, strata and auxiliary randomization, i.e.

and let f0 denote the trivial a-algebra. It is also useful in the proofs that follow, to

consider, for k 2 1, the a-algebras
Qk = 7:1: V Jill-+1}
and
Hk = .73, V U{l’)€+1, Uk+1}.

Note that 6H1 is ’Hk—measurable.

Lemma 3.3.1 For each '1) E V,

 

= ((1)) (1.3.

Proof. Fix 1) E V. Since

 

and {WC : k 2 1} is a sequence of i.i.d. random variables with E[I{V1 = 11)] = C(11),
the result follows by the Strong Law of Large Numbers. I
Lemma 3.3.2 For each 11 E V,

lirn Nk(11) : 00 (1.3.

k—wo

Proof. Follows directly from Lemma 3.3.1. I

Although the results that follow are only proved for treatment A, similar results

hold for treatment B.
Proposition 3.3.1 For each 11 E V,
klim N,4,k(11) z 00 as.

Proof. Fix 11 E V. Since {N,4,k(11) : k 2 1} is a nondecreasing sequence of random

variables, then lim;H00 N,4,k(11) exists and

{33.30.11.419 < 00} = U {NM-(v) —_- N,4,k(11), Vj > 1.}. (3.8)
[€21

Fix k 2 1. Note that

1VA,k+1(v) Z N.4,k(v) + Iin+1 Z L’}6k+l(l’)

1+ 113,,(21, M) - (1 — A) [2 (W) — 1]
2

 

= NA,1c(U) + [{Vk+1:v}l Uk+l <

By Lemma 3.3.2, there exists k0 = k0(k, 11) > k such that Nko(11) 2 2k. Consider any

such k0. Then, for any j 2 k0, Nj('11) 2 2k and so

 

 

NA. 11(1)) k 1 .
g =—, v 2k. 1
Nxm 2k 2 J 0 (30)
Clearly
[3,.(21, M) 2 —1, Vj 2 kg. (3.11)

Now, (3.9), (3.10) and (3.11) together yield

{JV/1,](v) Z NA,k(U), Vj > k}

- M 1.
1+ 1A,“), M) — (1 — 1) [2( ,1)»ij
2

 

Q Uj+1>

 

—1
) orVJ-Haév,\'/j>k

l—A

 

Q {Uj+1 > 07‘ 1341 751), VJ 2 k0} (3-12)

We now Show that (3.12) has probability zero. We can re-write (3.12) as 032,60 C]-

where, for each j 2 k0

l—A

Cj 2 {(17141 > T} U {V3.11 ¢ U} .

Note that {C}, j 2 k0} is a set of independent events with

 

Since C(11) E (0, 1) and /\ E (O, 1), then P {ﬂy-2h) 01-} = 0,1.e., (3.12) has probability

zero. Hence, by (3.8), and (3.12)

P{kli_,n:oNA’k(U) < 00} S ;P{IVAJ(U) Z NA,k(’U), Vj > k} 2 0.

The result follows. I

The proof of Theorem 3.3.1 below uses a technical result proved in Section 2.3
(Lemma 2.3.1) and a theorem from Hall and Heyde ( 1980) which is included in the

Appendix A of this dissertation.

Theorem 3.3.1 For each 11 E V, 114,),(11) and 133,),(11) are strongly consistent estima-

tors of the success probabilities pA(11) and 113(11) within stratum 11, i.e.,

lim [3,4,k(11) = p,4(11) (1.8.
k—mo

and
111“ 1531(1)) = 103(1)) (L19-
k—mo

Proof. Fix 11 E V and [121. We can write

 

k
1111(1— .11. 11 “11.11111 21113=11161<er<v11
Now, let

k
Mk = 2 1m- : v} 6) (X) — 121(1))-

1:1

Since 6,, and V}, are Hk_1—measurable, then
E[I{1"}. = '11}61(.’(1c - PA(v)) | 7111-11 =
= H171: v}1{51c = 1} E[(X1 — 11401)) | 7111-11
=I{l/1c—’U}I{5k—1}l(\k—llHk-1)“P.4('U)l

= [{Vk : 11} [{6k 2 1} [1111(0) — p,4(11)] : 0 as

59

Hence {Mb Hk :

k. 2 1} is a martingale.

Furthermore, {.NA,),(11) : k 2 1} is

a nondecreasing sequence of non-negative random variables such that NA,k(11) is

’Hk_1—measurable for each k 2

imply that

2N2;
=2

: p‘4(11)(l— PAW» Zoo:

[{lk—l’}1{6k— 1}

1. Finally, Lemma 2.3.1 and the deﬁnition of NA,k(11)

{I[{lk= 111511: (Xk ”1921(3)”? lHk—I} :

E [(Xk — PA(U))2 lHk—l]

 

N32( 1.( 21)
[{Vk 1‘ 1.1}6;C
k:1 N31,k(v)

 

< 00 as,

The result follows by Proposition 3.3.1 and a direct application of Theorem A.0.1. I

It can now be proved that the proportions of patients allocated within each stra-

tum to treatments A and B, converge almost surely to constants which depend on

the compromise weight, the crossover weights and the difference between the success

probabilities for treatments A and B within each stratum. Before formally stating

and proving this result we give some heuristics to show what the limiting constants

should be. Further notation is needed for the sake of simplicity. For each 11 E V, let

and

Now, suppose that

 

A(1.1) 2 [14(11) — p3(11) (3.13)
(1111):: m( 11 11') (11’). (3.14)
11 ’6V
. NAM”) _ ,
lenoic Nk(U) — 17(11) as. (3.15)

60

where n(v) can be a random variable. Then,

va.k(l.’) 17VB,k(v)

Nk(11) Nk(11)
_ NA,1¢(’U) _
‘ 2i thv) ) 1

—-> 211(11) — 1 a.s.,

k~+oc

012(1)) 2

 

 

 

We expect that

 

1. [N11, 1(1’)
1m

Nk(v) — P(6k+1(11) : 1 1912)] = 0 as.

k—aoo
Now Theorem 3.3.1 and (3.16) imply that

1+ AAk(11, 1W) —- (1 — A) Dk(11)
2
1+ AA(11, M) — (1 — A)(2n('v) — 1)

 

P(5k+1(v)=1|gkl=

 

12:0: 2
So, if (3.17) holds then

_ 1+ A 5(11, M) — (1 — 1)(2 17(11) — 1)
— 2

 

n(v)
which, solving for 17(11) gives

1 A ~
17(11) : 2 1+ 5:: A(11, IV) as

(3.16)

(3.17)

(3.18)

Therefore, if N A,k(v) /Nk(11) converges, we expect that it converges almost surely to

(3.18). The formal proof of this result follows arguments similar to those used to

prove Theorem 2.3.2.

Theorem 3.3.2 For each 11 E V,

. NA k(U) 1 A ~
’ = — / ' M . .
21:20 Nk(11) 2 (1+ 2 _ /\ A(11, )) a s

 

 

61

(3.19)

and

 

 

. JV}; k(l,’) 1 A ~
1 ’ =— 1— A" .M .2
kg; 1Vk(‘l1') 2 < 2 _ A (L’s )) (1 5 (3 0)

Proof. Define a function q on (0, 1) x (O, 1) by setting
q(s, t) = (2 — A)t— (1 — A) 8.
Note that q satisfies the regularity conditions of Section 2 in Eisele (1990), namely

(i) q is continuous;

(ii) q(s, 8) = 8;

(iii) q(s, t) is strictly decreasing in s and strictly increasing in t.
Fix 11 E V. Now, for the allocation rule CAWD(A, M) we can write

- , NH 1 A ~
0k+1(vl Z [{bk+1< q (Nkill’C), 5 ( + m Ak(v1 1110)} -

where, by Theorem 3.3.1,

 

. 1 A ~ , _1 A /~
1211111010 [2 (1+ 2_—/\Ak(11,111))] — 2 (1+ 2_)‘A(11m)) a.s.

Hence, relation (3.19) can be proved by following the same arguments used in the

 

proof of part (iii) of Lemma. 1 in Eisele (1990).

Since NB,,,(11)/Nk(11) = 1 — 1N7,(k(11)/Nk(11), then (3.20) follows from (3.19). I

Note that a direct consequence of Theorem 3.3.2 is that the relation (3.17) (that

we used to heuristic-ally deduce the limiting proportion of patients allocated to A

62

within stratum 11) holds.

From an ethical point of view (individualistic goal) we are interested in the propor-
tion of patients succesfully treated following CAWD(A, M) allocations. This asymp—
totic proportion is compared, in the proposition below, with the corresponding ex-
pressions for complete randomization within strata and AW D(A) within strata. Recall
that allocating patients according to AWD(A) within strata corresponds to choosing a
diagonal matrix (1).-Is) of crossover weights with diagonal elements equal to 1. Hence,

for this policy

~

A(11, 111) = A(11). (3.21)

For each k 2 1, let 5;, denote the number of patients successfully treated through
stage k, i.e. S), = ZUEV [34,),(11) + SB,k(v)]. Note that the true success probabilities

for treatments A and B, p A and 113, can be written as

PA = Z C('U)P.4(’U)

vEV
and
p3 = 260111136),
vEV
respectively.

Proposition 3.3.2 For complete randomization within strata,

1
lim i = — (11.1 +113) 11.3. (3.22)

63

For the CAWD(A, M) allocation policy,

 

, S. 1 A ~
.3132. 73‘ = g [(1.1 +22) + 2_—,\ 2(1/ (1.. 1114\(11) (3.23)
For the AWD(A) allocation policy within strata,
lim ﬂ — — (114 + 113) + ———- Zc(1) A2(11) as. (3.24)
k-—)oo k 2 _ A vEV

 

 

Proof. Relation (3.24) follows from (3.21) and (3.23). To show that (3.23) holds write

 

 

2 = z (2,... . - - 1111:) - N1”)
v€V

The result follows by Theorem 3.3.1 (as. convergence of 151,411) and 133201)), Theo-

IVB‘ 1,-(11)
Nk(v)

 

rem 3.3.2 (as. convergence of M and ) and Lemma 3.3.1 (as. convergence

1V140)
of 1317(2) and algebra.

A similar argument proves that (3.22) holds. I

As mentioned in Section 3.2.2, the idea behind the CAW D design is that it might
be possible, for suitable choices of the crossover weights, to improve upon the AWD
design within strata in terms of the proportion of patients successfully treated. We
now use Proposition 3.3.2 to derive the best choices, in terms of ethical allocation,
for the crossover weights.

In the case of two strata the best choices for crossover weights have simple in-
terpretations. Only the case of two strata is considered here and when evaluating

the CAWD design in Section 3.5. In the remainder of this section we assume that

64

V = {1, 2}. Hence
m(1,1) m(l, 2)
M =
m(2, 1) m(2, 2)

Recall that, in this setting,

0 the crossover weights for allocating a patient in stratum 1 are m(l, 1) and m(l, 2)
with 111(1, 1) + m(l, 2) = 1, and the crossover weights for allocating a patient in
stratum 2 are m(2, 1) and m(2, 2) with m(2, 1) + m(2, 2) = 1;

o A(1) 2 114(1) — 113(1) and A(‘2) = 114(2) — 118(2);

.
I
1

. A(1, m) .—_ 122(1, 1) A(1)+m(1, 2) 23(2) and M2, 771) = m(2, 1) A(1)+m(2, 2) 11(2).

Theorem 3.3.3 If A(1)A(2) g 0, the choice of crossover weights that yields the
largest asymptotic proportion of patients successfully treated is given by

1 0
M, = . (3.25)

O 1

which corresponds to allocating patients according to AWD(A) within strata.
If A(1) A(2) > 0 and |A(I)| > |A(2)|, the choice of crossover weights that yields
the largest asymptotic proportion of patients successfully treated is

1 0

1 0
If A(1) A(2) > 0 and |A(1)| < |A(2)|, the choice of crossover weights that yields

the largest asymptotic proportion of patients successfully treated is

0 1
M2 = . (3.27)

Proof. Let Sgen denote the as. asymptotic proportion of sucesses for CAWD(A, M)
with general choice of crossover weights. By (3.23). the maximum of Sgen is achieved
for the choice of M that maximizes 22,61, C(11) A(v, M) A(v). Since m(l, 1) = 1 —

m(l, 2) and m(2, 2) = 1 — m(2, 1), we may write

211(1) A(u, M) A(v) :

116V

2 2(1) 771(1, 2) 1(1) (A(2) — 1(1)) + 11(2) m(2, 1) A(2) (an) — A(2)) (3.28)

If A(I) A(2) g 0, then both A(1)(A(2) — A(1)) S 0 and A(2)(A(1) — A(2)) S 0.
Hence (3.28) is largest when 171(1, 2) = m(2, 1) = 0 and the ﬁrst part of the theorem
follows.

If A(1)A(2) > 0 and |A(1)| > |A(2)|, then both A(1)(A(2) — A(1)) < O and
A(2)(A(1) — A(2)) > 0. Hence (3.28) is largest when m(l, 2) = O and m(2, 1) = 1.
The second part of the theorem follows. The proof of the result for A(1)A(2) > 0

and |A(1)| < IA(2)| is similar. I

The results stated in Theorem 3.3.3 can be interpreted, from the point of view of

ethical allocations, as follows. Suppose stratum 1 corresponds to female patients and
stratum 2 corresponds to male patients.
a If A is as good as B for treating either female or male patients or if A is better
than B for treating females but B is better than A for treating males, then females
(respectively, males) should be allocated without using information on the responses
of males (respectively, females) previously treated; there should be no crossover of
information.

66

o If A is better than B for treating both female and male patients and the difference
between the success probabilites is largest for the female group (respectively, male
group), then both females and males should be allocated by using only information on
the responses of females (respectively, males); there should be crossover of information

only from the stratum with the largest difference between treatments.

3.4 Central Limit Theorem

In this section it is shown that within each stratum v E V, the strongly consistent
estimators of the success probabilities pA (v) and p 3(11) are asymptotically independent

and normally distributed.

Theorem 3.4.1 For each v E V, as k —> oo,

NA,k(’U) (1321,1201) “ 1921(0)) 11 0 1321(1)) (121(1)) 0
—> N , ,

VNB,k(U) (138.1(0) — 113(0)) 0 0 193(1)) (113(1))

where (124(1)) =1— 1921(0) and (18(1)) 2 1 — 173(1))-

Proof. Fix v E V. Fix real constants a and b, and deﬁne for each h 2 1 and

i=1,--1,k,

\/_

and let Hm‘ = 71,-. For each j = 1, ,i let

M,“- = i]: Z [a(XJ~ — pA(i1))6,-1{1'3 = v} + 13(1} - P3(v))(1— 5]) [{l’} = ”)1,

Z)“ = % [11(Xj — pA(v))6j I{l"j = v} + b(Yj — pB(v))(I — (531)1{1’} = v}].

The theorem can now be established using similar arguments to the ones used in the
proof of Theorem 2.4.1. I

67

3.5 Evaluation of the Design

Evaluating the performance of the CAW D design is not a simple task. Many param—
eters (p,.1(v), 113(1)) and ((1') for 11 E V) and choices of weights (A and M) have to be
considered. The number of parameters increases rapidly as r, the number of strata,
increases. Simulations are, in this setting, extremely time consuming; a thorough
evaluation of the CAW D design is relegated to future work. Here our goal is simply
to show that the idea behind the CAWD design has merit. Only the case r = 2 is
considered. Monte Carlo simulations are used to evaluate the CAWD design from the
individualistic point of view:

0 how ethical are the assignments of patients to treatments?

and from the utilitarian pont of view:

0 how good are the estimators of the treatments difference within strata, pA(1) —p3(1)
and 1921(2) - 113(2)?

These questions are addressed by looking, respectively, at the proportion of patients
successfully treated and at the empirical mean squared error within strata obtained
as a result of CAVVD(A, M) allocations.

Theorem 3.3.3 shows the best choices for the crossover weights from an individ-
ualistic perspective. Now we will use simulations to verify, both for small and large
trials, how good these ethical choices can be and how much may be lost in terms of
statistical inference. Comparisons are made between the CAWD design with A = 0.8
and matrices of crossover weights given by Theorem 3.3.3, the CAWD design with

A = 0.8 and a matrix of crossover weights, denoted Mh, with all elements equal to

68

0.5, complete randomization within strata and the RPW(1, O, 1) rule within strata.
Here the choice of A = 0.8 is justiﬁed by the good performance shown by an AWD
design with a large compromise weight.

Figures 3.1 through 3.18 show the results of 10,000 replications of clinical trials
with sample sizes 11 = 30 and 150, success probabilities (p,4(1), p3(1), p,4(2), pB(2)) =
(0.50, 0.10, 0.15, 0.40), (0.90, 0.10. 0.60. 0.15) and (0.35, 0.50, 0.15, 0.85) and a range
of values for c(1) from 0.20 through 0.80. For each of the allocation policies consid-
ered and each combination of values for n, (pA(1), p3(1), pA(2), 113(2)) and C(I), the
proportion of successes is computed as the average, over 10,000 replications, of the
proportion of patients successfully treated in the simulated trial; the empirical mean
squared error within stratum v (v E {1, 2}) is computed as the average, over 10,000
replications, of the squared difference between the estimates of (p,4(v) — p3(v)) and

the parameter.

The following labels are used in Figures 3.1 through 3.18.
o r = complete randomization within strata;
o a = CAVVD(0.8, .Ms);
0 b = CAWD(0.8, N11);
0 c = CAWD(0.8, Mg);
0 d = CAVVD(0.8, Mh);
o W = RPVV(1, 0, 1) rule within strata.
Here, M3, M1 and Mg are the matrices deﬁned in (3.25), (3.26) and (3.27), respec-
tively, and M), is a 2x2 matrix with all elements equal to 0.5.

69

For Figures 3.1 through 3.6, the success probabilities satisfy A(1)A(2) < 0. In
this case, Theorem 3.3.3 states that the best ethical choice of crossover weights for the
CAW D design is given by M, (which is equivalent to AWD allocations within strata).
Simulations show that (for a particular choice of the success probabilities) not only
the overall proportion of patients successfully treated is larger for the CAWD(0.8,
M 3) design than for the CAVVD(0.8, Mh) design and complete randomization within
strata, but the mean squared error for estimating the treatments difference within
each stratum is also smaller for the CAWD(0.8, Ms) design than for the other two
rules. The RPW(1, 0, 1) rule within strata and CAVVD(0.8, MS) design show similar
combined performances in these ﬁgures.

For Figures 3.7 through 3.12, the success probabilities satisfy A(1)A(2) > 0 and
|A(1)| > |A(2)|. In this case, Theorem 3.3.3 states that the best ethical choice
of crossover weights for the CAWD design is given by Ml. Simulations show that
(for a particular choice of the success probabilities) CAWD(0.8, Ml) yields, as ex-
pected, a larger proportion of successes than CAVVD(0.8, Mh). The RPW(1, 0, 1)
rule within strata. and CAWD(0.8, Ml) have similar performances with respect to the
proportion of patients successfully treated. Complete randomization within strata is
markedly deﬁcient in this aspect. A surprising feature of the CAWD(0.8, M1) de-
sign and the RPW(1, 0, 1) rule within strata is shown in Figures 3.8, 3.9, 3.11 and
3.12. CAVVD(0.8, M1) does better than RPVV(1, 0, 1) within strata when estimating
the treatments difference in stratum 1, but. the reverse happens when estimating the
treatments difference in stratum 2. Still, RPVV(1, 0, 1) within strata has a good per-

70

101111111111
1111.1 .1111
ul (111111
the 1111

1111
1) 1111)
Of ("1115
11111 (‘11‘

C .\\\‘l

formance in this respect, comparable to that of complete randomization within strata.
This suggests that the crossover weights may be adjusted to increase the proportion
of patients successfully treated while yielding low mean squared errors for estimating
the treatments difference within strata.

Finally, for Figures 3.13 through 3.18, the success probabilities satisfy A(l) A(2) >
0 and |A(1)| < |A(2)|. In this case, Theorem 3.3.3 states that the best ethical choice
of crossover weights for the CAW D design is given by Mg. Simulations yielded sim-
ilar conclusions to the ones presented in the previous paragraph, but now for the

CAVVD(0.8, 1112) design.

71

Preportion of Successes

 

032
in:

6949/6: /g

0.28 0.29 0.30 0.31

‘ dig/9?”,

 

 

 

I T F I T I

0.20 0.25 0.30 0.35 0.40 0.45 0.50

c(1)

 

033
in:

0.31

 

 

 

029

0.50 0.55 0.60 0.65 0.70 0.75 0.80

c(1)

Figure 3.1: Comparisons in terms of proportion of successes for n = 30, p,4(1) = 0.50,

113(1) 2 0.10, 121(2) = 0.15 and 123(2) = 0.40.

72

Mean Squared Error in Stratum 1

 

Q-i

/.7/

/7/

1’0.“

//

 

 

 

 

 

 

 

3\
8 _ O§é§
O 3% a
0.20 0.25 0.30 0.35 0.40 0.45 0.50
0(1)
O r
8 _ d
o' \
- \ \
o a (El
4 \w\&
\u\ 8
8 q \i \\\8
0.50 0.55 0.60 0.65 0.70 0.75 0.80

(1

V

G

Figure 3.2: Comparisons in terms of mean squared error in stratum 1 for n = 30,

73

Mean Squared Error in Stratum 2

 

 

 

 

d

. /s
g”. - 1.1/w
o , / (1% 1:1/
3 - g/i/
g /g%w/

‘ /gZW//
o 0.20 0.25 0.30 0.35 0.40 0.45 0.50

c(1)

 

 

 

‘1‘)
\

 

0.50 0.55 0.60 0.65 0.70 0.75 0.80

(1)

('3

Figure 3.3: Comparisons in terms of mean squared error in stratum 2 for n = 30,

112(1) = 0-50. 193(1) = 0.10, 114(2) = 0.15 and 223(2) = 0.40.

74

Proportion of Successes

 

 

 

 

 

 

 

 

N “/0
g T /0/
ya

‘ 0%“

ﬁ%
0
CO —1
d
3 1 2/ ﬁe?”
o d/

0.20 0.25 0.30 0.35 0.40 0.45 0.50

c(1)

_, /w
O ‘ u/W

3/
E7) -
d

.. /d

d/d

O) d/d/ ________————f—""‘"“'"—'r
g ‘ si::_.:: ér/T‘xr

0.50 0.55 0.60 0.65 0.70 0.75 0.80

C(1)

Figure 3.4: Comparisons in terms of proportion of successes for n = 150, pA(1) = 0.50,

113(1) = 010. 112(2) = 0.15 and 123(2) 2 0.40.

75

Mean Squared Error in Stratum 1

 

0.025

/

 

 

 

 

l'
o d§a§
3 d a§g
0.20 0.25 0.30 0.35 0.40 0.45 0.50
6(1)
l’
d d

.\8
§ 1 \\
O

.l/
//

 

 

 

0.50 0.55 0.60 0.65 0.70 0.75 0.80

c(1)

Figure 3.5: Comparisons in terms of mean squared error in stratum 1 for n = 150,

112(1) = 050. 113(1) = 0.10, 122(2) = 0.15 and 223(2) = 0.40.

76

Mean Squared Error in Stratum 2

 

 

 

 

52 d
g ‘ /1’1
/"/

‘ a

B , d/;/

o 21/11 /
‘ déﬁ/

2212?/, . , . .
0.20 0.25 0.30 0.35 0.40 0.45 0.50

(1)

O

 

0025
\ o.

1
\L
\X.

 

 

 

0.010

d
0.50 0.55 0.60 0.65 0.70 0.75 0.80

6(1)

I I I I I I

Figure 3.6: Comparisons in terms of mean squared error in stratum 2 for n = 150,

p.1(1) = 0-50, 113(1) = 0.10, p..(2) = 0.15 and 223(2) = 0.40.

77

Proportion of Successes

 

 

 

 

 

 

8' b
o /b/b w
O b/ yd/
8 b//w/
o 1 ‘2/
L0
<2: .I
o
r——--""”r r
r/r/
o /r/
<r_ r
o I I I f r I I
0.20 0.25 0.30 0.35 0.40 0.45 0.50
c(1)
/g
[D /
g 4 ”/0
b/h/
b/W/ d/d
/w d/d/
W d/
LO d/
to q
o'
/r/'
‘3 - r/r/r

 

 

 

0.50 0.55 0.60 0.65 0.70 0.75 0.80

c(1)

Figure 3.7: Comparisons in terms of proportion of successes for n = 30, p,4(1) = 0.90,

78

Mean Squared Error in Stratum 1

 

0.10

/7

 

 

 

 

 

 

 

9. ‘ \w\w
° \v:\:¥
0.20 0.25 0.30 0.35 0.40 0.45 0.50
6(1)
W
\w
O
8. - a \
O r \9 W\W\
g q \ r \ B \ W\w\
' W
o \ r \ a \
8. _ I \ \ a \ b
O I \ \d
r \ r
0.50 0.55 0.60 0.65 0.70 0.75 0.30

(1)

O

Figure 3.8: Comparisons in terms of mean squared error in stratum 1 for n = 30,

pA(1) = 0.90, 193(1) 2 0.10, pA(2) = 0.60 and 123(2): 0.15.

79

Mean Squared Error in Stratum 2

 

 

 

 

[\ b
s 4 b/
8. 4 /
o /b a
8 4 /b /%%w
c: b
3 _ b/ /66/
6 /63/

0.20 0.25 0.30 0.35 0.40 0.45 0.50

c(1)

 

 

 

 

I
_ /2
o P
'- - / /
/
.I b /U
b/ &/
8 - . 4%
0' Q/
0.50 0.55 0.60 0.65 0.70 0.75 0.80

c(1)

Figure 3.9: Comparisons in terms of mean squared error in stratum 2 for n = 30,

PA(1)= 090, 133(1) = 0.10, p.4(2) = 0.60 and 193(2) 2 0.15.

80

Proportion of Successes

 

 

 

 

 

8 ‘ U/ﬂ—‘g
/b7
‘0 b/b w d/d/d
ID -
‘5 b/gQ/g/d/
Lo -
c5
10
v s
o' /r/,
o /r/r/r
V 1 r-———---"""'r
o. I I I I I I I
0.20 0.25 0.30 0.35 0.40 0.45 0.50
0(1)
/w
«4 w/w
w/ _/b
«“3 r /w b/b/b
- d/d/d
.0 d/‘d/
l0 .
0'
____._—————r
g r—fr/r/r
o' r/'/

 

 

 

0.50 0.55 0.60 0.65 0.70 0.75 0.80

c(1)

Figure 3.10: Comparisons in terms of proportion of successes for n : 150, 19,;(1) =

0.90, p30) = 0.10, 1914(2) = 0.60 and 123(2) = 0.15.

81

Mean Squared Error in Stratum 1

 

 

 

 

 

(:5
w
:9 2 \w
o r
O Rb \w
r
o \dkb \w\w\
d b
r\ \d\
m ’\r\a<b
o \r\r
o _
O T I I I r I I
0.20 0.25 0.30 0.35 0.40 0.45 0.50
0(1)
0
S " W\
o w
co w
8 ‘ \w
o \w\
w
8 b\b
0- i d \
° \d b\
r\ \d b\b\\
V r d\ b
8 4 ‘ \r\r d\d::3
o \r

 

 

 

0.50 0.55 0.60 0.65 0.70 0.75 0.80

0(1)

Figure 3.11: Comparisons in terms of mean squared error in stratum 1 for n = 150,

m(l) = 090, 193(1) = 0.10, 1214(2) = 0.60 and 123(2) = 0.15.

82

Mean Squared Error in Stratum 2

 

 

 

 

 

 

 

 

b
S! b/
Q _
O
1— - b d v
(D d/Vfl/
s - V/
O
0.20 0.25 0.30 0.35 0.40 0.45 0.50
6(1)
b
0
g J /
d /b d
W
a b %
d
g - / /W
. b d/
/ /
‘ b
b Q’G/
g _‘ a?“
o I I f I r I I
0.50 0.55 0.60 0.65 0.70 0.75 0.80

6(1)

Figure 3.12: Comparisons in terms of mean squared error in stratum 2 for n = 150,

m(l) = 090, 123(1) = 0.10, M2) = 0.60 and 193(2) 2 0.15.

83

Proportion of Successes

 

 

 

 

 

 

 

 

9\
\w
8 d \U\
\
LO _ \d W
0 \d
O
U? -.
O
r~~~—r
\r\
er—xr r
0.20 0.25 0.30 0.35 0.40 0.45 0.50
6(1)
8 c
- \W\C
3‘) d d\ \W\C\
o' d\ \ c
d W \
_ \d\ C\
3 4 ‘g\\\w\c
0 ﬂ
.4 r\\r\r
\Ixr
\
g -I r\r
o“ I I r I I I I

0.50 0.55 0.60 0.65 0.70 0.75 0.80

c(1)

Figure 3.13: Comparisons in terms of proportion of successes for n = 30, 114(1) 2 0.35,

p30) = 0-50, 114(2) = 0.15 and 193(2) = 0.85.

84

Mean Squared Error in Stratum 1

 

//

//
//

 

 

 

 

 

 

 

v\ C \
_ \y\ c \ c
o' “ \?
0.20 0.25 0.30 0.35 0.40 0.45 0.50
6(1)
C
2 q \ c
A $\ \ c
'5 _ r \ d \ c
o \W\ \
I \a C \
- \ f \ C
to \‘g§ \ c
O a \$\
0 §$\$
0.50 0.55 0.60 0.65 0.70 0.75 0.80
6(1)

Figure 3.14: Comparisons in terms of mean squared error in stratum 1 for n = 30,

Q40) 2 0.35, 193(1) 2 0.50, pA(2) = 0.15 and 103(2) 2 0.85.

85

Mean Squared Error in Stratum 2

 

 

 

 

W

59 /C
o -
O w/d

I W///°/r
CO / / /
o * w/W/C d r
O C

w/c/ d/r/

‘ C/ /d/r/
8 d d r/

0.20 0.25 0.30 0.35 0.40 0.45 0.50

('3

(1

v

 

0.08

 

 

3:12”

0.04

 

I I I r I T

0.50 0.55 0.60 0.65 0.70 0.75 0.80

c(1)

Figure 3.15: Comparisons in terms of mean squared error in stratum 2 for n = 30,

m(l) = 0.35, 193(1) = 0.50, 114(2) -.= 0.15 and 193(2) 2 0.85.

86

0.55 0.60 0.65

0.50

0.48 0.52 0.56

0.44

Proportion of Successes

 

 

 

 

 

 

 

 

 

0.20 0.25 0.30 0.35 0.40 0.45 0.50
0(1)
d ‘8’
4 \\\W\
W
q \ d \ \w C
. \ d
.I r \ r \ r
\ I \ r
0.50 0.55 0.60 0.65 0.70 0.75 0.80

c(1)

Figure 3.16: Comparisons in terms of proportion of successes for n = 1530, [54(1)

0.35, p3(].) = 0.50, 114(2) = 0.15 and p3(2) 1‘ 0.85.

87

Mean Squared Error in Stratum 1

 

 

 

 

 

 

 

 

Q
0. ‘ c
o
:5
4 \$\C
8 4 X \c
m §Q\C c
0.20 0.25 0.30 0.35 0.40 0.45 0.50
c(1)
C
(0
E ~ \.
d d \C\
N y§d C\
‘- _ W
3 '§I °\
r§g C\C
. r§g§
I
o’ ‘1 I T I I I I I
o
0.50 0.55 0.60 0.65 0.70 0.75 0.80

(1)

(‘3

Figure 3.17: Comparisons in terms of mean squared error in stratum 1 for n = 150,

114(1) 2 0.35, 173(1) 2 0.50, pA(2) = 0.15 and 193(2) 2 0.85

88

Mean Squared Error in Stratum 2

 

 

 

 

 

 

 

 

N
5 E w
_‘ /W/
C
W
w/ c/
g - /W/ c/
w / d
O /C d/r
c-///C d/r/
.. C/ d/I/
/-—d/r/
d/d/r/
3 r~*""""’r
Q I I I I I I I
O
0.20 0.25 0.30 0.35 0.40 0.45 0.50
c(1)
W
. /c
w/
8 a /c d
w/ C /r
O c/ d/
5 - c/ “(I/94'
9/.—
O.50 0.55 0.60 0.65 0.70 0.75 0.80

0(1)

Figure 3.18: Comparisons in terms of mean squared error in stratum 2 for n = 150,

1914(1) = 0.35, 193(1) = 0.50, [14(2) 2 0.15 and [13(2) = 0.85.

89

Chapter 4

The Covariate Randomized

Play-the-Winner Rule

4. 1 Introduction

As mentioned in Chapter 1, very few clinical trials based on adaptive designs have
been reported in the literature. The logistics of conducting a clinical trial based on
an adaptive design are complicated, and more so if the design is in itself difﬁcult to
implement. The simplicity of implementation of an urn model (when compared to
other types of adaptive designs) is, perhaps, the main reason why the randomized
play-the-winner rule has been the most attractive adaptive design from the point of
view of applications (see Section 1.4). In this chapter, we use the idea behind the
covariate adaptive weighted differences design, crossover of information from strata
to strata, to incorporate covariates into a randomized play-the—winner rule. This

new adaptive design is called The Covariate Randomized Play-the- Winner Rule, ab-

90

breviated CRPW", and it corresponds to a multiple urn model - one urn for each
combination of covariate levels (strata). Similarly to the CAWD design, the CRPW
design can be applied in Clinical trials for which

0 patients arrive sequentially;

0 each patient can be assigned to either one of two treatments and will be assigned
to exactly one of them;

0 the responses of patients to treatments are dichotomous (either a success or a fail-
ure);

0 the response of each patient is observed before the next patient is assigned to a
treatment;

0 relevant covariates on a patient (concomitant information such as age, sex, general
physical status, severity of the disease, etc.) are available before assigning him/ her
to a treatment;

a all the strata that can be formed by considering common combinations of covariate
levels are known before the trial begins and there will be at least one patient treated

in each one of those strata.

With the CRPW rule, patients are randomly assigned to treatments according to
a response-adaptive covariate design. A compromise is once again sought between
the individualistic and utilitarian goals of a design for clinical trials.

In what follows, the allocation policy for the CRPW rule is formally described and
strong laws of large numbers and a central limit theorem are proved. The design is
evaluated by comparing its performance with that of complete randomization within

91

strata and the RPVV(1, 0, 1) rule within strata.

4.2 The Allocation Policy

The notation and assumptions of Section 3.2.1 are in force for the remainder of this
chapter.

The CRPW rule can easily be described with a multiple urn model. Suppose there
are r urns (which correspond to the 7‘ possible strata). When a patient enters the
trial he/she is examined to determine the stratum to which he/she belongs. A ball
is then randomly drawn from the urn corresponding to that stratum. If the ball is
of type i, assign the patient to treatment 2', where z' 6 {A, B}. The ball is replaced
in the urn and the response of the patient is observed. The composition of (possibly
all) the urns is then changed according to the rule described below.

The idea developed in Chapter 3 for incorporating covariates into an adaptive
design translates, in the present setting, in allowing for the possibility of changing
the composition of all the urns after observing the response of a patient belonging
to any one stratum. With the CRPW rule the composition of the urns is updated
as follows. Suppose the patient whose response to treatment 2' (z' E {.4, B}) has just
been observed, belongs to stratum v (v E V :2 {1, 2, ~ -- , 7}). If the response is a
success, add 13(1)“, 12) balls of type '23 to urn v", for each 12" E V; if the response is a
failure, add [3(1)‘, v) balls of typej (j 79 2') to urn v", for each 12“ E V. Here 13(1)", iv)’s
are non-negative integers. With the RPW rule within strata only the composition of

urn v is changed; this is equivalent to setting, in the CRPW rule, [3(0’“, v) = 0 for all

92

v“ 75 v. Hence, the RPW rule within strata is just a particular case of the CRPW
rule.
To formally describe the CRPW rule, the following notation and assumptions are

needed. Let C0 denote the set of the initial compositions of the r urns,

C0 : {(7‘0(1)9 710(1)) 3 H ' I (700‘)? n0('r))}I

where, for each v” E V = {1, 2, - - - , r}, m(v“) denotes the initial proportion of type
.4 balls in urn v" and n.0(v*) denotes the initial total number of balls in urn If. It
is assumed throughout that, for each 11* E V, r0(v*) E (0, 1) and n0(v*) 2 2. Let [3

denote the matrix of 8’s,

r -

6(111) (“132) (3(13T)

5(2, 1) 13(2» 2) 13(17‘)

503 1) M732) 5037‘)

- d

 

 

*

Each row, 1) , of the matrix 8 represents the number of balls added to um 11‘ after
observing responses of patients in strata 1, 2, , T. It is assumed throughout that
the 3’s are non-negative integers and that each column of B has at least one non-zero

component (i.e., at each stage, the composition of at least one of the urns is changed).

The [3’s can be interpreted as follows. Let

[3(1)‘, ’0)
i3(v*,1)+---+ﬂ(v*.r)

*

 

m(v , v) 2

Then m(v*, v) is the weight placed on the responses of patients previously treated in

stratum v E V when allocating a patient in stratum v“. Note that for each v“ E V,

93

2W< v) = 1.

The allocation policy for the CRPW rule can formally be described by setting for
each k 2 0
(5H1: Zék+1(ll)1{l'k+1: tr} (4.2)
1er
where, for each v E V, 6k+1('u) is to be speciﬁed.
Suppose patient I: + 1 is in stratum v“. For each It 2 0 and each 22* E V deﬁne
Ak('v") and Bk(v“) to be the number of balls of types A and B in urn v* after the

response of patient k. has been observed and balls have been added to the urns. Then,

A002") -—- m(v‘) new), Bow") = (1— row» no(v*> (4.3)
and for k 2 1
Adv‘) = AM") + 2‘; W, v) [Sim + Nam) — SB,k(U)l (4.4)
and
Bier) = Batu“) + 225% 1:2)[SB,.(v)+N.I,.(v) — 5.1.00]. (4.5)

So, according to the above description of the model,

 

- x4k(v*)
(5 l = I L'- g . 4.6
k+1(v ) { ’k+1 Akfvl) + Bk(v*)} ( )

Henceforth, the notation CRPVV(C0, B) will refer to the allocation policy described
in this section, where the set C0 gives the initial composition of the urns and the matrix

3 gives the number of balls to be added to each urn, at each stage of the trial. Note

94

that the RPW(u, 0, '13) rule within strata is the particular case of the CPRVV(C0, 8)
rule corresponding to choosing a set Co with all pairs equal to G, 2 u) and a diagonal

matrix B with all diagonal components equal to [3.

4.3 Strong Laws of Large Numbers

In this section it is assumed that patients are allocated to treatments A or B according
to the CRPW(C0, 8) rule. The main result. proved here is that the proportion of balls
of type A in each urn converges almost surely to a constant as the trial size converges
to 00. The proof of this type of results for single urn models is rather involved. See, for
example, Hill, Lane, and Sudderth (1980), Athreya and Karlin (1967) and Bai and
Hu (1999). The crossover of information on the responses of patients signiﬁcantly
complicates the arguments needed to prove this type of convergence. We prove the
result, for the CRPW rule, in the case of two urns and for matrices B with components
satisfying some conditions. Our proof follows the structure of the proof of Theorem
2.1 in Hill, Lane, and Sudderth (1980) with several modiﬁcations to allow for the
speciﬁc characteristics of a multiple urn model with crossover of information.

It is also proved, in this section, that for any number of urns and any matrices B
for which the above convergence holds, 13,4,k(v) and 133, k(v) are strongly consistent es-
timators of [n(u) and p3(v), the success probabilities for treatments A and B, within

each stratum v E V.

For k 2 1, let 7-} be the a-algebra generated by the the ﬁrst k allocations, potential

responses, strata and auxiliary randomization,
fk‘ Z 0{6i9 ‘27 ‘X'is )iv (Ii: 1S ’l S k}

and let .70 denote the trivial a-algebra. It is also useful in what follows, to consider,

for k 2 1, the a-algebras
9k = 7k V Ufl‘iwi}
and
71k = .75}, V 0{l"}c+1, Uk+1}-

Note that 6k+1 is ”Hk—measurable.

The following lemma is useful in proving some of the theorems in this section.

Lemma 4.3.1 F07“ each 2) E V,

. NM“)
1331010 k

 

2: c(v) (1.3.

Proof. The proof is similar to that of Lemma 3.3.1. I

It is reasonable to assume that, for each 11 E V, the limiting proportion of type
A balls in um 12 is almost surely the same as the limiting proportion of patients
allocated to treatment A within stratum v (if the limits exist). So, in what follows,

unless otherwise noted, we assume that for each 12 E V

. Ak(’U) NA k(v)
11m — —’—— = O (1.5. 4-7
k—mo Ak(1)) + kav) JIVkiU) ( )

 

96

If (4.7) holds, then Lemma 4.3.1 implies that

 

. Bkll’) NB All")
1 — ——’— = O .. . 4.8
1.320 Aka) + am) am) a 9 ( )
For each u E V = {1, ‘2, - -- , r} and each k 2 0 deﬁne Rk(u) to be the proportion

of type A balls in urn u just before patient I: + 1 is assigned to a treatment. Then

Akfb’)
Ak(u) + Bk(v)'

 

Rk(l.’) = (4.9)

Note that R0(v) : r0(u).

The next theorem states that, for any speciﬁc urn, if the proportion of type A balls
converges as. then the usual estimators of the success probabilities for treatments A

and B within the stratum corresponding to that urn are strongly consistent.

Theorem 4.3.1 For each u E V, if Rk(v) converges almost surely as k —> 00 then
15A,k('v) and 153,},(u) are strongly consistent estimators of the success probabilities pA(-v)
and pB(-v) within stratum v, i.e.,
(i) lim;MOO [3,4,k('U) 2 114(1)) as;

(at) limksmﬁmv) = 12302) as.
Proof. First note that Lemma 4.3.1 and Assumption (4.7) imply that
klim N,Lk(u) = 00 as.

The result can now be proved by following the arguments used in the proof of Theo-

rem 3.3.1. I

97

The previous results were proved with no restrictions on the number of strata or
on matrix B. The next theorems summarize the main result of this section. The

theorems are proved under the assumption that for each v E V,

94(0) + (18(1’) > 0

where (1,4(v) = 1 — p.4(v) and (13(1)) = 1 — pB(v). So, it is assumed that, within each
stratum, at least one of the success probabilities is less than 1. The proofs are only

valid for r : 2. Henceforth, V = {1, 2} and the matrix of ,8’s is of the form

,3(1,1) 3(1, 2)

13(22 1) 113(29 2)

L ..

 

 

Theorem 4.3.2 below states that, under the above assumptions and some further
constraints on the components of B, the proportion of type A balls in each urn
converges as. as the trial size converges to 00. Note that, the convergence for urn v"
(v‘ E V) is proved by constraining only the 5’s in row v“ of B. Assumption (4.7) is

not needed to prove the theorem.

Theorem 4.3.2 The following hold.
(i) [fl g ﬁ(1, 1) g ﬁ(1, 2) then Rk(1) converges almost surely as k —) 00;

(ii) if1 g M2, 2) S 13(2, 1) then Rk(2) converges almost surely as k —> 00.

The proof of the theorem follows three lemmas. Before stating these lemmas we
introduce some notation, deﬁne some concepts and present general arguments that
will be used in the proofs.

98

For simplicity, in what follows we write [3“- instead of [3(i, j), for i, j E {1, 2}.
Fix v“ E V. For each It 2 1, deﬁne r,4,k(v*) to be the number of A balls added to

urn v‘ after observing the response of patient 1:. Then,

Time) = (,s,,.,11{t;. = 1} +s,,.,2 [{Vk = 2}) (5,, X, + (1 — (5k) (1 — m).

N ow,

P [7.4.1.079 =i’3v-,1|Hk_1] =

= Illi=1}1{6k =1}p..(v*>+ 1m. = } Ital. = 0}<1— paw»

and so

PlTA,k(U*) : £1321 lgk—li :
: 1),,(1) I{V,c z 1} Put, =1|Qk_1)+(1— 193(1)) Ill/"k = 1} P051: = 0 l gk—I)

= lpA(1)Rk—1(1)+ (1 — 198(1))(1— Ric—1(1)” 191:1}-

Similarly, it can be shown that
PlTA,k(U*) = [32122 I gk—l] =
= [11.49) Ric—1(2) + (1 - 103(2)) (1 - Ric—1(2)” 1ka = 2}-
Deﬁne on V x [0, 1] x [0, 1] a function fv. by setting

PA(1)r1+(1—p3(1))(1—r1) ,ifv=1
fv~(v, r1, r2) = (4.10)
114(2) r2+(1-p3(2))(1—r2) ,ifv=2

So, if {U;c : k 2 1} is a sequence of i.i.d. Uniform[0, 1] random variables independent

99

of the sequence {(u, Xk, 1},) : k 2 1}, then

7.4.1.0”) =,L3vo.11{Uk S [{l‘i=1}fv-(lic. Ric—1(1): Ric—1(2)”

+ :52 1 {Uk g [{1} = 2} f,,.(v,,, Rk_1(1), Rk_1(2))} (4.11)

Also, note that

 

‘4_ ,’* +B_ ',’* R- [1“ , ,‘t
Rk(v*)=( k 1(1 l k 1(l )) k 1(l )+TA,k(b) (4.12)
4415(11") + Bk('U")
where, because of the restrictions on the components of B (1 g Bum g ﬂung),
Ak(v’) + Bk(v‘) : n.0(v‘) +,3,,-,1Nk(1) +,1’3,,.,21Vk(2) 2 710(1)") + k. (4.13)

The notions of urn process and urn function deﬁned in Hill, Lane, and Sudderth
(1980) can be generalized as follows. For each v* E V, we say that {Rk(v*), k 2 1} is
the urn v" process, with urn function fv- and initial urn composition (r0(v*), 72.0(v*)).

The distribution of such urn process will be denoted by P(,.0(,,.),,,0(,,.)).

We prove the theorem for v = 1 under the assumption 1 S [3(1, 1) g [3(1, 2). The
proof for v = 2 under the assumption 1 g ,8(2, 2) g {3(2, 1) follows similarly.

For simplicity, when an argument or statement refers only to the urn 1 function
and the initial composition of urn 1 we omit the subscript v*, (v* = 1) from the urn

function and the argument v“ from the initial urn composition.

We now state and prove the three lemmas used in the proof of Theorem 4.3.2.

100

Let I = (a, b) and J = (c, d), with 0 g a < c < d < b g 1 and let UR“) be the

event that {Rk(1), k 2 1} upcrosses the interval I inﬁnitely often.

Lemma 4.3.2 If onmo) (EMU) > 0 then for every 6 > O and positive integer 111,

there erists so 6 J and mo 2 1)! such that P(SO,,,,0) (Ugm) 2 1 - 6.

Proof. The proof of the lemma follows three claims.
Claim 1. The increments of {Rk(1), k 2 1} converge uniformly to zero, i.e. for every

6 > 0 there exists a positive integer 1111 such that for every k. 2 All,
sug |Rk+1(1)(w) — Rk(1)(w)| < 6.
we

Proof of Claim 1. Fix a) E Q. Then (4.13) implies

le+1(1)(w) — Rk(1)(w)| =

_ {Ak(1l(w)+Bk(1)(w)} Rk(1)(w)+7.4,k+1(1)(wl _ w
_ Ak+1(1)(w)+Bk+1(1)(w) Rk(1)( )
TA,k+1(1)(w) — {131.1 [{Vk+1(w) = 1} + H12 [{Wc+1(w) = 2}} Rk(1)(w)

n0+k

 

 

 

//\

 

 

< 2(.31,1+/31,2)

4.14
71.0 + k ( )

 

The expression in (4.14) does not depend on w and converges to 0 as k —+ 00. The
claim follows. 1:]

Claim 2. Any path in UR“) must visit J inﬁnitely often, i.e. for every w E U 3(1) and
positive integer Mg, there exists It 2 Mg such that Rk(1)(w) E J.

Proof of Claim 2. We prove the validity of the claim by contradiction. So, assume
that there exist L00 6 UR“) and a positive integer Mg such that Rk(1)(w0) ¢ .1, for all
k 2 A12. Let c = d — c. Note that e > 0. Then, the previous assumption and Claim

101

‘ 1 imply that, for 1113 = m.a.;r{1lll, ill-2},
Rk(1)(w0) Q J and |Rk+1(1)(wo) — Rk(1)(w0)| < d — c, Vk 2 Mg.
Hence, if R,.u3(1)(w0) S c then R,,.,3+1(1)(w0) S c. Iterating this reasoning yields
RMs(1)('.v0) S c 2:» R,.u,+k(1)(w0) S c, Vk 2 1. (4.15)
Similarly,
RMS(1)(w-0) 2 d :> RM3+k(l)(w0) 2 d, Vk 2 1. (4.16)

Expressions (4.15) and (4.16) contradict the assumption that tag 6 UR“). Claim 2
follows. C]

Claim 3. Let C be an event. Then 10 is the function deﬁned in Q as

1 iwa C
low):

0 iwaCc

Then
“”1 P<Rk(1)..4k(1)+81(1)) (UR(1)) = [Um a-S-

k—)OO

Proof of Claim 3. Consider the o—algebra
72;,(1) = o{R,-(1), V}: 1S i S k}.

Then 72;,(1) T 0(U7Zk(1)) 2: 7200(1) and UR“) E 7200(1). So, Lévy’s 0 —1 Law

implies that

gig) Plromo) (URU) l 73141)) = 111R”, 0-8-

102

On the other hand,

P(7‘0 n0))((UR1) |Rk(1)) : P(To.710)(UR(1)lR1(1)V11 ' ' '1Rk(1)11k)

: P(R11(1)1-‘1k(1)+311()1))((UR 1))

Claim 3 follows. [:1

Now, let.
G = {.11 E Q 3 3111310131111111111111111111111 11(1(UR1) = 111111,(W)}-

By Claim 3, P(,O,,,O)(G) = 1. Let

G1 = {111 E Ufa” 1 [$131.10 11111111111111 41(1)(w)+B1(l) (1111 1(UR< 11) - 0}
and

32 = {111 E UR<11 : 3320 P(1110111111.11141111111131(11(1111 (011(1)) = 1}-
Then, G: G1 U 62 and G1 0 G2: (11. By assumption, PM, n0)(UR(1)) > 0. Hence

P(ro1no)(Gll S P(ro1no)(Ufa(1)) < 1
and, consequently,
P(To,n0)(G2) > 0

Consider a ﬁxed 1110 E 02. Fix 6 > 0 and a positive integer M. By (4.13) there exists

a positive integer 1111 such that

A1(1)(110) + 3.01%) >111, v11 2 111,. (4.17)

103

Since too E G2, there exists a positive integer .1113 such that

P(Rk(1)(~00)1r111(1)(‘~110)+131~(1)(uJ0)) (1711(1)) 2 1 —' 1% WC 2 1113- (418)
Let Mg : ma.r{.-l[, MI, .113}. Then, Claim 2, (4.17) and (4.18) imply that there

exists an integer M4 2 Mo such that

R1114 (1)(w'0) E J,

AM4(1)(L00) + B,r([4(1)(w0) >111,
and

PlRM4(1)(wo),A114(1)(1110)+B,,,4(1)(w0)) (U1{(1)) 2 1 — 6.

Consider any such M4. Let so = RM,(1)(wo) and no :2 AM,(1)(wo) +BM4(1)(wo). The

lemma follows. I

For each v“ E V, let 7(1)") 2 {Tk(v"‘), k 21} and Q(v*) = {Qk(v*), k 21}
denote urn v” processes with urn functions gv1 and hv., and initial urn compositions
(to(v*), no(v“)) and (qo(v*), no(v"‘)), respectively. Recall that when an argument or
statement refers only to the urn 1 function and the initial composition of urn 1 we
omit the subscript v", (v* = 1) from the urn function and the argument 11* from the

initial urn composition.

Lemma 4.3.3 If g(v, r1, r2) = r1, for all (11, r1, r2) 6 V x [0, 1] x [0, 1], then there

exists a random variable T such that
klirnTk(1) = T as. (4.19)

104

E(101no)(T) : t0 (420)
lim sup E(t0,,,o)(T — to)2 2: 0 (4.21)
rig—>00 (06]011]

and, for every 6 > 0

1
P(t0.no) {5111) lTk(1) — t0] 2 6} g 6—2— E(to.n0)(T _ to)2 (422)

121
Proof. For simplicity, omit the subscript (to, no) when referring to expectations or
probabilities in this proof. To distinguish between the several urn processes now being
considered, denote by 799,110) the number of A balls added to urn 1 after observing

the response of patient k, when the urn function is 9. Hence (see (4.11))

79,1110) 2131,11{Uk S Iin = 1}9(Vk1 Tk—~l(1)1 711—191)}

+131,21{Uk S [{Vk = 2} 9(1’111 Tk-1(1)1 Tic—1(2))l (423)

Since the total number of balls in urn 1 at any given stage does not depend on the

urn function, we can write

(-4k—1(1)+ Bk-1(1))Tk—1(1)+ T9,.4,k(1).

T1“) : A1(11+B1(11

 

Now, the assumption on g and (4.23) imply that
Eng,A,k(1)lgk—ll 2,131,11{Vk = 1}Tk—1(1)+ 131,21{1"ic= 2} Tic—1(1) as. (4124)

Hence,

1111111111=111>+r<1>1:113:51”“121me111

 

=Tk_1(1) a.s.

So, {T1,(1), g1, : k 2 1} is a martingale. Furthermore, |Tk(1)| S 1 for all k 2 1. By
the L2-martingale convergence theorem, Tk(1) converges almost surely (to an almost
surely ﬁnite random variable, T) as A: —1 00. So (4.19) holds.

Since {T1,(1) : k 2 1} is bounded (and hence uniformly integrable), (4.19) implies

that

lim E(Tk(1)) = E(T)

k—100

But, since {T1,(1), g, : k 2 1} is a martingale,
E(Tk(1)) = E(T0(1)) =10, Vk 21

Hence (4.20) holds.

Since {(Tk(1) — to)? : k 2 1} is uniformly integrable, (4.19) also implies that

lim E(T,,(1) — to)2 = E(T — to)2 (4.25)

k—mo

Now, because {T1,(1), g, : k 2 1} is a martingale,

k k
E(T1(11 41112 = E Z(T.(11— T110112 = ZE(T1(1)— 111-1(1))?-
Fix i 6 {1, , k}. We can write

 

Tg.A.1‘(1) — (131,11{Vi =1}+ ﬂuzlfl’it = 2}) fF1410)

71(1) —— T._1(1) = A,(1) +B1(1)

The assumptions on {3’s and (4.13) imply that

(”0 +i)2(T1(1) - T1-_1(1))2 S
g (7—914‘1,l'(1))2 _ 2Tg,.4,i(1)(,61,11{l/;=1}+ 512 [{V; : 2}) T;_1(1) +

+ (1312,1103- = 1} + (312,2 1W1: = 2}) 7131(1).

106

Now, a similar argument to the one used to obtain (4.24) yields
E [(11.1.(111‘2191—1] =1f11{1~;~ = 1111-1(11+3i11{13 = 2111(1)
and

13(+,,,,,,,(1)(31 11(1; =1}+3121{1;- :2})T 11191.1}:

2 (3,211“:- = 1}Tk_1(1)+,13,2,21{V, : 2}) (T,-_.1(1))2

Thus,

33,1{12 = 1} + 3,2,, [(12- = 2}

E](Tz'(1) _Ti—1(1))2lgi—1] S (”0+z.)2

 

T1—1(1)(1-T1_1(1))

Since 1'; and T,_1(1) are independent, then

El(T1-(1) — T1—1(1))))2l = E{El(T1(1) - T1-..1(1))2 | 91-11}

 

 

121C 1—
<2 M, +1, C11))E[T._1(11(1—T._1(1111
71022
g133,111-(11+ <1c1<11
(7101111)2

Therefore, for any It 2 1 and to E [0, 1],

 

111(Tlcf1)—10)2 S

 

1:1 (71041)2
g(3,2,c(1)+,1312,2(1“4111):(710+1)2
<(121'1-()+’3122:1(
:{1’3f,1)+131,2“1))};,1_0:(i)_nol+i)
_{3,1)+ﬂ12((1)1—C)};1'1’::i

107

Hence,

sup E(T — to)2 = sup lim E(Tk(1) — to)2

to€[0. 1] to€[0, 1] ’HOO
«‘2 , 2 1 no 1
g {151.1C(1)++31,2(1-C(1))};1; Z; (4.26)
121

By Cesaro’s Lemma, (4.26) converges to O as no —> 00. So (4.21) holds.
Finally, since {(Tk(1) —- t0)2, 91,4 : k 2 1} is a submartingale, Doob’s inequality

implies that for any m 2 1

P{ max |Tk(1) —-t0| 2 e}<;1§E(Tk(1)—t0)2. (4.27)

1gkgm

But, as m ——> 00,

{max 111111—1012 }1{sup111(11—1012 }

lékSm 1:21

and by continuity

lim P{ max lTk(1) —t0| 2 e} = P {sup |Tk(1) — tol 2 e}. (4.28)
1:21

m—mo igkgm

So, (4.22) follows from (4.25), (4.27) and (4.28). I

Lemma 4.3.4 If h(v, r1, 7‘2) : T1, for all (v, m, 72) E V x I x [0, 1], then

lim sup P(so,,,0) {{Qk(1), k 2 1} visits IC} 2 0. (4.29)

rig—>00 SOEJ

Proof. Let 1; : min{c — a, b — (1}. Note that 7) > O. For each 30 E J, deﬁne QM, as

Q11,130 : {811p iTk(1) _ 80' < 7}} '

k21

108

Suppose that the urn 1 process 7'(1) has urn 1 function g satisfying 9(1), r1, r2) = T1
for all (11, T1, 7‘2) 6 V x [0, 1] x [0, 1], and the initial urn 1 composition is (30, no).
Then Lemma 4.3.3 implies that

E(801n0)(T _ 30)2
772

 

P(so,n0) (011,50) 2 1 _ (4.30)

and

lim sup E(so no) )(T— 30)2 = O. (4.31)

710—)00 50E[0 1]

Fix 1: > 0. Then by (4.30) and (4.31) there exists a positive integer M such that
P(30,n0) (90,50) 2 1 — e, Vn0> >11I, so 6 J.

Consider am such 11. Fix 71,20 >11 and 90 E J. Suppose that (30, no): (90(1), 710(1))
is the initial urn 1 composition for the urn 1 processes 7(1) and Q(1). Recall that
79,11,1(1) and 711,111,1(1) denote the number of A balls added to urn 1 after observing

the response of patient k, when the urn 1 function is g and h, respectively. Also,

711,111“): 51 11{Uk< [iv-i: —1}h(1k1Qk—1(1)1Qk—1(2))}

 

 

+ 111111U1< 11V: 21111111, 621-1(1), 621-1211}, (4.32)
W) 2 (1,1111) + 11:;11111Q51;(11<)1> + 1111(1), (4.33)
19,1,1(1=> 1311<11U1 1111:119011 T1 1(1) T112111
+ 51 21{Uk<1{1’=2}g(1"}c,Tk_1(1),Tk_1(‘2))} (4.34)
and
m) 2 (141-1(1141 B1_1(1))T1_1<1) + 11,11,101 (4.35)

Ak(1) + Bk(1)

109

Now, fix w E Qmso. Since so E J C I, the assumptions on g and h imply that
90111111 801 10(2)) = 30 = (1111(0))1 301 (10(2))

Then, by (4.32) and (4.34)

Tg,A,1(1)(w) = Th..4,1(1)(w)
and by (4.33) and (4.35)

T1(1)(111) = Q1(1)(W)-

Since w 6 9,1,0 and 17 = m11n{c — a, b — (1} then

Q1(1)(w) = T1(1)(w) 6 1-
Hence

9(13(w)1T1(1)(W)1T1(2)(w)) = T1(1)(W)

= Q1(1)(w) = h(1"1(w)1Q1(1)(w)1Q1(2)(w))
and, as before, (4.32), (4.34), (4.33), (4.35) and the choices ofw and 77 yield
Q2(1)(w) = T2(1)(w) E 1-
Continuing in this way we conclude that
Q1(1)(w) = Tk(1)(w) 6 I1 Vk >1

and, consequently, {Qk(1)(w) : k 2 1} does not visit I C. So, for every no 2 M and

30 E J,

P(so,no){{Qk(1)7 k 2 1} ViSitS IC} S P(So1no) (93,50) S 6.

110

and the lemma follows. I

We can now prove the theorem.
Proof of Theorem 4.3.2. Recall that {Rk(1), k 2 1} is the urn 1 process with urn
1 function f and initial urn 1 composition (T0, no). To prove that 121(1) converges

almost surely as k —> 00 it is sufﬁcient to show that
P(ro,no)(UR(l)) = 0, VI = (a, b) C [0, 1]. (4.36)

Suppose, to the contrary, that (4.36) does not hold. Let Io 2 (do, bo) C [0, 1] be such

that
P(r0,no)(UR(1)) > 0. (4.37)

Now, for every non-degenerate interval 11 g Io, v E V and r2 6 [0, 1], there exists
r1 6 11 such that f(v, r1, r2) yé r1. So, a non-degenerate interval 11 C; Io can always

be found so that, either'

f(v, r1, r2) < r1, \7’ (1), r1, r2) 6 V x I1x[0, 1] (4.38)
or

ﬂu, 7‘1,r2) > T1, V (1), r1, r2) 6 V x 11 x[0,1]. (4.39)

Let us construct one such interval 11. It follows from (4.10), the urn 1 function for

the urn 1 process {Rk(1), k 2 1}, that

(13(1)
Q1110) +(IB(1)

 

f(117‘117"2)§ 7‘1 41> 7‘1 2
f(21 T1: T2) § 7”1 <=> T1 2 T2 (114(2) _ qB(2)) + (13(2)‘

111

As r2 varies from O to 1, the expression r2 (114(2) — q3(2)) + q3(2) varies from
min{p,1(2), {13(2)} to max{p,1(2), {13(2)}. Let.

m : min{ (18(1)
(111(1) +4311)

08(1) , .
(11(1) +qB(1)’ 112.1(2). {18(2)}

 

.1112). 111(2)},

 

M = max {

Recall that Io : (ao, bo). Note that

(1) if M S 0.0, then (4.38) holds for a choice 11 = Io;

(2) if m 2 bo, then (4.39) holds for a choice [1 = Io;

(3) if M 6 (do, bo), then (4.38) holds for a choice 11 = (M, bo);

(4) if m 6 (do, bo), then (4.39) holds for a choice [1 2 (do, m).

These are the only possible four cases for values of m and M. To be precise, suppose
that 11 is such that (4.38) holds; the other case can be treated similarly. Let J1 be a

proper subinterval of 11. Deﬁne an urn 1 function h on V x [0, 1] x [0, 1] by setting

f(U1T117'2) ifTi $11
h(v, r1, r2) = (4.40)
7‘1 If 7'1 6 11
Let {621(1), k 2 1} be the corresponding urn 1 process with initial urn 1 composition

(7‘0, no). Since {{Rk(1), k 2 1} upcrosses I1 inﬁnitely often} 3 UR“) then (4.37)

and Lemma 4.3.2 imply that

lim sup P(,0,n0) {{Rk(1), k 2 1} upcrosses 11 inﬁnitely often} 2 1. (4.41)

710—)00 SOEJI

On the other hand, since h(-v, r1, r2) = T1 for all r1 6 11, Lemma 4.3.4 implies that

lim sup 1130,1101 {{Qk(1), k 2 1} visits If} 2 0. (4.42)

’10—’00 506.11

112

Define QR and (2Q as
QR : {.11 E Q : {Rk(1)(.u), k 2 1} upcrosses I1 inﬁnitely often}

and

9Q = {.11 E Q: {Qk(1)(.u), k 21} visits If}.
Suppose 51;; ﬂ QCQ ¢ ¢. Fix wo 6 9;; F1529 . Note that

22: {w E Q: Qk(1)(w) E 11, for all/c 21}.
We now show, by induction, that
R11(1)(w‘o)< Qk(1)(w(1)1 WC 2 0- (443)

Since the initial urn 1 composition is the same for both processes, the above relation

trivially holds when k = 0. Suppose (induction hypothesis) that for k 2 1
R,(1)(w0) S Q1(1)(wo), V2 6 {0, ‘ ' ° , [12}.

There are. three possible cases to be considered.
Case 1. Rk(1)(wo) 6 I1.
In this case, (4.38), the induction hypothesis, the choice of wo E {222 and the

deﬁnition of h (see (4.40)) yield
f(1"1+1(w0)1 Rk(1)(¢do)1 Rk(2)(wo)) < Rk(1)(wo)

S Qk(1)(wo)=h(""11+1(wo)1Q11(1)(wo)1Qk(?)(wo))-

Hence, 7f,,.1,k+1(wo) g Th,,.1,k+1(wo) and, consequently

Rk+1(1)(w0) S Qk+1(1)(w0)-

113

Case 2. Rk(1)(a1o)¢ I1 and 11+1(w0) : 2.

In this case, the deﬁnitions off and h (see (4.10) and (4.40)), the choice of wo E (22?
and the fact that if r1 6 [1 then r1 > T2 (111(2) — (13(2)) + (13(2), for any r2 6 [0, 1],
yield

f(1'i1+1(~'o)1Rk(1)(w'o)1Rk(‘2)(w'o))= f(21Rk(1)(wo)1Rk(?)(wo))
= Rk(2)(wo)(l).1(2) — 013(2)) + 08(2)
< Qk(1)(’w‘0) = h(1"ic+1(wo)1 Qk(1)(wo)1 Qk(2)(w0))-
Hence, 711/11k+1(~”'0) g Th,,.1,k+1(wo) and, consequently

Rk+1(1)(wo)< Qk+1(1)(wo)-

Case 2. Rk(1)(wo) 91 I1 and Vk+1(wo) =1.
The assumption 131,1 g 131,2 is used only to prove this part of the theorem. Since

Rk(1)(wo) g? 11 and Qk(1)(wo) 6 I1, then the induction hypothesis implies that

Rk(1)(w0) < Qk(1)(w0)-

The number of .4 balls in urn 1 (for both f and h urn 1 functions) is increased at each

stage by either 0, 131,1 or 51,2. This together with the induction hypothesis yields

(Ak(1)(w0) + Bk(1)(w0)) Rk(1)(w0) S

< (Ak(1)(w0) + Bk(1)(w0)) Qk(1)(w0) — 771i‘71{131,11/31,2}

114

Hence, because 1'1+1(wo) = 1 and 131.1 é 51,2

(.4k(1)(.vo)+Bk(1 (4101)]1 k1( )(w0)+7f..~1.k+1(1)(w0)
Ak+1( )(w0)+Bk+1(1)(W’0)

)
1
< (AA-(1)1410)+Bk(1)(w0))Rk(1)(w0)+ 131.1
\ 4k+1(1)(w0) + Bk+1(1)(w0)
(1411(1)(w0) + Bk(1)(-'0)) Qk(1)(w0) — "1’17"{1’31,11 131,2} + 51.1
Ak+1(1)(w0) + Bk+1(1)(w0)
: (141(1)(w‘0) + Bk(1)(w’0)) Qk(1)(w0) + 0
Ak+1(1)(w0) + Bk+1(1)(w0)
< (14k(1)(~00) + Bk(1)(w0)) Qk(1)(w0) + Th..4.k+1(1)(w0)
\ Ak+1(1)(wo) + Bk+l(1)(‘*’0)

: Qk+1(1)(w'0)-

111-“(1)000:

 

 

g

 

 

 

So (4.43) follows by the induction principle. We have proved that if wo E {222 then
(4.43) holds. But then too ¢ QR. This contradicts the assumption ﬁg (1 02, 7f (15.

Hence, (2;; (1 022 = o, i.e. QR Q QQ. Now (4.41) and (4.42) imply that

1 — lim sup 13(50 n0)QR\ < lim sup H50 no) 1—QQ — 0,

710—100 SOEJI 710-900 SOEJI

which is impossible. So assuming (4.37) yields a contradiction. Therefore (4.36) must

hold. The result follows. I

Theorem 4.3.3 below allows us to relax the constraints imposed on the components
of B (in Theorem 4.3.2) and still ensure that the proportion of type. A balls in both

urns converges almost surely.

Theorem 4.3.3 The following two conditions are equivalent.
(2' ) 131(1) converges almost surely as k —> 00;
(i2) Rk(2) converges almost surely as k —> 00.

115

Proof. Suppose that there exist a random variable 12(1) such that

lim 121(1) 2 12(1) as. (4.44)

k-mo

Recall that Rk(1) = .4k(1)/(.41(1)+Bk(1)). Lemma 4.3.1, (4.3), (4.4) and (4.5) imply

that, for each v* E V

 

 

 

A.*+B'* A'*+Bv* N1 N2
11(11) 1.(11 ) = 0(1)) 0(1 )+16v',1 k( ) +51132—kQ
k k k k
——> 1’3,.-,1c(1)+,8v-,2c(2) as (4.45)

k—>oo
We can write

A 1 .41
kl: ) = 01:)+1131,1

5111(1) + 1V3,k(1) — 33,):(1)
k

+131,2 541(2) + NB’]:(2) — 33142). (4.46)

 

 

 

 

By (4.44) and (4.45), Ak(1)/k converges a.s.; Lemma 4.3.1, Assumption (4.7), (4.8),
Theorem 4.3.1, (4.44) and algebra imply that (3111(1) + NB,k(1) — 531(1)) /k con-
verges a.s. Hence, by (4.46), (511(2) + NB,k(2) — 33,1(2)) /k also converges as. as

k —> 00. Since,

Ak(2) 140(2) 8,.1,k(1)+N3,k(1) "SB,Ic(1)

 

 

 

k :_7c_+132’1 k
S 2 +N 2 —S 2
+1312 .4,11( ) B,;( ) B.k( )

then 1411(2) / k converges as. This together with (4.45) yields the almost sure conver-
gence of 1411(2)/ (Ak(2) + 81(2)), as k —> 00.

A similar argument proves the other part of the theorem. I

116

By Theorem 4.3.2 and Theorem 4.3.3 a sufﬁcient condition for the almost sure

convergence of the proportion of type A balls in both urns is that either 1 S 13(1, 1) s

13(1, 2) or 1 <1’3(2, 2) 313(2, 1).

In future work we will try to establish this result for r > 2 and matrices B with

one zero component. in each row. Simulations (see Section 4.5) do suggest convergence

of the proportion of type A balls in the urns for these type of matrices.

We conclude this section by giving an expression for the limiting proportion of

patients allocated to treatment A within each of two strata. So, suppose conditions

011 the components of the matrix B are satisﬁed that ensure the existence of random

variables {(11) such that for each v E V = {1, 2},

lim Ak(?})
k—wo .4k(v) + Bk(2))

= {(v) as.

 

Fix v E V. Assumption (4.7) and (4.8) imply that

lim N114”)
k—mo Nk(v)

 

= {(v) as

As in the proof of Theorem 4.3.3

lim Ak(’U) + Bk(v)

k—mo k =ﬁv,lc(1)+ﬂv,2(1 _C(1)) (1.8.

 

Theorem 4.3.1, Lemma 4.3.1, (4.48), (4.4) and algebra yield

_ Ak(U)
1.133.. k

= £311,1C(v){19.4(1)€(1)+ (113(1) (1 - 5(0)}

 

+ (311.2 (1 — C(11)) {114(2) {(2) + (13(2)(1- €(2))} 0°3-

117

(4.47)

(4.48)

(4.49)

Take the ratio of (4.50) and (4.49) and use (4.47) to get, for each 21 E V

5(0) {311.1 0(1) + 1311.2 (1 - C(1ll-11 1)(IJ1(1)€(1)+QB(1)(1— €(1))} +

+ 131.21 — 6(1) {111(2) {(2) + (113(2) (1 - {(20} (1.3. (4-51)

Write (4.51) for v : 1 and for v = 2. Solve the resulting sytem of two equations
for {(1) and {(2). 11’e have shown that, if the proportions of patients allocated to
treatment A within strata 1 and 2 converge a.s., they converge as. to constants {(1)
and 5 (2), respectively, which depend 011 c(1), B and the failure probabilities within
strata. Explicit expressions for these limiting constants are given below. First, denote

by det(B) the determinant of the matrix B and let

D: (111(8) 6(1) (1 - 0(1)) ((140) + (18(1)) (114(2) + (113(2))-

Then
17-50 )= (1643) (00- 0(1)) ((1.4(2) +qB(2))qB(1) +
+{l31,16(1)+ﬂi,(21—C(1))}ﬁ2,6(1(111)3(1)+
'1' {132,1 C(1) '1' (32,2 (1 — C(1))} [31,2 (1 — C(1)) (13(2) '1'
and

D - 6(2) — (1611(3) 0(1) (1 - 6(1))(q.1(1) + (13(1)) (13(2) +
+ {,131,1C(1) ‘1‘ (31,2 (1 — C(1))} (32,1 C(1) (13(1) ‘1'

+{l32,1C(1)+112,(2(11—C())} (312(1—C(1))(13(2)+

118

 

4.4 Central Limit Theorem

The arguments used to prove Theorem 3.4.1 would allow us to show that, if the
proportion of type A balls in urn v E V = {1, 2... , r} converges almost surely
as the trial size converges to 00, then the strongly consistent estimators of success
probabilities p,;(z:) and p302) within stratumv are asymptotically independent and

normally distributed.

Theorem 4.4.1 For each 2) E V, 2f Rk(v) converges almost surely as k -—> 00 then

 

N.1,k(1’) (13.1.k(1’) — 111(1)» 17 0 119.1(1)) (11(1)) 0
——> N ,
NB,k(1’) (I3B.k(’1’) — 108(0)) 0 0 103(1)) (119(1))
as k ——> 00.

4.5 Evaluation of the Design

For the same reasons mentioned when studying the CAWD design, the CRPW rule
is only evaluated in the case r = 2.

Although we have not yet been able to prove convergence results (to be more
precise, Theorem 4.3.2) when at least one of the elements of the matrix B is 0,
simulations support that convergence holds in those cases. Simulations also suggest
that the best ethical choices of [3, according to the values of (14(1)) and q3(v) (v E V),
correspond to cases where in each row of B, one of the 3’s is 0.

Let B, be the matrix corresponding to allocating patients according to the RPW

119

rule within strata,

BS:

 

 

Let B1 be the matrix corresponding to allocating patients by using only information
on the responses of patients previously treated in stratum 1 when allocating patients

in any of the two strata,

Finally, let B; be the matrix corresponding to allocating patients by using only infor-
mation on the responses of patients previously treated in stratum 2 when allocating

patients in any of the two strata,

For each 2) E V, let F(v) : qA(v)/q3(v). Simulations indicate that the best choices in
terms of ethical allocation are as follows.

0 If (T‘(1) — 1) (N2) — 1) g 0 then 8, yields the best ethical choice.

0 If (F(1) — 1) (F(2) — 1) > 0 and |F(1) — 1| > |F(2) — 1| then 81 yields the best
ethical choice.

0 If (F(1) — 1) (F(2) — 1) > 0 and |F(1) — 1| < |F(2) — 1| then 82 yields the best
ethical choice.

Note that (F(1) — 1) (H2) — 1) < 0 means that A is better than B for treating pa-

tients in stratum 1 (respectively 2) but B is better than A for treating patients in

120

stratum 2 (respectively 1). So, similarly to what was proved for the CAWD design,
we again observe that in this case, and from an ethical point of view, there should be

no crosssover of information from stratum to stratum.

Monte Carlo simulations were used to evaluate the CRPW rule from the individ-
ualistic point of view (how ethical are the assignments of patients to treatments?)
and from the utilitarian pont of view (how good are the estimators of the treatments
difference within strata, p.4(1) — pB(1) and 114(2) —- p3(2)'?).

These questions were addressed by looking, respectively, at the proportion of
patients successfully treated and at the empirical mean squared error within strata
obtained as a result of CRPW(CO, B) allocations.

Comparisons were made between complete randomization within strata and the
CRPW rule with one ball of each type initially in each of the two urns, i.e. with
C0 = {(%, 2), (5 2), (%, 2), (%, 2)}, and matrices 83,81 and 32.

Note that B, and 31 yield the same mean squared errors in stratum 1 and that B,
and 82 yield the same mean squared errors in stratum 2. Therefore, in the ﬁgures cor-

responding to mean squared errors within strata, only three designs are represented.

The following labels are used in Figures 4.1 through 4.18.
0 r 2 complete randomization within strata;
0 f = CRPW(C0, 31);
0 s = CRPW(CO, 82);
0 w = CRPW(C0, BS) 2 RPW(1, 0, 1) rule within strata.

121

Figures 4.1 through 4.18 show the results of 10,000 replications of clinical trials
with sample sizes n = 30 and 150, success probabilities (114(1), p3(1), 114(2), 193(2)) :—
(0.35, 0.10. 0.60, 0.15), (0.50, 0.10, 0.60, 0.85) and (0.65, 0.10, 0.40, 0.15) and a range
of values for c(1) from 0.20 through 0.80. For each of the allocation policies consid-
ered and each combination of values for n, (114(1), p30), 114(2), 123(2)) and c(1), the
proportion of successes is computed as the average, over 10,000 replications, of the
prOportion of patients successfully treated in the simulated trial; the empirical mean
squared error within stratum v (v E {1, 2}) is computed as the average, over 10,000
repli(gfations, of the squared difference between the estimates of (pA(v) — p3(v)) and

the parameter.

In Figures 4.1 through 4.6 the success probabilities satisfy (F(1) -— 1)(F(2) — 1) <
0. Simulations show that the overall proportion of patients successfully treated is
larger for the CRPVV(C0, 83) rule than for complete randomization within strata
and CRPW rule with the other two choices of 8 matrices. As seen in Figures 4.2,
4.3, 4.5 and 4.6, the CRPW/(Co, 8,) rule has the best performance when estimating
the treatments difference in stratum 1, but has the very worst when estimating the
treatments difference in stratum 2.

In Figures 4.7 through 4.12 the success probabilities satisfy (F(1) — 1)(F(2) — 1) > 0
and |F(1) — 1| > |F(2) - 1]. Simulations show that the overall proportion of patients
successfully treated is larger for the CRPW(C0, 81) rule than for complete random-
ization within strata and CRPW rule with the other two choices of 8 matrices. The
best ethical choice for 3 yields a good performance when estimating the treatments

122

difference in stratum 1 both for small and large sample trials.

In Figures 4.13 through 4.18 the success probabilities satisfy (F(1) — 1)(F(2) — 1) >
0 and |F(1) — 1| < |F(2) — 1|. Simulations show that the overall proportion of pa-
tients successfully treated is larger for the CRPVWCO, 32) rule than for complete
randomization within strata and CRPW rule with the other two choices of B matri-
ces. The best ethical choice for 8 yields a good performance when estimating the

treatments difference in stratum 1 both for small and large sample trials.

123

Proportion of Successes

 

 

 

 

 

 

 

 

cud . :\W
o f§s\w
g \f\ \w
f\ \w\
l0 5
g ‘ \ w
EXI
g - I I I r r r g
0.20 0.25 0.30 0.35 0.40 0.45 0.50
0(1)
6 _ W\W\
o I w
2 J S§f\ \
\r\f w
S\r\f\w
UV). « S\r\f\w
O \S\r\f\w
O \s\\r\f
Z;- \s\r
\s
0.50 0.55 0.60 0.65 0.70 0.75 0.80

(1

v

('3

Figure 4.1: Comparisons in terms of proportion of successes for n = 30, p,4(1) = 0.50,

178(1) 2 0.10, 13,4(2) = 0.60 and pB(2) = 0.85.

124

Mean Squared Error in Stratum 1

 

0.18
m

0// /

 

 

 

 

 

 

 

_ r \S
2 - w\r Ss\\

_ W s \
8 — \w\ r
o \W>Jv

0.20 0.25 0.30 0.35 0.40 0.45 0.50
C(1)
S

I\ \
Q 1
0 s

_ \ S .\

S
l!) f
g - W\ r \ S \
\w\ r S \
\w§ r
8. 1 w JVQ r
O W
0.50 0.55 0.60 0.65 0.70 0.75 0.80

6(1)

Figure 4.2: Comparisons in terms of mean squared error in stratum 1 for n = 30,

m(l) = 0.50, 123(1) = 0.10, “(2) = 0.60 and 193(2) 2 0.85.

125

Mean Squared Error in Stratum 2

 

 

 

 

o w
o -4 w/
0 d w/
8 W"”"’W I
o' /
f
/ /
v I / r
O. - f %
o I #
f {f/
0.20 0.25 0.30 0.35 0.40 0.45 0.50

6(1)

 

 

 

 

w
to /
(\! —(
O
/W I
. / r
w
".3 /W/ 1 /
o w / r
w/ 1 /
s W/ /
I
/ I /
8 {ﬂ 1/
o' r T T I I T r
0.50 0.55 0.60 0.65 0.70 0.75 0.80

c(1)

Figure 4.3: Comparisons in terms of mean squared error in stratum 2 for n = 30,

PA(1) = 0.50, 193(1) 2 0.10, 114(2) 2 0.60 and p3(2) = 0.85.

126

Proportion of Successes

 

 

 

 

 

 

 

 

W
m S\W\
g .. r\s W\
f\ \ w
0 \;\s \
(O. ‘ \ W
O f§f \W
t8 - \£\ \W
é\f
8. - \;
o I F I I I I I
0.20 0.25 0.30 0.35 0.40 0.45 0.50
0(1)
L0 W
3 .. \w\
I w
E ‘ S§f\f\w\
\S\r\f W
10 \ \ \f\w
ﬂ: ‘ s r \ \
S \ f
8: . \S l’\
o
\S r
8 ‘ \S
o I ﬂ , I I T
0.50 0.55 0.60 0.65 0.70 0.75 0.80
c(1)

Figure 4.4: Comparisons in terms of proportion of successes for n = 150, 114(1) 2 0.50,

123(1) = 0.10, 17,4(2) = 0.60 and 103(2) = 0.85.

127

Mean Squared Error in Stratum 1

 

//

 

 

 

S
‘1 l' \ S
f
\ s
,— \
0.20 0.25 0.30 0.35 0.40 0.45 0.50

0(1)

 

 

 

 

3 s
o ‘ \
o
S \
-( S \
o S
S. d \ S
- \ \ S
W\ r
W\ r
\w\
é - \‘R’kw
0.50 0.55 0.60 0.65 0.70 0.75 0.80

c(1)

Figure 4.5: Comparisons in terms of mean squared error in stratum 1 for n = 150,

10,40) = 0.50, 123(1) 2 0,10,19,42) = 0.60 and 103(2) = 0.85.

128

*1

l!’

Mean Squared Error in Stratum 2

 

0014

0010
E
E

 

 

 

0.006

0.20 0.25 0.30 0.35 0.40 0.45 0.50

c(1)

 

0.05
i

004

003

002

 

 

0.01

 

0.50 0.55 0.60 0.65 0.70 0.75 0.80

c(1)

Figure 4.6: Comparisons in terms of mean squared error in stratum 2 for n = 150,

p.4(1) = 0.50, 103(1) = 0.10, 1),,(2) = 0.60 and 123(2) = 0.85.

129

Proportion of Successes

 

 

 

 

 

 

 

 

f
8 /W
' 1 f
W
.4 /f/W/ 5
§ . f/ / s
O- / /W /____,_.—S“"/
f/W s/S r
d w s/ /
s /r
r/l'
('3 -4 r/
O r/
0.20 0.25 0.30 0.35 0.40 0.45 0.50
0(1)
. f/va
f/w/
$11 1 /f;w/
O
.1 /f w
I w/
“3 W/ 5——-—-———-—S
0". « /———s—/
o 3/5
8/
.. s/
35 . /—————I/r
o r——---*"""""'r
r/
0.50 0.55 0.60 0.65 0.70 0.75 0.80

C(1)

Figure 4.7: Comparisons in terms of prOportion of successes for n = 30, pA( 1) = 0.65,

198(1): 0.10, 1),;(2) = 0.40 and 193(2) 2 0.15.

130

Mean Squared Error in Stratum 1

 

 

 

 

 

 

 

 

S
2. 1 w\
0 \§
. w\
W
o \
0.20 0.25 0.30 0.35 0.40 0.45 0.50
C(1)
5
_ w\
\g
s - w\
o' \ 3
w\\\ s
w§ s
(9) W\
O " \®\
0.50 0.55 0.60 0.65 0.70 0.75 0.80

c(1)

Figure 4.8: Comparisons in terms of mean squared error in stratum 1 for n = 30,

mm = 0.65, 193(1) = 0.10, “(2) = 0.40 and 193(2) = 0.15.

131

Mean Squared Error in Stratum 2

 

0.050

 

 

 

s , f:/
s / t¢w
1 f/1 ?W
0 w/w
8 I I I I I I I
O
0.20 0.25 0.30 0.35 0.40 0.45 0.50

0(1)

 

 

 

 

I
1 f /w
8 / I ;W/
o d &/,///W
0.50 0.55 0.60 0.65 0.70 0.75 0.80
c(1)

Figure 4.9: Comparisons in terms of mean squared error in stratum 2 for n = 30,

291(1) 2 0.65, 193(1) : 0.10, 114(2) 2 0.40 and 1123(2) = 0.15.

132

Proportion of Successes

 

 

 

 

 

 

 

 

-4 /f/
W
. f/ w/ s
g 1 w/ /.-—S/
0 w/ 5/5
1 S /——r/
r/l'
O r/
0') -. r/
o' r/
0.20 0.25 0.30 0.35 0.40 0.45 0.50
6(1)
(0
V: T /I
0
~ /f///:va/w
f
g f////w/w
' d f
w/ ______.._———s
co /s
m - s/3
O 5/
5/
A 5/
r/f
v r/
m. ~ /I/
o r/f
r/
0.50 0.55 0.60 0.65 0.70 0.75 0.80

0(1)

Figure 4.10: Comparisons in terms of proportion of successes for n = 150, [14(1) =

0.65, 193(1) .—.. 0.10, p,,(2) = 0.40 and p3(2) = 0.15.

133

Mean Squared Error in Stratum 1

 

0.020

/
//

0.012

d ‘Q\g\\\

 

 

y\y

0.20 0.25 0.30 0.35 0.40 0.45 0.50

c(1)

 

0.008

 

0.008
a) “E

0.007

0 006
(IE1

 

 

 

I I T I I r I

0.50 0.55 0.60 0.65 0.70 0.75 0.80

c(1)

Figure 4.11: Comparisons in terms of mean squared error in stratum 1 for n = 150,

1),;(1) = 0.65, 173(1) 2 0.10, 114(2) 2 0.40 and 193(2) 2 0.15.

134

Mean Squared Error in Stratum 2

 

 

 

 

f
O
a. - /.L
g I f/ f/‘;’/W
o r/
/ /W
‘ f r/
*ér7w
é -( JV/w
O F r r r T I I
0.20 0.25 0.30 0.35 0.40 0.45 0.50

v

(1

('3

 

.\
\

 

 

 

.. / f /
W
I /
o / \L’éw
5 $37—37;
o I I I I I I
0.50 0.55 0.60 0.65 0.70 0.75 0.80

c(1)

Figure 4.12: Comparisons in terms of mean squared error in stratum 2 for n = 150,

191(1) = 0.65, 193(1) = 0.10, 114(2) 2 0.40 and 103(2) = 0.15.

135

Proportion of Successes

 

J/
/

I
_/
,/
_/

/ /,,//

 

 

 

 

 

 

 

_ r \ r
\ r f
g \ r
d .1 r
0.20 0.25 0.30 0.35 0.40 0.45 0.50
0(1)
q S
WE s
55 - w\ s
_ \ f w\ s
(q - r \ \\ s
\ r I k S
- \ r v
\ r
8. '1 \ r
O I I I I f f f
0.50 0.55 0.60 0.65 0.70 0.75 0.80

6(1)

Figure 4.13: Comparisons in terms of proportion of successes for n = 30, pA(1) = 0.35,

PB(1) = 0-10, PM?) = 0.65 and p3(2) = 0.15.

136

Mean Squared Error in Stratum 1

 

 

 

 

In \
o _
d \ 8
" w\\g f \
S w
o
d I I T I I I I
0.20 0.25 0.30 0.35 0.40 0.45 0.50

(1)

Q

 

//

\
WW\

 

 

 

a \I
g ‘ w\w\$
_ \w\f
\w\s
Ki \w

0.50 0.55 0.60 0.65 0.70 0.75 0.80

6(1)

Figure 4.14: Comparisons in terms of mean squared error in stratum 1 for n = 30,

191(1) = 0.35, 123(1) = 0.10, 19,.(2) = 0.65 and 393(2) 2 0.15.

137

Mean Squared Error in Stratum 2

 

0.055

\\
\

0.035

 

 

 

I 1 I I I I I

0.20 0.25 0.30 0.35 0.40 0.45 0.50

c(1)

 

3" : /,/

 

 

 

9 I
d ‘1 /w
A /\II/
I/
8 - w/
d W/
0.50 0.55 0.60 0.65 0.70 0.75 0.80

6(1)

Figure 4.15: Comparisons in terms of mean squared error in stratum 2 for n = 30,

pA(1) = 0.35, 193(1) = 0.10, p,,(2) = 0.65 and 103(2) 2 0.15.

138

Proportion of Successes

 

 

 

 

 

 

 

 

S
:1, 1 W§s\
O W\S
. W\s
8. ‘ f\ \w\s
O 4 f\f \w\s
m. d \r f\
O \r f\
1 \r f
\r
g .1 \r
o‘ \r
0.20 0.25 0.30 0.35 0.40 0.45 0.50
6(1)
3'3 - s
W \
4 \f \ S
r \f W\ \S
8 - \r \f W
I \
-4 \r I
\r
a. \r
O T I I I I r I
0.50 0.55 0.60 0.65 0.70 0.75 0.80

c(1)

Figure 4.16: Comparisons in terms of proportion of successes for n = 150, [14(1) =

0.35, p3(1) = 0.10, “(2) = 0.65 and 123(2) = 0.15.

139

Mean Squared Error in Stratum 1

 

0.020
5 cm

 

 

 

 

 

 

 

I!)
S 1 S
o w
0
0.20 0.25 0.30 0.35 0.40 0.45 0.50
0(1)
s
8 W\
q 1
w\
s . \e
8 wk 8
g - w§
s
W\ S
3 \ w
Q d T I I I I I I
O
0.50 0.55 0.60 0.65 0.70 0.75 0.80

6(1)

Figure 4.17: Comparisons in terms of mean squared error in stratum 1 for n = 150,

191(1) = 0.35, 103(1) = 0.10, pA(2) = 0.65 and 123(2) 2 0.15.

140

121-MW . "ﬂ

Mean Squared Error in Stratum 2

 

0.010

 

 

 

 

co /
8 - w I
o / /
I
w
d w/t/
W/f/
8 /
0. 1 f
o I I I I I I I
0.20 0.25 0.30 0.35 0.40 0.45 0.50
6(1)
3 , ‘1’
O.
o
o
8 1 ‘1’
o

\\

/Y

 

 

I I I I I I

0.010

 

0.50 0.55 0.60 0.65 0.70 0.75 0.80

c(1)

Figure 4.18: Comparisons in terms of mean squared error in stratum 2 for n = 150,

m(l) = 0.35, 123(1) 2 0.10, pA(2) = 0.65 and 103(2) = 0.15.

141

APPENDIX

142

Appendix A

Theorem A.0.1 (Hall and Heyde (1980)) Let {Mk = ELI/13,711 : k 2 1} be a
martingale and {Tk : k 2 1} be a nondecreasing sequence of positive random variables
such that Tk is Hk_1—measurable for each It 2 1. Then

. [Wk
111120 f — 0 as.

on the set

°° 1
{klim Tk = 00, 23:72- E(I’VE l Hk_1) < 00} .
-+oo k=1 k

Proof. See Theorem 2.18 in Hall and Heyde ( 1980). I

Theorem A.0.2 (Hall and Heyde (1980)) Let {Mkm ’Hh : 1 g i g nk, k 2 1} be a
zero-mean and square integrable martingale array with diﬁerences {erkﬂ'}, and let 02

be an almost surely ﬁnite random variable. Suppose that

max ll’Vk,2'l i) 0, (A.1)
lggnk
71k p
Z V173,,- ——> 712, (A2)
i:l

143

E ( max IVEJ) is bounded in k, (A.3)

ISignk

and the o—algebras are nested:

71;“,- C ’HHL, forl g i g nk, k 21. (A.4)

Then MW = 23;, ”K, A N (0, "2).
Proof. See Theorem 3.2 in Hall and Heyde (1980). I

Theorem A.0.3 (Hall and Heyde (1980)) Let {2le 14”,, ”HI, : k 2 1} be a martin-
gale and deﬁne for each k 2 1 and each i = 1, , k, IV,“- = W,I{|W,| g k}.

Suppose that, as k —> 00

k
ZP(IVWI>k)—>0, (15)
1—1
1 k
'1; ZE (II/hi Vii-1) 1) 0, (A6)
1'21
and
1 k
p Z {EWEI - E [E (Wm IHHW} —> 0. (A.7)
i=1

Then k‘1 2le W,- i) 0, as k —> 00.

Proof. See Theorem 2.13 in Hall and Heyde (1980). I

144

BIBLIOGRAPHY

145

Bibliography

Athreya, K. B. and S. Karlin (1967). Limit theorems for the split times of branching

processes. J. Math. Mech. 17, 257—277.

Atkinson, A. C. (1982). Optimum biased coin designs for sequential clinical trials

with prognostic factors. Biometrika 69(1), 61—67.

Atkinson, A. C. (1998). Optimum experimental designs for chemical kinetics and
clinical trials. In New developments and applications in experimental design, pp.

36—49. Institute of Mathematical Statistics.

Bai, Z. D. and F. Hu (1999). Asymptotic theorems for urn models with nonhomo—

geneous generating matrices. Stochastic Process. Appl. 80(1), 87-—101.

Ball, F. G., A. F. M. Smith, and I. Verdinelli (1993). Biased coin designs with a

Bayesian bias. J. Statist. Plann. Inference 34(3), 403—421.

Berry, D. A. and B. Fristedt (1985). Bandit problems. Chapman & Hall, London-

New York. Sequential allocation of experiments.

Blackwell, D. and J. Hodges, J. L. (1957). Design for the control of selection bias.

Ann. Math. Statist. 28, 449—460.

146

Clayton, M. K. (1989). Covariate models for Bernoulli bandits. Sequential

Anal. 8(4), 405—426.

Cornell, R. G., B. D. Landeberger, and R. H. Bartlett (1986). Randomized play-
the-winner clinical trials. Communications in Statistics, Part A — Theory and

Methods 15, 159—178.

Efron, B. (1971). Forcing a sequential experiment to be balanced. Biometrika 58,

403—417.

Eisele, J. R. (1990). An adaptive biased coin design for the Behrens-Fisher problem.

Sequential Anal. 9(4), 343—359 (1991).

Hall, P. and C. C. Heyde (1980). Martingale limit theory and its application. New
York: Academic Press Inc. [Harcourt Brace Jovanovich Publishers]. Probability

and Mathematical Statistics.

Hill, B. M., D. Lane, and W. Sudderth (1980). A strong law for some generalized

urn processes. Ann. Probab. 8(2), 214—226.

Pocock, S. J. and R. Simon (1975). Sequential treatment assignment with balancing

for prognostic factors in the controlled clinical trial. Biometrics 31, 103—115.

Sarkar, J. (1991). One-armed bandit problems with covariates. Ann. Statist. 19(4),

1978—2002.

Sibson, R. (1974). DA-optimality and duality. pp. 677—692. Colloq. Math. Soc.

Janos Bolyai, Vol. 9.

Silvey, S. D. (1980). Optimal design. London: Chapman & Hall. An introduction to

147

the theory for parameter estimation, Monographs on Applied Probability and

Statistics.

Tamura, R. N., D. E. Faries, J. S. Andersen, and J. H. Heiligenstein (1994). A
case study of an adaptive clinical trial in the treatment of out-patients with

depressive disorder. J. Amer. Statist. Assoc. 89(427), 768—776.

Wei, L. J. (1978). The adaptive biased coin design for sequential experiments. Ann.

Statist. 6, 92—100.

Wei, L. J. and S. Durham (1978). The randomized play-the-winner rule in medical

trials. J. Amer. Statist. Assoc. 73(364), 840—843.

Woodroofe, M. (1979). A one-armed bandit problem with a concomitant variable.

J. Amer. Statist. Assoc. 74 (368), 799—806.

Woodroofe, M. (1982). Sequential allocation with covariates. Sankhya' Ser. A 44 (3),

403—414.

Wu, C.-F. (1981). Iterative construction of nearly balanced assignments. I. Cate-

gorical covariates. Technometrics 28(1), 37—44.

Zelen, M. (1974). The randomization and stratiﬁcation of patients to clinical trials.

Journal of Chronic Diseases 27, 365—375.

148