ABSTRACT A MULTIVARIATE APPROACH TO TESTING FOR HOMOGENEITY OF CORRELATED PROPORTIONS By Barry M. Katz A design which frequently appears in behavioral science re- search is one in which a single group of subjects is classified into unordered categories by multiple raters. A test of homogeneity of classification among the raters can be used to determine rater bias. Analyzing attitudinal change over time, over objects, or over treat- ments is also frequently of concern in the behavioral sciences. These illustrations are Specific cases of the more general mixed categorical data model of order-d which is characterized by a situation in which n randomly chosen subjects from some homogeneous population are unasured on an ravalued categorical dependent variable under d dif- ferent conditions. The hypothesis of general interest is whether the distribution of responses is the same under the d different condi- tions. Koch and Reinfurt (1971) stated that such a hypothesis best examines the relative effects of the d conditions on the dependent measure of interest. The probable lack of independence among measures or samples precludes the valid use of such standard techniques as the chi-square test of homogeneity. A technique which accounts for the correlated nature of the resulting distributions is needed if the hypothesis of homogeneity is to be tested validly. Barry M. Katz The probability model associated with the mixed categorical data model of order-d is an r x r x...x r contingency table of d dimensions whose cell frequencies are characterized by a multinomial distribution of rd cell probabilities and a sample size of n. Testing the hypothesis of homogeneity of the d correlated distribu- tions then reduces to testing for marginal homogeneity in the d dimensional table. Four different statistical approaches to the problem of testing for homogeneity of the marginal distributions in the mixed categorical data model were examined and compared along theoretical lines. The four approaches examined were the x: statistic of Stuart (1955), a quadratic form in the differences of the marginal proportions; the x; statistic of Madansky (1963), a statistic based on the likelihood ratio criterion; the X§SK statistic of Koch and Reinfurt (1971), a statistic based on weighted least squares; the X? statistic of Ireland et a1. (1969), a statistic based on minimum discrhmimation information estimation. All four approaches were shown to belong to the same general class of large sample chi-square statistics. Each of the techniques was shown to be based on the use of BAN estimators and the four techniques were shown to be asymptotically equivalent. An eXplicit algebraic relationship between x: and 338K. was also demonstrated. .A fifth statistic, 12, algebraically equal to Xésx, was pro- posed. The }? statistic has the advantage that it does not require a knowledge of linear models for its understanding as does the X§SK statistic. The limiting distribution of I? was shown to be chi- square with (d-l)(r-1) degrees of freedom. A detailed set of Barry M. Katz computational formulas were derived and a program written in Fortran IV to calculate the 12 statistic was given. The deve10pment of two different techniques for generating confidence intervals for contrasts involving the marginal prepor- tions was also given. One of the procedures, a simultaneous procedure, was developed along the lines of the results of Scheffé (1959) and Goodman (1964). A second technique was developed based on the Bonferroni inequality. The behavior of the 12 statistic in the finite sample situa- tion was examined by the method of simulation. The data were gen- erated in groups of 2000 samples of a given size (n). For this study the values of n were chosen to yield average expected cell fre- quencies (H) of 3, 5,10, 20, 40, and 60. For each of the 3 x 3, 4 x 4, and 5 X 5 contingency tables, five different null distribu- tions were considered and three different non-null distributions were considered. For the 3 X 3 x 3 table four null and three non-null distributions were considered. Empirical estimates of the actual significance level and power of the I? procedure were found by counting the number of rejections out of 2000 using the theoretical cutoff levels of a 8 .01, .05, .10 of the central chi-square distribution and were compared to their respective theoretical values. Empirical estimates of the actual significance level and power of 12*, a slight variant 2 2* of the 12 statistic were also found. The I and 1 statistics are related by the equation 2* 17' n Barry M. Katz It was found that for 5.2 10 both statistics approximated their respective asymptotic behaviors quite well. When E’= S for those null distributions in which most of the off-diagonal cell expectations were two or greater, 12 was quite liberal while I?* approximated its limiting distribution quite closely. For null distributions in which most of the off-diagonal cell eXpectations were one or less, 12* was extremely conservative while I? was slightly conservative. For I; = 3, values of actual alpha were found to be more extreme than those values found at n = S. In cases where = 5, 12 was more 5| the 12 procedure was found to be liberal at liberal at E'= 3 and in cases where the I? procedure was found to be conservative at E'= 5, I? ‘was more conservative at E'= 3. The 12* statistic was more conservative at E'= 3 than it was at 5-5. The ‘12 procedure was found to be somewhat more powerful 2* ._ than the ‘1 procedure for n = 5. A series of guidelines were set forth based on the findings of the simulation study. ,A MMLTIVARIATE APPROACH TO TESTING FOR HOMDGENEITY OF CORRELATED PROPORTIONS By (b 0 Barry M. Katz A DISSERTATION Submitted to Michigan State university in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Counseling, Personnel Services, and Educational Psychology 1975 TO THE MEMORY OF MY FATHER 11 ACKNOWLEDGMENTS I would like to thank the members of my guidance committee who were most supportive of me throughout my graduate studies. Dr. John Wagner has always given me much encouragement and provided me with a very stimulating introduction into mathematics education. I am most grateful. Dr. Dennis Gilliland was most helpful on the dissertation and I was greatly impressed with him both as a teacher and a person. Dr. William Schmidt has played an integral part in my professional development both as a teacher and friend. I have greatly enjoyed my association with him on both a professional and informal basis. Dr. Maryellen McSweeney who has served as my advisor, committee chairman, teacher, and friend has been an inspiration to me as both a teacher, researcher, and person. I cannot begin to thank her for all that she has done for me. Dr. McSweeney gave unselfishly of her time and efforts which were far beyond the call of duty. I only hOpe that I can live up to the example which she has set for me. I would also like to thank Sidney Sytsma for his develOpment of the random number generator and his helpful explanations concerning the generator. In addition I would like to thank Noralee Burkhardt for typing the dissertation and her genuine pleasantness throughout the project. She has been most enjoyable to work with. iii TABLE OF CONTENTS LIST OF TABLES ........................................... LIST OF FIGURES .......................................... Chapter I INTRODUCTION ............................... ..... II TESTING FOR HOMOGENEITY ......................... MUDTIVARIATE VERSUS UNIVARIATE APPROACH ....... Likelihood Ratio Statistic ................. Minimum Discrimination Information Statistic . Weighted Least Squares Approach ............. COMPARING TEST PROCEDURES ..................... Neyman and BAN Estimators ................... Some Results by Bhapkar ..................... xi Statistic and Maximum Likelihood Estimators Explicit Relationship Among Statistics . ..... III THE 12 STATISTIC AND ASSOCIATED POST HOC PROCEDURES THE 12 STATISTIC ............................ Model and Hypothesis . ....................... Large Sample Distribution of 12 Under H Invariance of 12 ........................... 15 15 18 l8 19 20 24 28 32 32 36 42 43 50 50 51 54 60 Chapter Computational Form of 12 ................... Asymptotic Power of 12 ..................... TECHNIQUES FOR ISOLATING SOURCES OF SIGNIFICANCE Scheffé-type Solution ....................... Bonferroni-type Solution .................... DATA EXAMPLE .................................. IV A MONTE CARLO STUDY OF THE 1 STATISTIC ....... RELATED RESEARCH DESIGN PARAMETERS USED IN THE INVESTIGATION Contingency Tables Examined ................. Distributions Considered Under the Null HypotheSis ................................ Distributions Considered Under the Alternative Hypothesis .. ....... . .......... DATA GENERATION ............................... Random Number Generator ..... . ............... Generation of Discrete Valued Random Variables ................................. Analysis Routines ........................... SIMULATION RESULTS ......................... . . . Occurrence of Singularities ................. Estimates of Actual Alpha for the 12 Statistic .......... .......... ............. 2* Estimates of Actual Alpha for the I Statistic ....... . ......................... Estimates of Actual Power for the 12 Procedure ..................... . ........... 2* Estimates of Actual Power for the I Procedure ................................. C(NCLUSIONS AND IMPLICATIONS OF SIMULATION RESULTS ..................................... Page 61 66 67 71 79 84 90 92 101 101 102 107 112 113 118 127 128 128 131 137 144 149 155 Chapter Page V SWY AND SUGGESTIONS FOR FURTHER RESEARCH . . . . 161 SUMARY ...... . ........ . ........ . .............. 161 SUGGESTICNS FOR FURTHER RESEARCH .............. 166 Appendices A PROOF OF THEOREM 2.1 ............................ 169 B PROOF OF THEOREM 2.2 ............................ 173 C NULL DISTRIBUTIONS FOR TWO DIMENSIQIAL TABLES . . . 177 D NULL DISTRIBUTIONS FOR THREE DIMENSIONAL TABLES . . 182 E NON-NULL DISTRIBUTIONS FOR TWO DIMENSIONAL TABLES 186 F um -NULL DISTRIBUTIONS FOR THREE DIMENSIONAL TABLES 189 G PROGRAM GEN ..................................... 192 H PROGRAM XSTAT ................... . ............... 194 BIBLIOGRAPHY ...... . ..................................... 206 General References .............................. 210 vi 4-10 4-11 4-12 4-13 LIST OF TABLES Data Illustration for the Chi—Square Test of Homogeneity 000000000ooooooooooooooooooooooooooo Laumann's Social Interaction Data ... ..... ...... Scheffévlike Confidence Intervals at 0’8 .05 and Bonferroni Confidence Intervals with a - .005 per Contrast ........ . ................ Cases Investigated Under the Null Hypothesis ... Cases Investigated Under the Alternative Hypothesis .............. ......... Illustrative Discrete Random Variable ......... Computer Memory Scheme for Discrete Generation . Goodness of Fit Tests for Null Distributions ... Goodness of Fit Tests for Non-Null Distributions Number of Singularities in 2000 Samples ........ Monte Carlo Estimates of Exact Level of 12 Test Associated with Nominal 1% Level ......... Monte Carlo Estimates of Exact Level of I Test Associated with Nominal 5% Level ......... 2 Monte Carlo Estimates of Exact Level of 1 Test Associated with Nominal 10% Level ......... Number and Percentage (out of 19 Different Dis- tributions) of ’51 Within 95% Confidence Limits of the Corresponding Nominal a ................ * Monte Carlo Estimates of Exact Level of 12 Test Associated with Nominal 1% Level ,.,.,,,,, 2* MOnte Carlo Estimates of Exact Level of I Test Associated with Nominal 5% Level ......... vii Page 85 87 110 111 120 122 125 126 129 132 133 134 135 140 141 Table 4-14 4-15 4—16 4-17 4-18 4-19 4-20 4—21 4-22 4-23 4—24 Page Monte Carlo Estimates of Exact Level of 12* Test Associated with Nominal 10% Level ........ 142 Number and Percentage (out of 19 Different Dis- tributions) of 3, Within 95% Confidence Limits of the Corresponding Nominal a' for the 12* Statistic ........................ .............. 143 Mbnte Carlo Estimates of Actual Power and Tabled Asymptotic Power for the I? 'Test Associated with the Nominal 1% Level . ...... .................... 145 Mbnte Carlo Estimates of Ac ual Power and Tabled Asymptotic Power for the 1; Test Associated with the Nominal 5% Level ..... ...... . ............... 146 Monte Carlo Estimates of Ac ual Power and Tabled Asymptotic Power for the I. Test Associated with the Nominal 10% Level ... ........................ 147 Number and Percentage (out of 12 Different Distribu— tions) of Empirical Power Values Beyond the Upper Limits of the 95% Confidence Intervals for Their Respective Asymptotic Values for Nominal a' of .01, .05, .10 ...... .. ..... ........ ......... .... 148 Monte Carlo Estimates of Actual Power and Tabled Asymptotic Power for the 12*' Test Associated with the 1% Level .. ....... .............. .......... . 150 Mbnte Carlo Estimates of Ac gal Power and Tabled Asymptotic Power for the I. Test Associated with the 5% Level ........... ............ ....... .... 151 Mbnte Carlo Estimates of Actual Power and Tabled Asymptotic Power for the 12*’ Test Associated with the Nominal 10% Level .............. ....... ..... 152 Number and Percentage (out of 12 Different Distribu- tions)of Empirical Power Values of I2*’ Beyond the 95% Confidence Limits of Their Asymptotic Values for Nominal a of .01. .05. .10 ........ .......... 153 2 2* Estimates of Actual Power of the I and I Procedures for Tests Performed at the Nominal 5% Level ....... . ......... . ....... . ........... .... 154 viii E-l E-2 E-3 F-l F-2 F-3 Two Correlated Distributions ............ ..... ClaSSsA ClasséB Class -C ClasséD Class-E The 3 X The 3 X The 3 X The 3 X Class-F Class-G Class-H The 3 X The 3 X The 3 X LIST OF FIGURES Null Distributions -.--.... ....... ..... Null Distributions .................... Null Distributions .................... Null Distributions .................... Null Distributions .. .................. 3 X 3 A Distribution . ............... 3 X 3 B Distribution ................ 3 X 3 C Distribution ................ 3 X 3 D Distribution . ............... Distributions ......................... Distributions ......... . ............... Distributions .. ....................... 3 X 3 - I Distribution . ............... 3 X 3 - J Distribution ........ ........ 3 X 3 - KDIStI‘IbUtIOD o oooooooo ooooooo ix- Page 103 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 CHAPTER I INTRODUCTION Questions about the homogeneity of multiple populations are common in behavioral research. When continuous variables are used, the questions may focus on population homogeneity with respect to mean level of performance or variability in performance. When categorical variables are used, the questions deal with the equi— valence of the populations with respect to a variable of classifica- tion. If multiple populations have been sampled independently and categorized independently with respect to the classification variable, then standard data analytic techniques can be used to determine whether the respective populations are homogeneous. 0n the other hand, if the samples have been matched (either by pairing or block- ing of subjects across the samples or by repeated measurement on a single sample of subjects), the corresponding data analytic techniques to examine the question of homogeneity are not well-known in the behavioral sciences nor have their small sample properties been in- vestigated. This dissertation develops data analysis techniques to examine the question of homogeneity in matched populations, studies the small sample properties of these techniques, and compares these techniques with other competing approaches that have recently appeared 1 in the statistical literature. To orient the reader to the design and analysis techniques presented in this thesis, the independent sample and the matched sample designs are briefly compared and the need for a separate matched sample test is explained. The chi-square test of homogeneity is a categorical data analytic technique used to analyze data from two—way contingency tables in which one margin is fixed and the other margin is not. The corresponding experimental situation involves choosing a random sample of subjects from each of d different underlying populations and classifying each subject into one of r categories. The sample sizes chosen need not be equal but there must be independence both within and between the samples. The hypothesis of interest is whether the d populations are homogeneous with respect to the r categories of the variable of classification. The chi—square test of homogeneity is analogous in some respects to the one-way ANOVA used in the analysis of continuous data. A data example to illustrate the chi-square test of homogeneity follows. A survey of community needs was conducted in a western Michigan county. The county was subdivided into four geographic regions of varying size, and samples of size 23, 25, 93, and 235 were randomly chosen from the adult populations of the respective four regions. Each respondent was asked to rate the degree to which he felt it was necessary to solve racial problems. The three categories of response were: (1) urgent, (2) important, (3) un— important. The survey yielded the following data given in Table l-l. TABLE 1-1 Data Illustration for the Chi-Square Test of Homogeneity Regions (1) (2) (3) Urgent Important Unimportant Region A (l) 2 ll 10 23 Region B (2) 9 13 3 25 Region C (3) 20 46 27 93 Region D (4) 86 104 45 235 117 174 85 The row totals for these data are fixed quantities correspond- ing to the sample sizes used in the study, while the column totals are random, with their size determined by the respondents' replies. The hypothesis of interest is whether the four regions are homogeneous with respect to the distribution on the dependent variable. If qu denotes the probability of someone from region q responding with category j, then the null hypothesis of homogeneity can be stated in the form Ho: P1J - sz = P3j = P4j for j - 1,2,3. (1-1) More generally if independent random samples are taken from d different populations and each subject is classified into one of r possible categories, the null hypothesis of homogeneity of the resulting d distributions can be written as no: P1j = sz =...= de for j = 1,2,...,r . (1-2) Let nqj denote the number of observations from sample which fall in cate or . Let n , n , n, P ,, and P , be q gyj q 01 OJ (11 defined as follows r nq = .:1 nqj for q = 1,2,...,d (1-3) J d n , = Z n . for j = 1,2,...,r (1-4) OJ qgl qJ d n = Z n (1-5) q=l q no, Poj = -;l- for j = 1,2,...,r (1-6) 1'1. fi . = —91- for j = 1,2,...,r; q = 1,2,...,d (1-7) 93 nq where nq is the size of the qth sample, n is the total number 0j of observations falling into category j from the combined samples, and n is the size of the combined samples. The formula in (1-6) is an estimate of the probability of an observation falling into category j under the assumption that this probability is the same for each of the populations; that is P0j is computed under the assumption that the null hypothesis in (1-2) is true. The formula in (1-7) is an estimate of the probability that an observation from population q will fall into category j. A statistic which can be used to test the hypothesis given in (1-2) is P cz= z 2: 3i “‘103 . (1-8) The statistic in (1-8) has the general structure d r (0 - E ) c2 - 2 z 91E qj (1-9) q=l i=1 qJ where Oqj represents the actual number of observations in the qth sample falling into category j and qu represents the number of observations to be expected to fall into category j for the qth sample under the assumption of the null hypothesis given in (1-2). Large values of the C2 statistic imply that the actual cell fre- quencies vary widely from those expected under the null hypothesis. Thus the null hypothesis in (1-2) is rejected for large values of the C2 statistic. When the sample sizes are suitably large it is known that the C2 statistic has approximately a chi-square dis- tribution with (d-1)(r—l) degrees of freedom under the null hypothesis. For tests performed at the a level of significance, H0 is rejected whenever the value of the C2 statistic exceeds the (1-o)th quantile of the chi-square distribution with (d-1)(r—l) degrees of freedom. An alternate way of writing the statistic C2, which some- times appears in the literature is given in (1-10) 2 d r n (P - P )2 c = z z q~ . QJ~ . (1-10) q=l j=l P E. OJ To use the chi-square distribution validly as the reference distribution for the C2 statistic, one of the assumptions which must be made is that responses across samples are made independently. One of the reasons for this assumption is that the 02 statistic does not make any allowances for covariation between random variables of the form qu and fiq'k for q # q'. That is, the 02 statistic assumes that 9? I Cov P ( q q k j ) = 0 for all q i q', j,k = 1,2,...,r. (l—ll) If responses between samples are not independent, the random variables of the form qu and fiq'k will be correlated. Since the C2 statistic makes no provision for this correlation an invalid test results. The chi-square test of homogeneity is considered for the special case where d = r = 2 to illustrate the need for between sample independence. The 2 x 2 table corresponds to an experimental situation in which samples of sizes nl and n2 are randomly selected from two respective underlying populations. The subjects are then classified on some dichotomous variable. The null hypothesis of homogeneity can be put in the form H0: Plj = P23 for j = 1,2 (1-12) HO: P11 = P21 (1-13) since the equality in (1-13) implies the equality in (1—12) for the case when j - 2. To test H0 the C2 statistic given in (1-10) is computed for the special case when d = r = 2 and is given by ~ , 2 2 2 n (P - P ) 2 C - 2 Z q* Qi: 91* (1-14) q-l j=l POj Making use of the equalities given in (1-15), which hold for the special case under consideration, P12 3 1 7 P11’ P22 = 1 ” P21’ P02 = 1 ‘ P01 (1‘15) it can be shown through some algebraic simplification that the expression for C2 in (1-14) can be rewritten in the form . ~ 2 (P - P ) c2 = A11 2E . (1-16) Po“1 ‘ P01) + 1601(1 ' f301) n1 n2 2 2 Because C as x with 1 degree of freedom (fill ' 521) . C = as N(0,l). (1-17) \v/I P91(1 ’ P01) + 1301‘1 ‘ P01) n1 n2 The statistic in (1-17) can be written in the form (fi11 ' P21) ' E1100311 ‘ P21) 150“1 ' 501) + 150“1 ' 501) n1 n2 because EH (P11 - P21) is 0 by the unbiasedness of the estimators 0 P11 and P21 and the specification of the null hypothesis, P11 - P21. The C statistic thus has the general form (1—19) Var (6) no where 6 is an estimate of a parameter 6. Consequently the estimated variance of P11 - P21 under the null hypothesis is given by fi — A A _ A Var (g _ fi ) = 01‘1 P01) + P01(1 P01) H0 11 21 n1 n2 = VarH P11 + VarH P21 . (1-20) 0 0 The result in (1-20) is a special case of Var(Pll - P21) = Var(P11)+ Var(P21) - 2 Cov(Pll,P21) (1-21) when Cov(Pll,P21) = 0. Thus the normal curve statistic (1—18) and the chi-square statistic (1-14) treat P11 and P21 as un- correlated random variables since (1-20) implies CSVH0(P11,P21) is taken to be 0. For the general case of d samples measured on a variable of r categories the result in (l-ll) is assumed to hold. Thus valid use of the chi—square test of homogeneity requires between sample independence because data from dependent samples are correlated. Use of 02 for dependent samples assumes incorrectly that condition (1-11) is satisfied and results in an invalid test. The mixed categorical data model of order-d involves a situation in which n randomly chosen subjects or blocks from some homogeneous papulation are measured on an r-valued categorical dependent variable under d different conditions. The hypothesis of general interest is whether the distribution of responses is the same under the d different conditions. Koch and Reinfurt (1971) stated that such a hypothesis best examines the relative effects of the d conditions on the dependent measure of interest. Because of the nature of the design, the chi-square test of homogeneity is most likely inappropriate to analyze the data from such a model. It is doubtful that the observations or responses made by the same subject are independent of one another. Since the chi-square test of homogeneity does not make any allowances for correlated responses, what is needed is a test of homogeneity for correlated samples that builds the correlation into the procedure. The use of the mixed categorical data model of order-d and a subsequent test for the homogeneity of the resulting d correlated distributions are required in the following types of situations. 1. Panel studies in which d different attitudinal policy questions are asked of the subjects with each question having the same r categories of response. A test for homogeneity of the d distribu- tions of response would determine if the group held the same attitude for each of the policy questions. ii. Longitudinal studies in which each of the subjects in a group is measured at d different time intervals on the same r—valued categorical variable. A test for homogeneity of the d distributions of response would determine if the group as a whole changed its mode of classification over time. 10 iii. Studies in which a single group of subjects is classified into r categories by d different raters. A test for homogeneity of the d distribu— tions would determine if the d raters viewed the group as a unit in the same manner. iv. Studies in which the same group of subjects is asked the same question which has r categories of response under d different treatment conditions. A test for homogeneity of the d distributions would determine the relative effects of the d treatments. v. Studies in which the subjects in each of d matched samples are each classified into one of r categories. A test for homogeneity of the d distributions would determine the degree to which the d groups are the same in terms of the variable of classification. Design situation (i) can be illustrated by part of a survey reported by Marascuilo and McSweeney (1969). The survey concerned the attitude of Berkeley residents toward the integration of the city's schools. A sample of the city's adult p0pulation was made and the subjects were asked to respond to the following questions. (1) For some grade schools, [it has been] suggested that lines be changed so that the percentage of nonwhite and white children in these schools would be more like the percentage for the entire school system I agree I disagree I am not sure 11 (2) For [two of the three] Junior High Schools, [it has been] suggested that school lines be changed so that the percentage of nonwhite and white children in these schools would be more like the percentage for the entire school system I agree _____I disagree _____ I am not sure (3) If more day-care centers and nursery schools are set up, [it has been] suggested that they be set up to permit a greater integration of the races than is found at present I agree .____ I disagree _____I am not sure A test for the homogeneity of the distributions of the three questions would provide a test of whether the adult pOpulation of Berkeley viewed each of the issues in the same manner. Such an analysis would be an improvement upon the pairwise analysis used by Marascuilo and McSweeney (1969) in the absence of a simultaneous test for three correlated distributions on a polychotomous variable of three categories. From the same study reported by Marascuilo and McSweeney (1969) the following question was asked of the group. [It has been] suggested that more day—care centers and nursery schools be set up to let more children attend. I agree ____ I disagree _____ I am not sure If such a question were asked over d increments of time a test for the homogeneity of the d distributions would examine the attitude of the Berkeley adult community to this issue as a function of time. This is an example of situation (ii). 12 In the Miller et al. (1970) study a sample of 84 infants were tested on a series of tasks involving object concept develop- ment in the sensory-motor period. Two observers categorized what they believed to be each of the children's overall performance on a series of Piagetian-tasks. The categories of classification were (0) failure, (1) partial pass, and (2) pass. If the researchers were interested in determining whether the two observers applied equally stringent standards to the group, a test for homogeneity of the distributions of ratings for each of the observers would answer such a question. This illustration is an example of the design situation (iii). Yoshinaga (1974) investigated the constancy of teachers' choices of a disciplinary strategy -- positive reinforcement, social modeling, and punishment -— when increasing increments of information were given about the child involved in the disciplinary incident. The teachers were first given a written description of the dis- ciplinary incident and asked to choose a strategy to deal with the incident. The teachers were then given some biographical informa- tion about the child and again asked to choose a disciplinary strategy. Finally, the teachers were presented with additional biographical information about the child and again asked to choose a strategy. The researcher was interested in knowing if teachers change their strategy to handle disciplinary problems as more is known about the child. A test for homogeneity of the resulting three distributions answers such a research question. This is an illustration of design situation (iv). 13 In Laumann's (1973) study of social interaction and social mobility, a husband and the fathers of both the husband and wife were each classified into one of a number of social classes. This design can be considered as consisting of three matched samples and is an example of design situation (v). By testing for homogeneity of the three correlated distributions, the researcher can make inferences concerning the social interaction of various social classes as well as make inferences about generational changes in social class. The classification of a husband, his father, and his father-in-law are most probably not made independently. The matched triples of husband, father, and father-in-law can be considered as blocks in a randomized block design. The previous examples show the different ways in which the mixed categorical data model can be used in behavioral science re- search. The studies of Marascuilo and McSweeney (1969), Miller et a1. (1970), and Yoshinaga (1974) illustrate data analytic situa- tions which demanded a statistic to test for homogeneity of d correlated distributions. Such a statistic can be applied in a variety of settings which require categorical data analysis of either matched samples or repeated measures data. This dissertation focuses on testing for homogeneity of dis- tributions on a polychotomous variable for non independent samples. The utility of such a technique in educational and behavioral science research has been pointed out in this introductory chapter. The dissertation has three major sections. The first section provides a review and synthesis of some of the more significant literature re- lating to the problem. Several methods for dealing with the problem 14 are presented and compared in this review. The second major section provides a detailed development of a large sample statistic which can be used to test for homogeneity of correlated distributions. The statistic has the appeal of simplicity and can easily be related to techniques which are familiar to behavioral researchers who have had some exposure to statistical methods. In addition a development of post-hoe techniques to be used in conjunction with the test statistic is provided. The third major portion of the dissertation reports a simulation study of the small sample behavior of the statistic develOped in the second section. The distributional pro- perties of the statistic both under the null and alternative hypothesis are considered for a number of small sample cases. The purposes of this third section are to assess the degree to which the asymptotic results hold in settings which are frequently encountered in data analytic situations and to establish a series of guidelines which can help the potential user of the technique in determining its appropriateness. A final chapter is included which ties the body of the dissertation together in the form of a summary and conclusions reached as a result of the work. CHAPTER II TESTING FOR HOMOGENEITY MULTIVARIATE VERSUS UNIVARIATE APPROACH The mixed categorical data model of order-d is characterized by a situation in which n randomly chosen subjects or blocks from some homogeneous pOpulation are measured on an r-valued categorical dependent variable under d different conditions. The hypothesis of interest is whether the distribution of responses is the same under the d different conditions. Because it is doubtful that responses made by the same subject or by matched subjects are in- dependent of one another, a multivariate model is assumed to fit the data. The data for this model are represented in an r x r X...x r contingency table of d dimensions. The set of d responses made by each subject is counted as a single observation in the contingency table. The set of responses made by a subject can be characterized by a vector (jl,j2,...,jd) where jg = 1,2,...,r for g = 1,2,...,d. In all there are rd such possible vectors or modes of response which characterize the rd celled contingency table. Let P31’32""’jd represent the probability of a response profile (jl,j2,...,jd) or equivalently that of an observation falling into cell (j1,j2,...,jd) of the contingency table. A multinomial distribution with parameters {P } and n is assumed jl’j2""’jd 15 16 to fit the cell frequencies {n. } with Jl)j2”°°3jd Z (31.12.....jd P. . . O a d JlstsoootJd) ’ n 9 P , = ) 31.32.....Jd (H) X n, = n . (j1,j2,...,jd) Jl’j2’°°°’jd The distinction between a univariate and multivariate approach is a consequence of the manner in which an observation is defined. A univariate approach would count the d responses made by each of the subjects as d separate observations. In all there would be nd observations which could be represented by a d X r contingency table. Such a model is appropriate for a design in which the d responses made by a subject are assumed to be mutually independent. As was indicated in Chapter I the chi-square test of homogeneity for this design is the appropriate procedure to test whether the distribution of responses is the same under the d dif— ferent conditions. This is contrasted with the multivariate approach which views the d responses made by each of the subjects as a single response profile which is characterized by a response vector of d components. It is the response vectors which serve as the conceptual units of observation. Under the multivariate approach there are thus n and not nd observations. The n observations can be represented in an r X r X...X r contingency table of d dimensions. Such a model is appropriate for a design in which the d responses made by a subject are assumed to be correlated as is often the case with the repeated measures or matched—sample data. 17 In Chapter I it was pointed out that when observations over the repeated measures are correlated, estimates of the proportions which characterize the d distributions of response are also correlated. The multivariate approach examines the joint distribu- tion of responses over the repeated measures. A test of whether the distribution of responses is the same under the d different con- ditions reduces to testing whether the marginal distributions in the r X r X...x r contingency table are homogeneous. Because the multivariate approach makes use of the joint distribution the nature of the correlation between the estimates of the proportions which characterize the d distributions of response can be determined. Such information could clearly not be determined using the d x r contingency table which characterizes the univariate approach. The multivariate approach, which is employed throughout the dissertation, takes account of the correlated nature of the responses by focusing on the joint distribution. Let M represent the probability that a subject is qb classified by category b under the qth condition. qu is a marginal probability which can be written as M = X . Z P for q = 1,2,... d qb ° ' = j ,j ,...,j ’ .(2_2) (jl,jz,...,jd) with jq b 1 2 d b = 1,2,...,r The formula given in (2-2) for the marginal probability indicates that all joint probabilities which correspond to reSponse vectors having response category b under the qth condition are summed. The hypothesis of interest is that the distribution of responses is the same under the d different conditions. This l8 hypothesis of marginal homogeneity can be written as Ho: M1b a M2b =...= M for b = 1,2,...,r. (2-3) A number of different techniques have appeared in the literature for testing H Each of these techniques assumes the 0. multivariate model as set forth in this chapter. The remainder of the chapter is divided into two main sections. The first section presents a number of different techniques to test HO while the second section provides a comparison among the techniques. TEST PROCEDURES Notation In presenting the different approaches to testing H a 0 common notation is used. The multivariate categorical model as described in the previous section is assumed for each of the techniques described together with the notation introduced in that section. Let P and M denote the unrestricted j1,j2,...,jd qb estimates of the cell and marginal probabilities respectively with - “hair-“dd lesjzt-ooajd n 2: x (2'4) n B j ,j ,...,j -(j1’j29'°°’jd)Withj b 1 2 d qb n For certain of the techniques it is necessary to estimate cell probabilities under the constraints of the null hypothesis. Such estimates will be denoted as with the method of 13 j1’12"“’jd estimation defined within the content of the technique presented. l9 Stuart's Statistic- The technique developed by Stuart (1955) is a large sample statistic which tests for homogeneity of two correlated distributions. The statistic is confined to the mixed categorical data model of order-2 whose data can be represented by an r x r contingency table. The null hypothesis can be written as HO: Mlb = M2b for b = 1,2,...,r. (2-5) Stuart argued that the cell frequencies have a limiting n jl’jz multivariate normal distribution. Stuart then defined the variate di - n(M21 - M11) for i = 1,2,...,r and argued that since the di are linear functions of the cell frequencies, the di will also have a limiting multivariate distribution but of rank (r—l) because of the constraint I d1 = 0. Stuart then suggestizla statistic which could be used to test H0. The statistic has the form 2 ,.-1 xS .9 r9 ‘g (2-6) with '9' = [d d lX(r—l) ,d 2d = ((6 >). r-l" __ ij (r-l>x(r-1) 1, 2,... The matrix id is a consistent estimator of the variance-covariance matrix for the random vector d_ under the assumption that H0 is true with the 6 being defined as 13 613 = Cov(di,dle0) = -n(Pi,j + Pj,i) (i f j) (2-7) 611 = Var(dilHO) = n(M21 + M11 - ZPi’i) 20 Because the vector d. is approximately multivariate normal of full rank (r-l) and Ed is a consistent estimator of the variance-covariance matrix of g_ under HO, Stuart claimed that for large n, the statistic x: has a chi-square distribution with (r-l) degrees of freedom under the null hypothesis of marginal homogeneity. Stuart then defined the rejection region for the large- sample test as the upper tail of the chi-square distribution with (r-l) degrees of freedom. Because any (r—l) of the di uniquely determine the remaining one, Stuart argued that the value of the x: statistic remains invariant under the choice of which di to eliminate in the formation of the d. vector. Likelihood Ratio Statistic Madansky (1963) developed a large sample statistic, based upon the likelihood ratio criterion of Neyman and Pearson (1928), which tests the null hypothesis of marginal homogeneity in the general r x r X...X r contingency table of d dimensions. Using Madansky's technique the null hypothesis is stated as H0: qu = Mlb for q = 2,...,d . (2-8) b = 1,2,...,r Let 3_ represent the vector of cell probabilities with P, a P. ] o (2-9) —d [Jl’j2"..’jd lxr The likelihood function for the multinomial distribution which char- acterizes the cell frequencies is 21 n I n (P jl’jz’°oa,jd jn j n31,j2,...,jd! jl,jz’...,jd jl,j2,...,jd o 319 2900., d (2-10) Luz) - n The likelihood ratio statistic is given by ME) A = ——f:- , (2‘11) LLB) The 2' vector of probabilities is that estimate of P_ which maximizes the quantity in (2-10) subject to the restrictions specified by the null hypothesis and the additional constraint A that the sum of the components in ‘g is l. The 2_ vector of probabilities is that estimate of P_ which maximizes the likeli- hood function in (2—10) subject only to the constraint that the sum of the cell probabilities is l. Madansky specified the constraints which define the null hypothesis. In all there are (d-l)(r-l) linearly independent constraints which do not depend on the added constraint that Z P = l . (2-12) 31:32:---,jdj1’j2’°°°’jd The constraints which are implied by‘the null hypothesis can be expressed as Z [ P jl’...’Jq-l’jq+l’.°° ’jd j19j29°°°9jq_19b9jq+l,ooo,jd (2-13) 1 = 0; b = 1,2,...,r-l - P b’j ,...,j ’j ’j ,...,j 2 -l 1 +1 d q q q = 2,...,d . The (d-l)(r—l) constraints given in (2-13) are equivalent to the statement that qu = Mlb for q = 2,...,d and b = 1,2,...,r—1. 22 But this, together with the constraint given in (2-12), implies the null hypothesis of homogeneity as stated in (2-8). Maximizing the function L(P) in (2-10) under the hypo- thesis of homogeneity is equivalent to maximizing the function H(_1?_)= (11:32::---.Jd)njl’j2’°°"jd log Pj1’j2"°°’jc1 (2-14) subject to the (d-l)(r-l) + 1 constraints given in (2-12) and (2-13). If the Lagrangian multipliers A0 and uqb (with q a 2,...,d and b = l,...,r-l) are introduced, the problem re— duces to finding that value of P_ which maximizes the function d r-l L* (P) = H(P) - A0 [ljl’.t.’de ) jl’°°°’jd - {J- qzzz b: 1 “qb¢qb (2-15) where ¢qb ' 2 'Pj1,j2,...,jq_1,b,jq+1,...,jd j1,...,jq_l,jq+l,...,jd . . ] b’jz!‘"3jq_l’jl’Jq+l”°”Jd * Differentiating L (P) with respect to each of the P . . — jl’j2’000,Jd and setting the derivatives equal to zero, Madansky obtained rd equations of the form n 11’...,jd d - X0 - E (u . - u . ) = 0 . (2-16) P j q=2 qu qu 11,090, d The estimates of the {Pj j } can then be expressed in the l’°°°’ d form 23 - j .-.-.j P a 1 d 11,...,3d d (2-17) A + 2 (u - u ) q=2 q In order to find the actual values of the { } the Pj1,...,jd corresponding expressions given in (2-17) are substituted into the constraint equations (2-12) and (2-13). Since the resulting equa- tions are nonlinear in the unknown Lagrangian multipliers an iterative method is needed to solve these equations. Madansky described a linear approximation method for handling the problem. A detailed algorithm is also given by the author. The resulting P jl’...’jd which are obtained using this procedure are the maximum likelihood estimates of the Pj j under the condition specified by the 1’...,d null hypothesis of marginal homogeneity. The unrestricted maximum likelihood estimates of the Pj 1 subject only to the constraint given in (2—12) are the 1.0.0, d observed relative frequencies given by P, . = . (2-18) 31,...,jd n (2-19) The value of 1 always falls between 0 and l. A small value of 1 indicates that the data provide evidence that the null hypothesis is false. A test statistic of the form X; = -2 log 1 (2-20) 24 was given by Madansky. Large values of x; lead to a rejection of the null hypothesis. For large sample sizes the Xi statistic is approximately distributed as a chi—square random variable with (d-l)(r-l) degrees of freedom. Minimum Discrimination Information Statistic Ireland, Ku, and Kullback (1969) considered the problem of testing for marginal homogeneity within the context of the mixed categorical data model of order—2. The authors proposed a statistic based upon minimum discrimination information estimation (MDIE) of the cell probabilities of an observed r X r contingency table under the null hypothesis of marginal homogenity. Before discussing the actual procedure used by the authors for the particular problem under consideration, a more general dis- cussion of the MDIE technique and how it can be used to produce estimates of cell probabilities under a general null hypothesis is given. An associated hypothesis testing procedure is also dis- cussed. Let { } characterize the cell probabilities in an P 119.12 r X r contingency table such that 2 P = 1, P > 0 j = 1,...ji; j = l,...,r . (2-21) . , 1 2 j1,j2j1"2 j132 Such a table will be denoted as the Pftable. Let {Pj j } be 1’ 2 the observed set of pr0portions based upon n observations from an r X r contingency table whose classifications are common to the Pftable and such that A Z , P. > Jl’jz jl’jz P. . = 1 o ' = 1,...,r; = l,...,r. (2-22) 25 Such a table will be referred to as the Eftable. A distance-like measure from the Pftable to the Ettable known as the discrimination information denoted by I(P}E) can be defined as Pj j I(§;§) = 2 Pj j log ;_l__2_ . (2-23) j19j2 l, 2 P. c 31932 The discrimination information is a distance-like function in the sense that (i) 1(3)?) ll 0 (2-24) (ii) I(_P:_l_3_) > o for £753 . It should be pointed out however that in general the values I(P}E) and 1(23P) are not the same. Let W represent the family of all Petables which satisfy the constraints of some null hypothesis H The method of MDIE 0. then consists of finding that §_e W which minimizes the dis- crimination information in (2-23). In a sense the technique con- sists of finding that {stable which satisfies the constraints of the null hypothesis and at the same time most closely resembles the observed table 2, The P_e W which is closest in distance ~ to the observed table will be denoted as P, The {Pj j } are l’ 2 then the MDIE of the cell probabilities under the constraints of the null hypothesis H Taylor (1953) has shown that estimates 0. obtained in this manner are BAN in the sense of Neyman (1949). A discussion of BAN estimators is given in more detail later in the chapter. 26 In order to test a null hypothesis a statistic of the form P 2 ~ j1’32 X = 2n 2 P log-—-—-— (2-25) 1' 2 j1,j2 can be used. Using the results of Kullback (1968) and Neyman (1949), it can be shown that the statistic Xi has a limiting chi-square distribution under the null hypothesis. The degrees of freedom are given by the number of linearly independent con- straints put on the components of P_ which are specified by the null hypothesis H0 and which are independent of the constraint that the sum of the components in P_ is l. The Xi statistic was called the minimum discrimination information statistic (MDIS) by Ireland, Ru, and Kullback (1969). Large values of the statistic indicate that the observed probabilities differ considerably from those cell probabilities which would be expected under the null hypothesis. Hence large values of Xi would reject the null hypothesis. Now consider the Specific problem investigated by Ireland et a1. (1969) who were interested in testing for the marginal homogeneity of two distributions in an r X r table. The procedure which these authors used follows the general methodology which has just been described. The specific null hypothesis under consideration is HO: Mlb = M2b for b = 1,2,...,r . (2-26) The linearly independent constraints specified by H0 which are independent of 2 Pi j = l are then expressed as i,j ’ r r 131 Pi,j = kil Pk,i for 1 = 1,2,...,r-l . (2—27) The first part of the procedure consists of finding that P_ vector whose components satisfy the constraints given in (2-27) and which minimizes the discrimination information P. . 1(3 2) = 2 Pi j log :%21- . (2—28) i,j ’ Pi j If the Lagrangian multipliers Y0 and o are introduced, the problem reduces to minimizing r r P r-l r r z 2 Pi j log-:iL1-+ 2 a1 2 Pi - 2 Pk 1 1-1 3-1 ’ P1 1 i=1 =1 ’1 k=l ’ (2-29) r r + y Z Z P - 1 0 i=1 j=l ij . Ex ressions for the P can be 1.3 p ' 1.1} obtained in terms of the unknown Lagrangian multipliers Y0 and with respect to the P a1, 1 - l,...,r-l by differentiating the expressions in (2-29) with respect to each of the P1 3 and solving the resulting r2 3 equations together with the constraint equations given in (2-27) and by X P = 1. i. 1.1 3 Rather than solving for the unknown Lagrangians the authors developed a convergent iterative procedure which leads directly to the estimates {P1,j}. Approximations to the {P1,j} can be achieved to any desired level of accuracy. A proof that the approximations do in fact converge to the desired {P1,j} was also given by the authors. Once the {Pi j} are found, or at ’ least approximated, the MDIS to test the null hypothesis (2-26) 28 is given by 2 ~ P. XI - 2n 2 P1 , log-£L1 . (2-30) .3 ‘ 1.1 Pi,j Values of Xi larger than the (1-o)th quantile of the tabled chi-square distribution with (r-l) degrees of freedom will reject the null hypothesis of marginal homogeneity for tests performed at the a level of significance. Weighted Least Squares Approach Koch and Reinfurt (1971) derived a statistic which tests for homogeneityof the marginal distributions within the context of the general case of the mixed categorical data model of order—d. The development of the test statistic makes use of a general methodology given in Grizzle, Starmer, and Koch (1969). This general methodology involves the derivation of test statistics in terms of weighted least squares analysis of certain linear models. A brief outline of the general methodology given by Grizzle et a1. (1969) is dis- cussed, followed by the Koch and Reinfurt (1971) adaptation of the technique to the problem of testing for homogeneity of the d marginal distributions in the r X r X...X r contingency table. Let P_ represent the vector of cell probabilities which characterizes the r X r X...X r contingency table of d dimen- sions and let .2 represent the corresponding vector of observed proportions. Let Fm(§) be any function of the elements of .2 that has partial derivatives up to the second order with respect to the {P }, m = 1,2,...,u. Also let jl’ooo’jd 29 Fm(P) = chg) evaluated at P_= P_ [E (2):" = (Fig). F2(.P_):°--:Fu(£)] (2-31) qu 1: (23' =- [5(2), F2<§),....Fu<§)1 . uxl It is assumed that the functions Fm(P), m = 1,2,...,u are linearly independent of one another and of the constraint that the sum of the components of the 2_ vector is 1. Let : (.11)= x .13. , <2-32) qu uXv le where X is a known matrix of coefficients and ‘B is a vector of unknown parameters. A general equation of the form given in (2-32) can be used to specify a linear model which is hypothesized to characterize the data. Grizzle et al. (1969) prOposed a statistic to test the fit of the data to the linear model proposed in (2-32). The test statistic is given by 2 _ "’ I "1 "_ _ "' _ XGSK - (3(3) — X g) 8 (2(2) x 3;) (2 33) where S is the sample estimate of the variance-covariance matrix of F(§) and ‘B is that value of the §_ vector which minimizes the quantity (21(2) - x y's’lmé) - x a) . (2—34) The estimate of the vector of unknown parameters which is given by E. is the weighted least squares estimate and the test statistic is simply the residual sum of squares or sum of squares error due 30 to lack of fit. If X is of rank v, the test statistic XGSK has approximately a chi-square distribution with u — v degrees of freedom for large sample sizes under the assumption that the model in (2-32) fits. Large values of the statistic provide evidence that the model does not fit. Consider the Koch and Reinfurt (1971) adaptation of the general weighted least squares methodology to test for homo— geneity of d marginal distributions in a mixed categorical data model of order-d. The null hypothesis of marginal homogeneity can be stated as H : M = M =...= M for b = 1,2,...,r . (2—35) In terms of a linear model approach the hypothesis in (2-35) could be stated in the form~ Ho: qu = Bb, q = 1,2,...,d; b = 1,2,...,r-l (2-36) where the Bb are unknown parameters which are estimated from the data. The linear model can be written in matrix terms as 31 .. 1 F” ‘7' - '- M11 1 O 0 B1 M21 1 O 0 B2 Mdl 1 0 O B M12 0 1 O L r-I‘ M22 0 1 O ... = . . . (2-37) MHZ O l O er-l O O . . . 1 M 0 O 1 2r-l Mdr-l 0 0 . . . l ‘b d — _ P. (P) = x a - d(r-l)x1 d(r-l)x(r-l) (r-1)x1 Testing the null hypothesis of marginal homogeneity as given in (2-35) can be accomplished by testing the fit of the model given in (2-36) to the observed data. A significant test of fit statistic (2-33) indicates that the hypothesized linear model in (2-36) does not fit the data. This is equivalent to rejecting the null hypo- thesis of marginal homogeneity as is stated in (2-35). The test statistic X2 has an asymptotic chi-square null distribution GSK with d(r-l) - (r-l) = (d-1)(r-l) degrees of freedom. 32 COMPARING TEST PROCEDURES This section of the chapter is devoted to a discussion of the relationships among the four procedures presented in the pre- vious section. The relationships deve10ped are, in some cases, based upon modifications and elaborations of previous research. Before any direct comparisons are made among the four procedures under consideration, a series of theoretical results are presented which serve as the basis for the subsequent comparisons. Throughout this section the symbol P will be used in 1 place of P for the sake of notational simplicity and the j17"°dd multinomial probability model will be assumed throughout. Neyman and BAN Estimators The mixed categorical data model of order-d, as considered in this dissertation, assumes an r x r x...x r contingency table of d dimensions whose cell frequencies are characterized by the probability distribution ¢ =" HP 1 (2-38) 2 n11 i ‘1 such that Z P = 1 and 2 n = n. 1 1 1 1 Let fmag = o , m = 1,2,...,t (2-39) define t linearly independent constraints on the components of g_ which are independent of the constraint 2 P1-= 1. It is .1. assumed that fm(2), m = 1,2,...,t,possess continuous partial derivatives up to the second order with respect to the Pi. and 33 that there is at least one solution such that P > 0 for all 1. l The problem of estimating the components of the g_ vector given the model in (2-38) and the constraints (2-39) was considered by Neyman (1949). Neyman defined a class of estimators known as best asymptotically normal (BAN) estimators which possess the following properties: ~ If P is a BAN estimator of P then i l (i) P is a consistent estimator of the parameter P.. This means i that as n, the sample size tends to infinity, the estimator P approaches the parameter P 1 1° (ii) Jh<§l.- Pi) is asymptotically normal with zero mean and asymptotic variance 02, where 02 is independent of n. This means that as n tends to infinity, for any real number t, 'fna‘ -P.) l l P o < t a p{Z < t}, where Z ~ N(0,l) . (2-40) ~* (iii) If P1. is any estimator of P1. satisfying (i) and (ii) * but with 0 taking the place of o in (ii) then 0 2 o . (2-41) (iv) P1. has continuous partial derivatives with reSpect to the observed proportions. Neyman (1949) showed that BAN estimators for the components of 2_ under the required restrictions can be obtained by either minimizing the quantities 34 (n - nP )2 (nl.- nPi)2 (a) z - or (b) )3 (2-42) 1 1 1 1 or by maximizing the quantity n- (C) 2'“ , 11 Pi 3- (2-43) 1 1 ' with reSpect to the Pl. under the given constraints. Estimates obtained using criterion (a) are called minimum chi-square estimates. Estimates obtained using criterion (b) are called modified minimum chi-square estimates while those estimates computed using criterion (c) are called maximum likelihood estimates. Maximum likelihood estimators constitute a subclass of BAN estimators. The motiva- tion for Neyman's (1949) paper was to find a class of estimators more general than that of maximum likelihood but possessing the same desirable asymptotic pr0perties as maximum likelihood estimators. It was h0ped that in certain cases some of the BAN estimators would be easier to compute than maximum likelihood estimators. Neyman showed that if the fm(g), m = 1,2,...,t given in (2-39) are linear in the P then a set of modified minimum chi-square 1: estimates could be obtained by solving only a system of linear equa- tions. Neyman demonstrated that a null hypothesis defined by the t linearly independent constraints given in (2-39) can be tested by using either the chi-square statistic, modified chi-square statistic, or likelihood ratio test statistic defined reSpectively as (n - n1’.)2 2 x =2 ., P. i. n 1 and (n - n? )2 X2 = X l 1 (2-44) 1 . n. 1 1 and as -2 lo ,1 = 2 z n,(lo n, - log n P, . g l. l g l. l? Neyman proved that each of the statistics in (2-44), using any set of BAN estimators defined in (2-42, 2-43) {%l.% has a limiting chi- square distribution with t degrees of freedom under the null hypo- thesis as the sample size, n, approaches infinity. All three test criteria given in (2-44) are consistent for the null hypothesis being tested. That is under any admissible form of the alternative the power of each of the procedures tends to one as n approaches infinity for every fixed level of significance. Neyman showed that the three test criteria in (2-44) are asymptotically equivalent in the sense that the probability of the respective tests’ contradicting each other tends to zero as n approaches infinity for every admissible hypothesis Specifying either the null or alternative. The Neyman results were reported by Bhapkar (1966). Mitra (1958) has shown that under a suitable sequence of alternatives tending to the null hypothesis at a suitable rate, the x2 statistic given in (2-44) has a limiting noncentral chi- square distribution if the {P1} are maximum likelihood estimators. Bhapkar (1966) conjectured that Mitra's results should hold for any of the statistics in (2-44) using any system of BAN 36 estimators. This conjecture was based on the asymptotic equivalence of the chi—square statistics in (2-44) and the fact that any BAN estimators possessthe asymptotic properties of the maximum likeli- hood estimators used by Mitra in his proofs. The results of Neyman discussed in this subsection together with certain results which are now discussed serve as the basis for comparisons among the four procedures outlined in the second section of this chapter. Some Results by Bhapkar In this subsection, a major result by Bhapkar (1961) which links the modified minimum chi-square estimators to a quadratic form of the unbiased estimators of the fm(P) is presented. The result serves as a theoretical basis for the Grizzle, Starmer, and Koch (1969) methodology which uses a weighted least squares approach. The result also provides some common ground for comparing the four procedures presented in the second section. Let the probability model in (2-38) be assumed. Also assume that n is large enough so that n1.> 0 for all 1, Consider a null hypothesis defined by t linearly independent constraints on the {P } (independent of Z P = l) of the form 1 J 1 Ho: fm(P) = Z fml 71.: 0 m = 1,2,...,t . (2-45) .1 The fml are known constants such that the equations in (2—45), together with Z P = l, have at least one set of solutions for l which the P 's are all positive. .1 37 Let f = f (P) =‘2 f , P m = 1,2, ,t m —' . mu. 1. 1. if = [E ,E ,...,E 1 (2—46) 1x: 1 2n c ‘ 1' f 1 ' with Pl.- ;- or al 1.. A The fm’ m - 1,2,...,t, are the unbiased estimates of the correspond— ing fm (P) . Using the multinomial probability model defined in (2—38) and the linearity of the covariance operator it can be shown that Cov(£,£.)=lzf f, P.(1--P)—l 2 f f,,,PP, m m n1 mjmi l j_ nfij'l' mjml ii (247) [H .. 1 _ .. - “1151,1333“.l P1 n fm(P>fm.(P> ¢mm. Let the variance-covariance matrix of f_ be denoted as ¢ where ¢ = ((¢ ,)) m,m' = 1,2,...,t . (2-48) 11’“: mm Let G denote the sample variance-covariance matrix of f. formed by replacing the P1 in 4 by their respective sample estimates P with l. G = ((gml)) m,m' = 1,2,ooo’t a (2-49) txt The G matrix is a consistent estimator of ¢, the true variance— covariance matrix of f, both under the null hypothesis and alternative. 38 Theorem 2.1. Min x2 = if G_1 £_ . (2-50) Subject to H0 The expression on the left side of the equality in (2-50) represents the value of the modified chi-square statistic (2-44) used in conjunction with modified minimum chi-square estimators, computed under the constraints of the null hypothesis. Theorem 2.1 shows that the xi method to test the linear hypothesis in (2-45) is algebraically equivalent to a test statistic based upon the asymptotic normality of the unbiased estimators of the fm(P), whose variance-covariance matrix is estimated by the sample variance— covariance matrix. The quadratic form in (2-50) bears some re- semblance to Stuart's statistic (2.6). The explicit relationship is specified later in this chapter. A modification and elaboration of the proof of Theorem 2.1 originally formulated by Bhapkar (1961) is given in Appendix A. Bhapkar's result given by Theorem 2.1 serves as the theoretical basis for the Grizzle, Starmer and Koch (GSK) methodology presented earlier in the chapter. The GSK methodology is concerned with testing the null hypothesis that a specified linear model characterizes the data: H : g: (P) = x 3; (2-51) qu uXv vxl where X is a prespecified matrix of coefficients with full rank v S u, and B. is an unknown (v X 1) vector of parameters. Bhapkar (1966) stated that there exists a [(u — v) X u] C matrix of full rank which is orthogonal to X such that 39 C _F(_I_’_) = C X I_3_ = _(_)_ . (2-52) (u-v)X1 The model in (2-51) implies the u - v constraint equations given in (2-52). These constraints represent u — v linearly independent constraints on the components of the 2. vector. As a concrete illustration of this abstract argument con— sider the case of testing for marginal homogeneity in an r X r table where r = 3. The null hypothesis can be formulated in terms of testing the fit of the linear model when the Koch and Reinfurt (1971) approach based on GSK methodology is used. M11 0 Bl M21 1 0 B2 (2-53) M12 = o 1 Jazz] _o 1— £(_1:>= x 2 4X1 4X2 2X1 If the C matrix is taken to be C = l -l O 0 2x4 0 0 1 -1 C and X are orthogonal, C X = 0 2x2 and the linear model in (2—53) gives rise to the constraint equations C 131(3) = 9 which can be written as (2—54) For this illustration testing for marginal homogeneity can be accomplished either by testing the fit of the model in (2—53) using the'weighted least squares approach as given in (2—32 - 2—34) or by testing the null hypothesis as it is formulated in (2-54) in terms of linear constraints placed upon the components of the '2 vector. The constraint equations given by (2-54) can actually be written out in terms of the components of P_ by writing the marginal probabilities in terms of the Pl. as given in (2-2). Testing the null hypothesis as it is formulated in (2—54) can be accomplished by using any of the techniques given in (2-44). Bhapkar (1966) showed that testing a null hypothesis of the form given in (2-52) using the statistic with modified minimum X1 chi-square estimators computed under H0 results in a test statistic which is algebraically equal to the statistic used to test the fit of the corresponding linear model given by (2-51). Bhapkar thus demonstrated that by thinking of the null hypothesis either in terms of a linear model (2—51) or in terms of a group of constraint equa- tions which the linear model defines (2-52), two algebraically equi- valent test procedures result. If the null hypothesis is defined in terms of a linear model H: 1; (_P_) = x p (3-55) qu uXV VXl the corresponding statistic to test H is given by the sum of 0 squares residual 41 2 , ~ , —1 - ~ xGSK = (5(2) - x 13) s (21(3) - x _I_3_) (2—33) with _E a weighted least estimator of B. (2-34) and S a con- sistent estimator of the variance-covariance matrix of F(P) under both the null and alternative hypotheses. If the null hypothesis is defined in terms of the constraint equations H o 0. f (P) = C F(P) = 0 (2-52) (u—V) x1 — (u-v) Xu (Ll—3) x1 which are induced by the linear model, then it can be tested using the statistic ~ 2 - P 2 (n1. n 1) x1 = 2 n (2-43) .1 i where the {P } are the modified minimum chi-square estimators .l computed under the constraints given in (2~52) and the additional constraint that Z P a 1. l Bhapkar's (1966) result states that the statistics XGSK and xi are algebraically equal whenever the latter is defined. Each has a limiting chi-square distribution with u - v degrees of freedom under the null hypothesis. In addition by Theorem 2.1 the result 2 ...“! XGSK E' G ’1 g (2—56) follows. The (u - v) X (u - v) matrix C is a consistent estimator of the variance-covariance matrix of £_ under both the null and alternative hypotheses with 42 G = C S C' . (2-57) xi Statistic and Maximum Likelihood Estimators It was shown by Theorem 2.1 that if a null hypothesis of the form (2-45) H : f (P) = E f , P, = 0 m = 1,2,...,t 0 m -' 1. ug_ j_ is tested using the statistic in conjunction with modified 2 X1 minimum chi-square estimators computed under the null hypothesis, a statistic results which is equal to the quadratic form =i'G f with 3 defined by (2-46) and 0 defined in (2—47 - 2~48). Define = ' = OG ((Ogmlil')) 111,111 192200.”: (2-58) with g , = 1-2 f ,f , P, 0 mm n l. ml,m j_ j_ The 0G matrix is a consistent estimator of the variance—covariance matrix of £1 provided the null hypothesis is true. From (2-47) and the definition of 0G in (2-58) the relationship between G and 0G can be written as G= G-iEE'. (2—59) 0 “—— In the computation of 0G, the zero vector is taken as an estimate of :3 which results in a consistent estimate only under the con- dition that the null hypothesis (2-45) holds. 43 Theorem 2.2. If the null hypothesis H0: me) glfmi Pl= o m = 1,2,...,t is tested using the xi statistic with approximate maximum likeli- hood estimators computed under H0 then 2* X1 -1 a W ‘ _ £00 E (2 60) k where xi is used to denote the modified chi-square statistic computed with approximate maximum likelihood estimators. The proof of Theorem 2.2, which is an extension and elaboration of a proof given by Bennett (1968), appears in Appendix B. It can be shown that when £_ is taken to be the vector d_ defined in (2-6) the quadratic form in (2-60) is the test statistic given by Stuart (1955) which was the first procedure in- troduced (2-5 - 2-7) to test for homogeneity of marginal distribu- tions. Explicit Relationships Among_Statistics The four statistics discussed in section two of the chapter are now compared from the standpoint of the research and results 2 S statistic of Stuart (1955), the Xi statistic of Madansky (1963) cited in (2-38 through 2-60). Under consideration are the X based upon the likelihood ratio criterion, the Xi statistic of Ireland et a1. (1969) based upon minimum discrimination information estimation, and the X2 GSK statistic of Koch and Reinfurt (1971) based upon fitting a specified linear model by weighted least squares. 44 There is an algebraic relationship between the X2 and GSK x: statistics. For the case of marginal homogeneity of the r X r table the linear model can be specified as _ ‘ F111 y 1 o 0 51 I M21 1 0 . . . 0 B2 M12 = 0 l . . . 0 . M O l . . . O , 22 H0. . . J-BrTlJ (2-61) M1r_1 o o . . . 1 M O O . . . l 2 -1 nrJ -— A using the Koch and Reinfurt approach. This model leads to the re- sulting constraint equations M21 " M11 = 0 “(M21 " M11) = 0 M22 - M12 = 0 or equivalently n(M22 - M12) = 0 (2-62) Ho: MZr-l " er-l = 0 n(M2r-l " Mir-l) = 0 Let d1 = n(M21 - M11), 1 = 1,2,...,r-1 5y =[d,d,...,d_] lX(r—1) l 2 r l A _1f with 'f_ taken as d_ and the 2 "v The result that XGSK .f_ G relationship between G and CG given in (2-59) with 0G taken as 2d defined by (2-7) can be used to write the xéSK statistic to test H0 as 45 2 = GSK X é'tid vigil-15; (2-63) For notational simplicity let A equal 2 In. Claim; By Ireland et al. (1969) 1 -1 -1 “A 93 A’1 [A — Ed _<_1_'] = A + —1 (2-64) [n - ng d] Proof of Claim 1 —1 Auli i'A-l Show [A - E-g_gf] A + _1 = I (2-65) [n - de d] where I is the identity matrix of dimensions (r-l) x (r-l). The product of the two matrices in (2-65) can be written as AA - H d Q'A + _1 - -l n - d'A d n(n - 51_'A _d_) —1 HI. 931-1) + 9. g'A'li'A'lg + n. s'A’l - .4. g'A‘IsI. g'A‘l =- AA + -1 n(n - de d) -1 5!. i'A-ld'A’lg. - d(d'A—1d)d'A—1 =- AA + -1 n(n — de d) -1 51 s'A'lgA’ls-L - g. _c1'A‘1 <_'A"lsl_> = AA + _1 n(n — de ‘_) - AA'1 + o = 1 (r-1)X(r-l) In a similar manner it can be shown that -1 A'lsi. g'A‘l—l 1 A + _1 [A'Eig']=1° In - i'A 9.3] Substituting the result in (2—64) into (2-63) one obtains 46 2 -1 flat 2'“ -1 g'A'lg £95151 XGSK - d' A + -l d = d'A _c_l_+ _1 In - 93A 9.] In - 93A 21 n _d'A-ld - (d'A-lg)2 + (d'A 1d)2 _d'A-ld [n - d'A-ld] _'A’1g_ 1— n “-1 I 1 2.21. .4. a —-——-—-—- ; (2—66) .1 '._1 l n g_Bg i but 4d} §;¥d_ is the Stuart statistic. Hence from (2—66) the re- lationship between XESK and x: can be stated as 2 2 X X x2 = S or x2 = GSK . (2-67) GSK 1 _ l_x2 S 1 +'l-X2 n S n GSK As n + w under H X2 + X2 although X2 2 X2 for all finite 0’ GSK S GSK S n. When n is large and the components of g_ are small the X§SK and x: statistics will be close in value. The X28K provides a more powerful test under forms of the alternative hypothesis since it uses a variance-covariance matrix whose variance terms are smaller in value than those used by Kg. In addition to the algebraic relationships given in (2-67), there is an underlying relationship among all four statistics. Each of the techniques either is itself or reduces to a large sample chi-square test based upon the use of BAN estimators. The x: statistics of Stuart can be shown to be approximately equal to the modified chi-square statistic xi (2-44) used in conjunction with approximate maximum likelihood estimators. It was also shown that the X§SK statistic of Koch and Reinfurt is equal to the modified chi-square statistic xi used in conjunction with modified minimum chi-square estimators. The x; statistic of 47 Madansky is the likelihood ratio test statistic (2-44) used in conjunction with maximum likelihood estimators. According to Neyman (1949) each of these three techniques possesses the same asymptotic properties and are all asymptotically equivalent. Ireland et a1. (1969) pointed out that by using results of Taylor (1953) and Neyman (1949) it can be demonstrated that minimum discrimination information estimators are also BAN and the Xi statistic thus belongs to the same general class of statistics as Xésx, x; and Xi. The technique of estimating a set of cell probabilities {B1} under the constraints of a given null hypothesis using MDIE consists of finding that 2_ vector whose components satisfy the null hypothesis and which minimizes the function P I(§_,_13_) = 2 P log :— . (2-68) 1 1 ‘1 If, instead of minimizing the function in (2-68), a gfvector is chosen which minimizes the function f . i. I(£,§_) = 2 Pi log 17 , (2—69) 1. J. an estimation procedure results which is also referred to as minimum discrimination estimation by Kullback (1968). The estimates found using (2-68) and (2-69) are, in general, different. When (2-69) is rewritten as n nl n I(_f:,_) = 2 —log-- - 2 --log Pl 1 i = constant - l-Z n 10 P 1 n .1 g .1 .1. n = constant1 - i-log P 1- (2-70) 48 it is seen that the value of _1_’_ which minimizes I@,3) is the same as that value of g_ which maximizes the quantity n H P 1-- L(§) + constant 1_ J. 2 (2-71) where L(£) is the likelihood function defined in (2-10). The equivalence of minimum discrimination estimation and maximum likeli- hood estimation thus follows from this last result, provided the distance-like measure defined by (2-69) is used. The technique of MDIE has appeared in the literature using both definitions of dis- tance, (2-68) and (2-69). When (2—69) is used, the resulting X2 I statistic which is given by log (2'72) Hw'lr-u’ is equal to the likelihood ratio test statistic given by (2-44) since {5 } are maximum likelihood estimators. When the distance- i. like measure defined in (2-68) is used, the Xi not the same as the likelihood ratio statistic. Berkson (1972) statistic is generally noted that the differences between the two estimation procedures defined by the two different distance functions are analogous to the differences between the modified minimum chi-square and minimum chi-square estimation procedures defined in (2-42) respectively. The point of comparison is the fact that 1(232) and 1(232) interchange the observed and estimated probabilities as do the respective chi-square procedures. Berkson then stated that the 2 X2 statistic using estimation procedure (2-68) is to the x1 I 49 statistic (2-44) as the Xi statistic using estimation procedure (2-69) is to the x2 statistic (2—44). The analogy is a conceptual rather than a mathematical statement. It is conjectured that Ireland et a1. (1969) chose to use the distance-like function given by (2-68) rather than that given by (2-69) to define their statistic because of the difficulty in computing maximum likelihood estimators for the problem of testing for marginal homogeneity. The use of the (2-69) definition would require the computation of maximum likeli- hood estimators. This somewhat lengthy discussion of the X2 I statistic and two associated estimation procedures has been given in order to re- late this statistic to the better known statistics in (2-44) which have served as the basis for comparing the techniques discussed in the second section of the chapter. It should be pointed out that the comparisons made are all based upon large sample theory. In Chapter III an explicit statement of a statistic which is algebraically equivalent to X2 GSK the advantage of not requiring a knowledge of linear models as is is given. The statistic has the case for x2 GSK' A development 0f the large sample distribution of the statistic is given. In addition post hoc procedures which are used to locate sources of significance are developed and illustrated. CHAPTER III 2 THE I STATISTIC AND ASSOCIATED POST Hm PRmEDURES 2 THE I STATISTIC In Chapter II four different techniques to test for homo- geneity of the marginal distributions in the mixed categorical data model were given. The Xi statistic of Ireland at al. (1969) and the Xil statistic of Madansky (1963) each require iterative procedures to obtain estimates of the cell probabilities computed 2 under the null hypothesis, The XGSK statistic of K061”! and Reinfurt (1971) does not require the estimation of the individual cell proba- bilities but does require the estimation of a vector of parameters Specified by the linear model which is to be fitted to the data. In the Koch and Reinfurt procedure the parameters of the model are esti- mated using a weighted least squares technique and the estimates are computed using a straightforward chain of matrix multiplications and 2 inversions. Although the XCSK statistic is less computationally 2 difficult than either the x: or XM statistics, it does require a certain knowledge of linear models and weighted least squares analysis which may not be possessed by potential users of the technique. The statistic develOped in this chapter has the advantage 2 2 of both computational simplicity relative to the XM and XI 2 statistics and conceptual SimpliCity relative to the XGSK SO 51 statistic in that a knowledge of linear models is net required in order to understand the technique. If the null hypothesis of marginal homogeneity is stated in terms of t linearly independent constraint equations, each linearly independent of the constraint 2 P = l, and is written in the form 1 i H : f = 2 f , P, = 0 for m = l,...,t (3-1) 0 m - ml. 1 l. a statistic to test HO can be written as 2 A " A x = g} G 1 g with g, and c as defined in (2-46) and (2-49) of Chapter II respectively. The structure of the null hypothesis and associated test statistic in (3-1) serve as the general framework for develOp- ing an explicit statistic to test for marginal homogeneity in the mixed categorical data model of order-d. Model and Hypothesis Throughout the discussion it is assumed that d repeated measures are taken on each of n Subjects and the measures are based on the same r-level categorical dependent variable. For notational simplicity the numbers 1,2,...,r are used to index the levels of the dependent variable. The data from such.a design can be represented as an r x r x...x r contingency table of d dimensions with a multi- nomial distribution assumed.to fit the observed set of rd fre- quencies. Any cell in the contingency table can be represented 52 as an ordered d-tuple each of whose components represents a particular level of the dependent variable on a particular measure. In general any cell can be represented in the form (jl’j2’°'°’jd)3ji=1’2)000,r Vl=1,2,...,d. Following the notation introduced in Chapter II let nj j denote the number of observations which fall in cell 1,...’d (‘11,."’jd)' Let Pj1,0'0,jd ject's set of reSponses falling into cell (j1,...,jd) denote the probability of a sub- with the Let M denote the marginal cell probabilities summing to one. qb probability that a subject is classified by category b on the qth measure with ll H D. M = 2 . . . X P , q .. qb j ,...,j _ (11,---,jd) 1 d b = l,...,r (3 2) with j = b q An unbiased estimator of P . is jl’°°°’Jd n . A .11’ ’Jd = (3-3) 31, "jd n and an unbiased estimator of “db is ‘ — . . . i M. 2 z ...,j qb _ j : (31,---.jd) 1 d with j = b q I‘ r with 2 M, = 2 fi = 1 Vq = l,...,d b=1 qb b=1 qb Define 53 _ 1 d a 1 d . M =— 2; M , M =- >3 M Vb=1,...,r. (3-4) .b d b .b d b (181 q q=1 q b and M‘b and the constraints on the marginal probabilities given in (3-3) it follows that Using the definitions of M r r A b:1(qu - Mb) = b2:1 (‘qb - Mb) = o vq = l,...,d (3-5) and d _ d . 3 q111(qu - MOb) = (1:104 b - Mob) = o Vb = l,...,r. (3-6) The null hypothesis of equality of the d marginal dis- tributions can be stated as H : M1b = M2b =...= M Vb = l,...,r or equivalently as H0: Y. 9. (3-7) (drxl) (drxl) where I . -'— - -'_ = - Yb [Mlb M.b.M2b 110b,...,Mdb Mb] Vb l,...,r (3 8) (lxd) and The correSponding sample estimates are given by .' A L. ; .... - - M - Yt [Mlb M.b 3 M21) .b g o o o ’fidb .b] ZP’ (3'9) and 54 In the next subsection a statistic for testing H !_= 0 o‘ — is developed and its large sample distribution is derived. The test statistic is a quadratic form which involves a reduced form x .* of the vector y, !_, and the sample variance-covariance matrix Zh* which is a consistent estimator of 2A*. The test statistic V has the form 2 .*' ._ .* I = l 2,3. X (3‘10) which is shown under H and for large n to be approximately 0 chi-square with (d-l)(r—l) degrees of freedom. Large Sample Distribution of 12 Under H0 Because each subject is measured d times, the outcome for subject 1 can be represented as falling into cell (j11,j21,...,jd1). Define a set of Bernoulli random variables on the sample space of outcomes as follows: xa1.a2.-...ad((jl’12’°”dd” = 1 if ak =51k Vk = l,...,d = 0 otherwise . (3-11) In all there will be rd such Bernoulli random variables with one being defined for each value of a a - 1,2,...,r and k’ k k - 1,2,...,d. One can now represent the d responses for each subject 1 as a vector of the form 55 P —‘ x1,1... . ,1((311’321’°” "1:11” x1,l,...,2((jli’jZi"°"jdi)) . X1,1,...,r((jli,121’...’jdi)) (3’12) rXm . .fifl xj1,j2’° ° ° ’jd ((j119j21" " 9jd1)) xr,r,...,r((jli’j21’°°"jdi)) _J The vector in (3-12) contains rd - l zeros and a single 1 which indicates the cell into which subject i's set of observations falls. The random vector X has a multinomial distribution with _1 parameters {P. } and n = l. j1,...,jd Let P' = P d — d [jls'°°9jd] an 1Xr (3—13) 13' =[13 1 . - j ,...,j lxrd l d By the definition of ‘xi, 2_ can be written in the form A 1 n §_= - Z X (3—14) n 1‘1 -d By the definition of the mixed categorical data model it is assumed that the n subjects are a random sample from a homogeneous popula— tion. Then it follows that gi,_2,...,§n are independent, and identically distributed. By a form of the multivariate central limit theorem given in Rao (1965) 56 .4; (13 - 1:) 421 ~ N (1(9- 2) (3-15) r where 2 - 2x . For a fixed n, where n is taken large, it _1 follows from (3-15) that .2. is approximately N d(P,-%-Z)- and r this approximation improves as n approaches infinity. Because a multinomial distribution is assumed to characterize the cell frequencies - 2 3! (3-16) (rdxrd) d where DP is an (r x rd) diagonal matrix with elements of the 2' vector on the main diagonal. Let I B A 3 Eb [Mlb’ fiZb’°°" Hdb] Vb l,...,r and lxd (3-17) _11' ' [21" fi'900-9 fi'] ° IXdr 2 "T It is possible to find a matrix K d’ the elements of which are zeros and ones such that K E = _ffrxi‘his can be done because M: is a vector of estimates of marginal probabilities and E_ is a vector of estimates of joint probabilities which, when summed appropriately, yield the marginal estimates. The following proposition from multivariate statistics is used to derive the approximate sampling distribution of :2 for large n. Proposition 1. If Y ~ N (u, 2 ) and if L is a matrix of - k‘- Y le - ka constants then LX_~ Nm(Lu, L ZAL'). Y Using the facts that P ~ N (1(2, 2?) for large n, and K}: = _M r _— from Proposition 1 it follows that the distribution of E; can be approximated by r<> HI> It is now possible to find a matrix - A i where 1 is the vector defined by (3-8). Matrix A, which is a block-diagonal matrix, has the form By Proposition 1, the distribution of §_= A‘fl_ can be approximated by a multivariate normal of the form Ndr(A L g, A L X is approximately distributed as 2 ~ N (Q, ’3)» with 2 dr V X_ the basis for constructing a statistic to test HO: !_- equivalently to test H Proposition 2. 57 AL') . (3-18) A erdr such that "U H l where D = ((d )) d = - —' for i f j 0 dXd ij ij d I ad—Zl- for i=j (3-19) d r_ Vm= 1,2,...,r . AL'A') when "U A Because !_ is an unbiased estimator of V, the null 09 n is large. hypothesis _V_ - Q can be stated as Ed) - 9; hence under H (3-20) . = A L ZAL'A' ['11 The following result from multivariate statistics provides 0 or EQZ) = _C_)_: is of full rank, then Y 0: ~ Nk(u_, £3.) and Z— If g kxl -l - ' - ~ (I. .2) g! C! .2) Xk' Let BA denote a sample estimate of Z. found by replacing V V in EA by 2, Because 2. is a consistent estimator of Z. X. .2 is given by V 0 a statisfic to test B 58 12 - @-2>'i:1@-2> = 'i'1 1 i I<> P<> (3-21) The statistic in (3-21) relies on (3-20) and the substitution of 2:1, a consistent estimator of 2:1. If EA were of full rank 1 y. 2 y. then 1? would have an approximate x distribution with dr degrees of freedom under H Large values of I? would provide 0' evidence against H and hence the upper tail of x2 could serve 0 as the rejection region. The only drawback to the preceding argument is that Z“ and 5‘ are both of deficient rank. V The rank of 2. is determined by the number of linearly V . independent components in the random.vector .23 The components of I<> j; are themselves random variables which can be thought of as elements in the inner product space L , the space of all random 2 variables with finite second moments with an inner product defined as follows: VU1,U e L , (Ui’U ) . (3-22) j 2 ) = Cov(U1,U J 3 Claim; There are (d-1)(r-l) linearly independent components in 2, The legitimacy of this claim rests on the restrictions d introduced in (3-5 and 3-6). The restriction Z (qu q-l Vb - l,...,r implies that r linearly dependent vectors — M.b) a 0 are introduced, one for each value of b. For a fixed value of b, d . once any d-l terms in 2 (qu - M b) are known, the other term qu ' . is determined and redundancy exists in the definition of ‘2, The r A restriction Z (qu - Mib) = O ‘Vq = l,...,d implies that d b-l linearly dependent vectors are introduced, one for each value of A r q. Knowing any r—l terms in 2 (M - M b) determines the b-l qb 59 remaining term. Again, a redundancy is introduced. In all, d1+ r - l redundant components are introduced since one component is counted twice. Consequently there will be dr - (d + r - 1) = (d - l)(r - l) linearly independent components in i} No information will be lost be deleting any (d + r - 1) redundant components. The goal is to form a new vector ‘fif from i; by choosing any (d-l)(r-1) linearly independent components .. * .9: from 1; y_ will be approximately 1 ~ N(d-l) (r_1)(0, 26*) under Ho since any marginal distribution of a multivariafe normal distribution is itself multivariate normal. The matrix 1A 2‘* is then of full rank and for n large, 25* will also be of V V full rank. As a consequence of Proposition 2, a statistic '12 can be defined which, for large n, is approximately distributed under Ho as 2 *'“-l “* 2 I Y. ZV" Y. X(d-1) (1.-” ' {3-23) 5* A The vector ‘2. is formed from. V_ by choosing any (d-1)(r-l) linearly independent components of ‘2, Define for each b, b - 1,2,...,r A*' A ..2‘. A _". A .1 x .2. Es ' [Mlb ' M.b"°°’Mmb-1 b “ M.b’ Mhb+1 ‘ M.b""’ Mdb ’ M.b] 1X(d-1) (3-24) where mb is any integer such that l S mb 5 d Vb and .*' .*' *' .*' .*' V - [_V_ ’ o o o ’Yh—l’ VH1, o o 0 "Y1 ] (3-25) 1x(d-171t-1) .* for any h tsuch that l S h S r. The specific !_ which is formed 5* is determined by the choice of the mb's and h. If 1!_ and 60 * 22. represent two different vectors formed by two different choices of (d-l)(r-l) linearly independent components from '2, then 2 .*'._1 .* .*4 -1 .* I a 12' .* l!’ — 2! 2 .* 2!_ . (3-26) 111. 2... Invariance of 12 * The invariance of 12 under mode of formation of .2 is demonstrated in this subsection. The invariance follows as a con- sequence of the following theorem. Theorem 3.1. Suppose that W' = [W1,W2,...,W ] such that fit t W1 6 L2 V i, l s i s t and also suppose that the W 1's generate a subspace S of L such that dim(S) = k where k < t. Now let 2 g} - [X1,X2,...,Xk] where .é. is formed by choosing a set of k lxk linearly independent components from ‘W_ and let If - [Y1,Y2,...,Yk] where q! is formed by choosing a different :2: of k linearly independent components from .3? Suppose fix is a function from (L2)k onto the class of k x k nonsingular — variance-covariance matrices such that ETX = T fixT' for all k x k real matrices T. Then _- .— e"?& = 1'2; X. - <3—27) ‘nggf. By definition of a basis {X1,X2,...,Xk} and {Y1,Y2,...,Yk} each forms a basis for 8. Hence ‘X_= T§_ where T is a k x k matrix of full rank. This result follows because one basis can be expressed as a nonsingular linear transformation of another basis for the same space S. By substitution 1'2; X." l‘T'Y- T32 - (3-28) By hypothesis f - T 2 T' . (3-29) Substituting (3-29) into (3—28) 16;]: - _x'T'uExT'flrx eg'r'u'flfiglfl'rx . (3-30) From (3-30) the final result .- ~-1 : .- v .. 3. XY 1 x 2x x (3 31) is obtained. .* The invariance of I? under mode of formation of :1 follows directly from Theorem 3.1. In the specific case under con— .* .* sideration, two vectors 1!_ and 2!. whose components generate the same subspace S of dimension (d-1)(r-1) are formed from the vector ‘2. by choosing any (d-l)(r-l) linearly independent com- ponents from ‘2, In terms of the more general theorem, Z, 12?, and 22* play the roles of W, x, and 2 respectively. The result in (3-31) with the appropriate substitutions made verifies the invariance .* of the 1? statistic under mode of formation of 2;. Computational Form of I? In this subsection the necessary formulas needed in the computation of the I? statistic are provided. The majority of the work in calculating the 1? statistic A A 6* given by 2; ZA* 2; goes into finding the variance-covariance 62 matrix. Instead of calculating §.* directly, it is easier to write 5* * A A *x *' y - A M (d-l) (r-l) xdt " v _M_ Associated with each 12% there are numbers m1,m2,...,mr and h defined in (3-24) and (3-25). The matrix A* associated with the corresponding if can be found by deleting rows m1, [m2 + d],...,[mr + (r-l)d], and rows [(h-1)d + 1] through dh of matrix A defined in (3-19). Row [mh + (h-1)d] will be in- cluded in the rows [(d-l)h + 1] through dh which are deleted. * An illustration of the formation of A follows i'r’h‘fia A- r—2-1-1ooooo‘O" 9X1 . .1 MZI-MJ -12—1oooooo fi31’fi.1 -1-1zoooooo Mlz-M:2 %— oooz-1-1ooo fizz-M.2 ooo-12-1ooo fiaz'fia 000-1-12000 {113-HS 0000002-1-1 1223-?4'.3 oooooo-12-1 1j33-fi-fl _oooooo-1—1_2_J .* Consider a vector !_ of the form A* " .3..— 1 " 1321”}.1‘ M -M 31 .'1 h=3,m-l,m=3~ . _‘fi' 1 2 M12 “.2 M —M _22 .34 63 * To find the appropriate A matrix using the value of h = 3, rows 7, 8, and 9 are deleted from A. In addition for ‘ml - 1 and a m - 3, rows 1 and 4 are also deleted from A. The resulting A 2 matrix has the form A= 'Ci'z-looooo'fi" 1-1-12000000 30002-1-1000 ' _ooo-12-1oo‘gj It now remains to determine the structure of 2 . For 'M notational simplicity let Cov(-) and Var(-) denote population parameter values and let C(°) and V(-) denote consistent estimators of the former found by replacing the P3 j 1,000, d with Pj j . The formulas which follow are a consequence 1,000, d of the fact that a multinomial distribution characterizes the cell frequencies. Var(P ) - la )(1 - P ) (3-32) 31’...,jd n Jl"°"jd j19°°'9jd v03 > - l{1'5 )(1 - f ) <3—33) 11,000,1d n 11’...,jd jl’ooo’jd A A 1 COV(P .P . .) = - -(P )(P . .) (3-34) jl’ooo’Jd 11,...’jd n jlgooo,jd jl’...’jd where for at least one value of i - l,...,d, ji * 3i .1’5 .>=—1( Ji!"°9jd HPj19'°'de)(P C(P 31’...’jd ji!'°'9jd v) . (3-35) The following estimates of Var(-) and Cov(-) are obtained from the respective parameters in a similar fashion A 1A A V(qu) - n qu(l - qu) (3-36) A _l" A I C(qu, qu.) n ququ. for b i b (3-37) 603 ii )=-1- z (1'5 )-lM M (3-38) qb’ q'k n j =b jl,...,jd n qb q'k Jq=k q' for qa‘q' 11:1,..0’r for i#q,q' Formula (3-38) results from.writing the marginal proportions qu and ‘fiq'k in terms of formula (3-3) and using the linearity of the covariance operator to obtain the algebraic simplification in Define the following: A . Z bb - 2‘ where ‘flb is defined in (3-17) b = l,...,r dxd 5o Sigh - “Sm” skm - C(Mkb, Mmb) k 9‘ 111 (3-39) k,m-l,...,d skk - v<fikb) f-C(M,M) for 1+3 dxdij “1 ‘1 C031. {11) - ((aghD ash = 00381. fihj) (3-40) g,h-l,...,d The variance-covariance matrix EA which is of dimension , M dr X dr can be expressed as a block matrix having the form 65 z11 212 ° ' ' ' ' 21r 5f“ - f: 55 . 2.1. 21 22 . . 0 .0 o . (3'41) fir . O . 0 . Err 1 The form given in (3-41) is a consequence of the definition for g given in (3-17). The computational form of the 1,2 statistic is then A *' *. *t * . 12 - M'A (A z A ',)'1A _M_ . (3-42) N I3) Conceptually the 1 statistic follows the general frame- work set forth in (3-1). The null hypothesis in (3-7) can equi- valently be stated as Ho: 1* Q . (3-43) (d-l)(r-1)XI The null hypothesis in (3-43) defines (d-l)(r-l) linearly in- dependent constraints on the parameters {Pj j } each of which 1’...’ d is independent of the constraint that the cell probabilities sum to one. The 12 statistic has the same form as the more general .* . . statistic given in (3-1) with 2_ replacing f_ and 25* replacing V G in (3-1). From the results of Chapter II the 12 statistic is thus algebraically equivalent to both the X§SK statistic and the xi statistic used in conjunction with modified minimum chi-square estimators. The null hypothesis of marginal homogeneity is rejected at the a level of significance whenever '12 is greater than the (l-o) quantile of chi-square with (d-l)(r-l) degrees of freedom. 66 Asymptotic Power of I? Bhapkar (1966) conjectured that the results of Mitra (1958) concerning the power of the minimum chi-square statistic based on maximum likelihood estimators could be extended to the xi statistic based on modified minimum chi-square estimators. Because I? is algebraically equal to a xi statistic based on modified minimum chi-square estimators, Mitra's results will be applied to the 1? statistic. Let a sequence of alternatives to the null hypothesis be defined as * “ink ! -= Li (3—44) A? where (I. is a vector of (d-l)(r-l) constants not all of which are zero. As n tends to infinity the sequence of alternatives {H§é)} tends to the null hypothesis. For large n, the power function of the I? test can be approximated by a noncentral chi- square with noncentrality parameter given by *' - k _ 1'17. i! =‘1‘:l'£«1'l—x ° (3-45) !_ /h 2_ J;- Because 2:: has an n to the first power in it the expression V given in (3-45) does not depend upon n. 67 TECHNIQUES FOR ISOLATING SOURCES OF SIGNIFICANCE If the I? statistic is so large as to lead to rejection of the null.hypothesis of marginal homogeneity, the next step in the analysis is to identify the sources of significance and estimate the magnitude of the differences. If there are only two correlated distributions it suffices to identify those categories, or combina- tions of categories, with unequal marginal probabilities. When more than two correlated distributions are involved, the additional problem of locating which of the distributions differ is encountered. Define a contrast to be a function of the M of the form qb r d W - X 2 C M (3-46) b=1 q=l qb qb where the Cqb are known constants subject to the condition that d 2 c =0 Vb . (3—47) q=l qb Let '- = 9b [Clb’CZb’°°"Cdb] Vb l,...,r le and let (3-43) 9' -[9_'.g'.....c'1 . lxdr 2 —T Define y} =- [g'.y_'....,M'1 (3—49) ler 1 "1' 0- = where Eb [M1b,M2b,...,Mdb] Vb l,...,r. 68 Any contrast, V, can then be written in vector notation as w = 9' M . (3-50) For suitably chosen 9_ vectors, contrasts which provide meaningful information to the experimenter can be formed. Interval estimates of these contrasts can be made which not only help isolate where the differences between the distributions lie, but also provide estimates of the magnitude of these differences. Some examples of contrasts are given in the following illustration from Yoshinaga's (1974) study. Yoshinaga investigated the constancy of teachers' choices of a disciplinary strategy when increasing increments of information were given about a child involved in a disciplinary incident. The teachers were given a description of the incident and asked to choose a strategy from a group of three strategies to deal with the be- havior problem. The three strategies were: (1) positive reinforce- ment, (2) social modeling, and (3) punishment. The teachers were given some biographical information about the child and were again asked to choose a strategy. Finally the teachers were presented with further background information about the child and asked again to choose a strategy from the same group of those three strategies. The ‘12 procedure provides a test of whether the teachers as a group vary their strategy as increased amounts of information are provided about the child. A significant value of the 12 statistic provides evidence that the probability of a teacher choosing a given strategy may change as this teacher learns more about the child. In order to identify which of the probabilities change and under what circumstances, the researcher can examine a series of contrasts. are typically chosen to reflect the logical comparisons which might be of interest in the given experiment. marginal proportions might be compared according to the contrasting 69 sets of conditions specified below. (a) (b) (e) (d) A set of contrasts which reflects comparisons of these types strategy based upon no information about the child vs. strategy based upon one source of background information strategy based upon no information about the child vs. strategy based upon two sources of background information strategy based upon one source of background information vs. strategy based upon two sources of information strategy based upon no information about the child vs. strategy based upon some information. and the corresponding §_ vectors are listed below. “’1 " M11 “'2 ' M12 “'3 ' M13 “'4 " M21 “'5 " M22 “'6 " M23 “’7'"13 wa'Mn “’9'H11 “'10 " M11 ’ W21 ' 15M31 V " M12 ' s""422 " 15M32 11 “'12 ' M13 ' L5H23 " L“‘Maa - M 21 2! 22 {I 23 I! 31 {I 32 {I 33 I! 33 i! 32 {I 31 10- 11 12- 19; - [1,-1,o,o,o,o,o,o,01 c' =- [0,o,o,1,-1,o,o,o,01 c' - [o,o,o,o,o,o,1,-1,01 c' - [0,1,-—1,o,o,o,o,o,01 c' - [o,o,o,o,1,-1,o,o,01 c' - [0,o,o,o,o,o,o,1,-1] c' - [o,o,o,o,o,o,1,o,-1] c' - [o,o,o,1,o,-1,o,o,01 c' - [1,o,-1,o,o,o,o,o,01 c' - [1,-!s,-1:,o,o,o,o,o,01 9' = [0.0.0,1,-%s.-!s,0,o,o] c' - [0,o,o,o,o.o.1.-!:.-!:] The contrasts For this experiment the 7O Contrasts W1 through W3 are based upon comparison type (a), while contrasts W4 through W6 are based upon comparison type (c). Contrasts V7 through V9 reflect comparison type (b) while contrasts V10 through W12 reflect comparison type (d). Contrasts which contain only two marginal proportions, such as those given by W through V 1 are called simple or pairwise contrasts. Contrasts 9 containing more than two marginal proportions, as in V10 through V12, are called complex contrasts. Hypotheses of the form H : W = 0 (3-51) 1 0 i can be tested by determdning whether the corresponding confidence interval for W includes the value of zero. Should be re- 1 1H0 jected, meaning that the confidence interval for W1 does not contain zero, the confidence limits provide an interval estimate of the magnitude of W A confidence interval procedure not only 1. locates differences, it provides interval estimates of the magnitude of these differences. For example a confidence interval for V1 not only tests whether the proportion of teachers who choose the strategy of positive reinforcement changes with the first introduc- tion of background information about the child, but also provides an interval estimate of the magnitude of the change, should one exist. A confidence interval for W10 tests whether some versus no back- ground information about the child has any effect on the probability that a teacher chooses a positive reinforcement strategy, and pro- vides an interval estimate of the effect, should one exist. 71 When several hypotheses are tested at one time, the prob— lem of determining the significance level for the experiment as a whole becomes very complicated. If the probability of committing a type I error is set equal to a for each individual confidence interval, the probability of rejecting at least one true null hypo- thesis, 1H0: W1 - 0, when several confidence intervals are gen- erated becomes considerably larger than a. If enough intervals are examdned, it is very likely that one or more hypotheses are re- jected even though they are all true. The error rate experiment- wise refers to the probability that one or more erroneous statements will be made in an experiment, where for the purposes of this dis- cussion an erroneous statement is the rejection of a true null hypothesis. For most data analytic situations the experimentwise error rate needs to be controlled since the researcher typically wants to guard against making any erroneous statements. In the discussion which follows two methods are considered for forming confidence intervals about the Y Each technique 1. controls the magnitude of the experimentwise error rate. The theoretical basis for each of the two techniques is now presented. Scheffé-type Solution The technique presented in this subsection is a simultaneous confidence interval method based upon the works of Scheffé (1959) and Goodman (1964) and adapted to the design and statistic under consideration in this chapter. Let the set of all possible con- trasts of the form. V be denoted by 3. The unrestricted maximum likelihood estimator of Y for any W e 3 is denoted as P where 72 i - _C_' 3. (3-52) V is also an unbiased estimator of V for each W e 3. Let 82(2) denote the consistent estimator of the variance of T de- fined in (3-53) 2 . . ,. S (‘1’) =9_ 2 g . (3-53) ) H! The theorem which follows provides the theoretical basis for the simultaneous procedure described in this subsection. Theorem 3.2. As n tends to infinity, the probability will approach 1 - e that simultaneously for all Y e 3 V - S(@)L s V s Q + S(@)L . (3-54) Here n denotes the sample size, L is the positive square root of the 100(1 - a)thapercentile of the central chi-square distribution with (d-l)(r-l) degrees of freedom and 8(2) is the positive square root of 82(9). The result of Theorem 3.2 allows the experimenter to examine the confidence intervals for as many contrasts, V e 3, as desired, holding the probability of making one or more erroneous statements at o, regardless of the number of confidence intervals examined. Rather than assigning an error rate per contrast, an error rate for the entire family of contrasts, 3, is assigned. The technique is quite advantageous to use when many different contrasts are of interest, or when one is merely searching the data to locate some of the sources of significance in the event of rejection of the over-all null hypothesis of marginal homogeneity. The use of such 73 a technique permits the experimenter to control the experimentwise error rate and thereby guard against making one or more erroneous statements, irrespective of the number of statements made. Proof of Theorem 3.2. Define Aqb = qu - qu q = 2,...,d b - 2,...,r (3-55) The (d-l)(r-l) functions A (q - 2,...,d), (b - 2,...,r), are qb’ contrasts. The {Aqb} form a set of (d-l)(r-l) linearly in- dependent estimable functions of the components of the ‘M_'vector and will span the space of all contrasts W e 3. Define A' B [A' __ __,...,A'] 1X(d-l)(r-1) 2 ‘T where (3-56) A; -[A ,...,. 1. lX(d-1) 2b db Any contrast V e 3 can then be written in the form V = h'.g (3-57) where ‘h_ is a vector of suitably chosen constants. Let 9} . [pf,...,n'] (3-58) 1X(d-l)(r-l) 2 "1 Where v _ " __ A _L 9b [MZb M.b’°°°’ Mdb M.b] (3-59> and i=e'2- 74 It was shown earlier in the chapter that for large n, 2_ is approximately multivariate normal with an expected value of .A and with a nonsingular variance-covariance matrix ED. The :2 5* vector is a special case of the more general .2. vector defined earlier in the chapter. If S denotes a consistent estimator of *. * *. Z , where S is A 2,.A with l_)_= A_M_ then for large n, S is D {I nonsingular and as n_ tends to infinity, the probability will approach 1 - a that (2, - A)'S"1(I_>, - A) s L2 (3-60) where L2 is the 100(1 - a)th percentile of the chi-square dis- tribution with (d-l)(r-l) degrees of freedom. The statement in (3-60) is the result of Proposition 2 of this chapter. The Cauchy-Schwarz inequality states that if ‘Q_ and .5 are any two vectors in the space Rk then IE'HI s (A's. Cy. - <3-61) The inequality given in (3-61) is used to prove a lemma which aids in the proof of Theorem 3.2. Lemma 3.1. If J is a k X k symmetric matrix such that J = R'R where R is a k X k matrix and of full rank, then for all 29!. e R“ le'il s (ZS'J x v’z'J'l: . (3-62) Proof of Lemma 3.1. If y_ is taken to be (R-1)':f_ and _N_ is taken to be RH; then it follows that 75 -l A's-E R' '3. = i'J‘li . If the preceding equalities are used together with the Cauchy- Schwarz inequality stated in (3-61) the desired result li'zl s (it! i é'J'H is obtained and Lemma 3.1 is proven. A second lemma needed in the proof of Theorem 3.2 is now given. M- Q; - AYS'IQ - A) 5 L2 if and only if lye. - A>| s L {hfs h. for all ‘h_ in (d-l)(r-l) Euclidean space. Proof of Lemma 3.2. The variance-covariance matrix S is of full rank for n large, thus it possesses the properties of matrix J in Lemma 3.1. If _11 is substituted for _1_(_, (Q - A) for X, and S for J in Lemma 3.1 with k = (d-l)(r-1) the result given in (3-63) follows. / 111' (D. - AM 5 “21's 11 v/(P. - Q's—1(1). - A) (3-63) VIL in. (d-1)(r-l) Euclidean space. 76 Now if QQ.‘.A)'S-1(2.‘ g9 S L2 then from (3-63) it follows that lh'Qz-AH SI» “.11'811 V1; in (d-l)(r-1) Euclidean space. This completes the proof of Lemma 3.2 going one way. Going in the other direction it is assumed that Ih'(_l_)_ - AH S L Vh'S _I_I_ V1; in (d-l) (r-l) Euclidean space. Choosing an ‘h_ equal to (S-l)'q;-£9 will result in the inequality Ie - Ars‘lo. — A>I «L 42- A>'s’1s'<2- A) . Because both sides of this inequality are non negative it is valid to square both sides to obtain an equivalent statement. Squaring both sides of the inequality and using the fact that symmetry of 8 implies that S-1 is also symmetric when the inverse exists, the following result is obtained: [(2 - g's'lm - £912 s Lzm - Q's-1Q - A) . <3-64) The expression (2.- A)'S-1(Q - A) is non negative since it is a quadratic form. If the quadratic form happens to be zero, the result that (2 - A)'S-1(_D_ - A) S L2 is obvious since L2 is positive. Otherwise [(Q_- A)'S-1(2 - 9)] > 0 and dividing both sides of (3-64) by this expression yields 1 2 (p - 9's" (p - _A_) s L . (3-65) This completes the proof of Lemma 3.2. 77 As a consequence of the results in (3-65) and Lemma 3.2 the probability that l l g'Q-hSthh'Ash'p+hShL (3-66) simultaneously for all h. in (d-1)(r-l) Euclidean space tends to 1 - a as n tends to infinity. But by (3-57), each contrast ‘l‘ e y , is of the form h'A and conversely every h'A is a con- trast ‘1'; 31']; is then ‘1" while 11's 3 is the sample variance of 2 which is denoted as $2(@) as defined in (3-53). The result given in (3-66) together with these substitutions completes the proof of Theorem 3.2. The simultaneous confidence interval procedure is typically used as a follow up procedure when the value of the I? statistic is large enough to permit rejection of Ho, the over-all null hypothesis of marginal homogeneity. The experimenter is free to generate as many confidence intervals as desired, provided the intervals have endpoints as 1Ho: V1 - O can be tested by seeing if the corresponding confidence interval specified by (3-54). Hypotheses of the form A (V - S(V)L, Q + S(T)L) spans zero. If the interval does not i i span zero, a source of significance is located and an interval estimate of its magnitude is given by the bounds of the interval. In this manner sources of significance can be located and their magnitudes estimated, while at the same time the experimenter is protected against making one or more erroneous statements because a simultaneous error rate of a is set for the entire family of contrasts 3. 78 The simultaneous confidence interval procedure just des- cribed has the added advantage of correspondence to the test criterion associated with the 12 procedure. The 12 procedure tests the over-all null hypothesis of marginal homogenity, H0: A = Q a (3‘67) and the 12 statistic is algebraically equal to the quantity QfS-¥Q. For a test performed at the a level, the test criterion is to reject B if and only if 0 p's p> L . (3-68) But from Lemma 3.2 the condition in (3-68) occurs if and only if there exists at least one h. in (d-l)(r-l) Euclidean space for which lh'al > E's—1h L . The over-all null hypothesis, H0, is thus rejected using the 12 test criterion if and only if there is at least one contrast V e 3 for which zero is not included in the interval (V - 3(@)L, ‘1? + S(\F)L) . Rejection of the over-all null hypothesis of marginal homogeneity using the ‘12 test criterion guarantees the existence of at least one contrast V e 3 which is statistically significant when using the simultaneous confidence procedure. The ability of the experimenter to examine as many contrasts V e 3 as he desires for a fixed experimentwise error rate is not 79 without drawbacks. Because the Scheffé-type procedure focuses on the entire family 3, it may not be as powerful a procedure as certain other techniques. The confidence intervals generated using the Scheffé-type solution may be wider than the corresponding intervals using other procedures and less sensitive in detecting differences. An alternative procedure to the Scheffé type technique follows. ‘ggnferroni-type Solution If only a specified subset, s, of the contrasts in 3' are of interest to the experimenter and if the number of contrasts in s is sufficiently small, then the technique presented in this sub- section may be preferable to the Scheffé-type solution. The Bon- ferroni-type solution places an error rate on each individual contrast which is examined but the error rate per contrast is de- termined so as to control the over-all experimentwise error rate. Let 3 be a subset of 3, with S containing the finite number of contrasts which are of interest to the experimenter, denoted as V1,V2,...,Wk. Let Ei be the event that the null hypothesis iH0: W1 - 0 (3-54) is falsely rejected. Let a1 - F(Ei) . * Let E be the event that one or more null hypotheses of the form given in (3-54) are falsely rejected. Then 80 k k * P(E ) - P(E u E u...u E ) s 2 P(E ) - 2 a . (3-69) 1 2 k i=1 1 i=1 i The inequality in (3-69) is called the Bonferroni inequality. The a quantity P(E ) is the experimentwise error rate, having as an k upper bound the quantity 2 a1. If k is not too large (for 1-1 example 5) and the a are relatively small (for example .01) i * the approximation of this upper bound to the quantity P(E ) is quite good. Typically each a is chosen to be a/k where a i is a specified value which serves as an upper bound for the experi- mentwise error rate. By dividing the entire 0 value among the different contrasts of interest, the experimenter is able to control the probability of making one or more erroneous statements. Earlier it was shown that for large n, the vector of cell frequencies is approximately multivariate normal. Because any con- trast, $1, is a linear combination of the components of the cell frequency vector, when n is large f will be approximately uni- i variate normal. Because S(@1) + o(@i) as n + m, for n large the distribution of (3-70) is approximately standard normal. An equivalent statement of (3-70) is that as n“+ a the probability will approach 1 - a1 that W1 - s(wi)za15 W1 5 vi + 3(Yi) zai (3—71) T 2 where Za ,2 is the 100(1 - a1/2)th percentile of the standard 1 normal distribution. To test hypotheses of the form 1HO: W1 = 0, 81 determine whether the corresponding confidence interval for W1, (W1 - S(Wi)Zai/2, W1 + S(Wi)Za1 I2) spans zero. If all 2 are set equal to Z where Z is the 100(1 - Ethh ai/Z 2k percentile of the standard normal distribution, then the confidence intervals using the Bonferroni technique (3-71) will be narrower than the corresponding intervals using the Scheffé-like procedure, (3-54), if and only if Z < L where L is the 100(1 - a)th percentile of the chi—square distribu- tion with (d-1)(r-1) degrees of freedom. Typically Z S L will hold whenever the number of contrasts examined, k, is such that k S 8d(d - 1)r (3-72) for the usual values of a (.05 or .01), Goodman (1964). The Bonferroni technique can be used as a substitute for the omnibus test of homogeneity or as a follow up procedure to locate and estimate sources of significance should the omnibus test be significant. In either case, the choice of the contrasts to comprise set :5 must be made before the data are examined, other— wise a serious distortion of the error rate may occur. The experi— menter must not use the data in determining which contrasts to include in set :5 This is a major point of difference between the Bonferroni and Scheffé-type procedures. The Scheffé-type procedure allows the experimenter to examine any contrasts in 3 for a constant 82 experimentwise error rate, whether the choice of contrasts is dictated by prior hypotheses or inspection of the data. Although both the Scheffé and Bonferroni type techniques are large sample procedures based upon asymptotic results, the robustness to nonnormality of the V is probably greater for the i Scheffé-type than for the Bonferroni-type technique. The accuracy of the Scheffé-type procedure is based upon the degree to which the quadratic form 'QfS-¥2_ approaches its limiting chi-square distribu- tion under the null hypothesis. Such a study is conducted in Chapter IV for tests performed at the nominal .01, .05, and .10 significance levels. A good fit to the limiting distribution at these cutoff values will guarantee a valid Scheffé-type confidence procedure at these levels. The validity of the Bonferroni procedure is dependent upon the degree to which the quantity W1 - W1 . (3-70) 8(W1) approaches its standard normal limiting distribution for each con- trast W1 examined. The fit of the statistic in (3-70) to its limiting standard normal distribution may not be good in the extreme tails of the normal distribution. Because a separate significance level is assigned to each test of iH0: W1 - 0, the significance level per test is typically quite small, for example If 2. k 0 a - .05 and k . 10, it would be necessary to use the .25th and the 99.75th percentiles of the standard normal distribution as the cutoff values. The fit of the statistic in (3—70) to its limiting distribu- tion in such extremities of the distribution as the .25th and 99.75th 83 percentiles is questionable for small or even moderate sample sizes. A.Monte Carlo investigation to examine such behavior would be extremely costly because of the large number of samples which would have to be generated to produce stable estimates of the actual significance level for tests performed at such small nominal a values. The actual significance level for each test of vi = 0 or equivalently the actual confidence level given to each interval in (3-71) may be quite different from its stated nominal level when n, the sample size, is small or moderate. When similar procedures are employed for the k contrasts, the resulting experimentwise error rate may be quite different from the theoretical rate of a, the value set by the experimenter. Asymptotically the relationship between the Bonferroni and Scheffé-type techniques is made explicit in (3-72), but for small or moderate sample sizes the relationship in (3-72) may not be a valid criterion to apply because of the lack of information about the actual significance level of tests using the extreme tails of the distribution as would be the case in the Bonferroni-type procedure when many hypotheses are tested. A data example which illustrates the 'L2 procedure and the two post hoc techniques presented in this section is now given. 84 DATA EXAMPLE The following data example is based upon Laumann's (1973) social interaction data. In Laumann's study of social mobility and social interaction, a husband and the fathers of both the husband and wife were each classified into one of a number of social classes. This design can be considered as consisting of three matched samples. By testing for homogeneity of the three matched samples on the variable of classification, the researcher can make inferences con- cerning the social interaction of various social classes. Because the three samples are probably not independent, the 12 procedure is used to test for homogeneity of the marginal distributions. Laumann's original categories of classification are collapsed into just four categories to make the analysis feasible since rd, the number of cells in the contingency table, would be too large had Laumann's original categories been used. The four categories of classification are given as: (1) professional, technical, kindred-white collar managers, officials (2) clerical - white collar sales - white collar (3) craftsman - blue collar (4) operatives - blue collar service - blue collar laborers (except farm) - blue collar Let husbands be sample 1, husbands' fathers sample 2, and wives' fathers sample 3. The data are represented in the 4 x 4 x 4 con— tingency table given in Table 3—1. TABLE 3-1 Laumann's Social Interaction Data Sample 1 (l) (2) Sample 2 Sample 2 (1) (2) (3) Q4) (1) (2) (3) (4) (l) 44 17 4 12 (l) 11 2 4 8 Sample 3 (2) 10 3 6 2 (2) 1 2 2 3 (3) 29 7 22 22 (3) 6 7 4 9 (4) l3 8 21 32 (4) S 1 11 8 (3) (4) Sample 2 Sample 2 (1) (2) (3) (4) (1) (2) (3) (4) (l) 8 2 19 ll (1) 9 2 9 10 Sample 3 (2) O 0 5 5 (2) 0 0 2 6 (3) ll 2 26 35 (3) ll 0 22 32 (4) 4 l 21 37 (4) 12 4 28 39 Marginal Totals (1) (2) (3) (4) Sample 1 252 84 187 186 Sample 2 174 58 206 271 Sample 3 172 47 245 245 n = 709 Ho: Mlb - MZb = M3b for b = l,2,3,4 86 32* =- E11 - 171 7.0742837 821 -‘fi;1 -.035731 17112 - "1:72 .029619 822 -‘§;2 = -.007052 813 -‘§;3 -.036201 fi23 - 3'3 -.009403 The if vector is formed by choosing m = m m = m = 3 and 1 2 = 3 4 h = 4, where the m's and h are defined in (3-15). This choice A represents one of many possible V vectors which could be used. in the computation of the 12 statistic 2:: a 11464.6 6198.0 6501.1 2647 3 5468.9 2420.6 2' 6198.0 13478.0 2991.9 6467.3 3445.4 4713.3 6501.1 2991.9 22447.7 12705.9 5326.5 2411.8 2647.3 6467.3 12705.9 25063.1 2961.5 4452.6 5468.9 3445.4 5326.5 2961.5 10192.6 4445.6 I_2420.6 4713.3 2411.8 4452.6 4445.6 8584;4‘1 '12 = Ef'i:i if = 71.39 v The null hypothesis is rejected for tests performed at the nominal level of a = .05 since the value of 12 is larger than the 95th quantile of x2 with 8 degrees of freedom. Confidence intervals using both the Scheffé-like and Bonferroni—techniques are computed for the following contrasts: 87 V1 7 M11 " M21 32 ' M11 ' M31 *3 ' M21 ‘ M31 *4 ' 2M11 M21 ‘ M31 ‘15 ' M14 ‘ M24 W6 ' M14 ‘ M34 V7 ' M24 7 M34 W8 ' 2M14 ’ M24 ' M34 *9 ' M22 ' M32 3 - M - M TABLE 3-2 Scheffé—like Confidence Intervals at a - .05 and Bonferroni Confidence Intervals with a - .005 per Contrast \ Y Q 8(2) - Scheffé-likeA A Bonferroni A (v — S(W)L, w + 3(3) L) (w — s(w) z,w + 3(3)2) 31 .1100 .0212 (.0349, .1852)* (.0504, .1696)* 32 .1128 .0228 (.0320, .1937)* (.0487, .1769)* 33 .0028 .0200 (-.0638, .0740) (-.0534, .0590) V4 .2228 -0270 (.1270, .3186)* (.1531, .2925)* 35 -.1199 .0233 (-.2026, -.0372)* (—.1854, -.0544)* 26 -.0832 .0227 (-.1639, -.0025)* (-.1470, -.0194)* 27 .0367 .0237 {-.0475, .1209) (-.0299, .1033) vs -.2031 .0395 (-.3431, -.0631)* (-.3141, -.0921)* 29 .0155 .0137 {-.0332, .0642) (-.0230, .0540) 310 -.0550 .0245 (-.1418, .0318) (-.1238, .0138) * Significant contrast 88 Contrasts W3, W7, W9, and W10 each compare the respective family backgrounds of husband and wife. Neither the Scheffé nor the Bonferroni procedure yields statistically significant confidence in- tervals. This suggests that the variable of social class has the same distribution for both father of husband and father of wife. In a global sense these results indicate that the social class back- grounds are the same for husband and wife. Contrasts W1 and W5 each examine generational differences in social class. Both con— trasts are significant, and the nature of the intervals suggests upward class mobility of the son versus his father. Contrasts Y2, Y4, Y6, and VB each make generational comparisons as well as comparisons between the husband and the wife's father. Each of these contrasts is significant using both the Bonferroni and Scheffé-like procedures. The confidence intervals again reflect an upward class mobility of the husband relative to his father and/or father-in- hm». For this data example the Bonfernoni solution yields shorter confidence intervals and thus greater power than the Scheffé-like procedure. The power of the Scheffé-like procedure relative to the Bonferroni solution would increase as more contrasts are examined and eventually surpass the power of the Bonferroni solution. For this data example the small number of contrasts examined relative to the total number of all possible contrasts is not enough to realize the advantage of a simultaneous procedure such as the Scheffé-like procedure. The extent to which the asymptotic results of this chapter hold in the case of finite sample sizes is investigated in the next 89 chapter. A simulation study of the behavior of the 12 statistic for finite sample sizes is reported in Chapter IV. The degree to which the ‘12 statistic approaches its limiting distribution both under the null and various forms of the alternative hypothesis is studied in the next chapter. CHAPTER IV 2 A MONTE CARLO STUDY OF THE I. STATISTIC 2 In Chapter III it was shown that the I statistic has a limiting central chi-square distribution under the null hypothesis of marginal homogeneity. The nature of the chi-square approximation 2 for finite sample sizes has not to the exact distribution of I as yet been investigated. The jump from asymptotic distribution theory to the practical use of an asymptotic result cannot be made unless the asymptotic result is examined in a context of practical circumstances. Before the use of 12 could become a legitimate statistical procedure, the author felt that an investigation of the distributional behavior of 12 for finite sample sizes was warranted. The extent of the disparity between the nominal significance level at which a statistical test is performed and the exact level of significance is a major factor in determining the legitimacy of the procedure. In this investigation exact levels of signifi- cance associated with the nominal one, five, and ten percent levels using a central chi—square distribution for the 12 procedure were estimated by Monte Carlo sampling. Samples of 2000 sets of n observations of a given discrete distribution specifying a null hypothesis were generated for each of a number of parameter sets considered and for each of several different values of n. 90 4 91 The ability of a statistical procedure to detect departures from the null hypothesis is a major concern in any hypothesis testing problem. Knowledge of the power of a statistical procedure can help in designing studies which use the procedure. Exact values of power for the 12 procedure were estimated by Monte Carlo sampling for tests performed at the nominal one, five, and ten percent significance levels. The author was interested in studying the extend to which the empirically determined values of power approximated values which were derived using a noncentral chi~square distribu- tion with the appropriate noncentrality parameter. Interest centered on determining the correspondence of nominal and estimated actual power values for varying sample sizes and different sets of cell probabilities, each set specifying a form of the alternative hypothesis. The chapter is divided into five sections. The first serves as a literature review, reporting the results of some related re- seardh findings. The remaining four sections describe the investiga- tion. The first of these four sections reports the design parameters used in the investigation. The second section describes the genera- tion procedure for the Monte Carlo sampling. The third section reports as well as discusses the results of the investigation and the final section discusses the conclusions reached as a result of the investigation. 92 RELATED RESEARCH Bhapkar (1961) demonstrated that a statistic of the form of '12 was algebraically equivalent to the modified minimum chi- square statistic of Neyman (1949) denoted as Xi. The 12 statistic can be represented in the form A 2 (n, - nP ) 2 2 J ’j ’°"!j jaj a°°°9j (jl’jZ’...,Jd) j1,j2,eoe,jd provided 111 j j > O for all cells of the contingency table. 1’ 2"°’ d The 9 form a set of best asymptotically normal j ,j ,...,j estimatirsz(BAN) Shich are least squares estimators found by min- imizing the quantity given in (4-1) subject to the constraints which specify the null hypothesis of marginal homogeneity (Neyman, 1949). Neyman demonstrated, that under certain regularity con- ditions, the statistic A 2 2 c (n - DP ) X a ‘1: k k 1 n k=1 R has a limiting central chi-square null distribution with t degrees of freedom where the pk are BAN estimators found by minimizing 2 x1 subject to t linearly independent constraints, F1(p) - 0 i - 1,2,...,t (4'2) 93 The statistic given in (4—1) is a special case of the more general result just stated. The constraints which specify the particular null hypothesis of marginal homogeneity possess the appropriate regularity conditions stated by Neyman. Consequently, the statistic in (4-1) has a limiting central chi-square distribu- tion under the null hypothesis of marginal homogeneity. The degrees of freedom are (d—l)(r-l) since (d-1)(r-l) linearly independent constraints are needed to specify the marginal homogeneity null hypothesis. Neyman showed the asymptotic equivalence of three statistical methods to test a null hypothesis of the form given in (4-2). Use of either the Pearson X29 . 2 2 c (n - npk) X - g -—————— (4-3) k-l nfik the Neyman modified minimum x1, . 2 c (n - np ) 2 x1 - n 1‘ n 1‘ <4-4) k-l k or the likelihood ratio statistic c -2£’m>.-2 z n(Lnn -Ln(nf> )) (4-5) k-l k k A leads to asymptotically equivalent results when the pk are BAN estimators of the pk subject to the constraints of (4-2). Neyman also established that each of these three statistics has as its 94 limiting null distribution the central chi-square distribution with t degrees of freedom. Bhapkar (1966) verified that the three statistics have the same asymptotic relative efficiency. Research concerning the behavior of the X2 statistic for finite samples may be applicable to the Xi and 1 statistics because all three statistics belong to the same general class and all three possess the same asymptotic prOperties. Since much of the work on finite sample behavior has been done with the x2 statistic, some of the more significant results are cited. Some of these results may be relevant to the behavior of the I statistic for finite sample sizes since the 12 statistic is algebraically equal to the xi statistic whenever the latter is defined and x: is asymptotically equivalent to the x2 statistic. Asymptotic equivalence does not necessarily imply that the statistics will behave in a similar manner for finite sample sizes, but a citation of some of the significant findings in the literature will help in designing a MOnte Carlo study to examine the behavior of the 12 statistic for finite sample sizes. There has been some disagreement as to how large the theoretical cell frequencies, upk, must be before the x2 statistic is distributed approximately as a chi-square random variable. Fisher (1941) recommended that no theoretical cell frequency be less than five while Cramér (1946) recommended that expectations should be at least ten. Cochran (1954) suggested that if relatively fewer than 202 of the cell expectations are less than five a minimum expectation of l is allowable in using the chi-square approximation 95 to the distribution of the x2 statistic. In the more recent past several authors have shown that the above restrictions are quite conservative in estimating the degree to which the x? statistic approximates its limiting chi-square distribution for small sample sizes. Maxwell (1961) stated that for fifteen or more degrees of freedom, the Pearson x2 statistic is well approximated by its limiting chi-square distribution even when most of the expected cell frequencies are as low as 1 or 2. Wise (1963), using a theoretical approach, showed that the X2 statistic is well approximated by its limiting distribution when the expected cell sizes are small but nearly equal. Slakter (1966) obtained the empirical null distributions of the x? statistic when the P1 each equal 1/k, for n's of 10, 15, and SO, and k's providing expected frequencies from S (n - 50, k - 10) to .05 (n - 10, k - 200). His study provided further evidence that the chi—square goodness of fit test is robust with respect to small but equal expected cell frequencies. Roscoe and Byars (1971) obtained results similar to those of Slakter in a Mbnte Carlo study which examined the small sample behavior of X2 in tests of fit to a uniform distribution. In addition these authors examined the x2 statistic in the context of tests of fit to nonuniform distributions. Roscoe and Byars found that for nonuniform distributions far removed from the uniform, average expected cell frequencies greater than five were needed in order to achieve a good approximation to the limiting distribution. Good, Cover and Mitchell (1970) theoretically derived the exact distributions for the Pearson X2 and the -297n ). statistics 96 in the context of a goodness of fit test for an equiprobable k- category multinomial distribution. The authors found that the chi-square approximation was better for the -2 (m 1 statistic than the x2 statistic when n > 3k/2 or n >1k + 9 for tests of size less than .005 where n is the total sample size. The Pearson x2 statistic was better than the -2.011. statistic in terms of approaching its limiting chi-square distribution when n/k < 1 for tests of any size. Yarnold (1970), working in the context of the goodness of fit test, showed that if the number of classes k is three or more, and if r denotes the number of expectations less than five, then the minimum expectation may be as small as 5r/k and still achieve a good approximation to the chi-square distribution. Yarnold found that by using this rule, the empirical lower and upper bounds for the true probability of a type I error would be .006 and .0162 for the nominal .01 level and .0375 and .060 for the nominal .05 level. The cited results of Wise, Slakter, Roscoe, Good and Yarnold are all restricted to the goodness of fit test in which the para— meters are completely specified a priori. The results do not re- flect the possible effects of estimating the parameters from the data, as in tests of independence or homogeneity. Because the 12 procedure does estimate parameters from the sample data, some re- seardh findings in this context follow. The small sample properties of the x2 test of homogeneity of independent samples were studied by Roscoe and Byars (1970). 97 Cases of 2 to 5 samples from multinomial distributions of 2 to 5 categories were investigated. Both uniform and skewed multi- nomial distributions were examined. For the uniform cases average expected cell sizes of 3 to 5 frequently resulted in empirical estimates of alpha that were within two standard errors of the nominal values of .01 and .05. For the skewed distributions average expected cell sizes of 10 or more were typically needed to meet the same criterion. Use of the Pearson X2 statistic resulted in conservative tests when chi-square was used as the limiting reference distribution. Lewontin and Pelsenstein (1965) looked at the Pearson X2 test for homogeneity for 2 X k tables and found the test to be conservative for most of the cases considered. They found the chi-square approximation to the Pearson x2 test to be good for 5 or more degrees of freedom and all expectations at least of size 1. Margolin and Light (1974) as part of their study compared the Pearson X2 statistic to the likelihood ratio statistic for testing for homogeneity in the context of 3 x 2 contingency tables (three response categories and two independent groups). The authors found that the Pearson x2 statistic is considerably better approximated by its limiting chi—square distribution under the null hypothesis than is the likelihood ratio statistic. The authors also proved that when the two independent groups are equal in size the likelihood ratio statistic is numerically larger 2 than the Pearson 'X statistic. The authors recommended the use 98 of the more conservative Pearson X2 procedure for small sample sizes. March's (1970) study, referenced by Tate and Hyer (1973), 2 test for independence in 2 X 3 tables with examined the X random margins, sample sizes from 8 to 42 and no expectations less than one. March found that when the average expected cell sizes were 1.5, 2.0, 3.0, 4.0, 5.0, 6.0, and 7.0, the mean absolute percentage errors over the .01 - .10 region of test sizes were 69, 36, 30, 23, 21, 20, and 19 respectively. March concluded that if close approximations to the exact probabilities are needed, the Chi-square test may at times be poor. A special case of the weighted least squares approach as described in Chapter II is the minimum logit x? approach of Berkson (1955). The Berkson approach is used primarily in the testing of hypotheses involving J independent binomial distributions. Let P denote the probability of success in population j. The logit J Yj for papulation j is defined as Y 3 (4-6) P Y a J— : '= ... rm Y [311,372, ,J j l-Pj IRS Let a null hypothesis to be tested be defined in terms of the linear model 1 - x B (4-7) where X is a matrix of constants and ‘B. is a vector of unknown parameters. 99 13 Let the observed logits be defined as ’Ifj = Pm “j— A where the {P1} are the observed relative frequenciesT-Pflet nj denote the size of sample j. The x? (logit) statistic to test the hypothesis given in (4—7) can be written in the form 2 J ,. ... 2 x (logit) = 2 [n.P.(1 - P )(Y. - Y.) 1 (4-8) j=1 J J j J J where the {§j} are those values of the {Yj} which minimize the quantity J 2 x (logit) = z . 2 j=l )(Yj - Yj) ] [anj(1 - Pj with respect to the parameters of the model given in (4-7). BishOp et a1. (1974) showed that the statistic in (4-7) could be written in the form 2 A ~ ' -1 ~ x (logit) = (z - _Y_) s? (3, - x) (M) where S? is the estimated variance-covariance matrix of ‘2, The x?(logit) statistic is thus seen to be a special case of the weighted least squares approach discussed in Chapter II. Taylor (1953) has shown that {Yj} belong to the class of BAN estimators of Neyman. Because the I? statistic is algebraically equal to a statistic based upon a weighted least square approach, as has already been noted in Chapter II, the small sample properties of x2(logit) statistics may provide some information about the behavior of the I? statistic for finite sample sizes. 100 Berkson (1968) examined the relationships between the Pearson x2 and x2 (logit) statistics using maximum likelihood and minimum logit X2 estimators respectively for the two statistics. This study was done in the context of tests for no interaction in higher order contingency tables. Berkson found no appreciable difference between the two methods for the sample sizes examined, and recommended the X? (logit) because of its computational simplicity. In another comparative study Odoroff (1970) looked at the samll sample prOperties of 12 goodness of fit tests for interaction in 2 x2 x2 and 3x 2x 2 contingency tables. The 12 tests were constructed by combining three tests (the minimum logit chi- square test, the Pearson chi-square test and the likelihood ratio test) with four methods of estimation (iterative maximum likeli- hood estimation and three variations of the minimum logit chi-square estimation). Odoroff found that the x2 (logit) and the Pearson X2 statistic approximated their chi-square limiting distribution better than did the likelihood ratio statistic. In addition Odoroff found that the approximation to the limiting distribution was better when minimum logit chi-square estimation was used in place of estimation by maximum likelihood. Odoroff's was a small sample study which employed contingency tables whose minimum expectations ranged from one to ten observations in a cell. There still is little agreement as to the requisite sample sizes for use of the chi-square large sample procedures. Some authors were very Optimistic as to the applicability of the 101 techniques to small samples while others were more pessimistic. The works cited in this literature review aided the author in de- signing a Monte Carlo study of the small sample prOperties of the I? test. The articles suggested possible sample sizes to examine and distributions, both uniform and nonuniform, to con- sider. Bhapkar (1966) noted that the 12 technique might be in— accurate in the presence of many empty cells, but he was unsure of the extent of this inaccuracy. This issue led to the inclusion of distributions which would give rise to many empty cells, since such a phenomenon is not rare in social science data analysis. The design of the Monte Carlo study of the small sample behavior of 12 follows. DESIGN PARAMETERS USED IN THE INVESTIGATION Contingency Tables Examined A design consisting of d matched—samples or d repeated measures on the same sample can be represented by an r X r x...x r contingency table of d dimensions. A multinomial distribution defined by rd parameters and a sample of size n are assumed to characterize the frequencies of the rd cells of the contingency table. The number of cells, rd, grows very rapidly as either r or d increases. For many data analysis situations higher order tables are impractical because of the large sample sizes needed to obtain valid statistical tests. For example, a 4 x 4 x 4 X 4 table contains 256 cells. Even at the very low 102 figure of an average expected cell size of three, 768 observations would be needed and simulation would be prohibitively expensive. In this study, 3 x 3, 4 x 4, 5 x 5, and 3 x 3 x 3 tables are considered. Their relatively modest sample size demands, their frequent use in social science data analysis and the reduced simulation costs all prompted this choice. Distributions Considered Under the Null Hypothesis The I? procedure tests for homogeneity of correlated marginal distributions. Because the marginal distributions are correlated, the joint distribution of the variables is needed to specify the contingency table completely. The choice of which parameter sets to include in the study must be made on the basis of both the marginal distributions generated and the nature of the configuration specified by the joint distribution. There is no way to investigate the x2 procedure for the general case of a set of specified marginal distributions. The procedure can only be investigated for the marginal distributions in the context of a specific configuration or joint distribution. For example, the contingency table given in Figure 4—1 is incompletely specified until at least four joint prdbabilities are given although its homogeneous marginal distributions are specified. 103 Distribution 1 Distribution 2 .3 .2 .5 FIGURE 4-1 Two Correlated Distributions For each of the 3 x3, 4 x4, and 5 X5 contingency tables 5 different parameter sets were considered under the null hypothesis of marginal homogeneity for a total of 15 discrete dis- tributions. The distributions fall into 5 general categories. The 5 general categories or classes of distributions are denoted by the letters A through E and are given in Appendix C. The five general classes of null distributions considered for study are each characterized by the common property of symmetry. The symmetric contingency table represents an important subset of the class of all contingency tables which specify the null hypo- thesis of marginal homogeneity. It was believed that systematic investigation of this important subclass would be more enlightening than attempting to study the entire class of contingency tables which specify the null hypothesis, a task whidh is clearly beyond the sCOpe of any Monte Carlo study of limited finances. The five general classes of null distributions considered for the two dimensional contingency tables within the subclass of symmetric tables provide a wide range of possible practical settings for the '12 technique. The joint distributions which make up 104 classes A through C have certain common properties which provide the basis for some interesting comparisons among these classes. The joint distributions for each of the classes A through C give rise to uniform marginal distributions. Secondly, calls off the main diagonal of each contingency table have equal probabilities associated with them and finally, the main diagonal of each con- tingency table is characterized by cells of equal probability. The classes A through C differ in one major respect and that is the degree to which the probability is concentrated on the main diagonal of the contingency table. The degree of concentration of probability on the main diagonal for the two dimensional tables . corresponds to the degree to which subjects do not change their response for the case of the repeated measures design, and corresponds to the degree of agreement between pairs of dependent subjects in the case of two dependent matched samples. It should be noted that marginal homogeneity can occur regardless of the degree of probability concentration on the main diagonal of the contingency table. The distributions of class-A are each characterized by a uniform distribution of the probability about the contingency table. The distributions of class-B are each characterized by having main diagonal cell probabilities four times those of the off diagonal cells. This class represents a moderate concentration of proba- bility on the main diagonal with the concentration being approximately 672, 572, and 50% for the 3 x 3, 4 x 4, and 5 x 5 tables respectively. The distributions of class-C are each characterized 105 by a very heavy concentration of the total probability on the main diagonal of the contingency table. For each of the distributions within this class between 85% and 88% of the total probability is concentrated on the main diagonal. The last two classes of distributions, class-D and class-E, define contingency tables in which the marginal distributions are not uniform over the categories of the dependent variable and the off diagonal cell probabilities are not all equal. The class-D distributions were chosen to examine the effect of having a small marginal probability for at least one of the categories of the dependent variable. The low marginal probabilities were chosen to be .10 and .07 respectively for the 3 X 3 and 4 X 4 tables and .065 for each of two categories in the 5 x 5 table. The con- centration of probability on the main diagonal was chosen to be relatively high for this class of distributions. The concentra- tion for the 3 x 3, 4 x 4, and 5 x 5 tables were chosen to be approximately 80%, 70% and 68% respectively. The class-E distributions represent a series of configura- tions which might arise when there is some meaningful ordering of the categories of the dependent variable. For each of the distribu- tions in this class, cell probabilities decrease in simplex fashion as one moves away from the main diagonal of the contingency table. The 12 procedure can be employed to test homogeneity of the marginal distributions, although such an hypothesis does not make use of the ordinality of the data. 106 Four different configurations specifying the null hypo- thesis were considered for the 3 X3 X3 contingency table. The distributions are depicted in Appendix D. For the 3 x 3 X 3 table the main diagonal is composed of three cells; (1,1,1), (2,2,2), and (3,3,3). In the case of a repeated measures design, the degree of concentration on the main diagonal corresponds to the degree to which subjects do not change their response over the repeated measures. For the non repeated measures design, the degree of concentration on the main diagonal corresponds to the degree of agreement among the triples of dependent subjects. Just as in the case of the two dimensional tables, marginal homogeneity can occur regardless of the level of the concentration of probability on the main diagonal. The configurations.A and C given in Appendix D are extensions of the corresponding classes in the two dimensional case to three dimensions. The two distributions have the same similarities pre- viously described for the two dimensional case. The joint distribu- tion A in Appendix D uniformly distributes the total probability among the 27 cells of the contingency table. Configuration C in Appendix D indicates a very high concentration of the total probability in the main diagonal cells with over 85% concentrated on the main diagonal. Configuration D in Appendix D is an extension of the class-D distributions to the three dimensional table. Configuration B in Appendix D has no counterpart in the two dimensional case. The distribution might arise in the context of a repeated measures design in which the probability of making only one change in 107 response over the three repeated measures is greater than the proba- bility of making two changes, and the probability of making two changes but responding with only two different categories is greater than the probability of using all three categories. The concentra- tion of probability on the main diagonal is 46%. The class-C distributions for both the two and three dimensional contingency tables have the proPerty that for small samples, many of the cells of the corresponding contingency tables may be empty. This class of distributions was chosen in order to 2 statistic investigate the claim of Bhapkar (1966) that the 1 may be considerably inaccurate in approaching its limiting distribu- tion when many cells are empty in the contingency table. Each of the configurations mentioned in this subsection was investigated for sample sizes which were chosen to yield average expected cell frequencies of 3, 5, 10, 20, 40, and 60. The average expected cell frequency, 3, is defined as the total sample size divided by the number of cells in the contingency table, 5 = 23'. The sample sizes investigated cover a range which extends from the very small samples which might be found in some experimental re- searCh studies to the large samples which might be encountered in survey research. Distributions Considered Under the Alternative Hypothesis Three different sets of cell probabilities each specifying a form of the alternative hypothesis were investigated for each of the following size contingency tables: 3 x 3, 4 x 4, 5 x 5, 108 3 X 3 X 3. The configurations are depicted in Appendices E and F. Estimates of exact power were computed by Monte Carlo sampling for tests performed at the nominal one, five, and ten percent significance levels for each of the parameter sets under consideration for sample sizes yielding average expected cell frequencies of 5, 10, and 20. These values were compared to their corresponding theoretical values of power as computed under a noncentral chi—square distribution with noncentrality parameter given by 3k 1=Y_ '8. V *1 (4.10) where .!f and 2.* are both functions of the distribution specified by the alternativg hypothesis and sample size under consideration. The theoretical value of power was found in tables given in Haynam et al. (1970) for a specified noncentrality parameter. Because many of the values of A considered in this study were not given in the Haynam tables, a Lagrange interpolation formula suggested by Haynam was used to compute the associated power values. If X is the given value of x which is not tabled,the interpolation procedure consists of taking 6 tabled values of k such that 3 of them are above X and 3 are below X. Let these values be denoted by x1, X2, X3, X4, X5, X6 such that X1«< X21< X3 < X4 <‘X5 < X6. Let Y1, Y2, Y3, Y4, Y5, Y6 be the corresponding values of power for the respective X's. The value of power Y which corresponds to the noncentrality parameter x is then given by the formula 109 6 6 (x - X) Y = 2 Y n _ (4-11) i=1 1 3:1 (x1 Xj) .‘Hi The interpolation procedure typically yields values of power that are within .0001 of the theoretical power, which is far in excess of the degree of accuracy needed for this study. The distributions chosen for the power investigation were selected to yield most theoretical power values in the range .40 to .80 for the case of an average expected cell size of 10 for tests performed at the nominal .01 and .05 significance levels. Moderate values of power were chosen as reflective of true data analytic situations. For average expected cell sizes of 5 and 20 more extreme values of theoretical power were achieved, thus pro- viding the investigation with a very broad range of situations to be examined. For the two dimensional tables three general classes of distributions specifying a form of the alternative hypothesis were considered. The configurations are given in Appendix E. The class- F distributions are each categorized by having the sole change in marginal probabilities occur between two categories. The change is concentrated in a single off diagonal cell. For the 3 X 3, 4 X 4, and 5 X 5 tables these off diagonal cells are (1,2), (2,1), and (2,1) respectively. The change in the marginal proba- bilities is relatively large for this distribution class. The class-G distributions are each categorized by having a large change in one of the marginal pr0portions accompanied by smaller changes Cases Investigated Under the Null Hypothesis 110 TABLE 4-1 Table Distribution Average Expected Cell Size '3 3 x 3 A 3 5 10 20 40 60 B 3 5 10 20 4O 60 C 3 5 10 20 40 60 D 3 5 10 20 40 60 E 3 5 10 20 40 60 4 x 4 A 3 5 10 20 40 60 B 3 5 10 20 40 60 C 3 5 10 20 40 60 D 3 5 10 20 40 60 E 3 5 10 20 40 60 5 X 5 A. 3 5 10 20 40 60 B 3 5 10 20 40 60 C 3 5 10 20 40 60 D 3 5 10 20 4O 60 E 3 5 10 20 4O 60 3 x 3 x 3 A 3 5 10 20 4O 60 B 3 5 10 20 40 60 C 3 5 10 20 40 60 D 3 5 10 20 40 60 111 TABLE 4—2 Cases Investigated Under the Alternative Hypothesis — g k Table Distribution H. Noncentrality Parametera 3 X 3 F 5 10 20 4.5283, 9.0566, 18.1132 G 5 10 20 3.4450, 6.8900, 13.7799 H 5 10 20 1.4876, 2.9752, 5.9504 4 x 4 F 5 10 20 4.7059, 9.4118, 18.8235 G 5 10 20 4.4063, 8.8126, 17.6252 H 5 10 20 5.5920, 11.1840, 22.3680 5 x 5 F 5 10 20 5.3215, 10.6429, 21.2858 G 5 10 20 4.6661, 9.3322, 18.6643 H 5 10 20 4.2989, 8.5979, 17.1957 3 x 3 x 3 I 5 10 20 3.4069, 6.8139, 13.6278 J 5 10 20 4.7230, 9.4461, 18.8921 K 5 10 20 9.9387, 19.8773, 39.7546 a Values of noncentrality parameters correSpond to average eXpected cell sizes of 5, 10, and 20. 112 in the other categories. The class-H distributions are each categorized by having small to moderate changes in the marginal prOportions for each of the categories. For the 3 X 3 X 3 table three distributions specifying a form of the alternative hypothesis were considered. The con- figurations are given in Appendix F. Distribution I reflects small to moderate changes in the marginal proportions for all of the categories, with these changes occurring exclusively between the first and second measures. Differences of this type characterize teachers' choices of a disciplinary strategy when increasing increments of information were given about the child involved in the disciplinary incident, Yoshinaga (1974). Although the teacher's initial choice of a disciplinary strategy may not have been a behavior formation strategy, once such a strategy was chosen, it was typically resistant to change on subsequent trials. Distribu- tion J indicates a steady, moderate decrease in the proportion concentrated in a single category accompanied by smaller increases for the other two categories. Such differences might typify the status of farm-related occupations in a generational study of of occupations. Distribution K reflects the same marginal pattern as distribution 1; the changes in the marginal proportions for distribution K are each 1.5 times the corresponding changes in distribution I. DATA GENERATION For each case which appears in Tables 4-1 and 4-2, 2000 samples were generated. Estimates of the exact significance levels 113 which correspond to the theoretical values of .01, .05, and .10 for each case considered under the null hypothesis were calculated by counting the number of rejections out of 2000 using the respective cutoff points of the central chi—square distribution. Estimates of exact power were found by counting the number of rejections out of 2000 for tests performed at the nominal significance levels of .01, .05, and .10 for each case considered under the alternative hypothesis. Data were generated from a pOpulation which has a discrete distribution. For the general r X r X...X r contingency table of d dimensions the discrete distribution can be characterized by rd cell probabilities. Two main steps comprise this generation procedure: (1) Generating independent random variables which are uniformly distributed between zero and one. (ii) Converting the uniformly distributed random vari- ables into random variables from the discrete dis- tributions Specified in the investigation. Each step is discussed in the next two subsections. Random Number Generator A multiplicative congruential generator was used to obtain the uniform random variates. The generator was described by Naylor (1968) and adapted to the CDC 6500 computer by Sidney Sytsma. Starting with an initial number no, pseudo-random numbers are generated according to the recursive formula n1+1 = a ni(mod Pe) . 114 The numbers a, P, e, and n are chosen to maximize the 0 period of the sequence of pseudo-random numbers generated and minimize the first-order serial correlation between the pseudo-random numbers. For a binary computer P is chosen to be 2 and e is chosen to be 48, the number of bits in a single word integer constant on the CDC 6500 computer. The maximum period which can then be achieved is of length 246. The size of this period is sufficiently large for almost any Monte Carlo study. Naylor (1968) showed that, to achieve the maximum period, a must be of the form a = 8t + 3, where t is any positive integer. Naylor (1968) recommended choosing a close to 224 in order to minimize first-order serial correlation between the pseudo-random numbers. A further require- ment suggests that n , the initial starting point must be chosen relatively prime to 248. Fbr Sytsma's adaptation of the generator to the CDC 6500, 24 a was chosen to be a - 2 + 3, and n was chosen as 110 - 135791357. Such choices guarantee a maximum period of length 246 and a low first-order serial correlation between the pseudo- random numbers. By dividing each random number produced by the generator by the quantity 248 , a uniformly distributed variate defined on the unit interval is obtained. Three different statistical tests were applied to the generator in order to ascertain if the generator has the desired statistical properties. A description of each of these tests follows. 115 Each of the tests is applied to a sequence, rw u0,u1,u2,..., of real numbers produced by the generator, which purports to be uniformly distributed between zero and one in an independent manner. The three tests applied in this study are designed primarily for integer-valued sequences instead of the real-valued sequence . The first test applied to the generator determines whether the numbers generated are uniformly distributed between zero and one. In the case of this first test, the auxiliary sequence = Y0, Y1, Y2,..., which is defined by the rule Y = [100 u ], n where [x] is the greatest integer less than or equal to x formed from the sequence , the numbers which are the product of the generator. The sequence {Yn>' is a sequence of integers which are uniformly distributed between 0 and 99 if and only if ' is uniform between 0 and 1. A sequence '= u0,u1,...,u9999 of numbers was generated and the corresponding ' sequence was examined. For each integer r, O S r«< 100, the number of times Yj - r for 0 S,j < 10,000 was counted. Under the null hypothesis that is uniform on [0,1] or equivalently that is a sequence of integers uniformly distributed between 0 and 99, the expected number of times Yj = r is 100 for each integer, r, 0 S r < 100. 116 A chi-square statistic of the form 99 (G; - 100)2 X = E 100 3 (4'12) r=0 where CE, is the observed number of Yj's which equal r, was used to test the null hypothesis that is uniform between 0 and 99. This process was repeated for 20 different sequences , each of 10,000 observations, and the results combined into an overall chi-square test which retained the null hypothesis that » and equivalently that is uniformly distributed. The test was performed at the .10 significance level. The second test applied to the generator determined whether pairs of successive numbers are uniformly distributed in an in- dependent manner. For this test an auxiliary sequence dn> = YO,Y1,Y2, . which is defined by the rule Yn = [10 um] is formed from the sequence = u0,u1,u2,ooo This test counts the number of times the pair (y2j’Y2j+1) = (q,r) occurs for 0 s j < 10,000; these counts are made for each pair of integers (q,r) with 0 s q, r < 10. Each observation (YZj’Y2j+l) can fall into any of 100 categories with equal probability of 1%63 Under the null hypothesis that pairs of successive numbers are uni- formly distributed in an independent manner, 100 pairs are expected 117 to fall into any given category. A chi-square statistic of the form 2 9 9 (or - 100)2 x = 2 :2 q 100 . (4-13) q=0 r=0 where CE? is the number of observed pairs of the form (q,r), is used to test the null hypothesis that - and equivalently has pairs of successive elements uniformly distributed in an in- dependent manner. This process was repeated for 20 different sequences , each consisting of 20,000 numbers or 10,000 pairs of numbers. The results were combined into an overall chi-square test which retained the null hypothesis at a .10 significance level. The third test was applied to the random number generator to determine whether triples of successive numbers are uniformly distributed in an independent manner. For this test an auxiliary sequence Y Y dn>=Y0’ 13 2,... which is defined by the rule is formed from the sequence >= , , ,... <”n u0 u1 u2 This test counts the number of times the triple (Y ) = 3;] ’Y3j+1’Y3j+2 (p,q,r) occurs for 0 s j < 10,000. These counts are made for each triple of integers (p,q,r) with 0 s p,q,r < 5. Each observation (Y3J’Y3j+l’Y3j+2) can fall into any of 125 categories Wlth equal probability under the null hypothesis that triples of Successive 118 numbers are uniformly distributed in an independent manner. For 10,000 triples a chi-square statistic of the form 4 4 4 (o -80)2 >2 where Opqr is the number of observed triples of the form (p,q,r), is used to test the null hypothesis that ' and equivalently ’ has triples of successive elements uniformly distributed in an independent manner. This process was repeated for 20 different sequences , each consisting of 30,000 numbers or 10,000 triples. The results were combined into an overall chi-square test which re- tained the null hypotehsis at a .10 significance level. The results of the three statistical tests performed on the random number generator indicate that the generator does what it purports to do; that is, it generates independent uniform random variates on the unit interval. The second step in the generation process converts the uniformly distributed random variates produced by the generator into random variates from a population which has a Specified discrete distribution. The next subsection describes the conversion process used. Generation of Discretg_yalued€§andom‘Variableg Let X be a discrete random variable with P[X =‘Vi] = pi. A method for generating X in a computer is to generate a unit uniform random variable U and put X =‘Vi if p1 +...+ pi_1 < U S p1 +...+ p (po = 0) (4-15) 1 Although this method is theoretically sound the method proves to be 119 too time consuming for discrete random variables which can take on a considerable number of values. Norman and Cannon (1972) developed a computer program for the generation of random variables from any discrete distribution which is far less time consuming than the method given in (4-15). The algorithm which these authors de- veloped generates numbers rapidly (many thousands per second of computer central processing unit time) while requiring a modest amount of computer core storage (no more than a few hundred words). The algorithm is based upon previous work by Marsaglia (1963). The algorithm is now described with the aid of a particular example. Let X be a discrete random variable of the form given in Table 4-3. Suppose that the computer used for the data genera- tion has storage blocks which may be called for by number. The fastest method of generating the discrete random variable X is probably the following: In memory locations 0 - 999, store 23 a's, 38 b's, 74 c's, 103 d's,..., ll m's. Then if U = .d1d2d3... is a unit uniform random variable generated in the computer, look up the number in location dld2d3 and designate it as X. While the method just described may be the fastest it is not necessarily the most economical. The method requires 1000 storage locations, and had the probabilities been given to four decimal places, 10,000 storage locations would have been required. The method of Marsaglia and the more general method of Norman and Cannon is an improvement on the fastest method in that it uses much less memory Space and takes only slightly longer to execute. This makes the Norman and Cannon procedure less costly than the fastest method just described. 120 TABLE 4-3 Illustrative Discrete Random Variable Value of X Probability a .023 b .038 c .074 d .103 e .148 f .206 8 .140 h .101 i .093 j .037 k .026 m .011 121 The fastest method has been described by Marsaglia as analogous to choosing at random a ball from an urn composed of dif- ferent balls which correspond to the values of the discrete random variable, with the number of balls of each type relative to the total the same as the corresponding probabilities given by the discrete distribution. The method of Marsaglia can also be compared to choosing a ball at random but the sampling scheme is slightly different. In- stead of one urn there are three urns. The urns are filled according to the digits of the probabilities of X given in Table 4—3. The composition of urn 1 reflects the digits in the tenths position, that of urn 2, the digits in hundreths position, and urn 3 reflects the digits in thousandths position. Thus the contents are: urn l: l d, l e, 2 f's, l g, l h urn 2: 2 a's, 3 b's, 7 c‘s, 4 e's, 4 g's, 9 1'3, 3 j's, 2 k's, l m urn 3: 3 a's, 8 b's, 4 c's, 3 d's, 8 e's, 6 f's, l h, 3 1'3, 7 j's, 6 k's, 1 m (4-16) Let Mi. denote the sum of the digits in the ith decimal position of all 12 probabilities given in Table 4-3. Let urn 1 be chosen with probability 10 -1 X M1, urn 2 be chosen with probability 10 -2 X M2, and urn 3 be chosen with probability 10 ‘3 XM3. For this example these probabilities are .600, .350, and .050 respectively. Urn 1 consists of 6 balls, urn 2 consists of 35 balls, and urn 3 consists of 50 balls, each of which is marked as indicated in (4-16). In all there are 91 balls available for selection. The Marsalgia technique is analogous to a two step procedure in ball selection. First an 122 urn is selected with probability .600, .350, and .050 for urns 1, 2 and 3 respectively. chosen. Once the urn is selected, a ball is randomly The principle underlying the Marsaglia technique is given in the equality (4-17) terminology and a corresponding generation procedure is given. P(X = x) i Ilt1cs 1 [F(urn i is chosen)P(X = x‘urn i)] for x (4—17) The ball and urn approach is now translated into computer In a computer the contents of the urns can be stacked in the following manner. 6 - 40, and urn 3, locations 41 - 90. Urn 1 occupies storage locations 0 - 5, urn 2, locations Table 4-4. This arrangement is shown in The source of Table 4-4 is Marsaglia's (1963) article. TABLE 4-4 Computer Memory Scheme for Discrete Generation Location 0 d 10 11 12 13 14 15 l6 17 18 19 Contents b 20 c 21 c 22 c 23 c 24 c 25 c 26 c 27 e 28 e 29 e e 30 31 32 33 34 35 36 37 38 39 x‘ LA L; Lh 40 m 41 a 42 a 43 a 44 b 45 b 45 b 47 b 48 b 49 b 50 51 52 53 54 55 56 57 58 59 c 62 c 63 c 64 c 65 d 66 d 67 7O 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 j 90 m 123 The two step selection process of choosing an urn and then a ball from the selected urn is translated to the following procedure. Let u I .d d ... be a unit uniform random variable. Let V(n) be 1 2 the contents of memory location n. Then (a) (b) 1) if d1< 6 put X=V(d1) 2) if 60 S dle < 95 put X = V(dld2 - 54) (4—18) 3) if 950 s d1d2d3 put x = V(d1d2d3 - 919) . In this rule (a) is analogous to determining the urn and (b) to choosing a ball randomly from the selected urn in part (3). Using rule (4-18) together with the equality in (4-17) results in the gen- eration of random variates with the discrete distribution given in Table 4-3. In all only 91 memory locations are needed. The Marsaglia algorithm was described for the particular dis- tribution given in Table 4-3. Norman and Cannon develOped a computer program for the generation of random variables from any discrete dis- tribution. Their procedure follows the specific illustration given with the exception that the probabilities of the discrete distribu- tions are expressed to four decimal places. The program consists of first setting up an array similar to that given in Table 4-4 to des- cribe the desired discrete probability distribution. Once the array is set up, a rule similar to (4-18) is determined. If M1, M2, M3, and M.4 are the sums of the digits in the first, second, third, and fourth positions of the probabilities of the discrete distribution, the number of storage or memory locations needed is given by M 3+‘M '+ M +-M which is considerably less than the 10,000 loca- l 2 3 4 tions needed for the computationally faster procedure. 124 The generation program GEN given in Appendix G is a Fortran IV version of the general program given by Norman and Cannon. The program reads in the probabilities which comprise the desired dis- crete distribution, creates an array similar to the one given in Table 4-4, generates the unit uniform random numbers using the multiplicative congruential generator and then converts the uniform random variates to random variates of the required discrete distribu— tion using the previously created array and a general assignment rule similar to that given in (4-18). For each of the discrete distributions considered in Tables 4-1 and 4-2, 4,000 observations were generated using the Norman and Cannon algorithm. A chi-square goodness of fit test was performed in order to determine the adequacy of the fit of the generated data to their respective theoretical distributions. For a contingency table characterized by the set of cell probabilities {P1,P2,...,P k} the chi-square goodness of fit test has the form k ((31 - 4000 Pi)2 8 1,1 4000 P 2 X = ((0-19) i where (31 is the number of observations out of 4000 which fall into cell i. A test of the form given in (4-19) was performed for each configuration considered in the Monte Carlo study. The results of these tests are given in Tables 4-5 and 4-6. From these tables it is seen that out of 31 goodness of fit tests only 2 show signifi- cant lack of fit for tests performed at the .10 significance level. These results provide strong evidence that the algorithm effectively generates the desired discrete distributions. 125 TABLE 4-5 Goodness of Fit Tests for Null Distributions 2 X Distribution Value Df p > A 9.11 8 .30 B 11.48 8 .10 C 5.53 8 .50 D 4.48 8 .80 E 9.01 8 .30 A 12.22 15 .50 B 12.68 15 .50 C 17.27 15 .30 D 7.45 15 .90 E 19.90 15 .10 A 22.92 24 .50 B 24.61 24 .30 C 27.46 24 .20 D 14.07 24 .90 E 21.97 24 .50 3 x 3 X 3 A 22.27 26 .50 B 15.73 26 .90 C 16.05 26 .90 D 36.06 26 .05 a p = PEXZ 2 critical value] lack of fit at a = .10 Goodness of Fit Tests for NonINull Distributions 126 TABLE 4-6 Table Distribution 3 X 3 4 X 4 5 X 5 3 X 3 X 3 F x2 Value 12.89 6.65 11.13 10.60 17.05 7.54 20.99 19.09 33.90 21.60 20.91 31.55 Df 8 8 8 15 15 15 24 24 24 26 26 26 .10 .50 .10 .70 .30 .90 .50 .70 .05 .70 .70 .20 2 P = PIX 2 critical value] lack of fit at CY .10 127 Analysis Routine Output from the generation program GEN consists of a vector of cell frequencies for each sample. For each case given in Tables 4-1 and 4-2, 2000 such samples were generated. The vector of cell frequencies serves as the input for the analysis routine XSTAT. XSTAT is a program written in Fortran IV which is used to calculate the value of the 12 statistic. As input the program uses the vector of cell frequencies, the number of matched samples, d, the number of categories of the dependent variable, r, and a matrix of constants Ait defined in Chapter III. The program is given in general form for any values of d and r in Appendix H. Special programs were written to adapt the more general XSTAT program to the specific designs used in the study. The calculation of the 12 statistic, 1* .- 1* 12 .. 1 . .151. , (4-20) ‘1. assumes the nonsingularity of E * which is an estimator of the V variance-covariance matrix of 4*. For certain types of contingency tables this matrix may be singular when the sample size is small. This may occur when many of the cells in the contingency table are empty. Berkson (1955) suggested replacing the zero cell frequencies with some small number. Grizzle et a1. (1969) and Koch et al. (1974) suggested replacing the zero cell frequencies with a number such as ‘% in the event that i.* is singular. Such a procedure was employed in this study. THHS procedure was effective in producing nonsingular estimators of 25*. The simulation results, including V a discussion of the occurrence of singularities, follow. 128 SIMULATION RESULTS This section is divided into five subsections. The first subsection reports the extent to which singularities occur in the iL*' matrix. The second subsection reports the estimates of the agtual significance level of the 12 procedure and discusses these results. The third subsection reports the estimates of the actual significance level of the {2* statistic, which is a slight variant of 1?. The fourth subsection reports the estimates of the actual power of the I? procedure and compares these values with the nominal values obtained from a noncentral chi-square dis- tribution. The fifth subsection deals with the power of the 12* statistic. Occurrence of Singularities In Chapter III it was stated that zfl*, the variance-co- variance matrix of .i* is nonsingular and tHat the corresponding consistent estimator i.* is asymptotically nonsingular. For small sample sizes and for ce¥tain types of data the matrix i *, may be singular. The presence of singularities was encountered-for dis- tributions of classes C and D for small sample sizes. These results are reported in Table 4-7. 129 TABLE 4-7 Number of Singularities in 2000 Samples Distribution n I 3 5 10 3 x 3 C 485 119 4 D 217 O 0 4 x 4 C 340 47 l D 19 3 0 S x 5 C 91 4 0 D 38 5 0 3 x 3 X 3 C 23 0 0 D 47 l 0 For sample sizes in which IE'> 10 no singularities were encountered. When ‘H'I 10 the singularity problem is nearly non— existent. At .E.- 5 the most noticeable problem occurs for the 3 X 3 - C table in which approximately one out of every twenty of the §.* generated was singular. The problem of singularities is also ngticeable for the 4 X 4 - C table at 'H'I 5 in which approximately one out of every forty of the §.* generated was singular. The class-C distributions are the mast troublesome be- cause of the very small probabilities attached to the off-diagonal cells. When '3' is 5 the expected cell frequencies for the off- diagonal cells are between .8 and .9 for the 3 X 3 - C, 4 X 4 — C, 5 X 5 - C, and 3 X 3 X 3 - C tables. In the generation process this may cause some of the off-diagonal cells to be empty. This does not appear to cause inversion problems of the EL* matrix in V 130 the 5 x,5 and 3 x 3 x.3 tables. The presence of zeros in the 3 x 3 and 4 x.4 contingency tables will more frequently cause the matrix to be singular. When the sample size is such that 'Hl= 3 a problem of singularities in the estimated variance-covariance matrix becomes decidedly more pronounced. When ‘H'I 3 for the class-C distribu- tions the off—diagonal cells each have a cell size expectation of approximately .5. This will cause many of the off-diagonal cells to be empty in the generation process. The problem of singularity is quite apparent in the 3 )(3 - C and 4 )<4 - C tables and, to a lesser extent, in the 5 )<5 - C table. The presence of singularities is also quite noticeable for the 3 )(3 - D distribution at ‘;'= 3. At this sample size, 4 of the 6 off-diagonal cells have expected frequencies of .54. This will cause some of the off-diagonal cells to be empty in the generation process. From the results given in Table 4-7 it appears that the presence of empty cells in the 3 X.3 and 4 X.4 tables tends to be more serious in terms of causing the EL,, matrix to be singular than a comparable proportion of empty cells in the 5 X 5 and 3 x 3 x 3 table. It was found that by applying the Berkson procedure of replacing the empty cells with the frequency i:, all singularities were successfully eliminated. The Berkson correction was used only when EL*. was found to be singular. The results in Table 4-7 V correspdnd to the number of times the Berkson procedure had to be employed out of 2000 samples. 131 Estimates of Actual Alpha for the I? Statistic The estimates of the actual alpha are presented in Tables 4-8 through 4-10. The standard errors for the empirical estimates of alpha are estimated by 63,- jtheoretical alpha (1 - theoretical alpha)/N’ where N, the number of samples for this investigation, is 2000. For the nominal alpha values of .01, .05, and .10 the corresponding estimates of the standard errors are .00222, .00487, and .00671 respectively. For the purposes of this investigation, an empirical estimate which falls within 1.96 standard errors of its corresponding nominal value will be considered an adequate approximation to the theoretical value. Such estimates are within 95% confidence intervals of the nominal values, and there is insufficient evidence to state that the actual alpha level and the nominal level are different. The 95% confidence intervals for the nominal values of alpha of .01, .05, and .10 are (.0056, .0143), (.0404, .0595), and (.0869, .1131) respectively. The choice of a 95% level statement is completely arbitrary although many researchersuse'it as a rule of thumb. Empirical estimates which fall outside the 95% confidence bounds may still be considered adequate approximations of the theoretical alpha values, depending upon the degree of accuracy required by the data analyst. Roscoe and Byars (1970) regarded empirical values which fall within :_ 20% of their corresponding theoretical values as excellent and believed that i 50% would be quite acceptable in most behavioral applications. Roscoe and Byars claimed that these figures are con- sistent with the notion of robustness as applied to parametric tests of hypotheses about means. The empirical estimates of a are given Monte Carlo Estimates of Exact Level of 12 132 TABLE 4-8 Test Associated with Nominal 1% Level Average Expected Cell Size Distribution H'= 3 5 10 20 4o 60‘_ 3 x 3 A .0265 .0145 .0180 .0130 .0115 .0105 B .0135 .0115 .0125 .0115 .0100 .0105 c .0010 .0005 .0045 .0090 .0090 .0115 D .0045 .0090 .0100 .0060 .0115 .0110 E .0150 .0150 .0120 .0135 .0120 .0080 4 x 4 A .0220 .0205 .0160 .0125 .0090 .0070 B .0195 .0145 .0115 .0120 .0120 .0095 c .0000 .0035 .0050 .0120 .0090 .0090 D .0075 .0115 .0125 .0100 .0105 .0075 E .0165 .0060 .0105 .0070 .0080 .0100 5 x 5 A .0215 .0155 .0130 .0130 .0140 .0105 B .0185 .0155 .0115 .0110 .0095 .0065 c .0015 .0025 .0085 .0110 .0080 .0140 D .0185 .0155 .0115 .0110 .0095 .0065 E .0165 .0125 .0100 .0100 .0075 .0110 3 x 3 x 3 A .0150 .0215 .0135 .0125 .0155 .0130 B .0225 .0100 .0150 .0085 .0075 .0070 c .0010 .0055 .0095 .0120 .0100 .0100 D .0210 .0085 .0095 .0170 .0075 .0110 2 Monte Carlo Estimates of Exact Level of I 133 TABLE 4-9 Test Associated with Nominal 5% Level Average ExPected Cell Size Distribution E'= 3 5 10 20 40 60 3 x 3 A .0755 .0685 .0650 .0520 .0620 .0470 B .0725 .0735 .0575 .0595 .0495 .0465 c .0160 .0300 .0465 .0475 .0525 .0510 D .0460 .0540 .0480 .0475 .0510 .0530 E .0750 .0650 .0505 .0560 .0595 .0450 4 x 4 A .0750 .0645 .0585 .0550 .0560 .0525 B .0765 .0690 .0540 .0540 .0535 .0560 c .0105 .0355 .0490 .0535 .0540 .0515 D .0615 .0590 .0645 .0500 .0530 .0490 E .0710 .0630 .0590 .0460 .0405 .0510 5 x 5 A .0760 .0660 .0565 .0545 .0575 .0555 B .0685 .0640 .0570 .0640 .0500 .0450 c .0265 .0460 .0515 .0515 .0445 .0605 D .0620 .0565 .0505 .0585 .0530 .0585 E .0700 .0685 .0535 .0555 .0475 .0515 3 x 3 x 3 A .0650 .0685 .0570 .0535 .0535 .0540 B .0655 .0605 .0570 .0510 .0480 .0390 c .0330 .0475 .0500 .0500 .0545 .0490 D .0685 .0510 .0555 .0615 .0505 .0515 Monte Carlo Estimates of Exact Level of I 134 TABLE 4-10 Test Associated with Nominal 10% Level 2 Average Expected Cell Size Distribution 3': 3 5 1o 20 40 60 3 x 3 A .1380 .1210 .1120 .0980 .1130 .0980 B .1415 .1260 .1105 .1065 .1060 .0905 c .0560 .0750 .1220 .0985 .1055 .1025 D .1140 .1055 .1050 .1000 .1100 .0990 E .1415 .1220 .1090 .1115 .1090 .0985 4 x 4 A .1405 .1145 .1135 .1025 .1095 .0965 B .1435 .1250 .1100 .1060 .1050 .1080 c .0515 .1015 .1030 .1045 .1050 .0985 D .1275 .1140 .1220 .0995 .0970 .1065 E .1260 .1190 .1070 .0990 .0980 .1090 5 x 5 A .1300 .1225 .1155 .0985 .1135 .1160 B .1225 .1165 .1105 .1150 .1095 .0975 c .0800 .1085 .1065 .1055 .0940 .1150 D .1160 .1140 .1055 .1085 .1000 .1105 E .1325 .1190 .1010 .1075 .1055 .0970 3 x 3 x 3 A .1260 .1260 .1105 .1010 .1040 .1130 B .1225 .1110 .1105 .0985 .0910 .0970 c .0925 .1150 .1055 .1070 .1015 .1065 D .1335 .1115 .1005 .1145 .1000 .1045 135 1.11 Tables 4-8 through 4—10. It is left to the reader to judge which criterion to use although the author feels that any empirical estimate falling within the apprOpriate 95% confidence interval should be considered a good approximation to the theoretical value. Table 4-11 reports the number of estimated actual alphas which are within the limits of a 95% confidence interval for the corresponding nominal alpha levels. These data exhibit the rela- t ionship between average expected cell size and degree of approxima— t :I; on to the limiting distribution under the null hypothesis. TABLE 4-11 Number and Percentage (Out of 19 Different Distributions) of 8 Within 95% Confidence Limits of the Corresponding Nominal o _:_ Average Expected Nominal Alpha Cell size 5 a I .01 .05 .10 3 3 (15%) 1 (5%) 1 (5%) 5 8 (42%) 6 (32%) 5 (26%) 10 14 (74%) 18 (95%) 15 (79%) 20 18 (95%) 17 (89%) 17 (89%) 40 18 (95%) 17 (89%) 18 (95%) 60 19 (100%) 17 (89%) 17 (89%) The results given in Table 4-11 are computed across 19 different combinations of distributions and table sizes. The most striking improvement in the chi-square approximation to the distribu— tion of 1.2 occurs when H is increased from 5 to 10. At 171 I 10 the approximation to the limiting distribution generally appears to 136 be quite good. There is some further improvement in the approxima- tion when H is increased to 20. Beyond 3 a 20, there is no noticeable improvement in the chi-square approximation to the dis- tribution of the 12 statistic. One of the more striking characteristics of the results in Tables 4-8 through 4-10 is the almost uniform liberalness of the 12 procedure for very small sample sizes (H = 3,5). The one major exception to this rule is found in the behavior of the 12 statistic Eo r the distributions of class—C. Each of these distributions is Characterized by a very heavy concentration of the total probability On the main diagonal of the contingency table. When the average EXpected cell size is 10 the expected number of observations within each off-diagonal cell is between 1.60 and 1.75 for the 3 x 3, 5|» X 4, 5 X 5, and 3 X 3 X 3 contingency table. When H = 5 these E igures drop to .48 and .52 respectively. In general the Procedure appears to be conservative for the class-C distributions, Especially for 11 I 3. The chi-square approximation is noticeably better for the 5 X 5 and 3 X 3 X 3 contingency tables than it is for the 3 x 3 and 4 x 4 tables. The approximation to the limiting distribution becomes quite good when H I 5 for the 5 X 5 - C and 3 x 3 X 3 - C distributions. For contingency tables of any of the sizes considered in this investigation the approximation to the lILmiting distribution is quite good when a 2 10 for the class-C distributions. These results are quite encouraging because the distributions of this type give rise to contingency tables in which a considerable number of off-diagonal cells may be empty or contain less than two observations when H = 10, and yet the procedure re- mains quite valid. 137 In general for the distributions of classes A, B, C and E for the two dimensional tables and for the distributions of classes A, B, and C for the 3 x 3 X 3 table, the approximation to the chi- square limiting distribution typically does not become good until 2 is at least 10. The I statistic behaves quite well, however, :1. E or the distributions of class-D at H a 5. At :1- = 5, 4 out of 9 cells in the 3 X 3 - D table, 6 out of 16 cells in the 4 X 4 — D cable, 14 of 25 cells in the 5 X 5 - D table, and 18 of 27 cells in the 3 x 3 x 3 - D table have expectations of size less than one. Y et despite the presence of a large number of tiny cell size expectations the empirical estimates of alpha are quite close to their nominal values. At .5 I 5, for the distributions of classes B and E for the two dimensional tables and B distribution for the three dimensional ease, the majority of the cells have expectations of sizes between For the distributions of class-A the 2 and 3 with none below 1. It seems unusual that the behavior Cell size expectations are 5. 2 0f the I statistic for the distributions of class-D should be b etter than the behavior under these other classes of distributions. An. explanation for these results is offered in the next subsection. 9: E\8timates of Actual Alpha for the 1,2 Statistic 2 It was mentioned in Chapter III that the I. statistic, *.-1 * 12= '..* ‘1 |<> |<’ uses 5; *, a consistent estimator of 25* both under the null and " v V _ alternative hypothesis. A slightly different statistic, which is 138 a generalization of the Stuart (1955) statistic discussed in Chapter II, can be used to test for marginal homogeneity. The 2* statistic, denoted as I , has the form 2* “-1 x* I =27 1*1 ° \1 with A: * having the prOperty that (4—21) 0 A V A A 1 A*A *' 5* - 025* - n _V. X . 1 \1 The matrix 02.01: is a consistent estimator of the variance- V 9? covariance matrix of 2 under the assumption that the null hypo- :h esis of marginal homogeneity is true. The use of smaller terms 2 Ln variance-covariance matrix of the 1 statistic, as seen from * C 4-21), makes the procedure more liberal than the 12 procedure. * Ar). explicit relationship between 12 and 12 is given by 2 2* x = "“‘lI—'2 . (4-22) 1 + g I This relationship is a generalization of the relationship (2-67) given in Chapter II. When n is small the disparity between the 2 2* I. and I. statistics may be large enough to cause considerable differences in the exact significance levels of the two procedures Wh en used to test the null hypothesis. Empirical studies cited earlier in this chapter indicated tflat the fit of the small sample chi-square statistics to the limiting chi—square distribution was quite good for uniform data. when the data were markedly nonuniform with certain cell probabilities Very small, the tests tended to be quite conservative for small Sample sizes. 139 The results in the last subsection showed the 12 statistic to be quite liberal for small sample sizes (3 -= 3,5) for distribu- tions which were uniform (class-A) and other distributions which did not have extremely small off-diagonal cell probabilities. The approximation to the limiting distribution was quite good for the class-D distributions for small sample sizes, especially for '6 = 5. From the description of the class—D distributions given in the pre— vious subsection one would expect the 12 statistic to be somewhat conservative for this class of distributions. The class-C distribu— C ions did produce conservative results but the results were 3 urprisingly better than would be expected for such an extreme dis- tribution in which the off-diagonal cells have such small proba- b ilities. The author speculated that a statistic based upon the use A of 086* as an estimator of 2“,, would produce results consistent V V W1 th some of the findings in the. literature review. Exact levels Of significance for the 12* procedure were estimated by Monte Carlo sampling for tests performed at the nominal .01, .05, and .10 1EPAIels. These estimates are based upon the same data that were used in estimating the corresponding exact levels for the 12 procedure given in Tables 4—8 through 4-10. The results for the 12* 3 tatistic are given in Tables 4-12 through 4-14. The 12* statistic behaves quite well for the class A, B a and E distributions for an '5 as low as 5. At the nominal .05 level 9 out of 11 empirical estimates of alpha are within the 957. QOnfidence limits for a nominal alpha of .05 while at the .10 140 TABLE 4-12 * Monte Carlo Estimates of Exact Level of 12 Test Associated with Nominal 1% Level Average Expected Cell Size D :1. stribution 3' = 3 5 10 20 3 x 3 A .0060 .0065 .0110 .0100 B .0020 .0035 .0075 .0100 c .0000 .0000 .0025 .0065 0 .0000 .0020 .0060 .0035 E .0010 .0060 .0065 .0095 4 x a A .0080 .0095 .0110 .0135 B .0065 .0040 .0085 .0085 c .0000 .0010 .0030 .0095 0 .0010 .0050 .0085 .0065 E .0030 .0015 .0060 .0065 5 x 5 A .0085 .0075 .0120 .0100 B .0060 .0090 .0080 .0100 c .0000 .0015 .0050 .0095 D .0050 .0050 .0070 .0090 E .0065 .0070 .0075 .0080 3 >< 3 x 3 A .0070 .0155 .0105 .0095 B .0075 .0045 .0090 .0080 c .0005 .0030 .0060 .0095 D .0060 .0030 .0075 .0155 141 TABLE 4-13 2 Monte Carlo Estimates of Exact Level of I, Test Associated with Nominal 57. Level Average Expected Cell Size Di. 8 tribut ion 3 3 5 10 20 3 x 3 A .0440 .0430 .0520 .0470 B .0350 .0470 .0440 .0515 c .0040 .0140 .0365 .0450 D .0155 .0325 .0410 .0420 E .0370 .0410 .0435 .0565 4 x 4 A .0370 .0510 .0485 .0535 B .0435 .0510 .0435 .0485 c .0025 .0155 .0380 .0490 D .0310 .0455 .0545 .0445 E .0345 .0475 .0465 .0410 5 x 5 A .0465 .0495 .0525 .0510 B .0465 .0480 .0440 .0580 c .0090 .0270 .0435 .0490 D .0350 .0430 .0385 .0570 E .0430 .0440 .0455 .0525 33 >< 3 x 3 A .0415 .0550 .0495 .0510 B .0415 .0485 .0485 .0475 c .0130 .0315 .0455 .0540 D .0415 .0365 .0495 .0560 142 TABLE 4-14 * Monte Carlo Estimates of Exact Level of 12 Test Associated with Nominal 10% Level AverageAExpected Cell Size Di. 8 tribution E = 3 5 10 20 3 x 3 A .0995 .0990 .0995 .0940 B .0845 .1025 .1020 .1055 c .0160 .0450 .0955 .0880 D .0505 .0745 .0905 .0910 E .0870 .0935 .0970 .1030 4 x 4 A .0970 .0960 .1020 .1090 B .1030 .1025 .0985 .1005 c .0245 .0780 .0910 .1015 D .0805 .0895 .1120 .0950 E .0950 .0980 .0970 .0920 5 x 5 A .1030 .1030 .1065 .0955 B .0880 .0960 .0970 .1120 c .0420 .0855 .0980 .0975 0 .0805 .0885 .0925 .1035 E .0935 .0990 .0905 .1040 3 ><, 3 x 3 A .0935 .1040 .1050 .0985 B .0900 .0955 .1035 .0940 c .0570 .0925 .0965 .1020 D .0990 .0935 .0825 .1035 143 nominal level every estimate is within the 95% confidence limits. The behavior of the 12* statistic is quite good for the uniform distributions of class-A even when '5 is as small as 3. These sarailues are somewhat conservative in contrast to the corresponding ‘12 values which were decidedly liberal especially at E = 3. The empirical estimates of alpha for the cases using the distribu- t: ji.ons of class-D are quite conservative at 'E'- 3 especially E or the 3 x3 and 4 x4 tables. The fit of 12* to its limiting cslrx.i-square distribution is extremely poor for the distributions Of class-C with the procedure being extremely conservative at I! equalling 3 or S and somewhat conservative for the 3 x 3 and [0» x 4 tables at 3 = 10. Table 4—15 reports the number of estimated actual alphas wh ich are within the limits of a 95% confidence interval for the *- Corresponding nominal alpha values when the 12 statistic is used. TABLE 4-15 Number and Percentage (Out of 19 Different Distributions) of 51 Within 95% Confidence Limits of the Corresponding 2* Nominal or for the I Statistic Average Expected Nominal Alpha Cell Size n a a .01 .05 .10 3 9 (47%) 8 (42%) 11 (58%) 5 6 (32%) 13 (68%) 15 (79%) 10 16 (84%) 16 (84%) 19 (100%) 20 17 (89%) 19 (100%) 19 (100%) 144 * A comparison of Tables 4-11 and 4-15 reveals that the 12 statistic better approximates its limiting distribution than does the 12 statistic when E'= 3, 5 and the null distributions are considered as a whole. However, 12* is extremely conservative when computed for class C and D distributions when E'= 3,5 and at least half of the off-diagonal expected values are less than one. 12 should be preferred in these cases. The use of a variance-covariance matrix with smaller terms inflates the 12 statistic relative to 12* and cancels out some of the conservativeness of the 12* procedure. For the other classes of distributions considered under the null hypothesis (classes A, B, and E) for E'= 3 or 5 none of the off- diagonal cells have cell expectations of size less than one. For such distributions the approximation of the 12* statistic to its limiting distribution is quite good eSpecially when ;.= 5. The smaller terms in the variance-covariance matrix of the 12 statistic inflate 12 relative to 12*, making the 12 procedure liberal when a good approximation to the limiting distribution might have * been expected. The power of both the 12 and 12 procedures is examined in the next two Subsections. Estimates of Actual Power for the 12 Procedure This subsection reports the degree to which the actual power of the 12 procedure approximates the power computed using a non- central chi-square distribution with the appropriate noncentrality parameter. The value of power found from the noncentral chi-Square distribution is referred to as the asymptotic power. Tables 4-16 through 4-18 each report estimates of the actual power of the 12 145 TABLE 4-16 Monte Carlo Estimates of Actual Power and Tabled Asymptotic Power for the 12 Test Associated with the Nominal 1% Level Average EXpected Cell Size Distribution {1' = 5 10 20 Est. Asymptotic Est. Asymptotic Est. Asymptotic Actual Power Actual Power Actual Power 3 X 3 Power Power Power F .2695 .2402 .5512 .5569 .9275 .9133 G .1960 .1676 .4385 .4095 .8005 .7962 H .0725 .0603 .1615 .1385 .3500 .3417 4 X 4 F .2310 .2050 .5445 .5062 .9105 .8925 G .2265 .1872 .4875 .4688 .8615 .8641 H .2995 .2598 .6315 .6097 .9525 .9467 5 X 5 F .2215 .2073 .5535 .5237 .9390 .9097 G .1870 .1720 .4645 .4469 .8670 .8550 H .1640 .1532 .4075 .4025 .8290 .8141 3 X 3 X 3 I .1335 .1111 .3100 .2939 .6930 .6780 J .2065 .1750 .4685 .4537 .8590 .8607 K .5220 .4829 .9035 .8830 .9990 .9985 146 TABLE 4-17 Mbnte Carlo Estimates of Actual Power and Tabled Asymptotic Power for the 12 Test Associated with the Nominal 5% Level AveragegExpected Cell Size Distribut ion 3 = 5 10 20 Est. Asymptotic Est. Asymptotic Est. Asymptotic Actual Power Actual Power Actual Power 3 X 3 Power Power Power F .4995 .4629 .7880 .7734 .9820 .9753 G .3990 .3638 .6511 .6480 .9270 .9240 H .2020 .1772 .3445 .3191 .5990 .5802 4 X 4 F .4450 .4167 .7575 .7327 .9710 .9666 G .4240 .3922 .7080 .7011 .9560 .9552 H .5295 .4871 .8325 .8110 .9880 .9865 5 x 5 F .4700 .4203 .7780 .7464 .9860 .9735 G .4005 .3709 .7045 .6816 .9700 .9511 H .3565 .3430 .6545 .6406 .9345 .9320 3 x 3 x 3 I .3140 .2748 .5445 .5274 .8630 .8348 J .4090 .3753 .6925 .6877 .9515 .9536 K .7725 .7130 .9765 .9630 .9995 .9998 147 TABLE 4-18 Mbnte Carlo Estimates of Actual Power and Tabled Asymptotic Power for the 12 Test Associated with the Nominal 10% Level Average Expected Cell Size Distribution K = 5 10 20 Est. Asymptotic Est. Asymptotic Est. Asymptotic Actual Power Actual Power Actual Power 3 X 3 Power Power Power .6175 .5893 .8745 .8566 .9905 .9886 .5040 .4891 .7560 .7569 .9513 .9596 .2885 .2752 .4570 .4412 .7330 .6983 4 X 4 .5705 .5444 .8500 .8255 .9875 .9839 .5420 .5195 .8045 .8006 .9790 .9777 .6520 .6132 .8955 .8842 .9960 .9941 5 x S .5985 .5486 .8670 .8364 .9940 .9876 .5205 .4981 .7990 .7852 .9885 .9754 .4750 .4685 .7580 .7512 .9650 .9643 3 x 3 x 3 .4305 .3936 .6545 .6515 .9125 .9157 .5345 .5026 .7920 .7901 .9710 .9768 .8500 .8103 .9860 .9820 1.0000 .9999 148 procedure together with the correSponding values of the asymptotic power assumed under the noncentral chi-square distribution for tests performed at the nominal significance levels of .01, .05, and .10. The asymptotic power understates the actual power of the 12 procedure. Of 109 empirical estimates of the actual power, only 9 are below their correSponding asymptotic values. Of these 9, 7 are within one standard error of their correSponding asymptotic values while the other 2 are within the 95% confidence limits of their reSpective theoretical values. Table 4-19 reports the number of cases in which the empirical value falls beyond the upper limits of the corresponding 95% interval about the asymptotic power value. The degree to which the asymptotic power understates the actual power decreases as the average expected cell size increases. This trend is expected since the distribution of the non-null 12 statistic should better approximate its limiting noncentral chi- square distribution as the sample size increases. The percentages of empirical power estimates falling within the 95% confidence bounds of their reapective asymptotic values are approximately 10%, 60%, and 75% for values of 3' of 5, 10, and 20 reapectively. TABLE 4-19 Number and Percentage (out of 12 Different Distributions) of Empirical Power Values Beyond the Upper Limits of the 95% Confidence Intervals for their Reapective Asymptotic Values for Nominal a of .01, .05, .10 Average Expected Nominal Alpha Cell Size n a = .01 .05 .10 5 9 (75%) 11 (92%) 9 (75%) 10 5 (42%) 6 (50%) 3 (25%) 20 3 (257.) 3 (257.) 3 (257.) 149 Although the results for power are not as theoretically pleasing as the results for actual alpha, the deviation from the theoretical power is in the desirable direction. For those distribu- tions considered under the alternative hypothesis, the asymptotic power generally serves as a lower bound for the actual power of the procedure. The power appears to be at least as good as would be predicted using asymptotic theory, a situation which is highly desirable for practical applications of the statistical procedure. Although many of the empirical estimates of the actual power of the ‘12 procedure are not within the 95% confidence limits of their reSpective theoretical values, implying that the actual values of power probably do differ from their corresponding asymptotic values, the actual magnitudes of the differences are quite small for most cases considered. At the .01 nominal significance level the average absolute percent error of the empirical estimate relative to the asymptotic value of power was 13.9% when E'= 5, 5.1% when El= 10, and 1.3% when 3'= 20. When the test was performed at the nominal .05 level, the comparable measures were 9.2%, 2.7% and 1.0% reapectively, and when the test was performed at the .10 nominal level the correSponding averages were 5.3%, 1.5% and .8% reSpectively. * Estimates of Actual Power for the 1? Procedure 2* 2 The 1. statistic is always less powerful than I as a consequence of the reSpective definitions of the variance-covariance matrices. In this subsection the degree to which the actual power 2* of the I procedure approximates the asymptotic power computed * with noncentrality parameter A , defined as 150 TABLE 4-20 MOnte Carlo Estimates of Actual Power and Tabled Asymptotic Power for the 12* Test Associated with the Nominal 12 Level M Average Expected Cell Size Distribution ;'= 5 10 20 Est. Asymptotic Est. Asymptotic Est. Asymptotic Actual Power Actual Power Actual Power 3 X 3 Power Power Power F .1675 .2116 .4905 .5026 .9170 .8786 G .1155 .1522 .3750 .3742 .7785 .7575 H .0400 .0582 .1235 .1328 .3235 .3281 4 x 4 F .1555 .1894 .4905 .4736 .9040 .8705 G .1555 .1739 .3630 .3732 .8485 .8421 H .2095 .2368 .4665 .4586 .9470 .9301 5 x 5 F .1595 .1953 .5060 .4987 .9325 .8940 G .1285 .1633 .4260 .4267 .8530 .8373 H .1140 .1461 .3685 .3851 .8130 .7959 3x3x3 I .0955 .1074 .2810 .2837 .6795 .6624 J .1515 .1667 .4240 .4346 .8440 .8444 K .4390 .4424 .8820 .8512 .9990 .9971 151 TABLE 4-21 Monte Carlo Estimates of Actual Power and Tabled Asymptotic Power for the 12* Test Associated with the Nominal 5% Level Average Expected Cell Size Distribution 5 = 5 10 20 Est. Asymptotic Est. Asymptotic Est. Asymptotic Actual Power Actual Power Actual Power 3 X 3 Power Power Power F .4270 .4254 .7605 .7305 .9795 .9617 G .3340 .3404 .6330 .6137 .9205 .9037 H .1540 .1728 .3165 .3101 .5780 .5657 4 x 4 F .3830 .3953 .7390 .7053 .9675 .9580 G .3630 .3732 .6855 .6751 .9535 .9454 H .4665 .4586 .8095 .7813 .9865 .9810 5 x 5 F .4110 .4040 .7535 .7261 .9840 .9674 G .3495 .3582 .6860 .6633 .9505 .9430 H .3110 .3320 .6305 .6237 .9315 .9230 3 x 3 x 3 I .2695 .2685 .5195 .5158 .8560 .8461 J .3645 .3631 .6750 .6705 .9485 .9463 K .7200 .6776 .9760 .9494 .9995 .9996 152 TABLE 4-22 Monte Carlo Estimates of Actual Power and Tabled Asymptotic Power for the 12* Test Associated with the Nominal 10% Level Average Expected Cell Size Distribution ;’= 5 10 20 Est. Asymptotic Est. Asymptotic Est. Asymptotic Actual Power Actual Power Actual Power 3 X 3 Power Power Power .5675 .5528 .8615 .8237 .9895 .9814 .4610 .4644 .7370 .7276 .9610 .9470 .2540 .2696 .4320 .4312 .7210 .6853 4 X 4 .5320 .5227 .8385 .8040 .9855 .9793 .4990 .4998 .7895 .7796 .9780 .9721 .6145 .5858 .8840 .8625 .9955 .9914 5 X 5 .5590 .5322 .8575 .8207 .9935 .9844 .4750 .4847 .7880 .7701 .9690 .9707 .4360 .4568 .7415 .7369 .9630 .9589 3 x 3 x 3 .3950 .3863 .6385 .6408 .9100 .9089 .4935 .4899 .7805 .7761 .9700 .9726 .8350 .7819 .9855 .9744 1.0000 .9999 * * -1 * A = V ' 2.9: Y. . (4-23) _. 0 V is examined. Values of the asymptotic power were computed from the Haynam.tables for each of the distributions and sample sizes given in Table 4-2. Empirical estimates of the correSponding actual power values were calculated using the same data used in calculating the empirical power values for the 12 procedure. The results of these calculations are given in Tables 4-20, 4-21, and 4-22. Table 4-23 provides a summary of the relationship between sample size and the degree to which the actual power of the 12* procedure approximates the asymptotic power. All of the estimates indicated in Table 4-23 with the exception of the case of o = .01 at ;'= 5, are beyond the upper limits of their correSponding 95% confidence intervals. This indicates that the asymptotic power computed by (4-23) tends to underestimate the actual power of the TABLE 4-23 Number and Percentage (out of 12 Different Distributions) 2* of Empirical Power Values of 1, Beyond the 95% Confidence Limits of their ReSpective Asymptotic Values for Nominal o of .01, .05, .10 Wed Nominal Alpha Cell Size n a = .01 .05 .10 5 9 (75%) 3 (25%) 3 (25%) 10 l (8%) 6 (50%) 5 (42%) 20 5 (42%) 4 (33%) 5 (42%) * ‘12 procedure with some exceptions in the case of tests performed at the nominal .01 significance level. These results indicate that, in general, the asymptotic power serves as a lower bound for the 154 TABLE 4-24 * Estimates of Actual Power of the 12 and 1,2 Procedures for Tests Performed at the Nominal 5% Level Average Expected Cell Size Distribution H'= 5 10 20 3 x 3 12 12* 12 r2* I2 2* F .4995 .4270 .7880 .7605 .9820 .9795 c .3990 .3340 .6511 .6330 .9270 .9205 H .2020 .1540 .3445 .3165 .5990 .5780 4 x 4 F .4450 .3830 .7575 .7390 .9710 .9675 G .4240 .3630 .7080 .6855 .9560 .9535 H .5295 .4665 .8325 .8095 .9880 .9865 5 x 5 r .4700 .4110 .7780 .7535 .9860 .9840 G .4005 .3495 .7045 .6860 .9700 .9505 H .3565 .3110 .6545 .6305 .9345 .9315 3 x 3 x 3 I .3140 .2695 .5445 .5195 .8630 .8560 J .4090 .3645 .6925 .6750 .9515 .9485 K .7725 .7200 .9765 .97 60 .9995 .9995 155 actual power of the 12* procedure. Table 4-24 indicates that at ;.= 5 the differences of the actual power values of the 12 and 12* statistics are rather sizable relative to the power values involved. However, for the larger sample sizes the differences are quite small. The following conclusions result from this Monte Carlo study. CONCLUSIONS AND IMPLICATIONS OF THE SIMULATION RESULTS The main purpose of this simulation study was to examine the behavior of the 12 statistic for finite sample sizes both under the null and alternative hypotheses. In addition the behavior of the 12* statistic, a slight variant of 12, was examined for finite sample sizes both under the null and alternative hypotheses. In this section recommendations regarding the use of each of the statistics are made based upon the findings of the study. In the strict sense generalizations from this study are limited to the 3 X 3, 4 X 4, 5 x 5, and 3 X 3 X 3 contingency tables for the con- figurations and sample sizes considered. Nevertheless, certain con- clusions can be drawn from the general patterns of observed statistical behavior of the 12 and 12* procedures which may be applicable to a larger class of cases than those actually studied. In choosing between the 12 and 12* procedures an important question to consider is the relative importance which the eXperi- menter places upon guarding against committing a type I or type 11 error. If a type I error is considered by the experimenter to be the more serious, then a procedure which is conservative might be preferred if the sacrifice in power is not too great. On the other 156 hand, if a type 11 error is considered to be the more serious error, then the eXperimenter might be willing to tolerate a higher actual alpha level for the procedure if some increase in statistical power can be gained. For the configurations considered in this investigation the distributions of both the 12 and 12* statistics are very well approximated by their limiting central chi-square distribution under the null hypothesis for samples of size 3.2 20. The actual power values for both of the procedures are quite close in value and are well approximated by the reSpective asymptotic power values. In general the asymptotic power serves as a lower bound for the actual power of the two procedures. These results indicate that for con- tingency tables in which 3.2 20 and all of the cells contain 3 or more observations, one can expect the asymptotic theory to hold quite well. The choice of which statistic to use for this case is a matter of personal choice. The 12 procedure is slightly more powerful but the 12* procedure may provide a slightly better approximation to the reapective theoretical alpha values. The distribution of the 12 statistic is well approximated by its limiting central chi-square distribution under the null hypo- thesis for the configurations considered for samples of size ;.= 10. The procedure tends to be slightly liberal when ;|= 10. The actual levels of significance tend to be between .01 and .015, .05 and .06, and .10 and .115 for tests performed at the nominal .01, .05, and .10 levels reSpectively. The actual power of the 12 procedure tends to be at least as great as the theoretical power for this size sample. * _— The 12 statistic tends to be slightly conservative when n = 10. 157 In general the actual levels of significance tend to be between .005 and .01, .04 and .05, and .095 and .105 for tests performed at the nominal .01, .05, and .10 levels reSpectively. For data in which 80 percent or more of the observations are on the main diagonal the 12 procedure tends to provide actual levels of significance which are closer to the respective theoretical values. The use of the 12 procedure also provides a slight increase in power over the 12* procedure. The 12 is therefore recommended for cases in which 80 percent or more of the observations are concentrated on the main diagonal when ;'= 10. The experimenter should eXpect the asymptotic theory to hold fairly well for both the 12 and 12* procedures for sample sizes of ;'= 10 for contingency tables in which each of the off-diagonal cells contains at least a single observation. For sample sizes in which 3'2 10 the behaviors of the 12 and 12* statistics were not radically different. When 5’ is re- duced to 5,however,noticeable differences in the two behaviors are evident. When 3': 5 the 12 procedure is quite liberal for con- figurations in which most of the off-diagonal cells contain two or more observations. For tests performed at the nominal .05 level the eXperimenter can eXpect actual alpha values between .06 and .07 for such configurations. For contingency tables in which most of the off-diagonal cells contain 2 or more observations the 12* statistic is recommended, eSpecially if the experimenter is most concerned with guarding type I errors. For such data the 12* procedure will typically yield actual levels of significance between .045 and .05 for tests performed at the nominal .05 level. The actual power of 2* the I procedure is scmewhat less than the correSponding power of 158 the 12 procedure when ;'= 5. The loss in power from using the 12* procedure may be from 10 to 15 percent of the actual power of the 12 procedure for values of the latter in the range .30 to .70. When there is a heavy concentration of the observations on the main diagonal (80% or more) for samples of size ;'= 5, most of the off-diagonal cells contain less than two observations. If only 20% of the off-diagonal cells are empty the .05 nominal alpha level is well approximated byI2 for the 5 x 5 and 3 X 3 X 3 tables. The {2 procedure for such configurations at 5': 5 tends to be slightly conservative for the larger contingency tables (5 x 5 and 3 x 3 X 3) and somewhat more conservative for the smaller contingency tables (3 X 3, 4 X 4). For these smaller tables actual alpha values of .03 to .04 occur for tests performed at the .05 level. Use of the 12* statistic at ;'= 5 for tables in which 80% or more of the observations are concentrated on the main diagonal is not recommended. For such data the 12* procedure may have an actual significance level as low as .015 to .02 for the smaller tables and as low as .025 to .03 for the larger tables for tests performed at the nominal .05 level. For such data singularities in the variance-covariance matrix may be a problem in the 3 x 3 table and to a slightly lesser extent in the 4 X 4 table. When the sample size is reduced to ;'= 3 the 12 statistic is quite liberal for configurations in which each off-diagonal cell contains at least one observation. For tests performed at the nominal .05 level actual significance levels of .065 to .075 are to be expected using the 1? statistic. For the same type of data actual levels * of significance for 12 between .035 and .045 occur for tests 159 performed at the nominal .05 level. For such data the 12* statistic is recommended eSpecially if the experimenter wants protection against making a type I error. However, use of the 12* procedure results in a considerable loss in power. When ;.= 3 and the majority of the off-diagonal cells is empty, the approximation of the 12 statistic to its limiting distribution is quite poor for the 3 X 3 and 4 X 4 tables but considerably better for the 5 X 5 and 3 X 3 X 3 tables. For tests performed at either the nominal .01, .05, or .10 levels the 12 statistic is conservative for data of this type. For the 3 X 3 and 4 X 4 tables tests performed at the nominal .05 level may have an actual alpha level as low as .01 to .02. For the 5 X 5 and 3 x 3 X 3 tables the experimenter can expect actual alpha values between .03 and .04 for tests performed at the nominal .05 level. The corresponding values of actual alpha for the 12* statistic are .004 for the 3 X 3 and 4 X 4 tables and .0165 for the 5 x 5 and 3 X 3 X 3 tables for tests performed at the nominal .05 level. The 12 statistic should be preferred to (12* for data in which the majority of off-diagonal cells is empty. For the larger tables the experimenter should expect a test which is moderately conservative. The presence of empty cells in the 3 X 3 and 4 x 4 tables results in a relatively frequent occurrence of singularities in the matrix §.*- Use of either 12 or I?* for the 3 X 3 or 4 X 4 tables isynot recommended if the majority of the off-diagonal cells is empty. By judiciously choosing between the 12 and 12* procedures the experimenter can eXpect to have a reasonably valid statistical 160 procedure for sample sizes as small as E]: 5 and in some cases when 3' is as small as 3. When the choice is between a liberal I2 pro- 2* cedure or a conservative I procedure the experimenter must evaluate the relative importance which is attached to guarding against committing a type-I error or type-II error. For those situations in which I2 is conservative the choice is made by virtue of the rela- 2 2* tionship between I and I . 2* If the I procedure leads to a statistically significant test, any post-hoe procedures should involve the matrix §A* rather than o§~* since the OEL* matrix is a consistent estimafor of V 8x* only under the assumption that the null hypothesis is true. V The results of this study emphasize the need to examine the small sample prOperties of statistics which are based upon asymptotic results. Theoretically one should eXpect the I2 and 12* statistics to have very similar behaviors,yet it was seen that when the sample sizes are sufficiently Small the two behaviors can be quite divergent. Chapter V provides a summary of the results which have been given in this and the previous three Chapters. In addition suggestions for further research are given. CHAPTER V SUMMARY AND SUGGESTIONS FOR FURTHER RESEARCH SUMMARY This dissertation has dealt exclusively with the problem of testing for marginal homogeneity in the mixed categorical data model of order-d. The design implied by the mixed categorical data model precludes the use of such standard techniques as the chi- square test of homogeneity to test for the homogeneity of the marginal distributions. It was demonstrated in Chapter I that a technique which takes into account the correlated nature of the marginal distributions is necessary in such an analysis. The practical utility of such a technique in behavioral science re- search.was also demonstrated in Chapter I. In Chapter II the probability model associated with the ‘mixed categorical data model was stated and a rationale for its structure was given. The necessity of using an r x r X...X r contingency table of d dimensions rather than an r X d con- tingency table for such a model was explained. Four different statistical procedures for dealing with the problem of testing for homogeneity in the mixed categorical data model were then presented 2 S statistic of Stuart (1955); the x; statistic of Madansky (1963), in some detail. The four procedures presented were the X 2 a procedure based on the likelihood ratio criterion; the XI 161 162 statistic of Ireland et a1. (1969), a procedure based on minimum dis- crimination information estimation used in conjunction with the minimum discrimination information statistic; and the XSSK statistic of Koch and Reinfurt (1971), a statistic based on weighted least squares. All four procedures were shown to be asymptotically equivalent. The X2 statistic was shown to be algebraically equal to the GSK modified chi-square statistic x: of Neyman (1949) used in conjunc- tion with modified minimum chi-square estimators of the cell proba- bilities computed under the null hypothesis. The x: statistic was shown to be approximately equal to the modified chi-square statistic xi of Neyman used in conjunction with approximate maximum likelihood estimators computed under the null hypothesis. All four statistics were shown to belong to the same general class of asymptotic chi- square statistics employing BAN estimators of the cell probabilities computed under the null hypothesis. Each of the statistics has a limiting chi-square distribution with (d-l)(r-l) degrees of freedom under the null hypothesis of marginal homogeneity. In Chapter II it was also demonstrated that the X2 GSK statistic can be put in the form 2 _ ,-1« : xGSK—ic g . (5-1) The linear model Specifying the null hypothesis of marginal homo- geneity using the X§SK procedure gives rise to (d-l)(r-l) linearly independent constraint equations of the form £103) = o i = 1,2,.... (5-2) 163 The null hypothesis can then be stated in the form H0: ..f_(P_) = .9. (5'3) (d-l)(r-l) and f... = [f1(2'),...’f(d‘1>(r'1)(£)]. A statistic which can be used to test HO has the form where (5-4) E. = [f1(P_) ’ ° ' ° 5f(d_1) (151303)] and G is a consistent estimator of the variance-covariance matrix of f, It was also shown that the x: statistic can be written in the form 2:... -15 xS :09 i (55) with 0G being a consistent estimator of the variance-covariance matrix of 2_ under the assumption that the null hypothesis is true. The 0G matrix has the property that Im’ c=c- “g o ' . (5-6) 5|r-J From statements (5-1), (5-5), and (5-6) it was demonstrated that the 2 2 d X XGSK an S are related by the equation 2 xésx 1 + n GSK For n large, the numerical difference between the two statistics is small and as n approaches infinity the two procedures are equi- valent. 164 In Chapter III a statistic of the form given in (5-1) was developed. The statistic does not require knowledge of linear models 2 for its understanding and has more intuitive appeal than the XGSK statistic. The statistic designated as I2 was shown,using asymptotic theory,to have a limiting central chi-square distribution under the null hypothesis and a noncentral chi-square distribution under the alternative hypothesis. The degrees of freedom are (d-l)(r-l). A detailed set of computational formulas for the I2 statistic were derived. The development of two different techniques for generating confidence intervals for contrasts involving the marginal pr0portions was also given in Chapter 111. One of the procedures, a simultaneous procedure, was developed along the lines of the results of Scheffé (1959) and Goodman (1964). A second technique was develOped which is based on the Bonferroni inequality. The simultaneous Scheffé- like procedure was found to yield narrower confidence intervals for the usual values of a (.05 or .01) when the number of contrasts examined, k, is such that k > %d(d-l)r. In Chapter IV the behavior of the I2 statistic in the finite sample situation was examined by the method of simulation. Estimates of both actual alpha and power were found and compared to their reapective asymptotic values for a number of different cell probability configurations and sample sizes. In addition the small sample properties of the I2* statistic were examined. The I2* statistic is related to I? by the equation (5-8) 165 The extent of the diaperity between the I2 and I2* procedures for small n was also considered. For those configurations considered it was found that the asymptotic theory held quite well for sample sizes of 3.2 20 and in most cases when 3'2 10. When a- was reduced to 5, however, a noticeable decrease in the degree of approximation to the asymptotic results was found. The I2 procedure was found to be quite liberal for configurations in which most of the off-diagonal cells contained two or more observations when E'= 5. The IZ* procedure for the same situation behaved quite closely to what asymptotic theory would predict under the null hypothesis. It was found, however, that a con- siderable loss in power relative to I2 took place when the I2* statistic was used. For the experimenter who is most concerned with controlling for the probability of making a type I error the I2* procedure was recommended. It was also shown that when most of the observations were concentrated on the main diagonal of the contingency table (greater than 80%) for samples of size ;'= 5, both the I2 and I2* were conservative. The I? statistic did, however, provide a much better approximation to the limiting distribution and was recommended in such situations. When the sample Size was reduced to ;'= 3 it was found that the I2 was quite liberal for configurations in which each of the 2* off-diagonal cells contained at least one observation. The I procedure for the same situation was moderately conservative. For 2* such data the I procedure was recommended eSpecially if the eXperimenter wants protection against making a type I error. The 166 12* procedure can result, however, in a considerable loss in power relative to the I2 procedure. When the sample size is E'= 3 and the data are such that the majority of the off-diagonal cells is empty, it was found that the fit of the I2* statistic to its limiting distribution is extremely poor. It was also found that the I2 statistic is extremely conservative in such situations for the 3 X 3 and 4 X 4 tables, but only moderately conservative for the 5 X 5 and 3 X 3 X 3 tables. The use of the I2 was recommended for such data if the contingency table is either 5 X 5 or 3 X 3 X 3. The experimenter was recommended not to use either procedure if the table is 3 x 3 or 4 X 4 for such types of data . The examination of asymptotic results in the context of finite sample sizes is a necessity if the asymptotic results are to be of any use to the applied statistician. The aim of this dissertation has been to explore a problem along the lines of asymptotic theory and then to test the theory in the context of data analytic circum- stances. The financial and time limitations have left certain issues unexplored. Suggestions for further research in the area under con- sideration are now given. SUGGESTIONS FOR FURTHER RESEARCH In Chapter 11 four asymptotically equivalent procedures which test the null hypothesis of marginal homogeneity were presented. The small sample prOperties of two of these procedures have been investigated by the method of simulation. Although asymptotically * equivalent, the behaviors of the I2 and 1? statistics were quite 167 different for small sample sizes. This suggests that a study of the small sample prOperties of Xfi (likelihood ratio statistic) and Xi (minimum discrimination information statistic) would be both meaningful and extremely useful. A study in which all four statistics are compared would help establish additional guidelines for the applied statistician. Another area for possible research concerns the manner in which Singularities of the variance-covariance matrix §A* are handled. The problem of singularities reSults from the %resence of a number of empty cells in the contingency table. Rather than re- placing zero cell frequencies by some arbitrary constant such as %, BishOp et al. (1974) suggested replacing the observed table of cell frequencies with a new table of what they called "pseudoéBayes estimates". The method consists of selecting an a priori set of probabilities {P1}, which may be based on external information or the data themselves. Based on this a priori distribution, cell estimates are computed using a set of formulas which the authors supply. The advantage of this technique is that one can distinguish among zero frequency cells which have varying probabilities of occurrence. The Berkson rule which was used in this simulation study treats all empty cells in the same way. The rule assumes each of the empty cells is equally likely. The pseudo-Bayes estimates approach can make use of prior knowledge in determining how to deal with the empty cells. Comparisons between the Berkson rule and that suggested by BiShOp et al. can be made and their effect on the sampling distributions of the large sample statistics such as I2, 2* 2 2 'I , XM. and XI can be determined. 168 In the recent past authors such as Koch et al. (1974), and Koch and Reinfurt (1971), have adapted the weighted least Squares approach discussed in Chapter II to a series of complex designs in which the dependent variable is categorical. The analysis of categorical data from such designs as the Split-plot is quite easily handled using the general weighted least squares methodology. The small sample properties of the statistics generated by these authors have not as yet been investigated. Such an investigation would prove of extreme value to the potential users of the techniques in the behavioral sciences. Much of the data in the behavior sciences are categorical, but the development of categorical analogues to ANOVA and MANOVA for complex designs is a relatively recent phenomenon. The avail- ability of such techniques and the ability to assess their statistical validity relative to anéipriori set of standards will greatly enhance the behavioral scientist's repertoire of categorical data analysis techniques. The findings of this dissertation together with further research along the lines suggested Should help greatly in the pursuit of this goal. APPEND IC E S APPENDIX A PROOF OF THEOREM 2.1 APPENDIX A PROOF OF THEOREM 2.1 THEOREM 2.1 min x: = f' G-IE, Subject to H0 The proof consists of finding algebraic expressions for the modified minimum chi-square estimators by minimizing the quantity 2 x (n1_ nPi) (P P ) Z n = n 2 .i .1 1P1 (A-l) ‘with respect to the {P13 subject to the constraints specifying the null hypothesis (2-45) with the additional constraint that 2 P3,. 1. The estimators {P1} are then substituted into (A-l) ‘with the resulting expression shown to be the required quadratic form. Minimizing the expression in (A-l) subject to the required constraints can be accomplished by minimizing (P -13 >2 .1 1 [— 1] f Q - n z: - 27.0 2 Pl- - 2 73 um 1m(g) (A-2) j_ P1. 1_ m with respect to the {P1), where A0, u1,...,ut are the required Lagrangian multipliers which correspond to the respective con- straints under which the minimization takes place. Differentiating Q with respect to P and equating to 1. zero results in rd equations of the form n(P - P ) 1' 1‘ A 2 f — — u = fi 0 m m m1_ 1’ 169 0 (A-3) 170 The rd equations in (A-3) are a consequence of the fact that rd parameters define the r x r X...X r ccontingency table which is the model under consideration. Multiplying equation (A—3) by P i, summing over 1, and using the fact that 2 P - Z P = 1 leads to ii 1.1 the result A0 = Z umfm . (A-4) m Putting the result in (A-4) back into (A-3) one obtains n(P - P ) 1. i. . A - Z umfm - X umfml - 0 . (A-S) P m m 1. Solving for P1 one obtains 2f '13) ~ A mum( m1._ m P = P 1 + (A-6) 1. 1. n The {P1} are the modified minimum chi-square estimators. Replacing the P in (A-l) by the corresponding estimates one 1. obtains after some simplification the resulting expression 1 x x x (A-7) 1 A A A = I: um :31. um! :[H (fmi - fm) (fm'l - fm')Pl] Lemma 2.1 1' “ f A - A 8 :[n (fmi — fm> Proof of Lemma 2.1 171 The expression in (A-fl) can be written as .1 “ _ 1 " " _ 1 * . _1_ . x in [fml fm'l Pl] n fin: [fm'i Pl] n fm, 3:- [fml Pl] + n fmfm' 1 .. 1 . ,. 1 ~ « 1 = :; fmlfm'l Pl - If fmfm' - n- fm'fm + h- fm'fm 1 . 1 x a i; [fmlfm'i P1] - h- fmfm' 8mm' as defined in (2-47) and (2-49). Using the result of Lemma 2.1 the second expression in (A-7) can be written as ... 0 2' Z “mum'gmm' -_§_J_ GU . m m Thus far it has been shown that 2 . Min x = U G y_ (A-9) Subject to Ho where E' = [u ,u ,...,u] l>.l.‘1'1flH-5.J1( o F \I J A"... 00 l. AEU 15......7.C(S(C a. 3, \a H G 1 01 00 A C3 S LS... U .. C131... 1 I a. g. L 0 1 2 11 N anv.nulo1rh.aJ I. ’HV pr ’ ’ 1. A 6 all fil 0 I FCF ....C 7.. 5.3)) 99 o. 1. 1. = . 7n. 7.. «:3 .... 030 P. 3‘ a 07.3: = 1) X .. .. J N 0 n. 2.. T T o. x ..-.J—r ...AAIn-u.,. h...’ 1 I J ’ A . T, .. «I! O 0‘ 95. . I FD". .1..qu 117:1. 9 9 ) 9 1 S ..1 )IOOG T 00 u. .../7...... a CS v ...-H h0360 Q. 52 \I an. \I J Q... 01 o HLH.‘ prll) ) CG, 91H «(UTA JYHTVKQJII‘QLRJ...) '7. w.)- J’ 9.) II D 6. MN 3.") \Ivls 0(- )ll.) F n” H». u... .... Pf. ICn (12m. 1. (Or. 90 .10 C3... )n On. In)... 07.11.;- CIW TTIu. 0 a I HW.‘ v..lLJ....n..-nun.£n.\ ’1. .11.»( M (5 .13.? d... 91. It. In; 1.1.. .1: on...) “11-9. 7.7.1..) .1. '1 53 C7 .... 3.1. 5. 93C?! In? I»? GUI: o. s .... .. 0 1.0.1-”: on. : 0 on. .. O 1: ...... Y .... .7 o1 u A..u.1\.uv~.7.7.T.(.D\l 1‘7. 1‘? LP :3 fl 0 ’73: 7. «x 3 T4,-.. .rvqa)0. 24’s...- a ...: II ...! H.1u...dV D ...Ho C0030 :1! 911. : 91...... {...L....JI .23....1.§fl. 1....1v1‘ 0L 0+.JOU. OJCLG OJ: ...a. JV; TTT 4 u 1.1......V~.1.T.7.T.nl(o ...( . 31.: (J10(.JGF oIU . 11:11.31. ( . ..J o. u ’1. o» 01. 0 on 114 I... T .....L.V/VCC