‘ n... I u-m-v "4359-“. 1:. . A fizz-.311”. ,A L“. ...... 3...... :m fiit't 111:; -- “a. .7;:;.~. A . V -—.—"-< n.-. v . . . l- ”L') . « up.-_ -.u... 4 A - - . ‘ . _n.>v» r—U‘hV—‘Ol... .. . ~3¥ifia -.u4. . : 1‘!" "'7’ '4‘? :‘ 4.13%.- '«fj'rt “1311‘“12'1jnhlté II: ...~.'€I’-.i II I -. I ! I I. I IIx 13 1‘ '9 1.1 1:"i' ”“1111“? Julhd. IIIIIM HI ”I" ILII’III‘JQ} 16%;"? I11“ I311; h ."" thig; M. 9:111; ‘1 “an“! 1’ Min {-.1’I ‘ I :5 1-1!” - ’1? ”I 3318411! 311‘ 'ILI' II, II II; . ‘ Jay {1:}: gig ’11,. LELJII‘i ' ”Infla'i I": IIEIIIIII I. .‘i' . L " Wit It}: xw'iti ‘ ‘1 051%,. “3:13;: ”3‘11““ EE‘!’ "IIII'IE' ".- - .o - .wc _ _ :1. r... ‘fl'V-nri; . 4- . s; ‘. 1.... .7 1..“ _. 1” .I: 1.;7' '_ .. - - :17 , -_-.-, .......-... ..- .a'- v...’ .. - 9 _. U . E ' ’. ' ‘ I. J- -... .... 7"} :‘i’J'-.-'—’:‘.:: :‘" “ - A w}. . -I - . " . . . .- ~ ‘ In“ - saw/“I "My; ‘ "" J. .I.‘ 0., "‘ r' ~‘ . ... .1“ ' ,3 ' "" ....' - .t—Q‘fi' 4" ‘ .- I s '-I-‘.: . " .1". "'1 n, Hdflzi : ”.3 v1 ,' .3. ' u I. . .wflb—a.‘ y." ’1. can . . . tau-'3. ‘3 "6;" . “L"! LI ,«1 '1'1‘ .1 ! Iim "Eff I.“ -1‘ !- :- 55 wt . 'c “4.. I“ . ’ £2”: swag: - ,.~— "" 1 . .«1 r A . n. v- - 5-. .O-raq - I I I! ;-: II I 1‘E‘.‘:I..3H '7’}? 2‘11‘L1 “”1"” MN: I t '. 12%|: ' ' {1:311} "i‘; it ”VII; I.” In tItHI “:th 11111;; n‘It‘II t 3+. :1‘ ‘1‘1111‘1111’rb‘tnl5 ;” {If}? . (:51? 35‘: .;.-” .— v’ l—«vuwfi .. ~. .,. ’1 . . I... .. v . vv wow- I, ' “1“ E 71, Jig {QHI‘V‘I I!» I 152;"? .1 1I"t ‘1:- 'gd: .5131.“ Ller; .. w. :«J ,I , "I“ "IV I'flf‘I .1?“qu . fl J 31911 1'. Elfin" 121% 3331 I} Ii: 111‘ 1 'I‘ It “It t'i'v .Sz-d! Liz-11:17 ’15 :3... 1 w»- fi:.:- 3%.} .- . 35.13; E§= «no .- .. 1:3 “5"" ‘ '1: :3? 15:22" ‘ Ii 1‘." It} 2E1“; 8:53;. '1' 1'» $1: o’idl'f I: “in; ép M 5"”: :7 . -- 1., '1‘:— . "'51.:3’: .I. ~ 'A V‘ I, , ‘._ .... ._‘—‘ K-... .. a .-' .;L.v..:‘. :7. 'l i . I. $3.13)“; “" IILE’ I: : III R t ‘1. ‘,' \II‘: ‘1‘: il‘mfli {1'31 ‘1 '3‘: 1?} 1142"“: 2}}; tgln 1W , 1;: ,‘ ' it“ [E 3.- ‘111 1 . '2 L {Hg sw‘w -. .. I A , fl ‘ ‘ ‘ - .f- n‘ ,— - .V I- _. - . cs3”: .-~ .. n: A’ :7- ,. 3*” 9;” r’ . ., . .1111“! .' é?" 'f" I ~sz A a. Whig. It LL‘I‘LI ”m 53' I :nI’ J. .4: ‘5 2.": III .. ‘i‘i‘. ' £—~. “-5- "" Q "I: . 48:50-42 - _ Q- - , d-4 ' 1'- ad; c-.- .m ‘— ~.. |.I “7”"- .. “J - I. W. > c . THESlS 7, STATUE IIHIHHIIHIIIZIHII lHIHIIINHIll“!llllHHllllllHWl 301570 3279 LIBRARY Michigan State University This is to certify that the dissertation entitled Access to Eighth—Grade Algebra: A Bayesian, Multilevel Analysis presented by Yuk Fai Cheong has been accepted towards fulfillment of the requirements for Ph . D . degree in Measurement and Quantitative Methods J7, L44/ [/2er CA Major professor Date 2-3-97 MS U i: an Affirmative Action/Equal Opportunity Institution 0-12771 . v Q .r‘ v a V s V" ,~‘ V" V r—‘v v v v V“ v gr w " -" 'A I 4 L" PLACE N RETUIIN BOX to remove thle checkout from your record. TO AVOID FINES return on or before due due. DATE DUE DATE DUE DATE DUE MSU IeAn Affirmative Action/Equel Opportunity lnetituion mm ACCESS TO EIGHTH-GRADE ALGEBRA: A BAYESIAN, MULTILEVEL ANALYSIS By Yuk Fai Cheong A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Counseling, Educational Psychology and Special Education 1997 ABSTRACT ACCESS TO EIGHTH-GRADE ALGEBRA: A BAYESIAN, MULTILEVEL ANALYSIS By Yuk Fai Cheong This dissertation addressed two research objectives: one substantive, the other methodological. The first objective was to examine what school- and state-level factors may influence a public school's decision to offer eighth-grade algebra for high school credits. The analysis employed the data collected under the 1992 Trial State Assessment Program (TSAP) in mathematics of the National Assessment of Educational Progress (NAEP), and the state profile statistics compiled by the Council of Chief State School Officers (CCSSO) (CCSSO, 1993). The second, related objective was to develop and evaluate a fully Bayesian, multilevel approach that enables the study of public schools as distributors of learning opportunities in advanced mathematics. The Bayesian, multilevel approach developed by Zeger and Karim (1991) was implemented via the Gibbs sampler (Gemen & Gemen, 1984; Gelfand & Smith, 1990). The algorithm was coded using the Interactive Matrix Language of the Statistical Analysis Systems (SAS/IML) (SAS Institute, Inc., 1989). The result of a small simulation study, which generated and analyzed data sets having similar structure to that of the TSAP, showed that the SAS/IML code performed well and the algorithm yielded reasonable inferences. The tested code was used to fit two models studying the distribution of learning opportunities in mathematics. The results of the first unconditional model reveal significant state-to-state variation in the likelihood of the offering of eighth-grade algebra by public schools. The findings of the second model suggest that schools serving minority students and students of low social-economic status (SES), small schools, schools located in rural settings and states that spend less on education are comparatively less likely than other schools to offer algebra. The implication is that size, composition, and location of a school are linked to inequality in access to this educational resource. In these ways, the schooling system reinforces social, ethnic and geographic inequalities regarding the opportunities available for eighth-grade students to study algebra for high school credits. This dissertation is dedicated to the loving memory of my grandmom, Ah Mah. iv ACKNOWLEDGMENTS This research was supported by a grant from the American Educational Research Association which receives funding for its "AERA Grants Program" from the National Science Foundation and the National Center for Education Statistics (U .S. Department of Education) under NSF Grant #RED-9255437. I wish to thank Dr. Richard Shavelson and Ms. Jeanie Oakes for their help in the grant application process and their continued encouragement. Opinions expressed in this work reflect only those of the author and do not necessarily reflect those of the granting agencies. I wish to express my gratitude to my dissertation advisor, Dr. Stephen Raudenbush, for his guidance and support from the initial stage to the completion of this dissertation, for his mentoring and the many invaluable teaching and research opportunities he has given me over the years. I wish to also express my appreciation to my doctoral committee members, Dr. Betsy Becker, Dr. Aaron Pallas, and Dr. James Stapleton, whose availability, support, and critical analysis were exceptional. I am thankful to Dr. Michael Seltzer at the University of California, Los Angeles, for generously sharing his expertise in Gibbs sampling. I also appreciate the support of my colleagues involved with the Longitudinal and Multilevel Methods Project. In particular, I am grateful to our project manager, Marcy Wallace, for her encouragement. Special thanks to Dr. Randall Fotiu, Dr. Rafa Kasim, and Mengli Yang for their assistance in programming, and to Todd D. Harcek for reading and commenting on the manuscript. Finally, I am especially appreciative of the love, understanding, and patience given to me by my parents, sisters, brother, and in-laws throughout the pursuit of my graduate education. vi TABLE OF CONTENTS LIST OF TABLES ....................................................... x LIST OF FIGURES ..................................................... xi CHAPTER 1 INTRODUCTION ....................................................... 1 1.1 Objectives of the Study .......................................... 1 1.2 Substantive Goals ............................................... l 1.3 Methodological Goals ........................................... 3 1.4 Overview of the Study ........................................... 5 CHAPTER 2 BACKGROUND AND SIGNIFICANCE ..................................... 7 2.1 Introduction ................................................... 7 2.2 The Substantive Inquiry .......................................... 7 2.2.1 Background ............................................ 7 2.2.2 Relevant Literature ..................................... 11 2.2.3 Sources of School-to-School Variation in the Availability of Eighth- Grade Algebra ....................................... 13 2.2.4 Sources of State-to-State Variation in the Availability of Eighth- Grade Algebra ....................................... 16 2.2.5 Significance of the Substantive Inquiry ..................... 17 2.3 The Methodological Work ...................................... 19 2.3.1 Motivations ........................................... 19 2.3.2 An Illustrative Example ................................. 19 2.3.3 The Maximum Likelihood Approach ....................... 24 2.3.4 The Approximate Maximum Likelihood Approach\Penalized Quasi- likelihood Approach ................................... 29 2.3.5 The Fully Bayesian Approach ............................ 32 2.3.6 Appropriateness of the Fully Bayesian approach .............. 37 2.3.7 Usefulness of the Bayesian Approach to Policy Research ....... 38 vii CHAPTER 3 BAYESIAN ESTIMATION OF GENERALIZED LINEAR MODELS WITH RANDOM EFFECTS VIA THE GIBBS SAMPLER .................................... 39 3.1 Introduction .................................................. 39 3.2 Logic of the Gibbs Sampler ...................................... 39 3.3 Implementation of the Gibbs Sampler .............................. 42 3.3.1 Full Conditional Distribution of y--p(y |T,u, y) ............... 43 3.3.2 Full Conditional Distribution of T--p(T| y,u, y) ............... 47 3.3.3 Full Conditional Distribution of u--p(u |y,T, y) .............. 48 3.3.4 Starting Values and Convergence Diagnostics ................ 50 3.4 A Simulation Study and Results .................................. 57 CHAPTER 4 BAYESIAN HIERARCHICAL ANALYSES OF ACCESS TO EIGHTH-GRADE ALGEBRA ................................. 62 4.1 Introduction .................................................. 62 4.2 Data, Procedures, and Measures .................................. 62 4.2.1 Data Description ....................................... 62 4.2.2 Procedures ............................................ 63 4.2.3 Measures ............................................. 65 4.3 Results ...................................................... 73 4.3.1 Unconditional Model ................................... 73 4.3.2 Between-school/Within-state and Between-State Model ........ 80 4.4 Discussion and Implications ..................................... 87 CHAPTER 5 SUMMARY AND CONCLUSIONS ....................................... 92 5.1 Summary and Conclusions ...................................... 92 5.2 Future Research Needs ......................................... 93 5.2.1 Substantive Inquiry ..................................... 93 5.2.2 Methodological Development ............................. 95 APPENDIX A SAS\IML Code for the Gibbs Sampler ...................................... 97 viii APPENDIX B Approval Letter from the US. Department of Education for Accessing Individually Identifiable Survey Data Base ...................... 117 APPENDIX C Approval Letter from the University Committee on Research Involving Human Subjects .................................... 118 LIST OF REFERENCES ................................................ 119 ix Table 2.1: Table 3.1: Table 3.2: Table 3.3: Table 4.1: Table 4.2: Table 4.3: LIST OF TABLES Defining Features for the Three Major Approaches .................... 36 CODA Output for Geweke Convergence Diagnostic ................... 52 CODA Output for Raftery and Lewis Convergence Diagnostic .......... 56 Results of Simulation Study ...................................... 6O Descriptive Statistics ........................................... 72 Results of the Unconditional (No Predictors) Model ................... 74 Results of the Within- and Between-State Model ..................... 81 LIST OF FIGURES Figure 3.1 Illustration of Rejection Sampling (Taken from Gelman et al., 1995, p. 304 )45 Figure 3.2 Geweke’s Convergence Diagnostic ................................ 54 Figure 4.1. Logits Plotted Against the Midpoints of the Categories of Percentages of Minority Enrollment .............................................. 67 Figure 4.2: Plot of Logits Against Midpoints of Grouped Categories of Enrollment . . . 70 Figure 4.3: Marginal Posterior Distribution of r (The Unconditional Model) ........ 76 Figure 4.4a Distribution of Predicted Probabilities of Offering Algebra for the First Half of Individual States ............................................... 78 Figure 4.4b Distribution of Predicted Probabilities of Offering Algebra for the Second Half of Individual States ........................................... 79 Figure 4.5: Marginal Posterior Distribution of 1: (The Conditional Model) .......... 86 xi CHAPTER 1 INTRODUCTION 1.1 Objectives of the Study This dissertation has two related objectives, one substantive, and the other methodological. The first objective is to describe and study unequal learning opportunities in advanced mathematics in Grade 8 among schools across 42 states'. This objective inspires the development and the evaluation of a fully Bayesian treatment of random effects models with binary outcomes, which constitutes the second, methodological objective of this dissertation. 1.2 Substantive Goals The substantive purpose of the work is to investigate school and state influences on public schools’ decisions to offer eighth-grade algebra for high school credits. Eighth- grade algebra is ofien a "gatekeeping" course that allows students access to advanced high school mathematics. Data collected under the 1992 TSAP of NAEP, and information made available by CCSSO (CCSSO, 1993) are combined to relate school- and state-level factors to the availability of an algebra course. Inquiry into the curriculum distinctions in mathematics among schools across the forty-two states can help us understand the interschool and interstate variation, and some of its sources, in the stratification of learning opportunities in the middle grades mathematics curriculum. It can help us 1There were a total of 41 states and 1 territory in the sample. For brevity, they will be referred to as states in this dissertation. 2 evaluate the effect of eighth-grade algebra on student achievement by offering a more thorough understanding of the selection process. The analysis can inform discussion on school- and state-level policies related to equity and funding. Furthermore, the likelihood of course availability can perform as an opportunity-to-learn indicator (Oakes, 1989; Pelgrum, Voogt, & Plomp, 1995; Porter, 1991) in monitoring the performance of educational systems. The study draws on perspectives primarily from Sorenson and Hallinan's (1977) conception and Gamoran's (1987) model of opportunities for learning. Raudenbush, Fotiu, Cheong and Ziazi's (1996) study on inequality in access to educational resources, and Mullis, Jenkin, and Johnson's (1994) NAEP 1992 report on school effectiveness pertaining to mathematics education, laid the groundwork for this analysis. Major empirical work that guides the formulation of various research hypotheses includes studies on course availability done by Becker (1990), MacIver and Epstein (1995), Monk and Haller (1993), Oakes (1990), and Useem (1992). At the school level, relationships between school composition, setting, size and the likelihood of the offering of algebra for high school credits are examined. Specifically, the analysis evaluates the hypotheses that schools 1) located in rural settings, 2) with greater percentages of African and Hispanic Americans, 3) of lower SES, and 5) with smaller grade 8 enrollment are less likely to offer high school algebra. At the state level, the study examines 1) whether the availability of the course varies with state poverty levels, as indicated by the percent of children in poverty, 2) whether state educational expenditures are related to how likely the algebra course is to be offered, and 3 3) whether state educational expenditures moderate the relationships between a school's racial and SES composition and the likelihood that it offers algebra. Put differently, the last question asks if there is an interaction effect between state educational spending and school racial and SES composition on opportunities to learn. It is hypothesized that schools located in states with higher poverty rates, on average, are less likely to offer the advanced course. In addition, greater state educational investment efforts are associated with a greater likelihood of offering algebra. The last state-level hypothesis postulates that the association between school racial and social composition effects and the probability of offering algebra depends on how much money states allocate per student. 1.3 Methodological Goals The inquiry has motivated the development and evaluation of a fully Bayesian analytical approach which accommodates 1) the binary nature of the outcome variable of the offering of the algebra course, 2) the hierarchical or nested design of the TSAP in mathematics, schools nested within states and territories, and 3) the need to incorporate the uncertainty that arises about the variances at the state-level. By representing the state- level variance as the investigator's uncertainty about the processes that produce it, the hierarchical framework enables us to study state-level heterogeneity, the relationships between state-level predictors and outcomes (Raudenbush, Cheong, & Fotiu, 1995), and not to treat the states as fixed strata. The fully Bayesian estimation approach has several advantages over two other major approaches, the maximum likelihood (ML) (e. g., Karim, 1991) and the 4 approximate maximum likelihood (AML) (e.g., Goldstein, 1991) or the closely related penalized quasi-likelihood (PQL) approaches (e.g., Breslow & Clayton, 1993). First, relative to AML\PQL, Bayes estimates of the level-2 variances are less biased (Breslow & Clayton, 1993; Rodriguez & Goldman, 1995). Second, relative to ML and AML\PQL, Bayes inference about any parameter allows full assessment of uncertainty in other parameters in the same model. Thus, unlike the other two approaches, Bayes estimates of the regression parameters incorporate uncertainty of the estimate of variance. Lastly, the Bayesian estimation supplies researchers with fuller inferential information by providing the entire posterior distributions of estimates, rather than point and interval estimates alone. This approach is useful: given the modest number of states, the posterior distribution of the level-2 variance is likely to be skewed and inferences based solely on the variance of the estimate can be misleading. Zeger and Karim (1991) developed an algorithm for this approach and implemented it via the Gibbs sampler (Gemen & Gemen, 1984; Gelfand & Smith, 1990). They showed with a simulation that the algorithm yielded reasonable inferences in finite sample cases in which the number of clusters was large relative to the number of observations within each cluster (100 clusters, each with 7 observations). Rodriguez and Goldman (1995) recommended the method as the most appealing option for more complicated models in their assessment of estimation procedures for multilevel models with binary responses, despite the intensive computation involved. However, Rodriguez and Goldman noted that no computer program for implementing the algorithm is publicly available. In this thesis, Zeger and Karim's algorithm is coded, evaluated and 5 implemented to study the distribution of learning opportunities. A small scale simulation study is carried out to assess the accuracy of the code and to examine how well the algorithm suits the present analysis, where the number of observations per cluster is larger than the total number of clusters (42 clusters, each with 100 observations on average). The approach can be applied to policy research that seeks to understand how various policies and factors at the macro (e.g., nations, states, districts) and meso levels (e.g., schools and classrooms) of the educational system(s) operate and interact with one another, and whose design has relatively few self-selected macro units. For instance, the approach can be applied to study how the probability of employment of adults is related to one's educational attainment, social and ethnic background, literacy level, as well as state policies on continuing education and retraining programs, using the 1992 National Adult Literacy Study (Raudenbush, Kasim, Earnsukkawat & Miyazaki, 1996). It can be applied to panel data to study trends as well. An example, which will be an extension of the inquiry of this thesis, is to utilize the various waves of the TSAP in mathematics data (I 990, 1992, 1996) to review the trends of the likelihood of offering algebra over time as a function of state reform initiatives. 1.4 Overview of the Study Chapter 2 describes the background and significance of the substantive inquiry into learning opportunities as well as the development of a Gibbs sampler for Bayesian hierarchical models with binary outcomes. Chapter 3 outlines the Gibbs sampler technique developed by Zeger and Karim (1991), and presents the results of a simulation 6 study documenting the performance of the SAS\IML code written to implement the Gibbs sampler. Chapter 4 reports and discusses the findings on school-to-school and state-to- state variation in the likelihood of offering algebra, and the relationships between various school- and state-level factors and a school's decision to include algebra in its middle grades mathematics curriculum. Chapter 5 concludes the study and suggests future research needs. CHAPTER 2 BACKGROUND AND SIGNIFICANCE 2.1 Introduction This chapter describes the theoretical frameworks and reviews relevant literature for the substantive inquiry and methodological studies. It elaborates the research objectives and hypotheses stated in Chapter 1. It is contended that the inquiry can add to our understanding in how educational institutions function as distributors of learning opportunities, and the methodological work done in this thesis can be beneficial to policy research that addresses the multilevel structure of social and educational systems. 2.2 The Substantive Inquiry 2.2.1 Background With the philosophy that access to algebra in middle grades is critical to a broad range of academic, and eventually occupational pursuits, and the mission to make algebra available to all middle grades students, projects like the Algebra Project (Jetter, 1993; Moses, 1994; Moses, Kamii, Swap, & Howard, 1989; Silva & Moses, 1990), and the Urban School Science and Mathematics Program (Archer, 1993) have made serious efforts to enhance mathematics and science education in the middle grades. Moses et al. and Archer postulated that, given the sequential nature of the mathematics curriculum, enrollment in algebra in middle grades is important in determining students' subsequent access to college preparatory mathematics in high schools. Empirical support for their 8 premise was provided by Stevenson, Schiller and Schneider's (1994) research on sequences of opportunities for learning mathematics from Grade 8 to Grade 10. They found, afier controlling for achievement, coursework in eighth-grade algebra is associated with a greater likelihood of enrollment in advanced high school mathematics. The mission "algebra for all" was initiated in part by the concern for equity in mathematics and science. A large number of inner-city, minority, and poor students are denied access to prerequisite courses in rigorous academic programs. Differential access to those courses is tied to race and class (Oakes, 1990, Raudenbush et al., 1996, Useem, 1992). The creators of the projects posed a challenge to the ability model, which decrees only mathematically inclined students should be given access to algebra, and its "institutional expressions" (Moses et al., p. 424)--curriculum tracking. They were convinced that, with appropriate pedagogy, capable teachers, high expectations of success, and strong support from school, parents and community, mastery of seventh- and eighth-grade algebra is within the reach of every student. MacIver and Epstein (1995) conducted an initial test of their belief that it is beneficial for public school students of all academic tracks to be given access to algebra in the middle grades, notwithstanding their past success in mathematics. Employing data from the base year of the National Educational Longitudinal Study of 1988 (N ELS:88), MacIver and Epstein found that students who had taken eighth-grade algebra scored .3 to .5 standard deviations higher on a standardized mathematics test than those who did not, controlling for school characteristics, such as social and ethnic composition, and students' track level, past achievements in mathematics, and background variables. The increment in achievement, 9 moreover, was similar for students in different ability groups. Usiskin (1987) shared, to a great extent, the views of Moses et al. and Archer. Citing the schooling experiences of other countries and that of the University of Chicago School Mathematics Project, he contended that algebra can be learned by average students at the eighth-grade level. The proponents of these mathematics projects stressed the critical role of school as a provider as well as a distributor of educational opportunity (MacIver & Epstein, 1995) in achieving equity as well as preparing the underrepresented non-Asian minorities for careers in science (Kamii, 1990; Oakes, 1990). As Porter (1991) succinctly stated, "Schools provide educational opportunity; they do not directly produce student learning" (p.33). Components that constitute opportunities to learn include exposure to content, pedagogy, instructional time, adequate institutional resources, and methods of assessment (Porter, 1991, 1994). Sorenson and Hallinan's (1977) model for the process of learning highlights the impact schools can have through the instructional resources and learning environments they provide. This model was further elaborated by Hallinan (1996), who collaborated with Sorenson in its conceptualization. In the model, two individual attributes of the students--level of ability and effort--together with opportunities for learning provided by schools, constitute three primary determinants of student learning. The amount of learning is a function of these three interdependent elements. While the three determinants interact with one another in the learning process, the model, as Sorenson and Hallinan stressed, is not an additive one. A high level of ability and effort does not compensate for a lack of educational opportunity. Students cannot acquire the knowledge 10 from schooling if they are not exposed to the appropriate curriculum, no matter how able or diligent the students are. Opportunities to learn can enhance and the lack thereof can constrain a student's learning environment. As most low-income minority families cannot afford to supplement school experiences with learning opportunities in the private sector, such as hiring a tutor, the learning resources at school have a much larger impact on the academic achievements of disadvantaged students (Gordon, 1967, Oakes, 1990). Another important concern of the projects is the immediate need to tackle the problem of the underparticipation of minorities in the fields of science, mathematics, and engineering. With the demographic change of increasing proportion of minorities and immigrants in the new labor force, Kamii (1990) argued that, in order to maintain the competitiveness and leadership role of the United States in the global economy, enticing the underrepresented groups into the "pipeline" leading to careers in science is particularly urgent. The present investigation compares public schools across 42 states with respect to the opportunities they provide students to learn advanced mathematics in middle grades through offering eighth-grade algebra. It seeks to understand how the schooling systems in various states distribute to their students access to the critical gatekeeping algebra course. The course availability measure (Monk, 1994) used here is whether a school offers algebra for high school credits. The measure indexes the upper bounds of student learning in advanced mathematics set by middle and junior high schools (Gamoran, 1987), as the absence of the course precludes most possibilities to learn algebra in the eighth grade. Influences of various school- and state-level characteristics and policies on 11 this limit are considered. Past relevant studies, on which the formulation of research hypotheses in this study are based, are reviewed in the following section. 2.2.2 Relevant Literature Two studies laid the groundwork for this dissertation: Raudenbush, F otiu, et al.'s (1996) work on inequality of access to educational opportunity, and Mullis et al.'s (1994) report on effective schools in mathematics. Using the 1992 TSAP in mathematics of NAEP, Raudenbush et al. investigated social and ethnic inequality in access to resources for learning eighth-grade algebra. Of particular relevance here is the school resource-- course offerings that they considered. Using the proportions of students classified according to ethnicity and level of parental education as predictors, the study modeled the probability of attending a school that offers high school algebra for eighth graders. The results indicates that the probability of attending a school with a rigorous mathematics program is positively associated with the level of parental education and that modest ethnicity gaps exist. In addition, there are substantial differences among states in the overall probability of students' access to algebra in grade eight. Mullis et al. (1994) carried out a school effectiveness study pertaining to mathematics education using the 1992 NAEP 4th, 8th (not the TSAP in mathematics used in this study), and 12th grade mathematics achievement data from 1500 public and private schools. In the study, three series of analyses were carried out to classify the more effective versus the less effective schools, to compare the characteristics of those groups of schools, and to identify sets of variables associated with higher mathematics 12 achievement in the more effective schools. In the first series of analyses, the researchers used the school mean mathematics proficiency scores to classify the schools participating in the study into three categories: highest-performing, medium-performing, and lowest-performing schools, and did a direct comparison of the contexts for learning mathematics, e.g., students' course enrollment, in the first and last groups of schools. In the second and third series of analyses, hierarchical linear modeling techniques were employed to identify factors associated with school effectiveness, controlling for home background of students and level of school SES. The three series of analyses yielded congruent results. Their findings on school effectiveness as related to opportunities to learn, socioeconomic characteristics of students and schools, and school ethnic composition are reported here. With course taking as an opportunity-to-learn measure, the study found that the top one-third of schools had more students than their bottom one-third counterparts enrolled in advanced courses. For example, 27 versus 13 percent of ninth graders in the two different groups of schools had enrolled in algebra by eighth grade. Also, more economically disadvantaged, urban, and minority students were in the lower-performing schools. After adjusting for socioeconomic characteristics of students and schools, two of the eleven factors which significantly differentiated the two groups of the most and least effective schools for grade 8 were proportion of students enrolled in Algebra 1 and percentage of non-African Americans. The more effective schools had more students enrolled in Algebra 1 and fewer minority students (non-Afiican Americans). The last series of analyses investigated predictors of mathematics achievement in more effective schools. In addition to the large 13 influences of socioeconomic factors, planning to enroll in or taking more advanced courses was a powerful predictor of higher mathematics achievement at grades 8 and 12. In sum, their study suggests that opportunities to learn, as measured by course enrollment, are likely to be linked to school achievement and are socially and racially stratified (Gamoran, 1987). A few studies in addition to the two pieces of groundwork have implications for the formulation of research hypotheses of this dissertation. The results of Education in the Middle Grades: A National Survey of Practices and Trends (Becker, 1990) show substantial between-school variability in the offering of algebra. The survey found that out of a probability sample of 2,400 public schools that enrolled seventh-graders, 63% of the middle-grade schools offered algebra courses in their curriculums in grade 7, and, more typically, grade 8. More than one third of the schools in the sample did not offer algebra. The finding, together with Raudenbush, Fotiu, et al. '5 (1996) report of substantial interstate variation in the probability of the offering of algebra, indicates that there is considerable variance at both the school and state level, and encourages an examination of its sources. 2.2.3 Sources of School-to-School Variation in the Availability of Eighth-Grade Algebra At the school level, variability in the availability of the algebra course may reflect differences in school composition, urbanicity, and size, as conceptualized in Gamoran's (1987) model of opportunities for learning. Gamoran postulated that the location of a 14 school in an urban, suburban, or rural setting, and school composition may have influences on school offerings, and lead to between-school variation and stratification of learning opportunities. School offerings in turn may have an impact on students' achievement. Gamoran labeled and tested these indirect "setting effects" on achievement in his study. Oakes (1990), using data from the 1985-1986 National Survey of Science and Mathematics Education (N SSME), studied the percentages of 1,200 junior high schools of different SES (high poverty to high wealth) and Caucasian student populations (different categories of different proportions) that had one or more sections of advanced mathematics, i.e., eighth-grade algebra and ninth-grade geometry. An analysis of variance revealed no significant differences in the percentages of schools that offered accelerated mathematics classes among the SES or racial composition categories. She, however, suggested that meaningful differences may exist. Specifically, low-SES and predominantly minority schools are less likely to offer accelerated mathematics classes. In another analysis, Oakes examined advanced mathematics sections per 100 students and found that high-SE8 or all-white schools have significantly more of those advanced sections. Useem (1992) reported that district parental education levels were related to the striking district-by-district variability of student enrollment in eighth-grade algebra, which ranged from 13% to 60%. The present study cross-validates Oakes' findings of non-significant SES and racial composition effects on course offering and evaluates the hypothesis that racial and social stratification exist in the offering of algebra. Given the other related results and that of Raudenbush, F otiu, et al. (1996), it is expected that schools with fewer economically 15 disadvantaged students and African and Hispanic minority students are more likely to have rigorous mathematics programs. Monk and Haller (1993) analde the 1980 High School and Beyond (HSB) survey data with 1032 public and private schools in the US. and found a positive significant interaction effect of urban setting and school size on the total number of academic course credits. Monk and Haller argued as urban high schools face a less restrictive local teacher labor market, which lifts the human resources constraint on curriculum development, diversity in academic course offerings is favored there. Therefore, all else being equal, the likelihood of offering algebra is hypothesized to be higher in urban versus rural and, similarly, suburban versus rural schools. Another source of variability among schools in the likelihood of the offering of algebra may originate from a structural factor of schools, that of size. Monk and Haller (1993) contended that greater school size increases efficiency through its potential to generate economies, and their study showed a positive relationship between the enrollment of a high school’s graduating class and how many different course credits were offered. Furthermore, Useem (1992) in her study on variability in advanced mathematics course placements across 26 districts, reported that in one K-8 system only some of the larger elementary schools offered algebra. The two studies suggest that greater school size is associated with the offering of a comprehensive and specialized curriculum, as contended by Lee and Smith (1995). It is therefore hypothesized that schools with larger grade 8 enrollments are more likely to offer algebra, net of the effects of school racial and social composition and size. 16 2.2.4 Sources of State-to-State Variation in the Availability of Eighth-Grade Algebra Two possible sources of state variation in the availability of algebra are state investment efforts in public education and poverty levels. As financial support is related to the quality of resource inputs, such as the qualifications of the mathematic teachers and how updated the instructional materials are, the amount being invested in the educational systems is likely to have an impact on the distribution of scarce educational resources. This is particularly important for schools in economically disadvantaged settings where there are typically fewer educational resources and less qualified mathematics teachers (Archer, 1991; Tate, 1994). Indeed, Tate posited that without fiscal equity for those schools, the chances for success of implementing the curricular reforms recommended by the well-known , _ ' ' ,_ (National Council of Teachers of Mathematics (NCTM), 1989), are slim. His views echo those of the proponents of systemic reforms (e. g., O'Day & Smith 1993). Given that state educational spending accounts for nearly half of total K-12 public school revenues (National Center for Education Statistics, 1994), the present inquiry tests the exploratory hypothesis that different levels of financial resources at the state level are associated with different likelihoods of offering of algebra. How much a state spends on education may also influence the availability of the course through its impacts on the relationships between school and ethnicity composition and the likelihood of the course. In other words, state expenditures may interact with the effects of student social and ethnic composition, if any, in influencing a school's decision to include algebra in its mathematics curriculum. The exploratory hypothesis aims at investigating the indirect 17 effects of expenditures on opportunities to learn. Intervening processes, such as the possibility that increased expenditures may help prepare and recruit better qualified teachers and thus may supply the instructional staff necessary for the offering and teaching of advanced courses, are not tested here. Another exploratory hypothesis pertaining to the poverty level of the state is that schools in states with higher poverty rates, on average, are less likely to offer algebra. 2.2.5 Significance of the Substantive Inquiry While built upon the existing research, the inquiry of this thesis differs from, and thus, adds to the literature on learning opportunities in several ways. First, unlike most studies, except the one by Raudenbush, Fotiu, et al. (1996), it adopts a multilevel framework and incorporates the clustering effects of states. It explicitly models the state memberships shared by schools, an unprecedented opportunity presented by the TSAP in mathematics. The framework allows one to study interstate and interschool variation appropriately (see Raudenbush, 1988) and how they can be explained by state characteristics and policies. Whereas both Raudenbush, Fotiu, et al.'s (1996) and the present study investigate access to algebra, they adopt different approaches. The former focused on person-to- person variation and investigated the probability of placement of students in schools which offered algebra as a function of the students' background characteristics. The present work surveys what school- and state-level factors may affect a public school's decision to offer high school algebra. The study focuses on one of the best "school- 18 controlled predictors of student achievement" (Porter, 1994, p.427)--content of instruction. Another major divergence from the previous research on course availability such as Oakes' (1990) studies is the use of more comprehensive models in the present study. The models assessed contain various school compositional, setting, and structural predictor variables. It allows one to investigate the influences of individual variables net those of the others. For instance, one may be interested in assessing the ethnicity bias while holding constant the influences of the structural characteristic of a school. The multilevel and better specified models in the study allow a better understanding of the school decision-making process in delivering learning opportunities to their students, and how the process may be related to state financial investment in education. Furthermore, as state revenue makes up a significant portion of the total school revenue, states can exert influences on how those monies are transformed into educational opportunities. One way to do so, as argued by Monk (1994), is through the explicit statements of the basis of the financial entitlement. A state, for instance, might enumerate the sort of educational opportunities, or the "intended curriculum" (Pelgrum et al., 1995), a district should provide and collect information on how well the district meets the specified details. It is believed that the measure of course availability, operationalized as the likelihood of the offering of algebra, can supply some of the relevant information and act as a production and curriculum indicator (Oakes, 1989; Pelgrurn et al., 1995; Porter, 1991). 19 2.3 The Methodological Work 2.3.1 Motivations The substantive inquiry has motivated the development and evaluation of a fully Bayesian analytical framework which accommodates l) the binary nature of the outcome variable of the offering of the algebra course, 2) the hierarchical or nested design of the TSAP in mathematics--schools nested within states and territories, and 3) the need to incorporate the uncertainty about the variances at the state level. By representing the state-level variance as the investigator's uncertainty about the processes that produce it, the hierarchical framework enables us to study state-level heterogeneity, the relationships between state-level predictors and outcomes (Raudenbush et al. 1995), and not to treat the states as fixed strata. To illustrate the framework and to establish notation, I use an over- simplified model for the availability of high-school algebra, with two predictor variables: percentages of minorities (Hispanic and African American students) per school and state educational expenditures per student. The fully Bayesian estimation approach, together with the ML and AML\PQL strategies, is then outlined. 2.3.2 An Illustrative Example Let y,j be an indicator outcome variable which takes on a value of 1 if high-school algebra is offered, and 0 otherwise for school i in state j, where i = 1 to nj and j = 1 to J. The goal of the analysis is to explore the correlates of the probability, pi}, that a particular school offers eighth-grade algebra. At the school level, it is of interest to determine if higher percentage of minorities in a school HiMin,j is associated with less likelihood of 20 the algebra course being offered, or if the learning opportunities for the gatekeeping course to advanced high school mathematics are racially stratified (Gamoran, 1987). HiMin,j is an indicator variable and is coded 1 for schools with more than 50% Hispanic and African American students, and 0 otherwise. A discussion of the coding scheme of this variable is given in Chapter 4. To constrain the probability to lie within a [0,1] interval, pg. is transformed to flu using the logit link function. The transformed outcome is now the log-odds of the offering of algebra. The model is expressed as: Pr . . log(—1—j—) = “U = Bo; + [lleszij , (2.1) _q or in vector form, [5 . “U = (i HiMin)U[B°’]. (2.2) 1] Each coefficient defined in the school-level equation is modeled as an outcome at the state level. The intercept of the school-level model, [50]., which is the average log-odds of the offering of algebra for schools which are located in state j and have a relatively low percentage of Hispanics and African American students, is estimated to see whether it is related to fiscal equity. The intercept is modeled as a function of state educational expenditure per student StateExpj‘ plus a random state effect according to the model 00. = 700 + YOIStateExp. '1' ac. , 2.3 1 1 1 where uoj ~ N( 0, r). In addition, whether or not educational expenditure may be related to 21 possible racial stratification in learning opportunities, as represented by flu, is assessed. The following model expresses the relationship: [31}. = 710 + ynStateExpj . (2.4) In matrix notation, the combined state-level model can be expressed as Y 00 [3 0/ l StateExpj 0 0 y or 1 = + [um], (2.5) B I]. 0 0 1 StateExpj y 10 0 _Y 11 and the combined school- and state-level model can be stated as 11,-. - J . , Yoo 701 (1 St‘ateExpj HiMinU. HiMinU*StateExpj) + (2.6) 710 5Y1]. [1] [“01] ’ which can be formulated as a more general two-level mixed model with a logit-normal hierarchy (Searle, 1991), 71,-, = ngY + ngu, , (2.7) where x” is a p x 1 vector of predictors associated with the log-odds of the algebra course being offered; for model (2.6), p = 4; 22 y is a p x 1 vector of regression coefficients for fixed effects; z”, a subset of x. ,j, is a r x 1 vector of predictors associated with r random effect(s); for model (2.6), r = 1; and u] is a r x 1 vector of random effects and it is normally distributed with mean of zero and variance of T. For (2.6), the variance is a scalar 1:. The parameters of interest in model (2.7) are 7, u}, and T. The marginal likelihood for the parameters 7 and T is proportional to J I! Km) « H II my, I u,)g(u,>du,. (2.3) 1:1 i=1 where _ l-yt 1 y’ exp n” I fty,-|u,) = _ _ . (2.9) I 1 +exp "'1 l +exp "'1 and _ 1 - g(u,) « I T I "zexpl-E-uj'T 'u,]du,. (2.10) Model (2.7) is a special case of the generalized linear model (GLM) (McCullagh & Nelder, 1989, Nelder & Wedderbum, 1972) with random effects. GLM has unified models with outcomes having a variety of different scales. Some examples are the linear regression models for approximately Gaussian data (e.g., SAT scores), logit models for dichotomous data (e.g., offering of algebra), log-linear models for count data (e. g., days of absence from school), and proportional hazards models for survival time data (e.g., 23 years taken to attain a doctoral degree). The unification has brought with it a common theoretical framework and estimation procedure (see McCullagh & Nelder, 1989). Model (2.7) is an extension of GLM with an incorporation of random terms uj to handle clustered data. The conditional distribution of y”. given uj assumes an exponential family distribution of the form fiy,lu,> = exp

da . (2.18) j=l i=1 Specifying g and maximizing [(7,0) with respect to the various parameters yield maximum likelihood estimates for y and Q, and subsequently T. The maximization can be accomplished by an iterative procedure such as Newton-Raphson or Fisher scoring method, and the evaluation of the integral in (2.18) can be carried out by the Gauss- Hermite quadrature technique when g is normal. The quadrature technique approximates 26 an integral by summing a certain number of quadrature points (1 for each dimension of the integration. If the number of quadrature points d is large enough, the approximation can be made increasingly precise and the approach could yield maximum likelihood estimates that have the properties of consistency, asymptotic normality, and efficiency. Using consistent estimators 1 and i, empirical Bayesian point estimator of a] can be computed, which allows the recovery of estimates of u]. The algorithm allows likelihood-ratio tests for comparisons of nested models differing in variance and covariance components in addition to fixed effects. A program called MIXOR (Hedeker and Gibbons, in press), which employs a direct maximization strategy called marginal maximum likelihood (ML) for estimating random-effects logit regression models, is available. The algorithm is very similar to the one of Fahrmeir and Tutz (1994). The models are basically the same as model (2.7). In their derivation, the models differ from (2.7) in their introduction of an unobservable or latent continuous variable which "trigger[s] the discrete response" (Cramer, 1991, p.11). A threshold value is assumed and a discrete response occurs if the value of the latent continuous variable exceeds the threshold value. An example of a latent continuous variable is a household's desire to own a car and there is a threshold value or certain critical level of desire beyond which ownership may result (Crarner, 1991). Karim (1991) developed an indirect maximization strategy to estimate the various parameters using a Monte Carlo implementation of the EM algorithm (MCEM) (Wei & Tanner, 1990, Dempster et al., 1977). In general, the EM algorithm is an iterative procedure for finding maximum likelihood estimates in the presence of unobserved or 27 missing data in probability models. The idea is to augment the observed data with latent data in order to replace one complicated maximization by an iterative series of simple maxirnizations (Tanner, 1990). Let y = (y,, ..., yj)T and u = (u ,, ..., u,)T, for model (2.7), y is the observed or incomplete data, u is the latent data , and (u, y) is considered to be the complete data. The complete data log-likelihood is given by log flu, y|¢) and the conditional predictive distribution of the unobserved or missing data is fluly,¢("), where 4)") is a current guess of the parameters. The iterative series of the simplified maxirnizations consists of two steps: the E-step (for expectation) and the M-step (for maximization). In the E-step the expected value of the complete data log-likelihood with respect to the conditional predictive distribution is determined, i.e., Q(¢ 14>“) = ang minnow» , (2.19) which can be obtained by evaluating Q(¢I¢"’) = flog flurlM/(ulmmwu . (2.20) In the M-step, Q((|)|¢( i)) is maximized by finding the solutions to fian‘") = 0 . (2.21) and d)” 1) is thus determined. This completes one cycle and the iterative series continues until convergence. Iterations between E- and M-steps lead ultimately to the maximum 28 likelihood values of the parameters under mild regularity conditions (Dempster et al., 1977; Wu, 1983). The algorithm does not yield any variances and covariances of the estimators, but they can be obtained from the Hessian of the score functions of the various parameters. For model (2.7), however, no closed form exists for the E-step in (2.20). Karim (1991) adopted a Monte Carlo implementation of the E-step (Wei & Tanner, 1990) as a result. The general scheme of Monte Carlo integration involves generation of random variates, which adds another step to the EM iterative series. The modified EM series begins with the generation of a sample of latent data, u”), ..., am), where m = l to M from the current approximation to the conditional predictive distribution flu [y ,d)‘ '7). The expected value of the complete data log-likelihood is then obtained as a mixture of the complete data log-likelihood over the generated latent data, i.e., M Q(,.,,(.¢“’) . 11?: log lbw ("”1 «1) . (2.22) m=l Then the M-step maximizes Q(,,1)(¢|¢( ") and the new maximizer is used to update the conditional predictive distribution. Karim (1991) used an indirect sampling method called importance sampling (Ripley, 1981) to generate latent data from flu ly ,(b‘ '7). He approximated the posterior with a Gaussian distribution with mean and variance equal to the posterior mode and the posterior curvatures of the conditional predictive distribution. Then random variates were sampled from the approximated Gaussian distribution. To adjust for the discrepancy 29 between the true posterior and the approximating distribution, each simulated draw was assigned a weight. Let g(-) be the Gaussian function, f (-) be the conditional predictive function, and let g(u‘) be a draw from g(u), the adjustment weight was computed as flu‘)/g(u‘). Karim (1991) noted that with MCEM, the estimates fluctuate around the true maximum even after convergence. He recommended various tests including visual inspections of the traces of the estimates and assessment of variability in consecutive iterations. He implemented a simulation study and showed that the algorithm yielded reasonable inferences. No software is publicly available for this algorithm. 2.3.4 The Approximate Maximum Likelihood Approach\Penalized Quasi-likelihood Approach Another main approach to parameter estimation for model (2.7) is the AML (e. g., Goldstein, 1991; McGilchrist, 1994; Schall, 1991; Stiratelli et al., 1984; Wolfinger & O'Connell, 1993 ), or the closely related PQL approach (e.g., Breslow & Clayton, 1993; Raudenbush, 1993). The essence of the approach is to derive an approximation to the marginal likelihood (2.8) and its partial derivatives, which allows repeated application of normal theory (Kuk, 1995). The approximation can be achieved by the linearization of the model (e. g., Goldstein, 1991), or by the use of penalized quasi-likelihood (e. g., Breslow & Clayton, 1993). The linearization can be implemented by a Taylor series expansion of p,j in (2.14) around the 1‘" current guess of ‘y = y") and u] = uj ( '1. p,j is approximated in the region of a 30 point in n ,1"). The procedure, which can be implemented by the Newton-Raphson algorithm, yields a set of linearized dependent variables y,j *'s and weights wy's for each data point (McCullagh & Nelder, 1989; Raudenbush, 1993; Schall, 1991), where (r) . 6p. . _ (l) ' - () ygj* — “I! + ( :0) ‘(yq‘ — pi], )’ (2.23) 1‘1 with = 10') Ta) ””111 ”aui’ (i) _ ‘11") -1 1" _ (l + 6X 0) 9 p’ p (2.24) 6p (1') if _ (i) _ (1') _ (i) — — wrj _ pi} (1 pi} ) 9 Lindstrom and Bates (1990) pointed out that this Taylor series expansion can be conceptualized as a step for creating psuedo-data. Alternatively, the approximation can be achieved by replacing f(y,j In!) in (2.9) by the exponential of the quasi-likelihood that denotes the deviance measure of fit (Breslow & Clayton, 1993; McCullagh & Nelder, 1989), which depends only on the first two moments of the model. Laplace’s method for integral approximation is then applied to the marginal likelihood with the substituted quasi-likelihood. The approximation then allows the use of normal theory to estimate the parameters. Iterative algorithms such as generalized least squares (e.g., Goldstein, 1991), Fisher scoring (e.g., Breslow & Clayton, 1993), and EM-type algorithm (e. g., 31 Raudenbush, 1993) can be applied to the approximated likelihood to obtain approximate maximum likelihood estimates. The EM-type algorithm is related to MCEM in that it is obtained when M in (2.22) is equal to one (Wei & Tanner, 1990). The iterative algorithm consists of forming and maximizing the function log f(u]=a},y*| (1)) over (I), where a, is the posterior mode of f(u,[y,¢"’), to obtain (NM) and update a, This EM-type algorithm which relies on posterior modal analysis, coupled with the Newton-Raphson algorithm used for creating psuedo-data, are employed by Stiratelli et al. (1984) and Raudenbush (1993) to estimate model (2.7). The Taylor expansion for the approximation can be either about u] = 11]") or uj = 0. The former type of model corresponds to the PQL model and the latter marginal quasi- likelihood (MQL) model in Breslow and Clayton's (1993) classification of models with and without the incorporation of the random effects terms zfu, in the linear predictor. Goldstein and Rabash (1996) showed that the PQL estimates with a second order Taylor expansion are less biased than those of the first order MQL. An option to estimate restricted approximate maximum likelihood estimates for degree-of-freedom adjustment (Breslow & Clayton, 1993) is also available in this approach. In cases where the outcomes are normal, such adjustment can alleviate the downward bias of the maximum likelihood estimates of T. To implement the estimation, Stiratelli et al. (1984) and Raudenbush (1993) handled y as random effects, whose variances tend to infinity, in the EM-type routine in their algorithms. Several software packages for the approximate maximum and penalized quasi- likelihood estimation are available. HLM2/3 (Bryk, Raudenbush, & Congdon, 1996), 32 ML3 (Prosser, Rasbash, & Goldstein, 1991), and VARCL (Longford, 1988) are some examples. Several simulation studies were done to assess the estimators of this approach. Rodriguez and Goldman (1995) and Breslow and Clayton (1993) found that the MQL estimators display considerable bias, especially with reference to T and when the binomial denominators are small. Kuk (1995) and Goldstein and Rabash (1996) applied iterative boostrap bias correction and Kuk showed that the procedure yields asymptotically consistent and unbiased parameter estimates. 2.3.5 The Fully Bayesian Approach The third major approach to parameter estimation is the fully Bayesian approach. Unlike the previous two approaches, which estimate T and/or y as fixed parameters, Bayesian inference treats all parameters as random quantities with prior distributions. Prior distributions have two basic interpretations, the population and the state of knowledge (Gelman, Carlin, Stern & Rubin, 1995). In the first interpretation, "the prior distribution represents a population of possible parameter values, from which the [parameter] of current interest has been drawn" (Gelman et al., 1995). In the second interpretation, which is more subjective, "the guiding principle is that we must express our knowledge (and uncertainty) about the [parameter] as if its value could be thought of as a random realization from the prior distribution" (Gelman et al., 1995). The main inferential goal of Bayesian analysis can be considered to be updating the prior knowledge of the random quantities in light of the observations. The analysis requires computation of the joint posterior distribution, p(y,T,u[y), which is proportional to the 33 product of the likelihood function for the parameters and their prior distributions: J"; p(r.T.u M = k "11 111201,)!Y.u,)p(u,lT)p(Y.T). (2-25) j=li=l where J "j k = II p0,,~lY.u,)p(u,lT)p(r T)6u,616T . (226) 1:1 i=1 In the joint distribution (2.25), the joint prior for T and y is p(T,y), and the prior for u,- is the product of the J distributions p(uj |T). Through the likelihood function for the parameters T, y, and u, which is J"; I r(0), 00 is rejected. If the random variate is rejected, another draw from g(0) and the unit distribution will be made, followed by another acceptance/rejection test. The validity of the algorithm lies in the equivalence of 6(00) conditional on U s r(0) to F (00), as shown by Dagpunar (1988). The envelope function chosen by Zeger and Karim to cover the full conditional distribution (3.9) is the Gaussian density N(?"",2- V,""), where W maximizes p(y | u“), y) and V7“) is the inverse Fisher information at the k‘" iteration of the Gibbs sampler. Given the values of u“), the task of estimating 9“) and V7“) is simplified because 7.), Tuf") has become a vector of offsets, or explanatory variables with known coefficients (Gelman et al., 1995). Model (2.7) therefore reduces to a fixed effects generalized linear model, which can be estimated via iteratively weighted least squares with y,""* being the linearized dependent variable and w”. being the weight. How they can computed is given in the equations (2.23) and (2.24). Let yj’w‘): (y ,1. *‘k’ *. ..., ynfl’"), X]: (ijT, , wa), Zj= (zUT, , znfiT), W11") = diag{p,.j("’(1-p,j“))}, then at the klh iteration of the Gibbs chain, .1 J A0) = T (k) -1 T (k) 00' 7 (2X1 W1 X1 (2X1 W1 y, )’ j=I 1:1 (3.10) J V9 (1):; X] W] X!) . 47 To ensure that the envelope fimction covers the full conditional (3.9), Zeger and Karim (1991) inflate the variance K“) by a factor of 2 and multiply the envelope function with a constant c, which matches the mode of the envelope function and the full conditional (3.9). Let the full conditional (3.9) be denoted by fly) and the envelope function N(?"",2°V,"") by g(y), c is then computed as the ratio fl?"")/g(?“)). Using a Cholesky factorization of V,"", a random variate y* can be generated. Then a rejection\acceptance test involves evaluating the full conditional (3.9) and the envelope function at y“ by computing the ratio f(y* )/(c-g(y*)) is implemented. The decision rule is that if the ratio is greater than or equal to a random variate U generated from the uniform distribution [0,1], y* is accepted as a random variate belonging to the full conditional (3.9) with a probability equal to the ratio and 7“”) = y*. Otherwise, draw another variate from the envelope function and subject it to the rej ection\acceptance test again. 3.3.2 Full Conditional Distribution of T--p(T| y,u, y) Assuming uj's are independent normal with mean 0 and variance T and a uniform prior for T, i.e., p(T) °< constant, the full conditional distribution is independent of y and y. It thus can be expressed as p(Tl um), which is proportional to J p(T lam) e ,E,‘ T l ‘mexvl "%u,“'TT "ufk’i . (3.11) Another choice of prior for p(T), which Zeger and Karim use, is the Jeffrey's prior |T|*"+')’2. More recent work on Bayesian analysis for hierarchical models (e. g., Seltzer, Wong, & Bryk, 1996), however, has found that the choice may cause a problem. The 48 problem can be illustrated with the case when the variance term, 1:, is a scalar, that is q = l. The corresponding Jeffrey's prior is p(t) o< 1/1:, 1: > 0. When 1: is close to zero, 1/1: becomes infinitely large and "a spike at 1: = 0" (Seltzer et al., 1996, p. 139) in the posterior distribution for the parameter may occur. It will result in a standstill in the updating scheme of the Gibbs sampler. I encountered this problem in the simulation runs and the problem went away when I adopted the uniform prior, i.e., p(T) o< constant. To obtain random draws from (3.11), it is convenient to work with the conditional posterior distribution of '1”, given u, which is a Wishart distribution with parameters J 8‘“ =2 1.100.110“ , (3.12) [=1 and J-q-l degrees of freedom, where q is equal to the dimension of T. As one can use a standard algorithm for sampling from the full conditional of '1" here, rejection sampling is not needed. Tu”) can then be obtained by inverting the values of T' sampled. Specifically, '1“ is equal to H“) TWIH‘” , where S“) " = H“) TH“) and W1 is a standardardized Wishart variate with J-q-l degrees of freedom (Odell and Feiveson, 1966). 3.3.3 Full Conditional Distribution of u—p(u |y,T, y) Given the values 7"" and T“), the full conditional distribution p(u | y“),T"",y) is proportional to 49 J "j 1 y” p(u I y”),T("),y) a, II II . j=11=l 1 + exp- (quYfl) + zUTul) “(xTY(k)+zTu) 1'?” exp ” " ’ (3.13) - Tour xv 214) 1+exp(" ”’ _ I -1 (it) 1/2 __ (k) | T | exp[ 2ulT u 1] . (3.13) is a non-standard distribution, so Zeger and Karim (1991) again use the rejection sampling algorithm. For each cluster j, the envelope function employed is N(0,‘*),2- Val‘“). Using iterative weighted least squares, at the k‘h iteration of the Gibbs sampler, the mode 11,“) for cluster j ~(k) = T (k) + -1"" u (Z, W] Z] T 1 )"ZITWI(")(yj*("’ —le“") , (3.14) and its curvature _ (k) _ ij’ = (ZITWIU‘VI + T ‘ ) ' , (3.15) can be estimated. To ensure coverage, the variance of the envelope function is inflated by a factor of 2, and the function itself is multiplied by a constant C}, which matches the modes of the full conditional and the envelope function for cluster j. For each cluster, a random variate “1* is drawn from the envelope function. A rejection\acceptance test that computes and compares the ratio j(u,"‘)/(cj-g(uj *)), where flu!) is the full conditional for cluster j and g(uj) is the envelope function, to a random variate U generated from the uniform distribution [0,1]. If U 2 j'(uj"‘)/(cj-g(uj *)), uj‘m’ = u,*, otherwise repeat the sampling procedure. 50 3.3.4 Starting Values and Convergence Diagnostics The starting values of the Gibbs sampler for uj's and T are set to be zeros and the identity matrix respectively by Karim (1991). No starting values for y are needed as the Gibbs sampler starts with sampling from the full conditional distribution of y. To assess convergence at iteration k, Zeger and Karim (1991) employ Q-Q plots in which the last specified number of draws, say m, are plotted against the preceding m draws. In the present investigation, the convergence diagnostics developed by Geweke (1992) and Raftery and Lewis (1992) and automated in the software Convergence Diagnosis and Output Analysis Software for Gibbs Sampling Output (CODA) (Best et al., 1995; Cowles, 1994) are used to monitor the convergence of the Gibbs chains. Besides the availability of the computer code, the two diagnostics are chosen because 1) the methods are theoretically motivated, 2) each of them requires one sampler run, and 3) the conclusion drawn from each diagnostic could be used to compare with one another to check for the reliability of the diagnostic results. To briefly explain and illustrate the two approaches, I apply the diagnostics to a chain of 800 simulated values of 1: generated to monitor the convergence of the chain. The parameter value for ‘i.’ is 0.250. The first diagnostic used is Geweke's (1992) convergence diagnostic, which is based on a standard time-series method. For each variable, the chain is divided into two "windows" containing different fractions of the chain, for example, the first 10% and the last 50%. If the whole chain is stationary, the means of the values early and late in the sequence should be similar. Geweke's approach involves computing a convergence 51 diagnostic Z, which is the difference between the 2 means in the two different windows divided by the asymptotic standard error of their difference. As the length of the chain K -' co, the sampling distribution of Z -+ N(0,1) if the chain has converged. Thus values of Z which fall in the extreme tails of a standard normal distribution may imply that the chain has not fully converged. Table 3.1 gives the CODA output for Geweke's convergence diagnostic. 52 Table 3.1: CODA Output for Geweke Convergence Diagnostic GEWEKE CONVERGENCE DIAGNOSTIC (Z-score): Iterations used = 1:800 Thinning interval = 1 Sample size per chain = 800 Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5 Variable Convergence Diagnostic Z 13 -0.594 53 As Table 3.1 shows, the number of iterates used is 800. The diagnostic uses every one of the 800 iterates, thus the thinning interval is equal to 1. Sometimes instead of using every iteration, one may "thin" the chain by choosing to save and use every t‘" iteration, where t > 1. The option is useful in runs when there is high correlation between consecutive iterations (Raftery & Lewis, 1996). Here I follow Zeger and Karim (1991) in using every iteration and a single chain. The two "windows" contain the first 10% and the last 50% of the iterates respectively. Given that the Z-score is only -0.594, no evidence against convergence is established. According to Geweke (1992), a score in excess of 4 indicates problems. One can also use plots to illustrate the diagnostic. In essence, the approach computes and plots different Z-scores for different segments of the chain. A large portion of Z-scores falling outside the 95% confidence interval for a N(0,1) distribution will suggest possible convergence failure. Figure 3.2 gives the plot of the analysis and the 95% confidence interval. As shown in the figure, most of the scores are within or cluster near the confidence interval (the two broken lines), therefore, no convergence failure is suggested. 54 Figure 3.2 Geweke's Convergence Diagnostic Z-score m4 x x x x x x xx x X X x xx x X x X 04 xx x xx x x x x x x x x x ....... x’--------------------x-T---x.---------'--"----.x-'----x'--"x---'--'--‘ xx x x T I I I O 200 400 600 First iteration in segment 55 Raftery and Lewis' (1992) approach, based on two-state Markov chain theory, as well as standard sample size formulas involving binomial variance, is an alternative convergence diagnostic. The method detects convergence as well as provides a way of bounding the variance of the estimates of quantiles of functions of parameters. Table 3.2 gives a slightly edited CODA output of the analysis of the same chain of 800 simulated values used earlier. Several letter symbols of the original output were changed to avoid confusions with the ones already employed in the thesis. 56 Table 3.2: CODA Output for Raftery and Lewis Convergence Diagnostic RAFTERY AND LEWIS CONVERGENCE DIAGNOSTIC: Sample size per chain = 800 Quantile = 0.025 Accuracy = +/- 0.02 Probability = 0.9 Chain: c:/coda/diagnose VARIABLE Thin Bum-in Total Lower bound Dependence (t) (B) (N) (N min) factor (D) 1: 1 5 25 3 165 l .53 57 In the output, the sample size per chain indicates the number of iterates used, which is again 800. This diagnostic specifies the convergence criteria to be estimating the 0.025 quantile of the posterior distribution of 1: with a precision of $0.02 with probability 0.9. The burn-in value indicates how many initial iterations can be discarded. The small burn-in value given in Table 3.2, B = 5, suggests the chain has converged almost immediately to t. In order to estimate the 2.5th percentile of the posterior distribution to the specified accuracy and probability, i.e., $0.02 and 0.9 respectively, the result recommends that an estimated 254 iterations (with a minimum of 165) out of the 800 iterations are needed. The last column reports the dependence factor, which is the ratio of N to Nmin. Values of D much greater than 1.0 indicate high within-chain correlations and probable convergence failure (Raftery and Lewis (1996) suggest that D > 5.0 often indicates problems). The result here (D = 1.53) indicates convergence is achieved. 3.4 A Simulation Study and Results A small simulation study was done to assess the accuracy of the SAS/IML code and how well it would suit the present analysis. The structure of the data sets generated basically followed that of the simulations carried out by Rodriguez and Goldman (1993) and Yang (1995), in which 1: was a scalar. The model specifications of the simulated data were similar to the over-simplified model (2.6) discussed earlier. The model had one predictor at each level, denoted by llepred,.j and lleprealj for the level-1 and level-2 predictors. However, here only the level-1 intercept but not the regression coefficient was 58 modeled using Ilepreab. The model could be expressed as: Yoo "if = (l llepredj llepredij) Yo] + [11140)]. (3.16) LY 10‘ Two kinds of data sets, each had J = 40 and nj = 100, were generated. The number of clusters and observations per cluster was chosen to be similar to those of the TSAP in mathematics, which had 42 states and 100 schools per state on average. In both data sets, yoo = -.796, yo, = 7,0 = 1. They differed in their value of 1:. One was equal to 0.250 and the other was equal to l. The two different values were chosen to examine how well the algorithm works with moderate as well as large between-cluster variance. As the algorithm is very computationally intensive, the number of replications was chosen to be 50, half of what Zeger and Karim (1991) ran. It is important to note that the goals of this simulation, as stated in the beginning of this section, was different from Zeger and Karim's, which was to establish the statistical properties of the Gibbs sampler algorithm. The Gibbs sampler was run for 2,800 to 5,000 iterations until convergence. Table 3.3 lists the values of the true parameter 0, and reports the mean and standard deviation of 6‘, the square root of the average variance of the estimates, and the coverage of the nominal 90% Bayes interval from the 5th to 95th percentile based on 200 additional simulated values generated after convergence. 200 iterates were used as it would facilitate the comparison of the results of this simulation with those of Zeger and Karim (1991), 59 who used the same number of iterates to compute the posterior characteristics of the various parameters. Table 3.3: Results of Simulation Study 60 9 = Yoo to. Y... r Easel 6 (true value) 0796 1.000 1.000 0.250 mean(9I )9 -0.763 1.042 0.991 0.287 (mode = 0.250) std(9| y) 0.187 0.233 0.152 0.104 (mean(var((6| y))))'/’ 0.163 0.201 0.154 0.096 Coverage of nominal 84 82 94 88 90% interval £25.12, 0 (true value) -0.796 1.000 1.000 1.000 mean(9|y) -0.778 1.042 1.016 1.211 (mode = 1.070) std(9l y) 0.302 0.372 0.144 0.359 (mean(var((0| y))))'/’ 0.270 0.334 0.159 0.349 Coverage of nominal 86 84 96 82 90% interval The results are similar to those of Zeger and Karim (1991). In their simulation, they reported a slight bias in the intercept estimate while the other fixed effects were approximately unbiased. Here all fixed effects are approximately unbiased. Zeger and 61 Karim reported that the posterior means for the random effects variances were positively biased by 20%-3 0%. They attributed the bias to the long tail of the marginal distribution, which was alleviated when posterior modes were used instead of means to indicate central tendency. Here the positive bias ranges from 15% to 22% and the bias is alleviated by using posterior modes. The coverage probabilities of the nominal 90% Bayes posterior intervals ranged from 80% to 100% in Zeger and Karim's simulation study and they range from 82% to 96% here. Given the number of trials was 50, the standard deviation of the proportion of coverage is sqrt(.9(l-.9)/50) = .04. Thus two standard deviations below and above the 90% coverage ranged from .82 to .98. The results obtained here, 82% to 96%, gave another indication that the code and the algorithm were performing well. Both simulations show the algorithm yields reasonable inferences, which suggests that the algorithm works well in analyses where the number of clusters, J, is less than the number of observations per cluster, nj as well and can be applied to analyze the TSAP in mathematics data to study the distribution of learning opportunities. The next chapter reports a use of the tested code to perform Bayesian analysis of models studying access to opportunities to learn. CHAPTER 4 BAYESIAN HIERARCHICAL ANALYSES OF ACCESS TO EIGHTH-GRADE ALGEBRA 4.1 Introduction In this chapter, the estimation approach and algorithm from Chapter 3 are implemented to analyze the data collected under the 1992 TSAP in mathematics and state background characteristics data made available by CCSSO. The purpose of the analysis is to study the distribution of learning opportunities in advanced mathematics in Grade 8 among schools across 42 US. states. I first provide a description of the data sets and the procedures used. Then the selection and the construction of relevant variables and measures are presented in conjunction with the various research hypotheses. The final sections describe and discuss the findings and their implications. 4.2 Data, Procedures, and Measures 4.2.1 Data Description In TSAP, 44 states volunteered to participate. Within each state, on average, 100 public schools were selected at random. Within each school, approximately 30 eighth graders were then sampled. More specifically, eligible schools were first stratified and selected according to grade 8 enrollment, urbanicity, percentage of African and Hispanic American students, and median household income (for a description of the sample design and selection, see Mohadjer, Rust, Smith and Severynse, 1993). A systematic sample of 62 63 students was then drawn from each selected school. In the survey, the students were administered a test to assess their mathematics proficiency. Personal and school background information was solicited from the students, their teachers, and their school principals or administrators. The present investigation employed primarily the school response data on school characteristics and policies collected from the principals or other administrators. After listwise deletion of cases with missing values, complete data were available for 3,525 schools in 42 states. About 5% of schools in the total sample (3,725) and two territories were removed as a result. The two territories, each with six schools, were excluded as they had no data on the availability of algebra. A comparison of the descriptive statistics of the total sample and the actual sample used in the analyses revealed little difference. Another data set used in conjunction with TSAP was data on state profile statistics compiled by CCSSO, for which there were no missing data (CCSSO, 1993). In the analysis, the two data files were merged. 4.2.2 Procedures Using the TSAP and state background data, I constructed measures of the central concepts of this study for the selected schools and participating states in the sample. Reliability analyses and the less computationally intensive penalized quasi-likelihood estimation approach implemented via the EM algorithm (Raudenbush, 1993) were used in scale construction. With the log odds of the offering of high school algebra as the outcome, 1 first 64 formulated an unconditional model that included 1) a within-states model in which the outcome for a given school in a given state was seen as varying around the mean of the outcome for that state and 2) a between-states model in which the mean for the state on that outcome was seen as varying randomly around the grand mean for the outcome. The model estimated the unconditional variance in the outcome that lies between states. To test hypotheses, I then expanded the model to include school- and state-level predictors, which consisted of school composition, setting, eighth-grade enrollment, state poverty level and educational expenditure. In the analysis, all coefficients that represented the effects of school-level predictor variables were assumed not to randomly vary among the 42 states. Systematic state-to-state variation of the racial and social composition effects on the offering of algebra associated with different levels of public educational investment was postulated. The model specification regarding between-state variation of the effects of the various school-level predictors was guided by the principle of parsimony and based on the reasoning that the effects on the outcome due to school size and urbanicity are similar across the 42 states. And after controlling for the state educational investment and poverty level, it is likely that there will be no significant unique ethnicity and social composition effects associated with individual states. An exploratory run using HLM2/ 3 (Bryk et al., 1996) found non—significant random variation in the ethnicity and social composition effects, controlling for the two state-level predictors. In addition, all predictors in the within- and between-state model were centered around their means. As a result, the mean logits of offering algebra for individual states were adjusted for the various predictors. 65 4.2.3 Measures A central focus of this research is to describe and study the availability of high school algebra in the 42 states and the extent to which the availability depends on various demographic composition, setting, structural, and state economic and educational financial factors. To measure the availability of the course, an indicator variable was used. Schools received a score of 1 if the course was offered and a 0 otherwise. The mean for the variable was .76, indicating that 76% of the 3,525 schools offered the course (the standard deviation was .43). 846 schools in the sample did not include algebra in their eighth-grade mathematics curriculums. Two measures of the demographic composition were the minority enrollment and the SES of a school. Schools in which Afiican and Hispanic American students made up more than 50% of their enrollments were represented by an indicator variable and coded as high minority enrollment schools. The variable was constructed from the continuous variable of percentage of minority students in a school. The decision to adopt the dummy coding scheme was based on an examination of various plots of predicted logits of the probability of offering algebra against the midpoints of grouped percentages of minority enrollment, as suggested by DeMaris (1992). In addition, an exploratory model was fit with an alternative coding scheme for the independent percent minority variable. Eleven groups of percentages (0, 1-10, 11-20, 21-30, 31-40, 41-50, 51-60, 61-70, 71-80, 81-90, and 91-100) were formed by breaking the range of the percentages of minority students in schools. All but the no minority enrollment group, which was chosen to be the reference group, were used to model the log odds of offering algebra using 66 HLM2\3 (Bryk et al., 1996). Thus the model had ten dummy predictor variables. After the run, I plotted the estimated logits for all eleven categories of percentages against the eleven midpoints of the groups. The estimated logits for all but the reference group were computed by adding the regression coefficient of each group to the intercept, with the intercept being the estimated logit for the no minority group. Figure 4.1 gives the plot. The result shows that higher percentage of minority enrollment has a generally negative relationship to the logits of offering algebra after the 40th to 50th percentile category. In addition, I plotted the estimated logits against the logarithm and the logits of the midpoints of the various groups. It was found that the two transformations of the independent minority enrollment variable did not result in linearity in the logit of offering algebra. To explore an alternative coding scheme, I ran a model with percentage of minority enrollment and its quadratic term as predictors. The result of the run showed that the quadratic term did not achieve a conventional level of significance, and no quadratic relationship was suggested. Therefore the analysis used an indicator variable indicating schools with high minority enrollment, which allowed a straightforward interpretation as well. 67 Figure 4.1. Logits Plotted Against the Midpoints of the Categories of Percentages of Minority Enrollment 0.85 . R, 0.80 r/ \M 0.75 .-____ M _—l LOGITS _T___-- 0.70 __-___.- , _[__ 0.50 -. 1 1 1 I _I O 20 4O 50 80 100 PERCENTAGE OF MINORITY ENROLLMENT 68 The measure of the SES of schools was a 3-item scale. Two items in the scale covered the home background of students of a school, and the third was the percentage of students who were in the free lunch program. The home background items were based on students' reports about the educational level of their parents and their access to reading materials (receiving a newspaper regularly, an encyclopedia in the home, more than 25 books in the home, and receiving regular magazines). To create the SES scale, I first averaged the responses of the students within each school to obtain the school means for the two measures on home background. Z scores were then created for the two aggregated home background measures and the percentages of students receiving free lunches. Finally, the scale was constructed as the average of the three standardized scores. The scale had a Cronbach's alpha of 0.76. Grade 8 enrollment was measured by the actual number (in hundreds) of students enrolled. Past research suggests that the positive relationship between school size and program comprehensiveness is nonlinear (e.g., Monk & Haller, 1993). The relationship weakens as the size of the school increases. The finding warrants a check of the tenability of the assumption of linearity in the logit of the outcome. To proceed, I broke the range of the enrollment variable into groups, following the approach of DeMaris (1992) and Hosmer and Lemeshow (1987). The values of the variable ranged from 0.01 to 16.07. Grouping based on quartiles of the distribution of the variable yielded four categories with midpoints at 0.585, 1.615, 2.525, and 9.25. I then created three indicator variables to represent three of the four categories and entered them as predictors to model the log odds of offering algebra. To compute the estimated logits for each category, I added 69 successively the regression coefficient for each indicator variable to the intercept, which was the estimated logit for the reference group. Finally, I plotted them against the rrridpoints of the four categories. Figure 4.2 displays the plot of logits against the various midpoints. It shows obvious departure from a straight line and reveals nonlinearity in the relationship. As a result, a quadratic term was created to appropriately model the trend. 70 Figure 4.2: Plot of Logits Against Midpoints of Grouped Categories of Enrollment LOGIT O l l I I J 0 200 400 000 800 1000 GRADE 8 ENROLLMENT 71 The measure of the setting of the school was represented by three indicator variables showing whether or not a school was located in an urban, suburban, or rural setting. In the sample, 23% were urban schools, 53% were suburban schools, and 24% were rural schools. At the state level, two measures employed were the percent of children in poverty and the state educational expenditure per pupil. CCSSO compiled the two measures for each state for 1990, which was relatively close in time to 1992, the year the data on algebra offering were gathered (CCSSO, 1993). Percent of children in poverty is defined as "the percentage of related children under age 18 who live in families with incomes below the US. poverty threshold" (The Annie E. Casey Foundation, 1994, p. 159). The state educational expenditure measures how much money (in hundreds) states allocate for each student, and indexes state efforts in funding public education. Percent of children in poverty is negatively correlated with state educational expenditure per pupil (r = -.41). Table 4.1 shows the means, standard deviations, ranges, and variable names for each of the variables in this analysis. Table 4.1: Descriptive Statistics 72 Variables Range Mean SD School-level Data (3,525 schools) Offering 8th Grade 0 = no 0.76 0.43 Algebra for High School 1 = yes Credits Schools with Minority 0 = no 0.14 0.35 (Hispanic and African 1 = yes Americans) in Excess of 50% School SES (-3.42, 2.08) 0.00 0.82 Schools in Urban Setting 0 = no 0.23 0.43 1 = yes Schools in Suburban 0 = no 0.53 0.50 Setting 1 = yes Grade 8 Enrollment (0.01, 16.07) 2.17 1.39 (in hundreds) t - v ta (42 states) Percent of Children in (7, 33.50) 17.72 5.92 Poverty Educational Expenditure (25.47, 78.27) 45.22 12.35 per Pupil (in hundreds) 73 4.3 Results 4.3.1 Unconditional Model Table 4.2 gives the posterior characteristics of the marginal distributions of the parameters in the unconditional model. Listed are the means, the standard deviations of the intercept or the grand mean log odds of offering algebra and between-state variance, the 90% credibility interval (C.I.) for each parameter, and the mode for the variance estimate. These statistics were computed from 1000 simulated values obtained after convergence. The results show that on average the logit of offering algebra is 1.312 and there is statistically significant state-to-state variation in the log-odds of offering algebra. The predicted odds ratio is exp{ 1 .312} = 3.714. Schools are more likely to offer the course than not on average. The predicted probability is (1+exp{- 1.312))" = 0.789. Note that this differs from the mean outcome across the whole sample, which is equal to 0.76. The difference arises because the grand mean estimated here is based on the conditional distribution of the outcome given the random effects are null, rather than on the marginal distribution of the outcome (Zeger, Liang & Albert, 1988; Raudenbush, 1993). 74 Table 4.2: Results of the Unconditional (N o Predictors) Model Posterior Characteristics Explanatory Variables Mean Std. Dev. 90% CI. From To Intercept Intercept 1.312 0.145 0.886 1.810 State-to-state Heterogeneity 1.’ (mode = 0.730) 0.806 0.223 0.492 1.222 75 Figure 4.3, a histogram that approximates the posterior distribution of the between-state variance of the log odds based on 1000 sampled values of 1:, portrays the variability. As Figure 4.3 implies, values of 1: as small as 0.30 and as large as 1.70 are plausible, whereas the 90 01. covers from 0.49 to 1.22. A sense of the magnitude of state-to-state differences in the likelihood of offering algebra can be conveyed by computing a plausible range of predicted probabilities with the random effects varying from -1 .5 to 1.5 standard deviation units, i.e., p,j = 1/(1+exp{-Boj}) for 13,, e (v i 1.5%). where [30]. is the average log odds of offering algebra for schools in state j (Raudenbush, 1993). When 1: is equal to the mode of the posterior, 0.806, the predicted probabilities range from 0.51 to 0.93; when 1: is at the 10th percentile, 0.48, they range from 0.57 to 0.91; and when 1: is at the 90th percentile, the range is from 0.41 to 0.95. The ranges indicate substantial state differences in the predicted probabilities of offering algebra. 76 Figure 4.3: Marginal Posterior Distribution of 1: (The Unconditional Model) 0.15 _ - 150 m '_I at) 010 .. j — 100 E13 /;I\r a Z __ Z 9 i \ '— 5 CL 005 — 72 Br -50 O ' V E i j l I 0.30 0.85 1.40 1.95 STATE-LEVEL VARIANCE lNROO 77 For individual states, Figures 4.4a and 4.4b display box plots describing the distribution of the predicted probabilities computed using the marginal posterior of u. Each box plot summarizes 1,000 probabilities computed from 1,000 corresponding simulated values of mean logits. Inside the box are the middle 75 percent of the values of the probabilities, while the ends of the whiskers delimit the top and bottom 1 percent of those values. Thus, the plots give 75 percent and 98 percent intervals for the predicted probabilities of offering algebra for each state. The figures offer another graphical assessment of between-state variability. For instance, there are quite a few non- overlapping distributions. PA, for example, is shown to differ from OK and TN in its predicted probabilities of offering algebra. 78 3335 as; 2 :s E: z: E E :2 <2 i 5. z_ E E E S I E Z Z 8 5 2 i LIL r _ _ _ _ _ _ r r _ R _ P _ L- _ _ _1-I - -_..I L. E L, L _ __ l , _ 2:; 22E? 2:51.325 E33: 3:2; 2 35:22; :32; .0 EEEE 3% 9:9... .2 YYV’fi 'V'DCN LO CDCDCDOOCCDCDCDCD CDCWCDN F— 79 Elsa; 3:; 2 :2 :33 ; >3 ; <> 5 f 3 gm 2 5 v5 :0 :_ 22 :4 :2 :4 32 U2 m2 oz e _ _ _ _ _ _ _ _ _ _ _ _ _ _ _-.I._-I_.-.I_,I,_,,,--,_. i——— p.— F: u. mm_c_x _:::_>Ls:_ A9 L_:: acacia w;_ .c. j_;;:_< o:__o__c _o mo_.___2c;:c; =QLu_cQ_; L: ::__:QL_.5_: 2; 2%: “‘Q r"? LI"? 3 CDCDQCDCDCDIDSQO 709 J: as 13 Id 3000 SS 80 4.3.2 Between-school/Within-state and Between-State Model The second model consisted of between-school/within-state and between-state predictors. Table 4.3 summarizes the result and displays posterior characteristics of the marginal distributions of the parameters, based on successive sets of 1000 sampled values collected after convergence, for the school- and state-level predictors and the state-level variance. 81 Table 4.3: Results of the Within- and Between-State Model Posterior Characteristics Explanatory Variables Mean Std. Dev. 90% CI. From To Intercept Intercept 1.540 0.121 1.349 1.747 State Expenditure Per Pupil 0.044 0.012 0.023 0.063 Percent of Student in Poverty 0.016 0.023 -0.018 0.053 Minority Enrollment Intercept -0.664 0.161 -0.939 -0.401 State Expenditure Per Pupil -0.046 0.013 -0.067 -0.024 SES Intercept 0.383 0.068 0.267 0.496 State Expenditure Per Pupil 0.008 0.006 -0.001 0.017 Urban Intercept 0.398 0.159 0.131 0.651 Suburban Intercept 0.230 0.117 0.105 0.486 Grade 8 Enrollment Intercept 0.645 0.050 0.562 0.725 Grade 8 Enrollment (Quadratic Term) Intercept -0.073 0.013 -0.097 -0.053 State-to-state Heterogeneity 1: (mode = 0.437 ) 0.477 0.147 0.286 0.736 82 At the school level, the log odds of offering algebra is negatively related to percent of minority enrollment and positively related to school SES, urban and suburban setting, and size. A sense of the magnitude of the effects of specific predictors can be conveyed by computing the scale-invariant measure of odds ratio, exp{y}, which yields the ratio of predicted odds of offering algebra, or by obtaining the predicted probabilities of the existence of the course for schools which differ with respect to particular characteristics. I first examine the effects of the percent of minority enrollment on the likelihood of offering algebra. The estimated ratio of the odds of offering algebra for schools which have minority enrollment greater than or equal to 50% versus schools which do not is exp(-0.664) = 0.515, after controlling for the effects of other variables in the model. It indicates on average that the odds of offering the course among typical schools with high minority enrollment are about half of those with less minority enrollment. Another way to convey the effect of minority enrollment is to compute the predicted probabilities for two typical schools which differ in their racial composition. The probabilities can be obtained by holding all of the grand-mean centered predictors constant at zero and allowing only the minority enrollment indicator to vary. As Table 4.1 shows, the mean of high minority enrolhnent is 0.14, indicating that 14% of the sample had minority enrollment in excess of 50%. The grand-mean centered version of minority enrollment takes on a value of 1 - 0.14 = 0.86 for schools with high minority enrollment and 0 - 0.14 = -0.14 otherwise. Given random state effects of zero (“0) = 0), the predicted probabilities of offering algebra are (1 + exp{-[1.540 + -0.664 * 0.86]})'1 = 0.725 for a 83 typical school with high minority enrollment and (1 + exp{-[1.540 + -0.664 * -0.14]})’I = 0.837 for a typical school with fewer minority students, other things being equal. As Table 4.3 implies, school SES composition is associated with the log odds of offering algebra, net of the effects of other variables. The likelihood of offering the advanced mathematics course is higher among schools with higher SES. To gauge the effect, the odds ratio for two typical schools which are two standard deviations apart in their SES composition (2*0.82) = 1.64 is computed. The odds ratio estimated for the higher SES versus lower SES schools is 1.874 (exp(0.383* 1.64)). Given random state effects are null, the predicted probabilities for the two typical schools to offer algebra are (1 + exp{-[1.540 + 0.383 * 0.82]})'1 = 0.865 and (1 + exp{-(1 + exp{-[1.540 - 0.383 * - 0.82]})‘1 = 0.773. Rural schools, when compared with their urban and suburban counterparts, are in general less likely to include 8th grade algebra in their mathematics curricula, when controlling for the school demographic composition, size, state educational expenditure, and percent of children in poverty. The estimated ratios of odds for an urban versus a rural school and a suburban versus a rural school in offering the advanced mathematics course are 1.488 and 1.259 respectively. The predicted probabilities for typical urban, suburban, and rural schools to offer algebra are 0.864 ((1 + exp{-[1.540 + 0.398 * 0.77]})‘ '), 0.839 ((1 + exp{-[1.540 + 0.23 * 0.47]})") , and 0.784 ((1 + exp{-[1.502 + 0.398 * - 0.23 + 0.23*-0.53]})"). The partial effect of school size on the logits of offering algebra, net of those of all other predictors, is positive and statistically significant. The relative odds for schools with 84 eighth-grade enrollments of 356 (one standard deviation above the average) to schools with average eighth-grade enrollments are exp(0.645* l .39 + -0.073*1.39*1.39) = 2.129, indicating the odds that the former schools offer algebra are twice as likely as the latter. The predicted probability of offering algebra for a typical school with 356 eighth-grade students is 0.909, whereas it is 0.824 for a typical one with average eighth-grade enrollment. Note that 0.824 here differs from 0.789 estimated in the unconditional model as it is conditioned on the random effect after controlling for state educational expenditure and percent of children in poverty. At the state level, the log odds of offering algebra, adjusted for school composition, urbanicity, and size, are associated with state educational expenditure per pupil but not percent of children in poverty. The estimated odds ratio for schools in states with educational expenditure per pupil one standard deviation above the average versus their counterparts in states with average educational expenditure is exp(.044*12.35) = 1.72, after controlling for the percent of student in poverty and other school-level predictors. The predicted probabilities of offering algebra for two typical schools, one located in a state with an educational expenditure one standard deviation above average, the other in a state with average educational expenditure, are 0.824 and 0.889, respectively. The relationship between minority enrollment and the likelihood of offering algebra depends on state educational expenditure per pupil. The gap in the log odds of offering algebra between predominantly minority schools and schools with fewer minority students is widened as state educational expenditure per pupil increases. The 85 estimated odds ratio for high minority schools in states with educational expenditure per pupil one standard deviation above the average versus their counterparts in states with average educational expenditure is exp(-.046*12.35) = 0.568. The predicted probabilities of offering algebra for two typical schools located in a state with an educational expenditure one standard deviation above average, with and without high minority enrollment, are 0.670 and 0.725 respectively. As the result in Table 4.3 suggests, the relationship between school SES and how likely a school is to offer algebra does not depend on state per pupil educational expenditures. Regarding the state-to-state heterogeneity, substantial variance remains to be accounted for after putting the school- and state-level predictors into the model. There is about a reduction of 35.6% in the mode of the variance. Figure 4.5 gives a histogram of the posterior distribution of the conditional variance for the second model. Note that the 90% CI. (0.286, 0.736) of the adjusted or residual variance is much narrower than that of the unconditional model 90% C1. (0.492, 1.222). Allowing the random effect to vary from -1.5 to 1.5 standard deviation units and holding all the variables constant at their grand means, the adjusted predicted probabilities for individual states range from 0.63 to 0.93 when 1: is equal to its mode, 0.437. 86 Figure 4.5: Marginal Posterior Distribution of 1: (The Conditional Model) PROPORTION PER BAR 0.15 0.10 0.05 _ - 150 7'1 - SK — 100 ~ 51 ~ 50 I I’T 0.15 0.70 1.25 1.80 STAT E—LEVEL VARIANCE lNITOO 87 4.4 Discussion and Implications Given that the results of the analyses confirmed most of the hypotheses stated in the study, a question that arises is whether or not the statistically significant partial effects of the school- and state-level predictors are practically significant as well. Translating the changes in odds or predicted probabilities to the opportunities to learn at the student level allows us to assess the practical significance and the magnitudes of effects of the predictors. It enables us to examine the extent to which factors at one level of the multilevel educational system influence, alter, or limit the events at lower levels. To proceed, I use the predicted probabilities of offering algebra obtained for two typical schools with and without high minority enrollment. All else held constant, the two predicted probabilities are 0.725 and 0.83 7, respectively. They differ by 0.112. Thus, for 100 typical schools with high minority enrollment, it is predicted that there will be 11 fewer schools offering algebra than if the probabilities were equal. As seen in Table 4.1, the average grade eight enrollment is 217. Approximately 2,400 students will attend schools that do not offer algebra for high school credits as a result. This indicates that the practical significance of the racial composition effect is substantial. By the same token, differences in the predicted probabilities of offering algebra among typical schools with different SES composition, size, and in different settings (rural versus others) could have a significant impact on students' access to the course. The correlational effect of the denial of access to algebra initiated at a higher level of the educational system can also be illustrated using a unit above schools--the state. For example, all else being equal, a difference of 0.065 in the predicted probabilities of 88 offering algebra brought about by a one standard deviation decrease in state educational expenditure per student will be translated to the absence of the algebra course in hundreds of typical schools within a state and, on average, to the preclusion of possibilities to learn algebra for 217 students in each school. Whereas the partial effects allow us to gauge the influence of a particular factor net those of the other factors, it may be useful and even important to examine how the outcome varies with more than one variable. One reason is that a school's disadvantage in its capacity to offer algebra may be compounded by other factors. For instance, high minority schools on average have lower SES than low minority schools, and rural schools usually have fewer students than their urban and suburban counterparts. In the 1992 TSAP sample, high minority schools have a mean SES of -0.982, and low minority schools have a mean of 0.163. The two means are more than one standard deviation apart. Thus, other factors remaining equal, the predicted probability of offering the course for a typical high minority school whose SES is equal to -0.982 is (1+exp{-[l .540 + - 0.664*0.86 + 0.383*-0.982]}) = 0.644. It is 0.081 less than that for a high minority school with average SES, i.e., 0. After assessing the practical significance of most predictors, I now offer and discuss some tentative explanations as well as the implications of the findings. The analyses indicate racial and social stratification in the opportunities to learn algebra among schools in the 42 states examined. The fact that high minority schools are more likely to have lower SES suggests there is a clustering of poor and minority children in schools, which leads to a compounding of school's disadvantage. The clustering, as 89 suggested by Hawley (1993) and Carter (1995), could have been brought about by residential segregation, policies of neighborhood school assignment, and school district configurations. Regarding school size, the results support the assertion that greater grade enrollmentencourages greater specialization and differentiation of the curriculum (Lee & Smith, 1995). The association, however, is a nonlinear one and it weakens as school size increases. One possible reason that rural schools are less likely to offer the advanced mathematics course, among other things, may be the limited availability of highly- qualified teachers in the rural sector due to the more restrictive markets in certain demographic regions (Monk & Haller, 1993). There may be an inequitable distribution of well-trained teachers across the different sectors. The association of increased levels of financial investment in public education with a greater access to the course lend some support to the conclusions of Hedges, Laine, and Greenwald's (1994) meta-analysis of studies of the effects of differential school inputs on student outcomes. They concluded that financial resource input is likely to be related to school outcomes. The result of the present investigation shows that increased state spending is associated with the offering of a course and in turn is associated with higher mathematics achievement (MacIver & Epstein, 1995) and a greater likelihood of enrollment in advanced high school mathematics (Stevenson et al., 1984). It is important to note here, however, as no educational expenditure at the district level was entered as a predictor, district-to-district variability in funding was not modeled. In order to model the relationship between the two educational inputs (district educational spending and the 90 availability of the course), two indices of educational investment efforts would be needed. These indices are the educational expenditure per student as well as the level of special education funding. TSAP does not provide the relevant data for the inclusion of these indices in the analysis. Hence what was examined here was the general effect of overall state public education investment efforts on the availability of the advanced mathematics course. Whereas higher levels of public education investment efforts in public education are associated with a greater likelihood of the offering of the course, they also widen the racial composition gap in access to algebra. This seems to suggest that increased state educational expenditures benefit low-minority schools more than high-minority schools. In sum, the results suggest that schools serving minority students and low SES students, small schools, and schools located in states spending less on education are comparatively less likely than other schools to offer algebra. The implication of the finding is that size, composition, and location of a school are linked to inequality in access to this particular educational resource. In these ways, the schooling system reinforces social, ethnic, and geographic inequality in the opportunity to study high- school algebra in the middle grades. Finally, the analyses show that in addition to allowing one to analyze the specific effects of variables of interest, while holding other factors constant, the fitted or predicted probabilities could act as a production and curriculum indicator (Oakes, 1989; Pelgrum et al., 1995; Porter, 1991) at the state and the school level. They inform us about the "implemented curriculum" (Pelgrum et al., 1995) at those levels and how they are related 91 to "curriculum antecedents" (Pelgrum et al., 1995) such as the SES composition of the school. The fitted probabilities could perform as an indicator as well as a diagnostic of how well the educational systems perform. For example, they can indicate the presence of the stratification of learning opportunities due to race, SES, and location in terms of student attendance at a school that offers algebra. CHAPTER 5 SUMMARY AND CONCLUSIONS 5.1 Summary and Conclusions In this work, I coded Zeger and Karim’s (1991) Bayesian posterior analysis for generalized linear models with random effects via the Gibbs sampler. The SAS/IML code performed well and the algorithm gave reasonable inferences. This was documented with the help of a simulated study with datasets having structure similar to the TSAP in mathematics data. Using the algorithm, 1 implemented a Bayesian, multilevel analysis of access to eighth-grade algebra. The analysis accommodated the hierarchical or nested design of the TSAP in mathematics and the need to incorporate the uncertainty that arose about the variances at the state level. The results suggest that schools serving minority students and low SES students, small schools, schools located in the rural setting, and states spending less on education are comparatively less likely than other schools to offer algebra. The implication is that size, composition, and location of a school are linked to inequality in access to this educational resource. In these ways, the schooling system reinforces social, ethnic and geographic inequality in the opportunity to study advanced mathematics. The analyses demonstrated that the fitted probabilities could be used as an indicator for certain academic opportunities in schools with different demographics, enrollment, setting, and levels of state financial investment efforts. In addition, the approach can be applied to policy research addressing the multilevel structure of the educational and social systems. 92 93 5.2 Future Research Needs 5.2.1 Substantive Inquiry This study provided a snapshot of the different availability of eighth-grade algebra among public schools in 42 states in 1992-93. The focus of the inquiry is on an essential component of the opportunities to learn algebra in middle grades, which is content of instruction that schools make accessible to students (Porter, 1994). With the current reforms in mathematics that stress a high-quality curriculum for all students (Porter, 1994; Wheelock, 1996), efforts to effect changes in the content of and access to algebra in the mathematics curriculum for the middle grades have been initiated over the past few years (Edwards, 1994; Olson, 1994). Two main and related curricular changes regarding algebra instruction are advocated by the widely-circulated WW Mathematics (NCTM, 1989), often abbreviated as the Standards. First, as Jack Price, the president of NCTM, stated, algebra should be seen as "a strand throughout the K-12 curriculum" (Olson, 1994, p.11 ), and not as an isolated instructional topic. Indeed it is recommended not to segregate basic algebraic ideas into a one-year course (N CTM, 1993). Second, algebraic thinking that emphasizes understanding, reasoning, problem- solving, and real-world applications is to be introduced gradually to all students in a broad and integrated curriculum for the middle grades. Other topics included in the recommended curriculum are measurement, geometry, probability and statistics. An example of an integrated middle grades mathematics curriculum that reflects the Standards is the Connected Mathematics Project Curriculum (Connected Mathematics 94 Project, 1996). According to the proponents of the Standads, extensive curriculum development is required to guarantee access to algebraic competence for all students (Silver, 1995). For supporters, the traditional first-year course in algebra is not the "right algebra" for everyone (Chambers, 1994), in part because of the course puts undue emphasis on rules and algorithms. The requirement of eighth-grade algebra in various school districts such as Cambridge, Massachusetts (Olson, 1994) has brought greater access to algebra to students. The district requirement shares the goals of the Standards to engage more students in learning important mathematical ideas and promote their proficiency in mathematics (Silver, 1995). Further research can be carried out to study the impacts of current reforms on access to algebra. Of particular interest is to examine the availability of algebra instruction in the middle grades amidst the perceived incompatibility between the advocacy of a "problem-based" curriculum suggested by the NCTM and the more conventional "scope-and-sequence" approach (Schoenfeld, 1994). In addition, the great amount of instructional and financial support and community involvement called for by the standard-based reforms, for example, extensive curriculum development and teacher education, add additional difficulties in the reform process to guarantee access and opportunity for all students (Bruckerhoff, 1995; Silver, 1996; Tate, 1994). An important research question one may ask is what state- and school-level factors may influence a school's decision to implement curricular reform and offer "problem-based" algebra 95 instruction in the middle grades. The present inquiry focused on the existence of an algebra course for high school credits in the middle grades. Further research can be carried out to review its "breadth" or the percentage of students within a school who enroll in the course. Becker (1990) found substantial within-school variability in access to algebra. He found that out of the 63% of the 2,400 middle-grade schools which offered algebra courses in their curriculums in grade 7, and, more typically, grade 8, only 35% provided access to algebra to at least one- quarter of their students. A future study of the student- and school-level correlates of enrollment in the algebra course can complement the present investigation by offering an understanding of the selection process at another level of the hierarchical educational system, that of the student. Using Pelgrum et al.'s (1995) term, the study will allow one to study the "attained curriculum" (p. 81) at the micro, or student level, and will complement the present investigation of the "implemented curriculum" (p.81) at the meso, or school and state levels, in monitoring the performance of the multilevel educational system. 5.2.2 Methodological Development The Gibbs sampler was coded in SAS/IML and its execution was very time consuming. In the simulation, for instance, it took approximately 8 to 16 hours to fit a model. Coding in a lower or intermediate-level language such as C will greatly reduce the run times. A prototype written in C is currently being developed. My early prediction is that it will take approximately half an hour to forty five minutes to execute the same analysis mentioned previously. 96 The simulations and analyses of the present study focused on cases where 1: is a scalar. With the development of a faster computer code, simulations to document the performance of the algorithm for cases where T is a matrix can be carried out. Satisfactory results will allow the code to be applied to analyze random coeffrcient models. A slight modification of the over-simplified model (2.6) described in Chapter 2 gives us an example of a random coefficient model. By assuming the racial composition effect to be randomly varying across the 42 states after controlling for StateExpj, i.e., BU. = 710 + YHStateExpj + “11 , (51) model (2.6) becomes a random coefficient model with u] being a 2 x 1 vector of random effects with mean of zero and variance T. APPENDICES APPENDIX A APPENDIX A SAS\IML Code for the Gibbs Sampler /*******************************************************************/ /* Program name: bayessas Purpose and Modules: The code implements the Gibbs sampler developed by Karim (1991) and Zeger and Karim (1991). Its main program consists of the following modules: 1) import -- reads in the data set, 2) initial -- enters various parameter values and sets up vectors and matrices, 2) glimgam -- performs GLIM to obtain 9 and V, 3) gbgarn -- generates y 4) gbtau -— generates T 5) ridgeuj -- performs ridged regression to obtain a, and V0, 6) gbb' -- generates ul 7) output -- outputs simulated values of y, T and uj */ /********************************************************************/ /******** formats and directories ******/ Options pagesize= 100 linesize = 80 nocenter number date; Libname gibbs 'c:\sas\data'; Libname output 'c:\sas\output'; Proc iml; reset nolog; /********* formats and directories ******/ 97 98 /********* mOdUIe import ************/ start import (yij,lev2var,lev1var,nj); /* begin module import: get raw data */ /* yij = outcome variable lev2var = a level 2 variable levlvar = a level 1 variable nj = number of level 1 units in cluster j */ use gibbs.rawdata; read all var{yij lev2varlev1var}; close metgibbsrawdata; use gibbsnj; read all var {nx} into nj; close metgibbsnj; finish; 99 /*********pan 2 ************/ start initial (yij,lev2var,levlvar,nj,xij,zij,ngroup,N,nf,nq,inititer, niter,nsample,gam,uj,Tau,seed1 ,outgam,outuj ,outTau); /* xij = predictors for fixed effects zij = predictors for random effects N = total sample size ngroup = number of level-2 units nf = number of fixed effects --- beta's nq = number of random effects --- z's inititer = counter of iteration niter = number of maximum iteration nsample = sample size of the posterior dist. gam = fixed effects uj = random effects Tau = variance of uj's seedl = seedl number outgam = output beta 0 = no, 1 = yes outuj = output bj 0 = no, 1 = yes outTau = output Tau 0 = no, 1 = yes */ ngroup = 40; nf = ; nq = ; inititer = 10; niter = 5000; nsample = 2000; seedl = 12378942; outgam = ; outbj = ; outTau = ; bj = j(ngroup*nq,l,0); yij = yij; N = nrow(yij); x1 =j(N,1,1); xij = x1 ||lev2var||lev1 var; zl =j(N,nq,l); zij = 21; finish; 100 /******** mOdUIe glimgam ********/ start glimgam (yij,nj,nf,nq,ngroup,N,xij,zij,gam,uj,vgam,inititer,det, loggamx,ofstuj); crit=1 ; lo=0; iterval=j(20,2,0); loggamx = 0; ofstuj = j(N,1,0); /* offset zijuj */ a = ; c = ; e = ; f= , g = nq; Do i = l to ngroup; ijth = a -1+ nj[c:e]; ofstuj[a:ijth] = zij[a:ijth,1:nq]*uj[f:g]; a = a + nj[c:e]; c=c+1; e=e+l; f=f+nq; g=g+n¢ End; Do it=1 to 20 while(crit> 1.0e-8); etaij =j(N,l,0); loglhood=0; xwx = j(nf,nf,0); xwystar = j(nf,1,0); ystarij =j(N.l.0); wij =j(N.1,0); uij =j(N.1,0); a— , c=1; e= , f= . 101 Do i = 1 to ngroup; end; ijth = a - 1 + nj[c:e]; etaij[a:ijth] = xij[a:ijth,l :nf]*gam + ofstuj[a:ijth]; nunit=ijth-a+1; if inititer <= 10 then do; if it <= 3 then do; h = a; do k = 1 to nunit; if .95 <= uij[h:h] then do; uij[h:h] = .95; * print 'large probability'; end; h = h+l; end; end; end; wij[a:ijth] = uij[a:ijth]#(1-uij[arijth]); ystarij[a:ijth] = etaij[a:ijth] + ((yijlaiiith] - uij[aiijthlywijlaiijflll); xwx = xwx + (xij[a:ijth,l:nf]#wij[a:ijth])‘*xij[a:ijth,1:nf]; xwystar = xwystar + ((xij[a:ijth,1:nf]#wij[a:ijth])‘*(ystarij[a:ijth]- ofstuj[a:ijth])); loglhood = loglhood + sum(yij[azijth]#log(uij[a:ijth]) + (1-yijlaiijth])#10g(1-uij[aiijth])); a=a+nj[c:e]; c=c+1; e=e+1; f=f+nq; g=g+nm gam = solve(xwx,xwystar); vgam = inv(xwx); crit=abs(loglhood-lo); iterval[it,]=it||loglhood; 10 = loglhood; end; 102 loggamx = loglhood; finish; /***** module gimgam ****/ 103 /********* module gbgam ******/ start gbgam (yij,nj,nf,nq,ngroup,N,xij,zij,gam,uj,vgam,seed1,loggamx,ofstuj,inititer); /* ygam = envelope function (multivariate normal distribution) xgam = actual function c1 = mode-matching constant c2 = variance-inflation constant loggamx = pi(x) from glimgam */ ygam = 0; xgam = 0; gamstarl = j(nf,1,0); c1 = 0; c2 = 2; etaij =j(N,1,0); uij =j(N.1,0); wij =j(N.1,0); nordev = j(nf,1,0); ratio = 0; /* compute the log of ordinate of ygam--the envelope function at the mode */ loggamy = - ((nf/2) * log (02 * 3.1416)) - ((nf/2) * log (det(c2 * vgam»); /* compute c1 */ c1 = loggamx - loggamy; print cl; /* generate gamstar from ygam */ u = 1; gamcnt = 0; Do while (ratio < u); a=1; b=1; doi=ltonf; nordev[a:b] = rannor(seed 1 ); a = a + 1; 104 b=b+ 1; end; CholeskU = j(nf,nf,0); CholeskU = root(c2*vgam); gamstar= gam+CholeskU‘ *nordev; /* compute the ordinate */ loggamy = - ((nf/2) * log (c2 * 3.17)) - ((nf/2) * log (det(c2 "‘ vgam))) - ((gamstar-gam)‘ *inv(c2*vgam)*(gamstar-gam))/2; /* compute the ordinate of xgam */ a=. c=. e=, f=, g=nq, Do i = 1 to ngroup; ijth = a - l + nj[c:e]; nijth = 0; nijth=ijth-a+1; etaij[a:ijth] = xij[a:ijth,l :nf]*gamstar + ofstuj[a:ijth]; uij[azijth] = 1/(1+exp(-l*etaij[a:ijth])); loggamx = loggamx + sum(yij[a:ijth]#log(uij[a:ijth]) + (1-yij[a:ijth])#log(l-uij[a:ijth])); a = a + nj[c:e]; c = c + 1; e = e + 1; f = f + nq; g=g+n¢ end; /* calculate the ratio fgam/ggam */ gamcnt = gamcnt + 1; ratio = exp(loggamx - loggamy - c1); u = uniform(seedl); end; /* updating gam */ gam = gamstar; end; finish; /******* mOdUIe gbgam *********/ 105 106 /******* mOdUIC ngau ***************/ start ngau (uj,Tau,nq,ngroup); seedl = 344456; 3 = j(nq.nq.0); c=1; e=nq; Do i = 1 to ngroup; S = S + (uj[c:e]*uj[c:e]‘); c=c+nm e=e+nm end; do i = 1 to nq; doj = 1 to nq; if(i 1.0e-8); loglhood = 0; etaij [azijth] = ofstgam[a:ijth]+ + zij[a:ijth,l :nq]*uj[f:g]; uij[a:ijth] = 1/(1+exp(-etaij[a:ijth])); nunit=ijth-a+ 1; if inititer <= 50 then do; if it <= 20 then do; h = a; do k = 1 to nunit; if uij[h:h] <= .01 then do; uij[h:h] = .01; end; else if .99 <= uij[h:h] then do; uij[h:h] = .99; end; h =h+1; end; end; end; wij[a:ijth] = uij[a:ijth]#(1-uij[a:ijth]); ystarij[a:ijth] = etaij[a:ijth] + (yij[a:ijth] - uij[a:ijth])/wij[a:ijth]; zwzian = (zij[a:ijth,1 :nq]#wij [a:ijth])‘ *zij[a:ijth,1 :nq]+inv(Tau); zwystaro = (zij[a:ijth,1:nq]#wij[a:ijth])‘*(ystarij[a:ijth]- ofstgam[a:ijth]); loglhdp1[a:ijth] = (yij [azijth]#log(uij[a:ijth])) 110 + ((1-yii[aiijth])#log(1-uij[aiijth])); loglhood = sum(loglhdp1[a:ijth]) - (uj[f:g]*inv(D)*uj[f:g])/2; crit = abs(loglhood-lo); iterval[it,]=it||loglhood; 10 = loglhood; uj[f:g] = solve(zwzian, zwystaro); end; varuj [fz g,1 :nq]= inv(zwzian); logujx[c:e] = loglhood; a = a + nj[c:e]; c = c + 1; e = e + 1; f=f+nq; g=g+nm end; finish; /******* module ridgeuj ***********/ lll /******** mOdUIe gij ************/ start gbuj (yij,xij,zij,nj,N,ngroup,nf,nq,gam,Tau,uj,varuj,seedl , logujx,accuj,ofstgam,inititer); CholeskU = j(ngroup*nq,nq,0); u = j(ngroup*nq,1 ,0); logujy = j(ngroup,1,0); /* (y --- the envelope/proposal function) */ ujstar = j(ngroup*nq,1 ,0); c1 = j(ngroup,1,0); c2 = 4; etaij =j(N.1,0); uij =j(N,1,0); wij =j(N,1.0); logur'Xpl =j(N.1,0); ujcount = j(ngroup,1,0); /* calculate c1 */ c=1, e= . f= . g=nq; Do i = 1 to ngroup; /* compute the ordinate of ggam --- the envelope function at the mode */ logujy[c:e] = - ((nq/2) * log (2 * 3.14)) - ((nq/2) * log (det(c2 * varuj[f:g,1:nq]))); 112 /* compute the ratio c1 */ c1[c:e] = logujx[c:e] - logujy[c:e]; c = c + l; e = e + 1; f= f+ nq; g=g+nm end; /* generate ujstar from yuj --- the envelope function */ a=. c=. e=, f= . 8:110; nijth=0, Do i = 1 to ngroup; ratio = 0; u = 1; ijth = a - 1 + nj[c:e]; /* size index */ nijth= ijth- a+ 1; CholeskU[f:g,l :nq] = root(c2*varuj [f:g,1 :nq]); Do while (ratio < u); nordev = j(nq,1,0); x = 1; do k = 1 to nq; nordev[xzx] = rannor(seed l ); x = x + 1; end; uj star[f: g] = uj [f: g] + CholeskU[f:g,1 :nq]‘ *nordev[l :nq]; 113 /* compute the ordinate of uj star using guj ---- the envelope function */ logujy[c:e] = - ((nq/2) * log (2 * 3.14)) - ((nq/2) * log (det(c2 * varuj[f:g,1:nq]))) - ((UJ'StarlfIgl-ujlfigl)‘ *inv(c2*varuj[f:g,1:nq])*(ujstar[f:g]-uj[f:g]))l2; /* the ordinate of ujstar using fuj ---- the true function */ etaij[a:ijth] = ofstgam[a:ijth]+ zij [a:ijth,1 :nq]*ujstar[f: g]; uij[a:ijth] = 1/(1+exp(-etaij[a:ijth])); logujxp1[a:ijth] = ((yij [azijth]#log(uij [azijth])) + ((1-yijlaiijth])#10g(1-uij[a=ijth]))); logujx[c:e] = sum(logujxpl [azijth]) - (uj star[f:g] *inv(Tau)*ujstar[f:g])/2; /* compute the ratio */ ujcount[c:e] = ujcount[c:e] + 1; ratio = exp(logujx[c:e] - cl [c:e] - logujy[c:e]); u = uniform(seedl); end; ujlfig1= ujstarlftg]; a = a + nj[c:e]; c = c + 1; e = e + 1; f=f+nq; g=g+nm end; totujcnt = 0; totujcnt = sum(ujcount); print totuj cnt; end; finish; /****** mOdUIe gbuj ****************/ 114 /****** module output ***************/ start output (itcount,firstout,ngroup,nf,nq,niter,nsample, outgam,outTau,outuj,gam,Tau,uj); gamt = j(1,nf,0); /* gain */ gamt = gatn‘; uniqTau = nq * (nq + 1)/2; /* unique elements of Tau */ elemTau = j( 1 ,uniqTau,0); ujt = j(1,ngroup*nq,0); /* uj */ ujt = uj‘; if (firstout = itcount) then do; if (outgam = 1) then create output. gampost from gamt; if (outTau = 1) then create output.Taupost from elemTau; if (outuj = 1) then create outputujpost from ujt; end; /* Write out the posterior distribution of the parameters */ if (outgam = 1) then do; setout output. gampost; append from gamt; end; if (outTau = 1) then do; index = 1; do i = 1 to nq; do j = 1 to nq; if (i <=j) then do; e1emTau[1,index] = Tau[i,j]; index = index + 1; end; end; end; setout output.Taupost; append from elemTau; end; if (outuj = 1) then do; setout outputuj post; append from ujt; end; finish; 115 116 /******* main program **********/ itcount= 1; run import (yij,lev2var,lev1var,nj); run initial (yij,lev2var,lev1var,nj,xij,zij,ngroup,N,nf,nq, inititer,niter,nsample, gam,uj,Tau,seed1 ,outgam,outuj ;outTau); Tau = 1; 111' =j(ngr0up*nq.1,0); firstout = niter - nsample + 1; Do while (itcount <= niter); run glimgam (yij,nj,nf,nq,ngroup,N,xij,zij,gam,uj,vgam,inititer,det, loggamx,ofstuj); run gbgam (yij,nj,nf,nq,ngroup,N,xij,zij,gam,uj,vgam,gengam,seed1, loggamx,c2gam,cntgam,accgam,ofstuj,inititer); run ridgeuj (yij,xij,zij,nj,N,ngroup,nf,nq,gam,Tau,uj,varuj,inititer, logujx,ofstgam); run gbuj (yij,xij,zij,nj,N,ngroup,nf,nq,gam,Tau,uj,varuj,genuj,seedl , logujx,o2uj,cntuj,accuj,ofstgam,inititer); run ngau (uj,Tau,nq,ngroup); If (itcount >= firstout) then do; run output (itcount,firstout,ngroup,nf,nq,niter, nsample,outgam,outTau,outuj,gam,Tau,uj); end; print itcount; itcount = itcount + 1; end; quit; APPENDIX B APPENDIX B Approval Letter from the US. Department of Education for Accessing Individually Identifiable Survey Data Base US. DEPARTMENT OF EDUCATION spruce CF soucmowu RESEARCH mo MPROVEMENT NATIONAL CENTER FOR EDUCATION STATISTICS Yuk Fai Cheong Department. of Counseling .. College of Education SEP 5 1994 Michigan State University East Lansing, Michigan 48824-1034 Dear Mr. Cheong: IampleuedtomformyouthstthoDepsrunentofEducatiomMichiganStats University has met the requirements for accessing the individually identifiable survey data base entitled: "1992 NAEP Trial State Assessment". The following information is enclosed for your use: 0 A signed copy of the license and 1 cop(ies) of the Affidavit of Non- Disclosure, for yourself and three project stsfl'; o Onecopyofthedambaseyourequescadonfour9-track tapes numbered: W-1565, W-01520, W-01483 and W05297; and o Disk containing documentation. Please keep the single copy ofthe W914, GEPA. and the N088 Wm enclosed with your initial licensing application. with the executed license for reference by you and those project staff who will be accessing the data. Also retain a copy of your approved data W with the examined license. Violations of any of the licensing provisions by any member of your research project stafl' could result in cancellation. MMmonlmeLEWmmwaa period not to exceed 5 years commencing with the date of the NCES Commissioner’s signature on the license. You have been assigned license number: 940830148. Reference this number in all future correspondence. If you have any questions. you may call me at (202) 219-1920. Alan W. oorehead Data Security Officer Enclosures .‘JASHINGTON DC 22235-- 117 APPENDIX C APPENDIX C Approval Letter from the University Committee on Research Involving Human Subjects MICHIGAN STATE UNIVERSITY November 20. 1996 To: Ste hen w. Raudenbush Col ego of Education 461 Erickson hall R3: IRES: 90-‘13 TITLE: CORRELATES OF ST!!! VIRIATION IN TR: SOCIRL DISTRIBUTION OF ACE : A RAYSERN ANALYSI : METHODOLOGICAL ALTERNIIIV‘S 18 TH: ANALYSIS OF Dflxh FROM NAEP REVISION REQUESTED: 11/05/96 CATEGORY: 1'3 APPROVAL DATK: 09/13/96 The University Committee on Research Involving Human Sub eots'IUCIIISI review of this project is complete. I am pleased to a so that the rights and welfare of the human subjects appear to be adequately rotected and methods to obtain informed consent are a ropriate. refore. the UCIIBS approved this project and any rev sions listed above. hllllhns UCIIHS approval is valid for one calendar year. beginning with the approval date shown above. Investigators planning to continue a project be one year must use the green renewal form (enclosed with t original a rovel letter or when a ptOject is renewed) to seek u date certification. There is a maximum of four such expedite renewals ossible. Investigators wishing to continue a reject beyond the time need to submit it again or complete re ew. asvrsross: UCIIBS must review :3! changes in rocedures involving human subjects. rior to tiation of t change. If this is done at the time o renewal. please use the green renewal form. To revise an approved protocol at an other time during the year. send your written request to the CRIBS Chair. requesting revised approval and referencing the project's IRS # and title. Include in your request a description of the change and any revised ins ruments. consent forms or advertisements that are applicable. paostxxs/ . ell-088: Should either of the following arise during the course of the work. investigators must noti UCRIES promptly: (l) roblems (unexpected side effects, comp aints. etc.) involving uman subjects or (2) changes in the research environment or new information indicating greater risk to the human sub ects than existed when the protocol was previously reviewed approved. STUDIES If we can be of any future help. lease do not hesitate to contact us at (517)355-2180 or PA! (Sl7Is 2- 171. Mumsldlmm Sincere ...,...‘W (Jr David E. Wright. Ph. . UCRIRS Chair DEN : bed sums-2180 FAX: SIT/£3241" cc: Yuk Fai Cheong mums-mm memes-m fan‘s-m “worm 118 REFERENCES LIST OF REFERENCES Annie E. Casey Foundation. (1994). Kids c ata : t r fil f hild well- b_e_i_ng. Washington, DC: Center for Study of Social Policy. Anderson, D. A., & Aitkin, M. (1985). Variance component models with binary response: Interviewer variability. al f e 1 ti t' a ci 'e B, 41, 203-210. Archer, E. (1993). WWW Eccggam. New York, NY: Academy for Educational Development. Becker, H. J. (1990). Curriculum and instruction in middle-grade schools. Ehi Delta Kantian, 11., 450457. Besag, J ., Green, P., Higdon, D., & Mengersen, K. (1995). Bayesian computation and stochastic systems. Statistical Science, 19(1), 3-66. Best, N, Cowles, M. K. &Vines, K. (1995). QQD A anvergegce Diegecci cad WWW Cambridge MRC Biostatistics Unit. Box, G. E. P., & Tiao, G. C. (1973). W. New York: Wiley. Breslow, N. W. & Clayton, D. G. (1993). Approximate inference 1n generalized linear mixed models lomnacflthcAntcncanStattsnchssccanan 38 9- 25 Bruckerhofi‘, C. (1995). Life in the bricks. Wen, 3_Q(3), 317-336. Bryk, A. S., Raudenbush, S. W., & Congdon, R. T. (1996). fliegagchical ligear egg lin ar deli wi M/ . Chicago: IL: Scientific Software International, Inc. Carter, R. L. (1995). The unending struggle for equal educational opportunity. leachege W. 26(4). 619-626. Casella, G, & George, E. I. (1992). Explaining the Gibbs sampler. Diem Statistician, 9.6(3) 167-174 Chambers, D. L., (1994). The right algebra for all. EducaticnelLeadecship, 51(6), 119 120 85-86. Council of Chief State School Officers. (1993).w§cjfi_1e_edmmcacc__ 'ee_‘ 11 fi_,' .. lt‘10.-.‘H 1-1.! Washington: DC: Author. Cowles, M.K.(l994). .' -. ' -. ' -‘ .111,,-.-111.'-111 ‘.11‘1--.11 -1111. a'on - - ' ‘ ' ' ' - ' , -1. Unpublished doctoral disSertation, University of Minnesota. Dagpunar, J. (1988). MW. Oxford: Clarendon Press. DeMaris, A. (1992). chitmadclineiflracticalanni'maticns (Sage University Paper series on Quantitative Applications in the Social Sciences, series no. 07-086). Newbury Park, CA: Sage. Dempster, A. P., Laird, N. M., & Rubin D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). WW Stansncalfiaclcndcncsfimfllls Dellaportas, P., & Smith, A. F. M. (1993). Bayesian inference for generalized linear and proportional hazards via Gibbs sampling, AppliedStatistjcs, ‘12, 443-459. Draper, D. (1995). Inference and hierarchical modeling in the social sciences. 1cm EdacatitleandSchaxictalStatiatics. 29(2). 115-147- Epstein J L &MePaflland, J M (1988) EducatmninthmddcgradcshAmncna 1 1- --_11 ‘ 1 - -a‘ «41 "1111-'1.1w.Balt1more: John Hopkins University, Center for Research on Elementary and Middle Schools. Edwards, T. G- (1992) ancnticfcnncffcnsinmathcmaticscaucatian Washington. DC. (ERIC Document Reproduction Service No. ED 372 969). Fahrmeir L & Tutz G 0994)mech Lineaemcdelc. New York: Springer-Verlag. Gamoran, A. (1987). The stratification of high school learning opportunities. Scciclch afEducaticn, SQ. 135-155. Gelfand, A. E. (1994). Gibbs sampling. To appear in the Egcyclcpedia cf Statistical Scicnccs. 121 Gelfand, A. E. ,& Smith, A. F. M. (1990). Sampling based approaches to calculating marginal densities lomnalcfthanncncanStatrsJLAssacann, 8:, 398-409 Gelfand, A. E., Hills, 8., Racine- Poon, A. & Smith, A. F. M. (1990). Illustration of Bayesian inference 1n normal data models using Gibbs sampling. 1031293121111; AmericanStatisticaIAssociaticn, 85, 972- 985. Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (19951W. New York: Chapman & Hill. Gemen, S. & Gemen, D. (1984). Stochastic relaxation, Gibbs distributions and the Bayesian restoration of Images IEEEILansactianLonfiattcmAnalxsisand. Machinalntclligcnccfi, 721 741. Geweke, J. (1992). Evaluating the accuracy of sampling-based approaches to calculating posterior moments. In J. M. Bernardo, J. O. Berger, A. P. David, & A. F. M. Smith (Eds.)fia1e_s_1an_S_tgt§_nc§_4 (pp. 169-193). Oxford: Clarendon Press. Gibbons, R. D. ,& Hedeker, D. (1994). Application of random-effects probit regression models W 62 285-296 Gilks, W. R. (1996). Full conditional distributions. In W. R. Gilks, S. Richardson, & D J Spiegelhalter EdS)Markoy_£hainMontc_QadainnLacticc (pp 75- 33) London: Chapman & Hill. Gilks, W. R., Best, N. G., &Tan, K. K. C. (1995). Adaptive rejection Metropolis sampling within Gibbs sampling. To appear in AppLedStatistice. Gilks, W. R. & Wild, P. (1992). Adaptive rejection sampling for Gibbs sampling. AnnhcdStatistics, 4.1, 337- 348. Gilks, W. R., Clayton, D. G., Speigelhalter, D. J., Best, N. G., McNel, A..,J Sharples, L. 1)., Kirby, A. J. (1993). Modeling complexity: Applications of Gibbs sampling 1n medicine WM 5.5, 39- 52 Goldstein, H. (1991). Nonlinear multilevel models, with an application to discrete response data. Siemenjlge, 73(1), 45-51. Goldstein, H. & Rasbash, J. (1995). Improved approximations for multilevel models with binary reSPense LoinnaLcfltheRoxaLStatisticaLSocicnnficticsfl, ii 39- 52 Gordon, B. W. (1967). Equalizing education opportunity in the public school. IRQB Bulletin. (1)5, 1-3- 122 Haller, E. J ., Monk, D. H., Bear, A. S., Griffith, H, & Moss, P. (1990). School size and program comprehensiveness: Evidence from High School and Beyond. EdnaahcnalExalaatmamdfioluAnalxsts. 12(2). 109-120. Hallinan, M. T. (1992). The organization of students for instruction in the middle school. WM, 6.5, 114- 127. Hallinan, M. T. (1996). Educational processes and school reform. In A. C. Kerckhoff (Ed) - 1 , 1 1 - - (pp. 153- 169). Boulder, CO: westview PreSs. Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Bicmemka, 17, 97-109. Hawley, W. D. (1983). Achieving quality integrated education--with or without federal help- BhLDcltaKanna. 611(5). 334-338. Hedeker, D. R., & Gibbons, R. D. (1994). A random-effects ordinal regression model for multilevel analysis. Bicmetgjcc, 5_Q, 933-944. Hedeker, D. R., & Gibbons, R. D. (in press). MIXOR: a computer program for mixed- effects ordinal regression analysis. To appear in CcmpnteLMethcdiand EmgtamsinfiiotncdicaLRcscarch. Hedges, L. V., Laine, R. D., & Greenwald, R. (1994). Does money matter? A meta-analysis of studies of the effects of differential school inputs on student outcomes. Edncanonalficsearchcr. 23(3). 544- Heidelberger, P., & Welch, P. (1983). Simulation run length control in the presence of an initial transient. Qmeticnfleseacch, 31, 1109-44. Hosmer, D. W., & Lemeshow, S. (1989). mm. New York: John Wiley. Jetter, A. (1993). Mississippi learning. Wm, 21, 28-72. Kamii, M. (1990). Opening the algebra gate: Removing obstacles to success in college preparatory mathematics courses. MW, 52(3), 392-405. Karim. M. R. (1991). SenaralizchincaLmadelmmrandcmctIects Unpublished doctoral dissertation. Johns Hopkins University. Karim, M. R., & Zeger, S. L. (1992). Generalized linear models with random effects: 123 Salamander mating revisited. Licmctiics, AS, 631-644. Kuk, A. Y. C. (1995). Asymptotically unbiased estimation in generalized linear models with random effects.J mflciRoxalStatisticalfiocicSLScnesB. 51. 395-407 Lee, V. E. & Smith, J. B. (1995). Effects of high school restructuring and size and early gains in achievement and engagement. ScciclcgycfiEdgcamm, SS, 241-270. Lindstrom, M. J ., & Bates, D. M. (1990). Nonlinear mixed effects models for repeated measures data. Biomenjcc, i6, 673-687. Educational Testing Service. Longford, N. T. (1994). Logistic regression with random coefficients. W Statsttcséallatamm, 11,71- 15. Longford, N. T. (1995). Random coefficient models. InG. Arminger, C. C. Clogg, M E Sobel (EdS).Handlmck_QfstatisticSmcdclmg12LthascciaLand hehaxicraljciences (pp. 519- 577). New York: Plenum Press. MacIver. D J. & Epstein. J L (1995)anomnnncs_talcarn._3cncfits_of_a_gcbra_can_t 1:111:111111-1°111 1111--1u11-111 111 11° Baltimore: The Johns Hopkins University, Center for the Social Organization of Schools. McCullagh, P., & Nelder, J. A. (1989). GenetaiizflLineaLMcfiels (2nd ed.). London: Chapman & Hill. McGilchrist, C. A. (1994). Estimation 1n generalized mixed models. anmaLcflRcyaL Stat1st1cal§oc1cna_Scncs_B, 56, 61 -.69 Metropolis, N. Rosenbluth, A. W. ,Rosenbluth, M. N., & Teller, A. H. (1953). Equations of state calculations by fast computing machines. 1911mm grammar 1087-1091 Mohadjer, L. Y. Rust, K. F, Smith, V. ,& Severynse, J. (1993). Sample Design and Selection. InE. G. Johnson, J. Mazzeo, & D. L. Kline ,Iechnicalm ' ' (pp. 35- 86). Washington, DC: National Center for Education Statistics. 7 Monk, D. H. (1994). Incorporating outcome equity standards into extant systems of 124 educational finance. In R. Beme & L. O. Picus (Eds), 9W edycaiicn (pp. 224-246). Thousand Oaks, CA: Corwin Press. Monk, D. H., & Haller, E. J. (1993). Predictors of high school academic course offerings: The role of school size Arncncanflncatianalficscarchiotmal. 3_Q(1), 3- 21. Moses, R. P. (1994). Remarks on the struggle for citizenship and math/science literacy. loumalchathcmaticaLBchaxicr, 13. 107-111 Moses, R. P., Kamii, M. ,Swap, S. M. ,& Howard, J. (1989). The Algebra Project: Organizing 1n the spirit of Ella MW, 52, 423-443. Mullis, I. V. S. (1991). '1‘ W (Princeton, NJ: Educational Testing Service, ii National Assessment of Educational Progress, 1990 and 1992). Mullis, I. V. S., Jenkins, F ., & Johnson, E. G. (1994). We; Washington, DC: National Center for Education Statistics. National Center for Education Statistics. (1994). W. Washington, DC: United States Government Printing Office. National Council of Teachers of Mathematics. (1989). WWW smaatdstorschoolmathcmatics. Reston. VA: NCTM. National Council of Teachers of Mathematics. (1993). Algebrafcuhe W1-F 11-1 '1 11' 1 .1"-°4-- -- 11_‘-1,Reston, VA: NCTM. Nelder, H. A., & Wedderbum, R. W. M. (1972). Generalized linear models. Mei tthoxalfitansncaLSocicmficncsA. 155. 370- 84. Oakes, J. (1989). What educational indicators? The case for assessing the school context. Educeticnal Evelgaticn and Eclicy Agelycis, 11(2), 181-199. Cakes, J. (1990). U- I I' I 9-1.-..- 'I i 1. . -11 1__-.I'I -I°_ ancnncrttnnticstalaammathandscicncc Santa Monica. CA: RAND O'Day, J. A., & Smith, M. S. (1993). Systemic reform and educational opportunity. In S. Fuhnnan (Ed) MW (pp. 250- 312). San Francisco: Jossey-Bass Publishers. 125 Odell, P. L., & Feiveson, A. H. (1966). A numerical procedure to generate a sample covariance matrix MalcfthaAmcncanStatisncaLuactancn. 61. 198- 203 Olson, L. (1994, May 18). Algebra focus of trend ro raise stakes. W pp. 1, l l. Pelgrum, W. J., Voogt, J., & Plomp, T. (1995) Curriculum indicators 1n international comparative research. In 11:1. -- _ 1 1 ' systems (pp. 80-102). Washington, DC: OECD Publications and Information Center. Porter, A. C. (1991). Creating a system of school process indicators. EducatigmaL W111“). 13- 30 Porter, A. C. (1994). National standards and school improvement in the 1990s: Issues and Promise. W, 102,, 421-449. Prosser, R., Rasbash, J. & Goldstein, H. (1991).m3_§mm1eyel_§1alymcer§ guide. London: Institute of Education. Rafiery, A. L. & Lewis, S. (1992). How many iterations in the Gibbs sampler? In J. M. Bernardo, J. O. Berger, A. P. David, & A. F. M. Smith(Eds.)B_a1e51_an_§tati§11c§_ 51, (pp. 763- 773). Clarendon Press, Oxford, UK. Rafiery, A. L. & Lewis, S. (1996). Implementing MCMC. In W. R. Gilks, S. Richardson, & D. J. Spiegelhalter (Eds. ) MarkcyflhainMcmeQarkLm Enacticc (pp. 115- 130). London: Chapman & Hill. Raudenbush, S. W. (1988). Educational applications of hierarchical linear models: A review lmnnaLoflEducationalSatistics. l3. 85- 116 now under review. Raudenbush, S. W., Cheong, Y. F., & Fotiu, R. P. (1995). Synthesizing cross-national classroom effect data: Alternative models and methods. In M. Binkley, K Rust. and M Winglee (Eds. ) Mcthadciogicaljssncsjnconiparatixe ntcmancnalstndicsihcsasaofrcadinahtcracx (243- 286) Washington D. C.: National Center for Educational Statistics. Raudenbush, S. W., Fotiu, R. P., Cheong, Y. F., & Ziazi, Z. M. (1996). Inequalitycf t1e1- 1 1-1-.1111111 11111 ,‘1111 1-1 1131-1--°411 11at1. 126 Unpublished manuscript now under review. Raudenbush, S. W, Kasim, R. M., Eamsukkawat, S. & Miyazaki, Y. (1996)Sgcia1_ 11°1 1111' -.1. ‘11. 111 1111111‘1'{‘__ 11111.1-111- WW. Unpublished manuscript now under review. Ripley, B. (1987). Won. New York: John Wiley and sons. Rodriguez, G., & Goldman, N. (1995). An assessment of estimation procedures for multilevel models with binary responses. MW SeriesA 5.6 73- 89 SAS Institute, Inc. (1989). WWW Cary. NC: SAS Institute, Inc. Schall, R. (1991). Estimation 1n generalized linear models with random effects. W49 719- 727. Seltzer, M. H., Wong, W. H. ,&B,ryk A. S. (1996). Bayesian analy31s 1n applications of hierarchical models. Issues and methods. MW Wm. 21(2). 131- 167. Searle, S. R., Casella, G., & McCulloch, C. E. (1992). W. New York: John Wiley & Sons. Silva, C. M., & Moses, R. P. (1990). The Algebra Project: Making middle school mathematics count. Won, 59(3), 375-391. Silver, E. A., (1995). Rethinking "algebra for all". MW, 52., 30-33. Sorenson, A. B. ,& Hallinan, M. T. ( 1977) A reconceptualization of school effects. WI]. 5.0, 273- 289. Spiegelhalter, D., Thomas, A., Best, N., & Gilks, W. (1994). W, Cambridge: MRC Biostatistics Unit. Steen, L. A., (1992). Does everybody need to study algebra? W, 31(4), 9-13. Stevenson, D. L., Schiller, K. S., & Schneider, B. (1994). Sequences of opportunities for learning. Wan 6.7. 184-198. Stiratelli, R., Laird, N., & Ware, J. H. (1984). Random-effects models for serial 127 observations with binary response. W, fl, 961-971. Tanner,..MA(l993). 11 1 i ° 1 ‘u 111 __1 .-‘__._1l-11_ 111 W 2nd ed New York Springer- Verlag. Tate, W. F. (1994). Mathematics standards and urban educational Is this the road to recovery W 5.8 380- 390 Tierney, L. (1994). Markov chains for exploring posterior distributions. W Statistics 22(4). 1701-1762. Useem, E. L. (1992). Getting on the fast track in mathematics: School organizational influences on math track assignment. MW, 1510(3), 325-393. Usiskin, Z. (1987). Why elementary algebra can, should, and must be an eight-grade course for average students. MW, 8_Q, 423-437. Wei, G. C. G., & Tanner, M. A. (1990). A Monte Carlo implementation of EM algorithm and the poor man' 5 data augmentation algorithms. WW W185, 699-704. Wolfinger, R. & O'Connell, M. (1993). Generalized linear mixed models: A pseudo-likelihood approach W93 .8 233- 243. Yang,M. (1994). . imul '1--1.1 1 .1‘ a. ‘ 11‘1 1. 1‘ \1-1-1‘1- 3.111- 1, al ' ° 1 ' - 1 ' '1111.Unpublished manuscript. Zeger, S. L., Liang, K. Y., & Albert, RS. (1988). Models for longitudinal data: A generalized estimating equation approach. Biometrics, 44, 1049-1060. Zeger, S. L, & Karim, M. R. (1991). Generalized linear models with random effects: a Gibbs sampling approach Wu 86 79- 86.