. 9.10.4. . . 1;? V Sv'ntzi‘t! . if’pi‘yhb '1. In B.— .vs . I h: P? r. v I: f l‘,. . d... at." v c :12!) 3!. 9a pricli‘unfifi.‘ rc- ¢v».!lwt. . r I v _ ‘ I. '1'}: q . f‘l" ' . . ‘ .1 g... 3".- 3 . J... E .5 Bra... K751... . Zriwl. 9.} Ir: ... ”hf-.8315... I... v... , 910%: its! eh: u .. ,3 Jacki}. 535‘ V . .l . EVKE 2; I‘ V ridmuflvur la"! y 2. n v, 1.....th . ,. l . .. . . loin}! .y . . . v! 5):! (“waft-V. 5!...‘( 5-... , It... Bunkrhltg Who. 5!: . hp 9'35. 1‘) 3“! {.Eg'i'. Dh¥=i: . if u - Q’V’X‘ . . .2 . Ia. . 1.251 I} . A l h -.~’nr‘$"." Lt.‘ ~t¢rf1H .ii V . , . y : 53:3 qu¢pLahkAI .I fl‘lv‘K-i'l‘rlp ‘1‘; ‘ :‘tv’. 17". ' V lei‘zrlvin it... y .26: t I 1.: v 91;}..5 1 (II I. y . .l . .’.II{E.} XVI: tr .L.’ lXai‘llzvl. II. ‘ Nixxlfvflgufl “in: 1.". . r: _ ‘ ‘ U x . i a: .3 3.», s. grin?! v.1 Crhrf'Il‘- . , . . II. "V’L‘ V: ‘38. . . WabfWI; $5.... .5... K n . .1. .w . ‘- I- I.‘ . . I . ;.....br: £12.! .6 I ‘3 .' 39““ V. I} 151'. v9.3: Devil. ?l6..iini.9ltv.vizv.l tv #339356. . 0‘5!» OIV. V ISIS Nit?! . . . a..Ibt|b.l..v . vsnglzli": V In in"... ‘u 9?... {25358.33}! . $693 .. I‘ll)?!“ 1.0...Pc‘1 u . Yin... vll. , 13.1.3.- Trix ' i [1 V 1123' . 1‘ Q: 1...! It v:s :I»:1..‘$¢ 3‘; 122:: i t. . it)!!! .1121. . cu: . itaf‘liib‘ r l. .3 Q? 0": IxI'AQ :Yl~ob .,.£1.ty .V . .r.‘. I. l... .0...‘ .\\ a1 3 \OI. :v 15.3%.. .I... .ly...1.1.vl;.i I svtflv|\l.fi\.lrl1 . 1§|.9.\.! 1": 12.5.15... I. CXLI-utu‘llt 9.. w . . «7"? ll CJvtixvxyilzl . , : vs: 1 t :lSi‘évote S , )‘l-Itilitulvoyl . V 3.0.vl. ‘50! .2. . inst!!! “L '1 ‘1’"! ‘9 I D’v.'n ‘1 t it," |"7.nl».:"b III.“ 2!... .v 1...? .I. t t S... \ I... . luti‘oi . .. 19c»- fiu vuxu‘iu! :VVI‘ .1». I o. 3.7.. II. rah ‘1’.“- 3 ll;uu.l!t..( It: i ; 143‘ V. 1;! y.» vu‘!‘ - ,r I! Iliqo vivacitluu...‘ tr}. (filial; ..: stifzvtao 1.! 99%|: : \‘.’~5I¥Ar"v¥l¥§b2§itfltl\ev. . 3...»...11c-T .v . ‘ IF. 0‘! ‘bli..vl a." v....rv.r.§!l. .72., I \Ill'x. . ‘n. n .L. sci...- 1 iiii‘zpvl. {-1.6} ‘ Igi yrir nu.: , . {I}... 3.1: . 1’22..va v..y....|! t. #2355: I. 2.1.: tv‘ lvltb}...? F! I .2 .1 ... ‘Ds::v 3!. A ‘Ii Vtrvnu . 'IY- ‘ 9 3 . ..I.I.Iv'$..|l . ul‘qt :1 $331.3 if 'rt.|.‘i.v.r|-. . k“) ,Ziinfui'b .. . .ON‘1.KVA\’I!IQOIIII. .V . V . . 1O [ ‘8? 1 7"}. t’iltrifv‘b: . . ‘ .. n..»\!1.l|r.h.t!!§ . V i .Vvlez‘. 1 V . IX‘I‘Q‘IVDIIVIift v 3‘55... Vii-Ivct‘~: ‘1'! (as!!! . . w :1: I! . . i. ... .' l‘.“|..v|).ul: . E! 3iauiiil. 0). . . . Ix. volll‘tillcu “VI“, (02“;tfiiw I‘x‘fi‘tz'lvl'u. 3 . . ' . . . . 1.1ij u..p,\n...\w.n,~vv...a V\.;n.!.aWol.1J.u . . ‘ . ‘ V A . . L‘ f a .. n .-V c l .1 .l. A . E s. V A . ...5 but; V. .. . . . . . E13133 L.” . . . . .. 4.9."... . .. . . . ‘ tucbl..o?tl Riki”. am .n.b:. r p ANSTATEU l llflllllllllUlllHlHllllllHill 300891 4339 lllllllllflllzll This is to certify that the thesis entitled GENETIC PARAMETER ESTIMATION FRCN SINGLE AND MULTIPLE TRAIT ANALYSES presented by Terri Lynne Moore has been accepted towards fulfillment of the requirements for M. S. degree in Animal Science 41 W Major professor Datewif/770 0-7639 MS U is an Affirmative Action/Equal Opportunity Institution h r r m 4‘ LIBRARY Michigan State 1 University ‘ ’- PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. DATE DUE DATE DUE DATE DUE fl =¥Fll MSU Is An Affirmative Action/Equal Opportunity Institution czleb'cmma-pd GENETIC PM ESTIMATION HOE SINGLE AND HULTIPIS MIT ANALYSES 3! Terri. Lynne Hoore A THESIS thitted to Inchigen State University in partial fulfill-ant of the requirements for the degree of MASTER OF SCIENCE Depart-eat of Aninl Science 1990 GENETIC BARANETEE ESTINAIION FED! SINGLE AND NDLIIPLE TRAIT ANALYSES By Terri Lynne‘loore The estimation of genetic parameters from unbalanced data with information on many traits poses severe computing problems. subsets of traits may be repeatedly selected such that multitrait models are computationally manageable and biases in estimates are minimized. Such biases, if exist, may be dependent on the magnitude of underlying true parameters. The current method of choice for estimating (co)variance components is derived from the normal density function. Its statistical properties may not hold if data for analysis is not normally distributed. This work examined, by simulation, three potential sources of bias in estimating genetic parameters: the number of traits in the analysis, the magnitude of the underlying genetic parameters, and violation of normality assumptions. Results indicate that both accuracy and precision of genetic correlation estimates were dependent upon all three sources examined. The exception was that precision of heritability estimates was dependent only on the true underlying heritability value. ACKNOWLEDGEMENTS I'd like to thank my committee members for their guidance and support: Dr. Ivan Mao, Dr. Ted Ferris, and Dr. John Gill. They have all been very patient and understanding throughout the years I have been here. Many of the good friends I have made here have encouraged and supported me over the past three years. In particular, Florah, Gwang-Joo, and Gustavo: they have all made the hard times a little easier to get through and could always put a smile on my face. I would especially like to thank Just Jensen. He has more patience and understanding than anyone I know. We have been through a lot together and I hope one day I can give to him all that he has given to me. I would like to thank Sheryl Hulet for helping in preparing this thesis, but more importantly for being a good friend. She has a way of making the work place a lot more bearable and almost even fun. I would also like to thank my family for all their support. They have always been there when I have needed them, and I will definitely need them in next four years! ii TABLE OF CONTENTS LIST OF TABLES ................................................. 1. Introduction ............................................... 1.1 Number of Traits ..................................... 1.2 Underlying Population Parameters ..................... 1.3 Normality Assumptions ................................ 2. Objectives ................................................. 3. Review of Literature ....................................... 3.1 Introduction ......................................... 3.2 (Co)variance Component Estimation .................... 3.2.1 History ....................................... 3.2.2 EM-REML Method of Estimation .................. 3.2.3 Models ........................................ 3.3 Advantages of Multiple Trait Analysis ................ 3.3.1 Improved Accuracy and Reduction in Selection Bias .......................................... 3.3.2 Evaluation of A11 Animals ..................... 3.3.3 Estimation of Covariance With No Crossproducts 3.4 Limitations of Multiple Trait Analysis ............... 3.4.1 Computational Requirements .................... 3.4.2 Degree of Correlation Among Traits ............ 4. Comparison of Genetic Parameter Estimates From Single and Multiple Trait Analyses .................................... 4.1 Abstract .............................................. 4.2 Introduction .......................................... 4.3 Materials and Methods ................................. 4.3.1 Simulation Procedure ........................... 4.3.2 Statistical Analysis ........................... 4.4 Results ............................................... 4.4.1 Biases ......................................... 4.4.2 Mean Square Errors ............................. 4.4.3 Correlations ................................... iii Page 19 2O 21 22 22 24 25 25 26 32 4.4.4 Grouping of Traits ............................. 4.5 Conclusions ........................................... Comparison of Genetic Parameter Estimates From Single and Multiple Trait Analyses When Underlying Distribution is Skewed ..................................................... 5.1 Abstract .............................................. 5.2 Introduction .......................................... 5.3 Materials and Methods ................................. 5.3.1 Simulation Procedure ........................... 5.3.2 Statistical Analysis ........................... 5.4 Results ............................................... 5.4.1 Biases ......................................... 5.4.2 Mean Square Errors ............................. 5.4.3 Correlations ................................... 5.5 Conclusions ........................................... Summary .................................................... 6.1 Heritability .......................................... 6.2 Genetic Correlations .................................. 6.3 Sampling Subsets of Traits ............................ Bibliography ............................................... iv 32 35 38 39 40 41 41 42 45 45 45 52 56 58 59 60 61 62 LIST OF TABLES Table Page 1. Underlying parameter structures investigated ........... 24 2. Percent biased heritability and genetic correlation estimates ............................................ 27 3. Average root mean square errors for heritability estimates 28 4. Average root mean square errors for genetic correlation estimates ........................................... 29 5. Average root mean square errors of genetic correlation estimates falling in the range (.08-.lO) and (.26-.48) 31 6. Correlations between estimates of heritability for two extreme values (.1,.8) for multiple trait and single trait analyses ............................... 33 7. Correlations between estimates of genetic correlations for all possible pairs of multiple trait analyses.... 34 8. Underlying parameter structures investigated for skewed traits .............................................. 43 9. Number of analyses run and number of estimates obtained from each replicate of each parameter combination.... 44 10. Percent biased estimates for those situations having a greater than expected number (P<.05) ................. 46 11. Average root mean square errors for heritability estimates from skewed underlying distributions ................. 48 12. R2 values and standardized partial regression coefficients for three multiple regression models with standard error of heritability estimate as the dependent variable ................................... 50 13. 14. 15. 16. Average root mean square errors of genetic correlation estimates from skewed underlying distributions ....... R2 values and standardized partial regression coefficients for three multiple regression models with standard error of genetic correlation estimates as the dependent variable ................................... Correlations of heritability estimates for two extreme values(.1,.8) between multiple and single trait analyses ............................................. Correlations of genetic correlation estimates when one trait is skewed between different multiple trait analyses ............................................. vi 51 53 54 55 1. INTRODUCTION Genetic parameter estimates, such as heritabilities and genetic and phenotypic correlations, are obtained from estimates of (co)variance components, usually from large unbalanced data sets with information on many traits. The use of all data on all traits often leads to models that are computationally demanding. To alleviate the difficulty data may be sampled by including only a subset of the traits of interest. The current method of choice for the estimation of genetic parameters is Restricted Maximum Likelihood (REML), which is derived from the normal density function. If data to be analyzed are not normally distributed, estimation results may not have the desirable statistical properties of REML. These are two of a number of factors that may potentially effect the accuracy and precision of genetic parameter estimates from multiple trait analyses (MTA). The understanding of these possible sources of bias should enable one to develop strategies in sampling subsets of traits that yield high estimation accuracy and precision while minimizing computational requirements. 1.1 Number of Traits Studies in literature have indicated that the accuracy of genetic parameter estimates may depend on the number of traits in a MTA. Buttazzoni and Mao (1989) examined the estimates of sire and residual variance components and heritability estimates from both single and multiple trait analyses of the same data set and found that while residual components from MTA were slightly greater than those from single trait analyses, sire variance components were consistently much greater. Lin and Lee (1986) compared estimates from single trait and multiple trait analyses of the same data set in addition to looking at the effects of sequentially adding traits to a mixed model on parameter estimation by MTA. Their results suggested that parameter estimates may vary depending on the type of analysis (single or multiple trait) and upon other traits included in a MTA. They found that heritability estimates of a given trait or genetic correlation estimates of two given traits change as additional traits are added to or deleted from a MTA. Walter and Mao (1985) found similar results for genetic correlations from single trait and two-trait analyses. This suggests that differences in parameter estimates reflect the joint contribution of other correlated traits which are omitted in subset MTA with smaller number of traits. This is a direct consequence of using different (co)variance matrices. Therefore, it appears genetic and phenotypic parameter estimates are conditional upon other traits included in simultaneous analyses. Schaeffer and Wilton (1981) reported similar findings in that differences they found in the sign of the correlations between sire proofs depended upon the number of traits in a MTA. 1.2 Underlying Population Parameters The magnitude of the population parameters from which the sample was obtained may affect the accuracy and precision of genetic parameter estimates. Schaeffer (1984) studied reductions in prediction error variances (PEV) from two-trait models over single trait models using various combinations of genetic and residual correlations. He found that the percentage increase in accuracy was dependent on the difference between genetic and residual correlations, implying that the ability of MTA to increase accuracy of estimation is dependent on the levels of correlations used. Walter and Mao (1985) compared (co)variance REML estimates under various genetic and residual correlations in simulated populations. Results indicated that while estimates of residual variances were consistent across different levels of genetic and residual correlations, estimates of sire variances tended to decrease as genetic correlation increased. 1.3 Normality Assumptions A violation of normality assumptions may have an effect on biasedness in (co)variance component estimates. Both Maximum Likelihood (ML) and REML procedures require random effects contributing to the observation vector to be random samples from underlying normal populations. This may not be the case in many instances. Traits such as calving difficulty, litter size, and conformation are either subjectively scored or categorized due to the discrete nature of the units of measurement. The symmetry of the distribution is, therefore, dependent upon the frequencies within each class. However, such traits were assumed to be normally distributed when applying ML or REML. The effects of selection may also cause the distribution of a random factor to be skewed (Banks and Mao, 1985). Cows are culled at various stages in their lifetime for a variety of reasons. Thus, the population of older cows is more likely a selected population. This could cause skewness of residuals in the model. Intense selection in the male population and groups of half siblings could cause skewness in the sire distribution. Buttazzoni and Mao (1989) indicated that the discrepancies found between single trait and multiple trait estimates appeared to be inversely related to the magnitude of the estimates and directly related to the skewness of the residuals. Banks and Mao (1985) examined the dispersion and asymptotic biasedness properties of variance component estimates through REML for single trait analyses when sire and residual variances were skewed. Results indicated that the method of estimation appeared robust to skewed distributions, in terms of accuracy, while it was not in terms of precision as sampling variances of the estimates were greater in skewed distributions. 2. OBJECTIVES The objectives of this work were to examine, by simulation, three possible sources of bias in genetic parameter estimates from single and multiple trait analyses: 1) the number of traits included in an analysis, 2) the magnitude of the underlying parameters, and 3) the effect of violation of normality assumptions. If patterns of bias can be found, guidelines in sampling subsets of traits that would yield estimates with optimal properties while minimizing computational requirements may be developed. 3. REVIEW 0? LITERATURE 3.1 Introduction Multiple trait analysis (MTA) utilizes information from all traits to estimate (co)variances and evaluate animals through genetic and environmental correlations between traits. Advances in computer technology and concern about effects of selection on traits measured have resulted in increased interest in multiple trait models. Simplifications in computing often can be made for specific types of multiple trait models, such as when every animal is measured for each trait or if the same model can be used for all traits (Meyer, 1986). Estimates of genetic parameters obtained through MTA have been studied by several workers (Schaeffer and Wilton, 1981; Walter and Mao, 1985; Lin and Lee, 1987) There is little known, however, about properties of these estimates, in terms of accuracy and precision, and factors affecting these two properties. Factors which may have a varying influence in REML estimation on the accuracy and precision of the estimates include the magnitude of correlations of genetic elements of traits, number of traits in an analysis, and robustness of the REML procedure against severe violations of distribution assumptions. 3.2 (Co)variance Component Estimation 3.2.1 History Genetic parameter estimates such as heritabilities and genetic and phenotypic correlations are obtained from estimates of (co)variance components. Several methods for (co)variance component estimation exist. Searle (1989) provides an extensive review on variance component estimation methods. The methods reviewed include Analysis-of-Variance (ANOVA) methods for balanced data, Henderson's methods 1,2 and 3, Minimum Norm Quadratic Unbiased Estimation (MINQUE), Maximum Likelihood (ML) and Restricted Maximum Likelihood (REML) for use on unbalanced data. For many years analysis-of—variance (ANOVA) and analysis-of- covariance (ANCOVA) estimations were the standard procedures to estimate genetic and environmental (co)variances for both balanced and unbalanced data. For the balanced case, ANOVA estimators are translation invariant, minimum variance quadratic unbiased and can be used regardless of distributional properties. But they can also yield negative estimates. When normality is assumed, the estimators are not just minimum variance quadratic unbiased but are minimum variance unbiased. For the unbalanced case however, the only known properties of ANOVA and ANCOVA methods were translation invariance, i.e. invariant to the fixed effects in the model, and unbiasedness, but even the latter property no longer holds for populations under selection. MINQUE of Rao (1971) was the first attempt to minimize sampling variances, the variance of an estimate, in the class of translation- invariant quadratic unbiased estimators. The smaller the sampling variance of an estimator the more efficient it is. Hartley & Rao (1967) developed the first general method for dealing with unbalanced data in ML estimations which demand the assumption of some known form of distribution function for the data vector. The ML procedure of Hartley and Rao yields simultaneous estimation of both the fixed effects and the variance components. ML estimators are derived by maximizing the likelihood over the parameter space, which is non-negative as far as variance components are concerned. ML estimators of variance components, however, do not lead to those estimators derived from ANOVA methods since these estimators can take negative values. The difference concerns divisors of certain mean squares, resulting in some of the solutions being biased estimators, i.e., the ML estimators of the variance components take no account of the loss in degrees of freedom (d.f.) resulting from the estimation of the fixed effects. Patterson and Thompson (1971) extended this procedure to Restricted Maximum Likelihood (REML), modifying the ML procedure of Hartley and Rao by adapting a transformation which partitions the likelihood under normality into two parts, one due to the fixed effects, and one due to error contrasts free of the fixed effects, i.e., contrasts with expectation independent of the fixed effects. Maximizing this latter likelihood yields what are called REML estimators. Patterson and Thompson (1971) described this procedure for the univariate case. Thompson (1973) subsequently extended it to the multivariate case. A comprehensive review of ML approaches to variance component estimation, their properties, and problems of application was presented by Harville (1977). Searle (in Paper BU-673-M, Biometrics Unit, Cornell University, 1979) gave a detailed account of ML and related procedures, summarizing and comparing the algebra for alternative approaches in the univariate case. Both authors emphasized REML. Thompson (1982) discussed REML to estimate genetic parameters in animal breeding. A number of studies reported REML algorithms for specific analyses in this field (Thompson, 1977; Schaeffer, et a1. 1978; Lin and Lee, 1986). 3.2.2. EH-REML Method of Estimation The use of REML in the estimation of (co)variance components has become increasingly popular due to its desirable statistical properties. REML, as ML, has the property of invariance under translation. Furthermore, REML estimators have the additional property of reducing to ANOVA estimators for many, if not all, cases of balanced data, unlike the ML estimators of Hartley and Rao (1967) (Corbeil, Searle, 1976). This additional proerty is a useful one because of the optimal properties of ANOVA estimators from balanced data, particularly minimum variance properties. A second additional property of REML, as indicated by Meyer and Thomson (1982), and Henderson (1987), is that it may have considerable power to eliminate selection bias due to culling. However, all data in selection decisions must be used in the analysis if the selection bias is to be eliminated. The REML sets of equations, as with ML, are non-linear in the variance component estimators and must be solved numerically, usually by iteration. Hence, computational requirements are extensive. Several algorithms for obtaining REML estimates of variance components exist and can be classified according to the information from derivatives of the likelihood function that is utilized: first and second derivatives, first only, and derivative-free. Procedures such 10 as Fisher's method of scoring and Newton-Raphson require expected values of second derivatives. Expectation-Maximization (EM)-type algorithms exploit first derivative information. Derivative-free approaches obtain REML estimates by direct maximization of the likelihood function using standard optimization procedures. The EM algorithm (Dempster et al. 1977) has been the most frequently used due to the relative ease of programming required, expressions that are intuitively easy to understand, and the guarantee that estimates are in the parameter space. The latter property is not a feature of other algorithms for ML or REML. The EM algorithm is iterative in nature and is directed at finding values of the parameter vector, ¢, which maximize the density g(y|¢) given an observed y, but it does so by making use of the associated family f(x|¢0 that is thought to represent the population from which the sample comes. Each iteration of the algorithm consists of an expectation step followed by a maximization step. The expectation step involves assuming that the current estimate is equal to that of the true parameter, and given the observation vector, computes the expectations for quadratic forms. The maximization step then computes the new estimates by simply dividing the expectations of quadratic forms by the suitable d.f. However, the EM algorithm, in general, converges slowly. Several attempts have been made to speed up convergence, such as the common intercept approach (Schaeffer, 1979), the use of relaxation factors, or non-linear adjustment (Mistal & Schaeffer, 1986) which reduce the number of iteration rounds required for convergence. Other approaches 11 to reduce the amount of computation in each round of iteration involve the use of canonical, Householder, and Cholesky transformations applied to different elements of the EM algorithm. 3.2.3 MOdels For any statistical analysis there is an assumed or implied model. Linear models are extensively used because they are easily applied. Consider t traits and let the model for each be denoted as: yi - xibi + Ziui + e1 (i-l,...,t) [l] where 7i is an observation vector for trait i of length n1; b1 is an unknown vector of fixed effects for the ith trait of length p1; ui is an unknown vector of random effects for the ith trait of length qi; e1 is an unknown vector of random residuals corresponding to yi; and Xi and 21 are observed incidence matrices of order n1 x pi and n1 x qi, respectively. If the design matrices are equal for all t traits , i.e. if Xi - X and 21 - 2, then the model for all t traits simultaneously can be written as a direct extension of [1]: Y - (It*X)b + (It*Z)u + e [2] y - vec Y, "*" denotes the direct product operation (Searle, 1982), b' - [b1'. bz'. . . . . bt']. 1 u: _ [“1" ‘12:, e' - [e1', ez', 12 , ut'], and , et']. The expectations, E( ), and (co)variances, V( ), are: E(y) - (It*X)b. E(u) - 0. E(e) - 0 V(u)-Vr 1 ‘11 ‘12 -G-P G11 G12 " 62200 symmetric G1: “2: G tt With one random factor per trait with the same number of classes then G - r 81118 81213 322Is symmetric similarly, V(e)-V r- - - e1 °2 °t gltIs g2tIs symmetric q - IS*Go where Go - r 811 812 822 symmetric °"’ th rllln r12In ° "° 32: r221n ' symmetric Rtt _ _ 31c 82: 13 - In*Ro where Ro - - 1 r11 r12 "’ r1: r22 "’ r2: symmetric rtt _ Then V(y) - V - Z(IS*GO)Z' + In*Ro The model is therefore characterized by the assumed structures of G and R. Under these assumptions, the mixed model equations that would yield the best linear unbiased estimator (BLUE) of the fixed effects and the best linear unbiased predictor (BLUP) of the random effects can be written as (Henderson, 1973): 1 1 1 [3] x'x*n'o x'z*n'o b (X'*R'o)y z'xsa'g z'z*n'i + 13*6’g u (Z'*R';)y where G and R contain a priori estimates in [3]. Let C, a generalized inverse of the coefficient matrix in [3] be partitioned by denoting the 1th and jth submatrix of C corresponding to the 1th and jth subvectors of u as cij' Utilizing expressions by Dempster et a1. (1977), the EM algorithm to estimate the ijth elements in G and R are: 811(k+1) - [u'i(k)“j(k) + tr(cij(k))]/q [4] rij(k+1) ' [a'1(k)°j(k) + tr(”HUGH/n [5] for the kth round of iteration, where Bij is the submatrix WCW' corresponding to the ijth pair of traits, where Wh[X:Z]. These l4 expressions are due to Henderson (1984), and correspond to the REML estimators given by Patterson and Thompson (1971) and Harville (1977) but extended to multiple traits. 3.3 Advantages of Multiple Trait Analysis 3.3.1. Improved Accuracy and Reduction in Selection Bias In the statistical analysis of animal breeding data, traits are often considered one at a time without considering other traits measured on the same individual. Usually one is interested, however, not only in the mode of inheritance of a particular trait, but also in its relationships with other traits and expected changes in the latter when selecting on the particular trait analyzed. For these cases multivariate analyses are required to obtain estimates of genetic and phenotypic correlations between traits. Moreover, while univariate analyses implicitly assume that all correlations are zero, joint analyses of correlated traits utilize information from all traits to obtain estimates for a specific trait and are thus likely to yield more accurate results. This is of particular relevance when data are not a random sample. For animal breeding data this is often the case since, typically, data originate from selection experiments or are field records from livestock improvement schemes which select animals on the basis of performance. Usually one or more traits have undergone selection resulting in missing observations for some traits. This is particularly true in sequential culling, where observations on one trait are used for selection, and the selected group of animals then is measured for a subsequent trait. In these situations, univariate analyses are expected to be biased 1 If]: 15 while multivariate analyses may account for selection. Pollack, et a1. (1984) examined the ability of multiple trait methods to reduce or eliminate selection bias, either by sequential selection or selection on a correlated trait. Results indicated that in both cases, bias in the single trait evaluation was eliminated by multiple trait procedures. Henderson, (1975) showed that multiple trait models account for selection bias if selection is described as a translation invariant function of the traits that had been used to make selection decisions. Analysis-of—variance methods have been used widely to estimate genetic and phenotypic correlations. These require records for all traits for all individuals. If there are missing records, this implies that part of the relevant information is ignored. If the lack of records is the outcome of selection based on some criterion correlated to trait(s) under analysis, estimates are likely to be biased by selection (Meyer, 1989). In contrast, ML estimation procedures utilize all records available and, under certain conditions, account for selection. Essentially, it is required that all information, unless totally uncorrelated, on which selection decisions have been based be included in the analysis. Even if these conditions are only partially fulfilled, ML estimates are often considerably less biased by selection than their ANOVA counterparts (Meyer and Thompson, 1984). 3.3.2. Evaluation of All Animals A second advantage of MTA that has been noted by several authors (Schaeffer and Wilton, 1981; Schaeffer, 1984) is that it allows every animal to be evaluated for all traits without actually being observed 16 for all traits. This is due to the non-zero genetic and residual covariances among traits that are incorporated into the analysis. Therefore, in multiple trait analysis, evaluation of an animal for a trait is composed of contributions from all traits in the analysis. For example, a sire that has no progeny recorded for calving ease can have a calving ease evaluation which is based upon genetic correlations of calving ease with the other traits in the analysis. This cannot be accomplished with single trait analyses unless the relationship matrix is used (Schaeffer, 1984). The correlation between error effects for different traits will have a direct effect on the contribution from an observation on a trait (Schaeffer, 1984). As the absolute value of the error correlation increases, the weight on observations from other traits also increases. Therefore, in some instances, multiple trait evaluations could be greatly different from single trait evaluations because of correlation among traits. As reported by Schaeffer and Wilton (1981), the accuracy of the evaluation will also be dependent upon the number of progeny with observations on the other traits. 3.3.3. Estimation of Covariance with No Crossproducts A third advantage of MTA is the estimation of covariance components when crossproducts or sums of two traits do not exist and the same linear model is not possible for both traits. Usually, covariance components are estimated between traits measured on the same individual with the same linear model being assumed for each trait. Consider, for example, yearling weights on male and female beef calves. A different model is appropriate for each sex. Also, to estimate the covariance between male and female yearling weights, a procedure of 17 using crossproducts and sums on each individual is not possible. If an interaction of sire-by-sex of calf is present, the sire component of covariance would yield a genetic correlation less than unity. This would mean that sires need to be evaluated from their female and male progeny separately. The estimation of such a covariance would require a multiple trait procedure. 3.4 Limitations of Multiple Trait Analysis 3.4.1. Computational Requirements One limitation of applying multiple trait analysis is the increased number of equations to be solved. Computations are usually cumbersome, time consuming, and costly. One would need to weigh the gain in accuracy, with special reference to the elimination of selection bias, relative to the extra computing effort. There are situations, however, when models for different traits cannot be the same and/or different traits cannot be measured on the same individuals, then multiple trait methods become a necessity. Restricted maximum likelihood estimation requires the inversion of the entire coefficient matrix of the mixed model equations. By direct inversion of the coefficient matrix, the time required becomes proportional to the order of the coefficient matrix (n); CPU ~ on3 where c is a constant. There have been strategies developed that alleviate this problem. One of them is to eliminate the fixed effects in the model by absorption, therefore, requiring only solutions to the random factors in the model. Other shortcuts applied to calculations have been developed that depend on the particular model being used. Many of these shortcuts l8 involve the use of transformations applied to different elements in the EM algorithm. In the case where the model is assumed to contain only one random factor and the same model is used for all traits, then canonical transformation can be applied. The purpose of canonical transformation is to obtain a set of canonical variates, between which all covariances are zero, without loss of any information contained in the original variables. Variance components are estimated for each canonical trait using single trait analysis, and then transformed back to the original scale. 3.4.2. Degree of Correlation Among Traits The utilization of MTA would seem to be most advantageous when the absolute values of the correlations between traits are high so that information on each trait would contribute more to the accuracy of estimation and/or prediction in other traits. However, as indicated by Meyer (1985), Hill and Thompson (1978), Seal (1966) and others, in the analysis of highly correlated traits there is a strong chance, depending on the method of estimation and the amount of data, that the estimated (co)variance matrices are not within the allowable parameter space. For the estimated (co)variance matrix to be in the allowable parameter space, it must be positive definite. In genetic applications, (co)variance matrices must be positive definite. This probability of obtaining non-positive definite (co)variance matrices increases with the number of traits. 4. Comparison of Genetic Parameter Estimates From Single and Multiple Trait Analyses 19 20 4.1 ABSTRACT Discrepancies between estimates of genetic parameters from single and multiple trait analysis of the same data set were examined by simulation. Two possible causes were studied: the magnitude of the underlying genetic parameters, and the number of traits in the analysis. Different situations were simulated to cover a range of heritabilities as well as genetic correlation structures. A model with fixed management group effects and random sire effects was used to simulate records for four traits. All animals were recorded for all traits and the same model was used for each trait. Genetic parameters were estimated using an EM-REML algorithm with canonical and Householder transformations. Overall, no significant biases were found in heritability estimates. Biases in genetic correlations tended to occur more when the underlying correlation was negative. Inclusion of correlated traits in a multiple trait analysis did not increase accuracy of heritability estimates but did increase accuracy of estimates of genetic correlations. Correlations between the estimates for both heritability and genetic correlations for different analyses were all high (.78 and greater). Correlations between the estimates for heritability were highest between two trait and single trait analyses and lowest between four trait and single trait analyses. Correlations between the estimates of genetic correlations were highest between four trait and three trait analyses and lowest between four trait and two trait 21 analyses. Correlations for both heritability and genetic correlation estimates increased as the underlying heritability increased and were highest under a positive gentic correlation. 4.2 INTRODUCTION Unlike single trait analysis (STA), multitrait analysis (MTA) takes into account all correlations among traits in estimating genetic and phenotypic parameters. The advantage of MTA over STA is the increase in estimation accuracy of components of (co)variance, especially from populations undergoing selection (Walter and Mao, 1985). Differences in estimates resulting from the two methods can be expected. Some studies (Schaeffer and Wilton, 1981; Lin and Lee, 1986; Buttazzoni and Mao, 1989) report large changes in magnitude and even signs of the estimates when applying the two methods to the same data set. Biases such as these may be the result of the effect of several factors. Estimates of genetic paramteres may be biased by selection if not all the data used in selection decisions are included in the analysis. Biases may also occur in estimates of the parameters if some or all the assumptions of the model are violated, for example, normality. The true population parameters may also affect the accuracy of the estimates, for example, the degree of association between traits. Finally, the number of traits one decides to include in a multitrait analysis may also have an effect on whether the estimates are biased or not. Even though the estimation accuracy and/or prediction error through MTA is greater, one limitation of applying MTA is the increased 22 complexity and computing requirements. The gain in accuracy relative to the extra computing efforts needs to be considered. It would be desireable to group traits within an analysis that would be computationally feasible and yet still achieve a high degree of accuracy. The objective of this study was to examine causes in discrepacies between genetic parameter estimates from mixed models using EM-REML methods of estimation in simulated unselected populations. The causes that were examined in this study were; the magnitude of the underlying genetic paramters, and the number of trait included in the analysis. As a result of examining these two causes, it was also possible to study the effect of grouping traits for computational feasibility so that the highest accuracy possible could be achieved. 4.3 MATERIAL AND METHODS 4.3.1 Simulation Procedure Records were generated for four traits using the same model for each trait. For each trait, the model was: Yuk " mi+sj+eijk where yijk was a record on the kth progeny of sire j in the ith management group; management group effect (m) was fixed whereas sire effect (3) was random, and eijk was the random residual. All animals had records for all four traits. Sire and residual components were generated from a multivariate normal distribution with expected values E(sJ)-0 and E< .m manna 29 ON. ON. OF. ON. ON. ON. MO. NO. NO. ON. ON. ON. an..n.u «N. MN. NN. Op. Ow. Op. OO. OO. OO. NF. 5.. NF. no..o.v OM. ON. NN. NN. NN. 0N. FF. ... pp. MN. MN. MN. no..M.v NN. ON. NN. NN. NN. MN. OO. O.. OF. MN. MN. NN. no..M.O MM. OM. NN. OQ. NM. MM. OF. O.. O.. «O. OM. MM. NN....O 0M. NM. ON. MM. ON. ON. 5.. NF. NF. PO. OM. OM. 00....v MO. OM. OM. we. MO. OO. ON. ON. MN. MO. Me. «O. AM..—.O OM. OM. NM. OM. MM. NM. ON. ON. OF. OM. OM. MM. RN..N.V g IIIII IIII IIII IIIII IIII IIII IIIUI IIII IIII IIIII IIII IIII N N 0 0 N 0 0 N 0 0 N 0 0 "00.01. 00.0: M.- N.- o. N. "at mouoafiumm soaumHouuoo owumsom no muouuo mumsvm acme uoou omnum>¢ .v manna 30 analysis. Trends found in RMSE's for genetic correlation estimates can be seen easier when values of similar magnitude are grouped together as indicated in Table 5, where the values have been separated into three ranges (.08-.10), (.17-.20), and (.26-.48). Only those values in the two extreme ranges (.08-.10) and (.26-.48) are shown. RMSE's were smallest when traits were highly correlated (.8) and moderate to highly heritable (.3,.6,.8). Within this group, RMSE's tended to decrease as the heritability value increased, indicating precision of covariance estimation between strongly associated traits increases as the heritability value of the two traits increases. Also, RMSE's stayed almost constant as the number of traits in the analysis changed, indicating precision of covariance estimation between strongly correlated, highly heritable traits does not depend on the amount of extra information obtained when more traits are included in the analysis. RMSE's for genetic correlations were largest (.26-48) in those combinations where either one or both traits had a low heritability value and traits were weakly correlated. RMSE's under these conditions tended to decrease as the heritability value of one of the traits increased, again indicating, that precision of covariance estimation, in general, increases as heritability increases. Also, RMSE's tended to decrease as the number of traits in the analysis increased, opposite to what was observed for strongly correlated traits. The same trends observed in the RMSE's for the two extreme ranges of genetic correlations can also be seen for those RMSE's falling in the medium range. 31 00. 00. 00. .0..0.0 ... ... ... .0..0.0 00. 0.. 0.. .0..0.0 00. 00. NN. 00. N0. 00. .0. 00. 00. .0....0 00. N0. 0N. 00. 0N. 0N. .0. 00. 00. .0....0 00. 00. 00. 00. 00. 00. 00. 00. .0. .0....0 00. 00. N0. 00. 00. N0. 00. 00. 00. .N..N.0 ........ -- -- -- --- -- -- --- -- -- -- N0 N 0 0 N 0 0 N 0 0 N 0 0 "...-t. .0 .0: 0.. N.. 0. N. "00 100.-0N.0 0:0 A00.-00.0 00:00 on» :0 0000000 mounawumo cowucaouuoo Owuocom mo muouuo oucsvm coma uoou omcuo>¢ .m manna 32 4.4.3 Correlations Table 6 shows correlations between estimates of heritability for two extreme values (.1,.8) for all possible pairs of multiple and single trait analyses. Correlations were high for all levels of heritabilities indicating high repeatability in the estimation of heritability from different analyses. Slight trends were found in the correlations according to the level of heritability, the underlying genetic correlation, and the number of traits in the analyses. As expected, correlations were highest for analyses differing by only one trait. For heritability estimates, correlations were highest between a two trait analysis and a single trait analysis and lowest between a four trait analysis and a single trait analysis. Correlations tended to be slightly higher for large heritabilities. Under a positive genetic correlation, correlations between heritability estimates also tended to be higher. Table 7 shows correlations between estimates of genetic correlations for all pairs of multiple trait analyses. Similar trends were found as those mentioned above for heritabilities. For all levels of genetic correlations, correlations tended to be highest between four trait and three trait analyses and lowest between four trait and two trait analyses. Correlations increased as the heritability of the traits increased, also correlations tended to be higher under positive genetic correlations than those for negative genetic correlations. 4.4.4 Grouping of traits Trends found in correlations between the estimates from different analyses and trends in RMSE's for both heritabilities and genetic 33 OOOO. HMOO. OOOO. OHOO. OOOO. OOOO. n.l O. OOOO. vMOO. OOOO. «OOO. MOOO. OOOO. N.I O. NOOO. OOOO. MOOO. MOOO. OMOO. OOOO. O. O. OMOO. OOOO. NOOO. eOOO. OMOO. OOOO. N. O. NOOO. HOMO. OOOO. OOHO. OOOO. ONOO. O.I H. HNOO. OmbO. MNOO. OOOO. ONMO. Omsm. N.| H. OOOO. OOOO. mOOO. OMMO. OMOO. HOOO. O. H. OMOO. OMOO. ONOO. OOOO. HOOO. NOOO. N. H. 0\N N\0 N\0 0\0 N\0 0x0 on Na momchcm cH muchu mo .02 .momchcc “Hana mHOGHm 0cm uchu oHnHuHsa Mom AO..H.O mmch> oawuuxo 03» you OuHHHnnuHuos no mouwaHumo coosuon acoHucHoHHoo .O oana 34 N-OO. MONO. MOOO. OMNO. «OOO. .NOO. ONOO. NMOO. ONOO. OOOO. MNNO. ONOO. NO—O. NOON. OMNO. MNeO. NOOO. OOMO. PNQO. MQOO. NNcO. MMMO. MMOO. MNMO. Nxe N\M MNO NN.-cu we .0: M.- OMOO. OOOO. «FOO. MOOO. OMNO. NMOO. MHOO. «FOO. «ONO. —OOO. OONO. ONOO. NMOO. OONO. OOMO. ONOO. OFNO. O—OO. NMOO. o—NO. ONOO. OMMO. ONNO. «MMO. NNO NNM wa uumflgu $0 .08 N.- OOOO. NOOO. OOOO. OOOO. ONOO. OOOO. ONOO. OMOO. NMOO. NONO. NMOO. NHMO. MOOO. MOOO. PMNO. MOOO. ONOO. ONNO. MNOO. MONO. OMOO. NOOO. OPOO. NOOO. N\¢ N\M Mxe mum...» $0 .02 OOOO. FOOO. NOOO. MOOO. —OOO. NMOO. mNOO. OOOO. OOOO. OMOO. OOOO. FOOO. NOOO. "ONO. OMOO. OOOO. OMNO. OOOO. OMNO. ONOO. NOOO. OMOO. «ONO. OOOO. N\c NNM Mxe «v.00» 00 .02 ~. .mothccm Uchu onHuHsa no much oHnHmmon HHs you usoHucHouuoo oHuosoO no mounEHumu coosuon nsoHunHouuoo NO..0.0 NO..0.0 AO..M.O NO..M.O .0.:30 .0..fis AM..—.O nN..N.v .O OHQMB 35 correlations can be used to evaluate the types of traits, that when grouped together, would result in higher estimation accuracy of the genetic parameters. However, results can only apply to those situations where traits are equally correlated. For highly correlated traits, inclusion of additional traits in a multiple trait analysis does not seem to increase accuracy of genetic parameter estimates. For weakly correlated or negatively correlated traits, inclusion of additional traits, especially with larger heritabilities, may increase accuracy and repeatability of the estimates of covariance. The use of a multiple trait model over a single trait model for the estimation of heritability does not seem to provide any increase in accuracy or precision of the estimates. 4.5 CONCLUSIONS Results indicate that heritability estimates of a given trait do not vary from single trait to multitrait analysis and do not vary across different levels of positive and negative genetic correlations when all traits are equally correlated. Therefore, STA was as accurate as MTA for estimating heritabilities. For genetic correlation estimates, biases were most often found for negative correlations. Under these conditions, results indicate that when all traits are equally correlated, estimates of genetic correlations could vary depending on the number of traits in the analysis and the magnitude of the underlying genetic correlation. In terms of absolute values, genetic correlation estimates were biased downwards for stronger negative correlations and biased upwards when traits were weakly 36 correlated (negatively). Root means square errors of heritability estimates increased as the underlying heritability increased but did not vary from single trait to multiple trait analysis. Therefore, precision in heritability estimation was not dependent on the type of analysis, STA or MTA. RMSE's for genetic correlation estimates had a much wider range as compared to RMSE's for heritabilitites, the magnitude of which depended on the underlying genetic correlation, heritability of the two traits involved, and the number of traits in the analysis. For all levels of genetic correlations (positive and negative), precision in covariance estimation tended to increase as the heritability values of the traits increased. Under weak genetic correlations, precision of estimation of covariance tended to increase as the number of traits in the analysis increased. Correlations between the estimates of heritability and genetic correlations from different analyses were all high. Correlations for heritability estimates tended to be higher than those for genetic correlation estimates. Correlations between estimates of heritability were highest between two trait and single trait analyses and increased as the heritability increased. Correlations between estimates of genetic correlations were highest between four trait and three trait analyses and also tended to increase as the heritability values of the traits involved increased. Correlations of the estimates for both heritability and genetic correlations were lowest under a negative genetic correlation. For weakly correlated or negatively correlated traits, inclusion 37 of additional traits in a multitrait model seems to increase estimation accuracy of the covariances especially if heritability values of the traits are large. For highly correlated traits, no gain in accuracy of genetic parameter estimates occurred when additional traits were added to the model. In order to be able to choose the best grouping or sampling strategy of traits, additional work needs to be done in cases where traits are not equally correlated. 5. Comparison of Genetic Parameter Estimates From Single and Multiple Trait Analyses When Underlying Distribution is Skewed 38 39 5.1 ABSTRACT Discrepancies between single trait and multiple trait estimates of genetic parameters when the underlying distribution is skewed were examined by simulation. A model with fixed management group effects and random sire effects was used to simulate records for four traits. Residual variance effects for one of the four traits were skewed for five different levels of heritability. A total of 24 situations with varied levels of genetic correlations were examined. Genetic parameters were estimated using an EM-REML algorithm with canonical and Householder transformations. The degree of skewness used had no effect on heritability estimates for either single trait or multiple trait analyses, but did affect the number of biased estimates of genetic correlations when the underlying heritability value of the skewed trait was small. Degree of skewness had no effect on the magnitudes of mean square errors for either heritability or genetic correlation estimates. Correlations of the estimates for heritability and genetic correlations between single trait and different multiple trait analyses were at least .78 and higher for both skewed and non-skewed traits. For all heritability levels, correlations were the highest between two trait and single trait analyses and lowest between four trait and single trait analyses for both skewed and non-skewed traits. Correlations for the estimates of genetic correlations were highest between four and three trait analyses and lowest between four trait and two trait analyses, for both skewed and non-skewed traits. 40 5.2 INTRODUCTION Estimates of (co)variance components are required for estimation of genetic parameters such as heritabilities and genetic correlations. Numerous methods for variance component estimation exist and there is no method that is considered universally best. The distinction among the methods depend on the properties of the estimators that a particular method provides. The Maximum Likelihood (ML) and Restricted Maximum Likelihood (REML) estimators have become popular, in part, because they can be derived readily from Henderson's mixed model equations (MME) and also due to their desirable statistical properties, such as non-negativity, consistency, and asymptotic normality. REML estimates also have the advantage, for balanced data, of reducing to the standard Analysis of Variance (ANOVA) estimates, which are known to have minimum variance properties. The REML procedure, however, requires random effects contributing to the observation vector to be random samples from underlying normal populations. This may not be the case in some instances for certain traits. There are traits that exist that have observations which follow a skewed distribution, such as calving difficulty or litter size, which are subjectively scored or categorized. The asymmetry of the distributions will depend upon the frequencies within each class. The effects of selection may also cause the distribution of a random factor to be skewed. The objective of this study was to compare results of (co)variance estimation for single and multiple trait REML estimators in simulated 41 populations under a false assumption of normality of the residual effects under varying conditions of heritabilities and genetic correlation values. 5.3 MATERIALS AND METHODS 5.3.1 Simulation Procedure Records on three traits were generated according to the model: yijk ' "'1 + 93 + eijk where yijk was a record on the kth progeny of sire j in the ith management group; management group effect (m) was fixed whereas sire effect (s) was random, and eijk was the random residual. Sire and residual components were generated from a multivariate normal distribution with expected values E(sj)-0 and 3(913k)'0 and specified sire and residual covariances. For a fourth trait, records were generated according to the same model, except for eijk' such that the observation vector, y, followed a log-normal distribution. Random 81 and eijk have the same expectations and (co)variances as listed above. All animals had records for all four traits. For each set of data generated, the number of progeny per sire, number of management groups, and number of sires per management group were identical to that described in section 4. A total of 24 underlying parameter structures were designed by varying levels of heritability, genetic correlations, and the underlying distribution of one trait. The situations investigated are summarized in Table 8. Residual correlations were kept constant at .5 across all 24 situations 42 and within each situation, genetic correlations were the same for all pairs of traits. The degree of skewness for the log-normal distributed traits was kept constant at 1.0 across all situations. The coefficient of skewness for the log-normal distribution is (eae+2)(eae-l)'5 (Hastings and Peacock, 1975). Setting this equal to one gives a residual variance on the normal scale (agn) of approximately .09876 for all combinations. With this value and using a median value of 1.0, the residual variance on the log scale (02L) can be determined to be approximately .11457 (Hastings and Peacock, 1975). Therefore to simulate a specific heritability value for a trait on the log scale (hi), the following formulae were used to determine the heritability value for the trait on the normal scale (hi): "3n - h: * agL / (4 - a 2 2 sn is the sire variance on the normal scale. Once as is hi) where n determined for a specific heritability value on the log scale, the heritability value on the normal scale can be determined by h; - 4*°§n/(°§n+°§n). 5.3.2 Statistical Analysis Each parameter combination was replicated 50 times. Single and multiple-trait EM-REML methods using canonical and Householder transformations as described by Jensen and Mao (1988) were applied in order to estimate genetic parameters. Table 9 shows the total number of different multiple and single trait analyses that were run and the number of estimates that were obtained from each replicate. Iteration of the EM-REML procedure was stopped when the absolute relative difference between the Euclidean norm of the estimated sire and 43 Table 8. Underlying parameters structures investigated for skewed traits h2 of trait Situation ‘18 1A 1B 16 1D 2A 23 2C 2D ah ah 3A 3B 3C 3D 31 4A 48 6L 40 ea 4D 11 12 s_ a underline indicates skewed trait. 44 residual (co)variance matrices in consecutive rounds was less than 10'5 or until 5000 rounds of iteration had been reached. that did not converge were not used in summaries. Those estimates The EM-REML estimators correspond to those given by Patterson and Thompson (1971) and Harville (1977) but extended to multiple traits as described by Jensen and Mao (1988). Sample parameters of sire and residual (co)variances were computed for each data set based on true residuals generated. Converged replicates were compared to the Table 9. Number of analyses run each replicate of each transmitting abilities of sires and estimates from each of the 50 average of the sample parameters. The and number of estimates obatined from parameter combination Number of traits in analysis 4 3 2 1 no. of analyses: 1 4 6 4 Type of Estimate Skewed Normal 2 2 h rg h rg no. of estimates: 8 12 24 12 45 non-parametric Sign Test (Daniel, 1978) was used to test for biases, where the exact probability level was determined by the number of replicates that converged for each parameter. 5.4 RESULTS 5.4.1 Biases Table 10 lists the percent of the biased estiamtes found for only those situations which had a significant number. In only 8 of the 24 combinations were the number of biased estimates greater than expected by chance when using a significance level of .05, however only 59% of all the biased estimates, for both heritabilities and genetic correlations, were for a skewed trait. Thirty-six percent of the biased estimates were heritability estimates, and only one third of these pertained to skewed traits, all with low heritabilities. Overall, low heritability levels tended to produce more biases, whether the distribution was skewed or not, than higher heritability levels. Ninety-eight percent of the biased heritability estimates occurred from populations with low underlying heritability values (.1,.2,.3). Sixty-four percent of the biased estimates were genetic correlations, where 72% of the biased correlation estimates contained a skewed trait. Of this 72%, more biases occurred when traits with low heritability values (.1,.2,.3) were skewed. 5.4.2 Mean Square Errors Table 11 shows average root mean square errors (RMSE) of heritability estimates from skewed distributions for four levels of 46 heritability (.l,.3,.6,.8) according to levels of genetic correlations Table 10. Percent of biased estimates for those situations having a greater than expected number (P<.05) % Biased Estimates True Parameters Skewed Normal h2 rg Total h2 rg h2 r8 .l,.3,‘§,.8 .2 12.5 0.0 0.0 7.1 5.4 .1,.3,.6,‘§ .8 19.7 1.8 5.4 5.4 7.1 ‘1, 3, 6, 8 - 2 23.3 8 9 5 4 5 4 3 6 ‘1, 3, 6, 8 - 3 12.5 1 8 7 1 0 O 3 6 .l,‘3, 6, 8 - 3 17.8 1 8 8 9 0 0 7 1 ‘2,.2,.2,.2 .2 30.4 12.5 0.0 16.1 1.8 ‘2, 2, 2, 2 8 14.3 0 0 5 4 0 0 8 9 ‘2,.2, 2, 2 - 3 17.9 7 1 5 4 0 0 5 4 and number of traits in the analysis. Values ranged from .06 for small heritabilities to .18 for large heritabilities. The effect of skewness did not seem to affect the magnitudes of the RMSE's when compared to those for non-skewed traits. Mean square errors tended to increase as underlying heritability increased and remained constant over different genetic correlation levels and number of traits in the analysis. Table 12 shows the R2 values and the standardized partial 47 regression coefficients from three different multiple regression models that were run where the standard error of the heritability estimate was the dependent variable in each model. The first model run contained independent variables of number of traits, the underlying heritability value, the underlying genetic correlation value, and whether the trait was skewed or not. As expected, the underlying heritability was highly significant and explained the majority of the variation in the standard errors. The underlying genetic correlation was also significant, whereas the number of traits and skewness were not. The second multiple regression was run within level of genetic correlation, with the number of traits, underlying heritability value, and skewness of the trait as the independent variables. The underlying heritability value was significant for all levels of genetic correlation. Skewness of the trait was significant under two levels of genetic correlation, however, both coefficients were essentially zero. The third multiple regression was run within level of heritability. The number of traits was significant only under the smallest heritability value of .l. The underlying genetic correlation was significant in three out of the four heritability levels. Skewness was significant in two out of the four heritability levels. Table 13 shows average RMSE's for genetic correlations estimates when one trait was skewed according to number of traits in the model and the heritability values of the traits involved. Magnitudes of the values did not seem to depend on the level of heritability of the skewed trait. Similar trends were found in the values as those found in Section 4 where magnitudes depended on the number of traits in the N.. 0.. 0.. 0.. N.. N.. N.. N.. ... N.. N.. N.. .0. N0. .0. 00. . N.. 0.. 0.. 0.. 0.. 0.. N.. N.. N.. N.. N.. N.. N0. N0. .0. 00. N N.. 0.. 0.. 0.. N.. 0.. N.. N.. N.. N.. ... N.. 00. N0. N0. 00. 0 N.. 0.. 0.. 0.. N.. N.. N.. N.. N.. N.. ... N.. 00. 00. 00. 00. 0 8 .0 00.0.. .0 .02 I C I O I 0 I I I O 0 Cu 0 . N - 0 0 - N . 0 N 0 . N . N 0 - N - 0 . 0 0. 0. "N0 mcoHpanHMumHU OCHOHumcss cozoxm aouu moucaHumo OUHHHnmuHuon you muouum oucsvm same #000 0O0H0>< .HH oHnme 49 analysis, the heritability value of the traits involved, and the underlying genetic correlation. Table 14 shows the R2 values and standardized partial regression coefficients from three differnet multiple regression models that were run where the standard error of the genetic correlation estimate was the dependent variable in each model. The first model contained the five independent variables of number of traits, the smaller heritability value of the two traits, the larger heritability value of the two traits, the underlying genetic correlation, and whether the trait was skewed or not. All factors were highly significant except for skewness of the trait. The second model was run within level of genetic correlation, with the number of traits, the smaller and larger heritability values, and the skewness of the trait as the independent variables. The number of traits was significant when the underlying genetic correlation was negative. The smaller heritability value was significant for all levels of genetic correlation, as was the larger heritability value, except under the correlation of .8. Skewness was significant in only one of the four levels of genetic correlation. The third model was run within levels of heritability of the two traits involved, with the number of traits, the underlying genetic correlation, and the skewness of the trait as the independent variables. The number of traits was significant when either one or both traits had a low heritability value of .l. The underlying genetic correlation was significant for all pairs of heritability values whereas skewness was not. 50 Table 12. R2 values and standardized partial regression coefficients for three multiple regression models with standard error of heritability estimate as the dependent variable. Beta Coefficients R2 NT h2 rg skew 58 - 042 749** 117** - 017 r8 .2 .93 - 007 .972** -.063* .8 .38 -.060 .622** -.031 -.2 .92 -.035 .958** .078** -.3 .85 -.057 .918** -.056 b2 .1 .08 -.302** -.078 .047 .3 .12 -.149 .251** -.230** .6 .21 -.063 .437** .190* .8 .05 -.063 .255** -.054 51 0N. 0N. 0.. 0N. .N. .N. 00. 00. 00. 0N. 0N. 0N. .0..0.0 0N. 0N. .N. 0N. 0N. 0N. ... 0.. 0.. 0N. 0N. NN. .0..0.. NN. 0N. 0N. 00. 00. N0. ... ... ... NN. 0N. 0N. .0..0.0 N0. .0. 0N. 00. 00. N0. NN. 0.. 0.. 00. 00. .0. .0..... 00. N0. 00. 00. N0. N0. 0N. 0N. 0.. 00. 00. 00. .0..... 00. 00. N0. 00. .0. 00. 0.. 0.. 0.. 00. 00. 00. .0....0 00. 00. 0N. 00. N0. 00. 0N. 0.. 0.. 00. N0. 00. .N..N.0 ----- -- -- --- -- -- --- -- -- --- -- -- N0 N 0 0 N 0 0 N 0 0 N 0 0 "0..... .0 .0: 0.- N.- 0. N. "00 mCOHudnHuuch OcHaHumvca 603030 fiouu moumaHumo :oHOMHouuoo oHumcwo Ho muouum oucsvm Gama uoou mmcum>4 .nH mHnma 52 5.4.3 Correlations Table 15 shows the correlations of the heritability estimates for the two extreme values (.1,.8) when a trait was skewed for all possible pairs of multiple and single trait analyses. Correlations were high for all levels of heritabilities indicating that heritability estimates from different analyses were highly repeatable. Similar trends were found in the correlations as were found in Section 4 according to the level of heritability, the underlying genetic correlation, and the number of traits involved. As expected, correlations were highest for analyses differing by only one trait, for example, estimates from a four trait and a three trait analysis were correlated higher than those from a four trait versus a single trait analysis. Correlations for heritability estimates were highest between a two trait analysis and a single trait analysis and tended to be lowest between a four trait analysis and a single trait analysis. Correlations tended to be slightly higher for large heritabilities. According to different levels of genetic correlations, correlations for the heritability estimates were highest under strong positive genetic correlations and lowest under strong negative correlations. However, with a large heritability value, this trend was not as noticeable. Table 16 shows correlations of the genetic correlation estimates when one trait was skewed for all possible pairs of multiple trait analyses according to underlying genetic correlations and heritability values of the traits involved. All correlations were high but slightly less than those for heritability, as expected. Magnitudes of the 53 Table 14. R2 values and standardized partial regression coefficients for three multiple regression models with standard error of genetic correlation estimate as the dependent variable. Beta Coefficients R2 NT hZL hZH rg skew .73 - 112** -.468** -.233** -.568** .027 r8 .2 .81 .080 .560** .503** .129 .8 .62 .081 .788** .005 .035 .2 .71 .130** .619** .333** -.004 .3 .81 .289** .619** .357** .022 h2 .1,.3) .54 .158** .731** .052 .1,.6) .70 .242** .234** -.002 .1,.8) .54 .226** .713** .092 .3,.6) .73 .088 .854** -.110 .3,.8) .78 .110 .878** .074 .6,.8) .74 .030 .865** -.007 54 0000. 0000. 0N00. 0N00. 0000. 0N00. 0.- 0. 0000. 0000. 0000. 0000. 0000. 0000. N.- 0. 0000. 0000. 0000. 0000. N000. 0000. 0. 0. 0000. 0000. 0000. 0000. 0000. 0000. N. 0. 0000. 0000. 0000. 0000. 0000. 0000. 0.- 0. N000. 0000. 0000. 0000. 0000. 0000. N.- 0. 0N00. 0000. NN00. 0000. 0000. 0000. 0. 0. 0000. N000. 0000. 0000. 0000. 0000. N. 0. 0\N 0\0 N\0 0x0 N\0 0\0 00 N0 MOM%HMCM CH munmflhu H0 .02 .momOHmcc uHmuu mHOch can oHQHuHsa cmoauon 0O..H.O mmaH0> casuaxo osu you mounaHumo 09000000000: no mCOHucHouuou .00 oHnua 55 OOMO. OOOO. MOOO. OOMO. NNOO. OMOO. OFOO. OOMO. FQOO. OOPO. ONMO. OONO. NOOO. NMNO. ONMO. OOOO. OMMO. MOMO. OOOO. OMMO. NOMO. NMOO. NOOO. FOMO. N\¢ NxM M\e 00.000 00 .0: M.- OOOO. FMOO. MOOO. NNOO. OOOO. OFOO. ONOO. MOOO. ONOO. NONO. «OOO. MOOO. OMOO. MOMO. HONO. OOPO. OOMO. NNMO. ONOO. «MMO. MOMO. NMOO. NNNO. FOOO. Nxv NNM M\0 00.000 00 .o: N.- pOOO. OOOO. OOOO. MOOO. NOOO. OOOO. MOOO. NMOO. MMOO. ONmO. ONOO. OOOO. FMMO. FNMO. NOMO. OONO. NMNO. OONO. NNMO. NOOO. NMNO. OONO. OMOO. OMOO. NNO NNM M\0 Oumngu v0 .03 MNOO. OOOO. NOOO. OqOO. OOOO. OOOO. «ONO. NMOO. ONOO. OOOO. FMOO. MOOO. «OOO. ONNO. «FOO. NONO. FOOO. OOOO. OOOO. NOOO. NOOO. M—OO. NMOO. OMOO. Nxe N\M M\¢ nun-Lu $0 .02 N. .mmmaHmsm #000» mHQHana ucohouuHo comsuon mosmxm 00 #0009 mso con: mmucaHumm mcoHunHouuoo oHumcmO mo mcoHumHmuuou 0910.0 no..M.v 0O..F.v .0..E0 0M..P.O ANLN.0 .OH OHQMB 56 correlations did not seem to depend on the heritability value of the skewed trait. As with heritability, trends in the values were found according to number of traits in the analysis, the heritability values of the traits involved, and the underlying genetic correlation. Across all levels of genetic correlations and heritabilities, correlations values were highest between four trait and three trait analyses and lowest between four trait and two trait analyses. This trend differed from that found in heritabilities in that correlations for heritability estimates were highest between a two trait and a single trait analysis, where as for genetic correlations, correlations were consistently highest between a four trait and a three trait analysis. Correlations were highest when both traits were highly heritable, with little variation across different levels of genetic correlations. When heritability values were low, correlations tended to be lower and more variable across different levels of genetic correlations, where correlations were highest under a genetic correlation of .8 and lowest under a genetic correlation of -.3. 5.5 CONCLUSIONS Results indicate that in terms of biases, the degree of skewness used had no effect on heritability estimates from either single or multiple trait analyses. However, more biases were found when heritability values were small, whether skewed or not, than for those with large heritability values. The degree of skewness seemed to have a small effect on the estimates of genetic correlations, especially when the heritability value of the skewed trait was small. 57 In terms of MSE's, the degree of skewness had no effect on the magnitude of values for either heritability or genetic correlation estimates. Similar trends in MSE’s for skewed traits were found as those of non-skewed traits for both heritabilities and genetic correlations. Only one degree of skewness was examined in this study. The direction of the skewness was also held constant in the situations examined. The form of skewness examined here seemed to have no effect on the accuracy and precision of genetic parameter estimates. Thus, REML appears rather robust in terms of expectation and sampling variances of the estimates for this type of skewness. Asymmetric sire effects, as opposed to the skewed residual effects that were examined in this study, may prove to have more of an effect on the accuracy and precision of heritability and genetic correlation estimates. Also, different levels of residual correlation needs to be examined as well as skewing more than one trait. 6. SUMMARY Variance component estimation methods for unbalanced data are plentiful and there is no universally best method. Different methods will give different estimates from the same set of data. The current method of choice is REML due to its desirable statistical properties. Once the method of estimation has been determined there are a number of factors that will affect the properties of the resulting estimates. Three sources of potential bias in (co)variance component estimation by EM-REML were examined by simulation: the number of traits in the analysis; the magnitude of the underlying parameters; and violation of normality assumptions. The understanding of these possible sources of bias should enable one to develop strategies in selecting subsets of traits that yield high estimation accuracy and precision while minimizing computational requirements. Different populations were simulated to cover a range of heritabilities as well as genetic correlation structures. A model with fixed management group effects and random sire and residual effects was used to simulate records for four traits. Each population was replicated 50 times. Single and multiple trait EM-REML methods were applied in order to estimate genetic parameters. Converged estimates from each of the 50 replicates were compared to the average sample parameters. The Sign test was used to test for biases in the estimates. In studying the effect of violation of normality assumptions, residual effects for one of the four traits was skewed such that the records generated followed a log-normal distribution with 58 59 a constant degree of skewness of 1.0. The above procedures were then repeated. Estimates for both heritabilities and genetic correlations were examined. Results were summarized in terms of accuracy or amount of bias in the estimates; precision or magnitude of mean square errors in the estimates; and correlation between estimates from analyses involving different number of traits. 6.1 Heritability From analyses involving different number of traits, none of the heritability estimates were significantly biased. Also, the accuracy of heritability estimates did not appear to be dependent on the degree of association between traits in a multiple trait setting. Heritability estimates of weakly correlated traits were as accurate as those of strongly correlated traits. Across all levels of underlying genetic correlations and from analyses involving different number of traits, more biases in estimates tended to occur when underlying heritability values were small than when heritability values were high. Correlations of heritability estimates between single, two, three, and four trait analyses were high for all levels of underlying heritability. For low levels of heritability, where majority of the biases in heritability estimates were found, high correlations among the estimates indicate consistency in the direction of the bias. For all levels of heritability, mean square errors did not change as the number of traits in the analysis changed, or as the genetic correlation among the traits changed. This suggests that STA was as precise as MTA for estimating heritability. 60 In general, results suggested that little is gained through MTA for estimating heritability. The degree of skewness used had no effect on the amount of biasedness occurring in heritability estimates for single or multiple trait analyses. Correlations between estimates of heritabilities from skewed underlying distributions were as high as correlations between heritability estimates arising from normal distributions. The effect of skewness did not seem to affect the magnitudes of the MSE's of heritability estimates when compared to those of non- skewed traits. 6.2 Genetic Correlations Estimates of genetic correlations tended to be biased when the underlying genetic correlations were negative. The direction of the bias seemed to depend on the number of traits included in the analysis. Biases occurring in a four trait analysis tended to be weaker negative or underestimated, whereas biases occurring in the two and three trait analyses tended to be stronger negative or overestimated. Biases also tended to occur under strong positive correlations, however, these biases were much smaller than those under negative correlations and could be considered negligible. Mean square errors for genetic correlation estimates indicate precision of estimation depended on the underlying genetic correlation, the underlying heritabilities of the two traits involved, and the number of traits in the analysis. Precision in covariance estimation was the highest when traits were highly correlated positively and moderate to highly heritable. Within this parameter setting, Mean 61 square errors stayed constant as more traits were included in the analysis indicating no gain in precision is made through MTA. Mean square errors for correlation estimates were largest under conditions where the traits were weakly correlated with small heritability values. Under this parameter setting, precision in covariance estimation seemed to depend on the heritability values of the traits and the number of traits in the analysis. Results indicated that precision increased as the heritability values increase and as the number of traits in the analysis increased. This suggests that a gain in precision can be obtained through MTA. The degree of skewness used had a small effect on estimates of genetic correlations. Correlations between estimates of genetic correlations when a trait was skewed were as high as those of non- skewed traits. Mean square errors for genetic correlation estimates did not seem to be affected by the amount of skewness. Similar trends were found as those for non—skewed traits in terms of underlying heritabilities, underlying genetic correlations, and the number of traits in the analysis. 6.3 Sampling Subsets of Traits Trends found in the accuracy and precision of the estimates can be used as guidelines for selecting subsets of traits when computer resources are limited. Results found should only apply to those situations examined here where traits are equally correlated. For heritability estimation, results indicated that estimates are consistent across different levels of parameter values and number of 62 traits in an analysis. There was no gain in accuracy or precision of genetic correlation estimates by adding or deleting traits from a MTA when traits were highly correlated. The greatest increase in accuracy and precision occurred when traits were negatively correlated and with small heritability values. Under these conditions, adding more traits to the analysis continued to improve the properties of the estimates. In general, when traits were negatively correlated, for all levels of heritability, adding more traits to the analysis continued to increase accuracy and precision of genetic correlation estimates. BIBLIOGRAPHY 7. BIBLOGRAPHY Banks, B.D., I.L. Mao, and J.P. Walter. 1985. Robustness of the restricted maximum likelihood estimator derived under normality as applied to data with skewed distributions. J. Dairy Sci. 68: 1785-1792. Buttazzoni, L. and I.L Mao. 1989. Genetic parameters of estimated net energy efficiencies for milk production, maintenance, and body weight change in dairy cows. J Dairy Sci. 72:671-677. Corbeil, R.R., and S.R. Searle. 1976. Restricted maximum likelihood (REML) estimation of variance components in the mixed model. Technometrics 18:31-38. Corbeil, R.R., and S.R. Searle. 1976. A comparison of variance component estimators. Biometrics. 32:779-791. Daniel, W.W. 1978. Applied nonparametric statistics. Houghton Mifflin Company. Dempster, A.P., N.M. Laird, D.B. Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. Royal Stat. Stat. J. Series B 39:1-38. Graser, H.V., Smith, S.P. and Tier, B. 1987. A derivative-free approach for estimating variance components in animal models by Restricted Maximum Likelihood. J. Anim. Sci. 64:1362-1370. Hartley, H.O. and J.N.K. Rao. 1967. Maximum likelihood estimation for the mixed model analysis of variance model. Biometrics. 54:93. Harville,D A. 1977. Maximum likelihood approaches to variance component estimation and to related problems. J Amer. Stat.Assoc. 72:320-338. Hastings, N.J., and J.B. Peacock. 1975. Statistical distributions. John Wiley and Sons. Henderson, C.R. 1973. Sire evaluation and genetic trends. In: Proc. Anim. Breeding Genetic. Symp. in Honor of Dr. J.L. Lush p 10-41. Am. Soc. Anim. Sci. and Am. Dairy Assoc., Champaign, IL. 62 - A 63 Henderson, C.R. 1976. Multiple trait sire evaluation using the relationship matrix. J. Dairy Sci. 59:769. Henderson, C.R. 1978. Simulation to examine distributions of estimators of variances and ratios of variances. J. Dairy Sci. 61:267-273. Henderson, C.R. 1984. Applications of Linear Models in Animal Breeding. p160. University of Guelph Press, Guelph, Can. Henderson, C.R. 1984. Estimation of variances and covariances under multiple trait models. J Dairy Sci. 67:1581-1589. Henderson,C.R. 1985. MIVQUE and REML estimation of additive and nonadditive genetic variances. J. Anim. Sci. 61:113-121. Henderson, C.R. 1986. Recent developments in variance component estimation. J Anim. Sci. 63:208-216. Henderson, C.R. ANOVA, MIVQUE, REML, and ML algorithms for estimation of variances and covariances. Iowa State University 50th anniversary of statistics book. Henderson, C.R. 1987. Progress in statistical methods applied to quantitative genetics. Proceedings of the 2nd international conference on quantitative genetics. Henderson, C.R., and R.L. Quaas. 1976. Multiple trait evaluation using relatives records. J. Animal Sci. 43:1188. Hill, W.G. and R. Thompson. 1978. Probabilities of nonpositive definite between group and genetic covariance matrices. Biometrics. 34:429-439. Jennrich, R.T., and P.F. Sampson. 1976. Newton-Raphson and related algorithms for maximum likelihodd variance component estimation. Technometrics. 18:11. Jensen, J., and I.L. Mao. 1988. Transformation algorithms in analysis of single trait and multitrait models with equal design matrices and one random factor per trait. J. Dairy Sci. 66:2750-2761. Lin,C.Y., and A.J.Lee. 1986. Sequential estimation of genetic and phenotypic parameters in multitrait mixed model analysis. J Dairy Sci. 69:2696-2703. Meyer,K. 1983. Maximum likelihood procedures for estimating genetic parameters for later lactations of dairy cattle. J Dairy Sci. 66:1988-1997. 64 Meyer, K. 1985. Maximum likelihood estimation of variance components for a multivariate mixed model with equal design matrices. Biometrics. 41:153-165. Meyer, K. 1987. Restricted maximum likelihood to estimates variance components for mixed models with two random factors. Genet. Sel. Evol. 19:49-68. Meyer, K. 1989. Estimating variances and covariances for multivariate animal models by REML. Genet. Sel. Evol. (submitted). Meyer, K. and R. Thompson. 1984. Bias in variance and covariance component estimators due to selection on a correlated trait. J. Anim. Breeding and Genetics. 101233-50. Patterson,H.D., and R. Thompson. 1971. Recovery of inter-block information when block sizes are unequal. Biometrika 58:545-554. Pollack, E.J., and R.L. Quaas. 1981. Monte carlo study of genetic evaluations using sequentially selected records. J. Animal Sci. 52:257. Pollack, E.J., J. van der Werf, and R.L. Quaas. 1984. Selection bias and multiple trait evaluation. J. Dairy Sci. 67:1590-1595. Rao, C.R. 1971. Minimum variance quadratic unbiased estimation of variance components. J. Multivariate Analysis 1:445-456. Rothschild, M.F., C.R. Henderson, and R.L. Quaas. 1979. Effects of selection on variances and covariances. J Dairy Sci. 62:996. Schaeffer, L.R. 1983. Notes on linear model theory, best linear unbiased prediction and variance component estimation. Dept. of Anim. and Poult. Sci., Univ. of Guelph, Ontario, Can. Schaeffer, L.R. 1984. Sire and cow evaluation under multiple trait models. J. Dairy Sci. 67:1567-1580. Schaeffer, L.R. 1985. Maximum likelihood method for multiple traits for two traits, one breed, sire model. Summary. Schaeffer, L.R. 1986. Estimation of variances and covariances within the allowable parameter space. J Dairy Sci. 69:187-194. Schaeffer, L.R., and J.W.Wilton.l981. Comparison of single and multiple trait beef sire evaluations. Can. J. Anim. Sci. 61:565-573. Schaeffer, L.R., J.W. Wilton, and R.Thompson. 1978. Simultaneous estimation of variance and covariance components from multitrait mixed model equations. Biometrics 34:199-204. 65 Seal, H.L. 1966. Multivariate Statistical Methods for Biologists. London, Methuen. Searle, S.R. 1971. Topics in variance component estimation. Biometrics. 27:1-76. Searle, S.R. 1982. Matrix Algebra Useful for Statistics. John Wiley and Sons, Inc. New York. Searle, S.R. 1989. Variance components- some history and a summary account of estimation methods. J. Anim. Breed. Genet. 106:1-29. Smith, S.P., and H.U. Graser. 1986. Estimating variance components in a class of mixed models by restricted maximum likelihood. J. Dairy Sci. 69:1156-1165. Sorensen, D.A., and B.W. Kennedy. 1984. Estimation of genetic variances from unselected and selected populations. J. Animal Sci. 59:1213-1223. Thompson, R. 1969. Iterative estimation of variance and covariance components for non-orthogonal data. Biometrics. 41:153-165. Thompson, R. 1973. The estimation of variance and covariance components when records are subject to culling. Biometrics. 22:527-550. Thompson, R. 1977. The estimation of heritability, with unbalanced data. I. Observations available on parents and offspring. 11. Data available on more than two generations. Biometrics. 33:485—504. Thompson R. 1982. Methods of estimation of genetic parameters. In Proceedings of the Second International Congress on Genetics Applied to Livestock Production, Madrdi. Vol 5, 95-103. Walter, J.P., and I.L. Mao. 1985. Multiple and single trait analyses for estimating genetic parameters in simulated populations under selection. J Dairy Sci. 68:91-98. MICHIGAN STRTE UNIV. LIBRRRIES [I“lull“[WWIWVINI]WWI[IHIWINW 31293008914339