Q .— ‘ t . ' In”: . .‘qu avg—L... -Ifi.'-.' uro- r - ,3 ‘ - If; I J ”‘5“ fl ' " IIC'r.‘ 'I I I I: I I ' .~. "'UIQ.-. 9 ' | .I I III. 4» - l . ,, a: U ,, . (II... ’I '3‘. III _“|‘.'-. .I s: [ell'l'l' :Ig‘ {I -{ .'y" I. 1., {ht-I‘m?“ .II.- ‘I-Qlfv .. ' 'l‘l“ ".'n.‘ ‘.".I..{y' E}: .' ’l III “.1 ,Ilfl , “.43.. I“ "39".: 5.4"}. U NIH“ !:l¢ WK)“ 1'“ ‘71}? fng‘. I, I- 3'. ,J u MK“ Na") ' . < 3‘. " .' .fi: '2'“ {'5 R.’ ‘ 1'35: ;,. € 1' "V‘ I '1 . .1 3’ I :5:th '\II 1? .'gl'l’ I( '3 ..j 3 . -up. . . 'h‘é 'I ." . '.4I _'.IR_ ' u \I II v v ‘l'I ‘n I I‘It ' ‘i' ' é ' ~ ' " .‘. _ . , Ifinfiwfl‘é ' ., H l ‘53:! -1“ I '1 . ' " ” WI": ‘“ (t "” "Ill-I'M“: nl'” :‘YQ log}; . ‘(VE ' ' v' _ -. v E a ‘ ,',.. ~JIII- _I;.I s“l""" ‘ . I. - . ., ...-II . IOr ... .. .. .3,” .I w-I 52‘: *nfl'r _. '. In, --.u-I- . {.1 , , I. .313 ,.I‘ [A {.1 .umu ,. .vI M. ... v . .. I . ...I.','...-' v.4 . .,3 L m ."W'Qy .‘I‘fylfil ': “'3‘; . 1'35: “J _,'I 5;. 3.3., ."a a “It...” In." > _ II. ,f'. .5.‘ .J ' . I, Mm. ". ’ A". l' "I.‘- '.'I .-‘I“ I. I‘ I.'.-‘III-' I1”; I, I,“ ‘ ‘ u ‘_I.’ ".' ‘ "IlIfI" , . 'y ". .' 33, \‘I'. - .I; I ‘1 t, I, “pad ‘f'. ‘3 V" I,“ n V'.‘ mI';""""'II I'll-{INF .3) MIL". m“, "’5 . "j 3 .',.IrI ‘ III I» .. .3 , I 0" .5 I, 3 5.. d ‘ -u” I .t . 3'. .7 I“ l - It ~. 1 H I .. l '. L. bi. «HI. 'LL-‘I' n '1 ' .- "5" ‘67 "75' 'l I 'I.." 1'7' In J . 1'" '. . , I ‘I ”2*... .I "I "II . I .. ‘ "2:? 0" ' 'f." . U‘ "I I, ' v|l I ‘ ‘ " u" _ .. .. I. . . . V . ., uh «I I . .» mu. ‘II, ')" ' '.'.II..,.nI‘II 3'., " 'I I'.' -.I‘—. . "- I. I." I. MI'I‘! :i' ,I u‘ -_ ,I ..-' ‘. 'I “ I\. ‘0‘.) . S “I .l I‘, '»._.r-.'I’I.llv} u". “HI -.-'.',‘."l'.". . ‘v ’ ‘wfi III This is to certify that the dissertation entitled ANALYSIS OF THE ERROR VARIANCE-COVARIANCE MATRIX OF FIRST, SECOND AND THIRD LACTATIONS IN MEXICAN HOLSTEINS presented by Manuel Villarreal has been accepted towards fulfillment of the requirements for Ph.D. degree in Animal Science L w:\.\ { \ix17&3\n36\4\ Major professor //9 /2 f/fi/ 7 / M30 is an Affirmative Action/Equal Opportunity Institution 0-12771 MSU LIBRARIES .- V RETURNING MATERIALS: Place in book drop to remove this checkout from your record. FINES will be charged if book is returned after the date stamped below. ANALYSIS OF THE ERROR.VARIANCE-COVLRIANCE NAIRIX OF FIRST, SECOND AND THIRD LACTATIONS IN'HEXICAN HOLSTEINS By Manuel Villarreal A,DISSERIATION Submitted to ‘Hichigan State university in partial fulfillment of the requerilente for the degree or DOCTOR OF PHILOSOPHY Dapartnent or Aninal Science 1985 ABSTRACT ANALYSIS OF THE EH01! VARIANCE-COVARIANCE KATRIX OF FIRST, SECOND AND THIRD LACTATIONS IN MEXICAN HOISTEINS by Hanuel Villarreal Errors from each record of the same cow have been treated as independent of each other when analyzing later lactations. Need to incorporate more flexibility to this relation by allowing errors from each lactation to be correlated with each other and to vary from lactation to lactation in accordance with a first order non-stationary autoregressive process has been suggested. Use of recursive estimation techniques would be possible if residual errors follow this pattern. More accurate estimates of a cow's real ability could be obtained from these relationships. This study tests the assumptions that there is homogeneous error variance-covariance among lactations: and examines the hypothesis that errors might follow a first order non-stationary autoregressive process. Milk yield records of 3999 Mexican Holsteins with three consecutive lactations were used. The model applied to each lactation included random effects of sire, fixed effects of herd-year-season, and days in lactation fitted as a covariable. Residuals were computed from solutions to the mixed model equations. The val 1:85 the er: hi; pr: orc‘ the Va: was ref co: COW variance-covariance matrix s was produced from a vector of residuals for each lactation from each cow. The 3 matrix is not homogeneous (p < 0.01) according to the Box test for homogeneous variance. In the 8, matrix error variance increases with time and error covariances are higher in adjacent lactations.‘Using a maximum likelihood procedure to test the hypothesis that errors follow a first order autoregressive model was rejected (p < .01). A model that incorporated a parameter representing the possible variance increment caused by the maturing effect of the cow was also rejected (p < .01). Although these models were rejected they fitted 3 better than a model with no correlation of errors. .Justifications would.be that if a cow produces an unusually'good or bad.yield initially'her caretaker would have a high or low expectation of her ability managing her accordingly, also cumulative effects of accidents or diseases could be carried over lactations. To model the error structure of repeated lactations, a first order non-stationary process should be used. DEDICATION I would like to dedicate this thesis to the following people: to my parents Maria Estela c. de Villarreal and Fernando Villarreal who instilled in me the importance of education: to my brothers and sisters for their example; to aunt Emma and uncle Fernando Riou remembering their support and friendship: to my American family Carlitos, Leticia and.Gumecindo Salas for helping me to be me again, and finally, to the many friends who make life so worthwhile. R- Am major SUgge ACKNOWLEDGMENTS The author wishes to express his gratitude to Dr. Clyde R. Anderson for his guidance, encouragement and patience as major professor, and to Dr. James Stapleton for his many suggestions during the course of this reseach. The author also wishes to thank the members of his guidance committee: Drs. J. L. Gill, L. D. McGilliard, and W. T. Magee for their support and suggestions. Appreciation goes to all the members of the Animal Breeding group, especially to J. P. Walter for sharing with me some of his programming expertise. A very special thanks to Jim Liesman for all his help and support and to Alejandro Villa for great moral support. Financial support of CONACYT (Mexican Council of Science and Technology) for the first two years of the graduate program is acknowledged. The opportunity to work for Dr. R. S. Emery while finishig the program is deeply appreciated. ii TABLE OF CONTENTS Page LIST OF TABLES e e e e e e e e e e e e e e e e e e e V I momma“ O O I O O O O O O O O I O O I 1 II LITERATURE REVIEW . . . . . . . . . . . . . 9 11.1 Studies on Later Lactations .. .. .. .. 9 11.1.1 Overview: First vs. Later Lactations . 9 11.1.2 Studies Comparing Sire Values Using First and Later Lactations . . 17 11.1.3 Genetic Parameters of Milk Yield for First and Later Lactations . . . . . 22 11.2 Analysis of Sequential or Repeated Observations . . . . . . . . . . 30 11.2.1 Repeated Measurements Experiments . . 30 11.2.2 Analysis of Times Series Data . . . . 42 III MATERIALS AND METHODS . . . . . . . . . . 51 I I I O 1 Data 0 O O O O O O O O O O I O O O O O O O 5 1 111.1.1 Source of the Data . . . . . . . . . . 51 111.1.2 Data Screening and Editing Procedures . . . . . . . . . . 52 111.2 Strategy to Calculate the Error Variance-Covariance Matrix . . . . . . . . 53 III 0 3 MOdel O O O O 0 O O 0 O O O O O O O O O 0 57 iii 111.3.1 Equations and Characteristics of the Model . . . . . . . . . . . . 57 111.3.2 Numerical Example . . . . . . . . . . 62 111.3.3 Programming Strategy . . . . . . . . 74 111.4 Variance Components Estimation . . . . . . 87 111.5 Heritability Estimates . . . . . . . . . . 94 111.6 Ranking of the Sires . . . . . . . . . . . 95 111.7 Methods for the Analysis of the Error Variance-Covariance Structure . . . . . . 96 111.7.1 Method to Test for Homogeneous Variance . . . . . . 97 111.7.2 Method to Test for the Hypothesis that Errors Follow a First Order Autoregressive Model . . . . . . . . 99 IV RESULTS AND DISCUSSION . . . . . . . . . 103 1V.1 Basic Parameter of the Model for each Lactation . . . . . . . . . . . 103 IV.2 Variance Components and Heritability Estimates . . . . . . . . . 104 1V.3 Effect of Covariable Days in Milk . . . . 110 1V.4 Sire Solutions for each Lactation . . . . 111 1V.5 Analysis of the Variance-Covariance Matrix . . . . . . . 114 1V.5.1 Test for Homogeneous Variance . . . 115 1V.5.2 Test for The Hypothesis that Errors Follow a First Order Autoregressive Model . . . . . . . . 119 V SUMMARY AND CONCLUSIONS . . . . . . . . . 126 LI ST or REFERENCES 0 O O O O O O O O O O O O O 0 O l 3 1 MIX A O O O O O O O O O O O O O O O O O O O O 138 LIST OF TABLES Table Page 3.1- Data for example problem . . . . . . . . . . . 62 3.2- Sire by herd-year-season (HYS) distribution (example data) . . . . . . . . . 63 3.3- Example data sorted by HYS and sire sorted within HYS . . . . . . . . . . . . 78 4.1- Basic parameters of the model for each lactation . . . . . . . . . . . . . . 103 4.2- Variance components, heritabilities, L ratios, means, and coefficients of variation for milk yield in three lactations . . . . . . . . 105 4.3- Heritabilities of milk yield of the first three lactations in Holstein cattle . . . . . 109 4.4- Range and standard deviations of BLUP sire solutions (n-193) in each lactation . . . . . 111 4.5- Rank correlation of BLUP sire solutions for the three lactations . . . . . . 113 4.6- Sample 8 and estimated variance-covariance matrices for first order autoregressive plus cow's maturing effect models SIGMATp, and SIGMATG . . . . . . . . . . . . . 124 A - FORTRAN program for mixed model analysis . . . 138 lact lact the. made Effie: those recoz SQaSc to th I INTRODUCTION The amount of milk that a cow produces varies from one lactation to another. The variation in milk.yield from lactation to lactation may be attributed to variations in the environment prevailing at the time the observation is made. A goal of animal breeders has been to estimate effects of environment and correct the cow's production for those effects. For dairy cattle, it is common to correct records for age, times milked per day, days in.milk, and season of freshening. This process corrects or standardize to the average of the whole population. Each adjusted or corrected record can be considered to represent the real ability of the cow under those "standard conditions" plus or minus some random error. The corrected record of the same cow in the next lactation has the same real ability of the animal plus or minus another error for environmental conditions. As pointed out by Lush (1945): "So far as temporary environmental conditions are concerned, these errors remaining in the correc- ted records will be independent of each other. Hence if all the records of the animal are averaged together, some of these will have posi- tive errors, and others will have negative errors which will tend to cancel the positive ones." The average lifetime production of a cow can be expressed as: l Ave Th anin The more vali assun anima lacta Real ability Average Lifetime Production 8 of the animal + SUM (errors)/n where: SUM is the summation of all n errors. The average of n observations is the real ability of the animal and the average of the errors that did not cancel. The amount of error in this average becomes smaller with more observations if errors were really random. Animal breeders have considered this assumption to be valid and random nature of the errors is a standard assumption in most of the research in animal breeding. A linear model for estimating breeding value of dairy animals and predicting future milk production from repeated lactations follows: Yijkn - hysi + sj + cijk+ eijkn where: Yijkn is the milk production for the nth lactation of the kth cow in the 1th herd-year-season sired by the jth sire, hysi is the fixed effect of the ith herd-year-season in which this lactation was initiated, sj is the random effect of the jth sire, cijk is the random effect of kth cow, daughter of the 3th sire in the 1th herd, be re rasic Ve aim real lifet eijkm is the residual or random error. From genetic theory, the cow effect in a given lactation, cijkn' can be expressed as: cijkn " aijkn + dijkn with aijkn being viewed as the genetic and unseparable permanent environmental contribution and dijkn as a temporary environmental contribution. The sj sire, the aijkn and, dijkn effects are taken to be random having zero means and variances Vs, Va, Vd. The residual effects eijkn have zero mean and common variance Ve and are uncorrelated with each other and with sj and aijkn' Inherent to this model are the assumptions that a cow's real producing ability remains constant over her'productive lifetime, that the variance of the milk yields is constant for‘all lactations and the covariances for all pairs of milk yields are constant. The assumptions that the cow's effect remains the same over time has been challenged. Harville (1979) pointed out that a cow might incur an ailment, such as acute mastitis that would affect adversely her producing ability in the lactation of initial occurence and in all sequent lactations. He suggested that more flexibility could be incorporated into the model by letting the temporary env lac pro: environmental part of the cow effect vary from lactation to lactation in accordance with a first order autoregressive process, i.e., by replacing dijkn by dijknt‘ dijknt ‘ A dijkn,t-l + Vijknt where: D = autogregressive coefficient dijkn,t-1 a temporary environmental effect in previous observations Vijknt = uncorrelated random variable with zero:mean and variance vv. Mansour, Nordheim, and Rutledge (1985) presented a model to estimate variance components in a repeated measurements design, assuming non-stationary autoregressive errors. Their rationale was that in many sets of data the assumptions of constant variance and constant covariance do not appear warranted. They presented as an example, the variance-covariance matrix of lactational yield of milk fat for the first three lacta- tions of 11,613 cows reported by Butcher and Freeman (1969): 6398 3322 2623 3323 7003 3514 2623 3514 7467 Note ‘ incre. he 01 relax In this factOrs meters C ‘OIlOVS eij Note the increasing value along the principal diagonal, the increasing value 3323, 3514 on the first minor diagonal, and the decreasing value of the covariances in the first row. Mansour, Nordheim, and Rutledge (1985) proposed to relax the assumption of independence of the error terms. An example is a two way mixed linear model: yij 3 M + ai + tj + eij where: Yij is the milk yield recorded for the ith subject at the jth time, M is the overall fixed mean, a1 is the random effect of the 1th animal, tj is the fixed effect of the jth time and eij is a random error term, eij and ai are independently and identically' distributed (i.i.d.) with N(0,Ve ) and N(0,Va). In this model Ve represents the temporary environmental factors and Va the genetics and permanent environmental factors. For the same conditions on ai but now assuming that eij fellows a first order autoregressive model: °ij ' 4 e1 (j-1> + Vij 31' A” where: °i(j-1) is error term in time j-l, Vij is uncorrelated random variable i.i.d. N(0,Vv), p is autoregressive coefficient. The symmetric variance-covariance structure of three milk yields associated with the model is: Va+Ve Va+pVe Va+¢2Ve Va+ (1+;a2 ) Ve Va+¢ (14132 )Ve Symmetric Va+ (1+¢2+p4)Ve . With simulated data, the authors calculated. maximum likelihood estimates of Va ,Ve, and ,3. They found substancial improvement of the autoregressive model in the Butcher and Freeman (1969) data over the model that did not use autoregressive statistics. These findings could have a biological justification. The authored arguments for the positive association among errors are: First, if a cow initially'produces an unusually'good or ‘bad.yield, it would be natural to think that her caretaker would have a high or low expectation of her ability and would provide sequent management, accordingly. Although a portion of the initial yield is due to genetic and permanent environmental effects, when Va is large relative to Va, the major impact of the differential management would.be reflected better by autoregression. .A pathogenic infection carried over from one lactation to the next might cause a low production in both lactations. Rothchild and Henderson (1979b) also relaxed the assumption of error covariance being set to zero. They explained that in a sire evaluation model with no cow component, to estimate variances and covariances for first and second lactations an error covariance must be included because the same cow has both records, and covariances between errors must exist. If errors are correlated and if this correlation has a first order autoregressive pattern, there would.be the potential to use recursive estimation techniques in the sire evaluation for later lactations case. These techniques are such that it is unnecessary to store old data to process new data. .Also, more accurate estimates of the real ability of the cow could be obtained from these relationships. Objectives of this study are: --Estimate the error variance-covariance structure of a series of three lactations. --Test the assumptions commonly made by animal breeders that there is homogeneous error variance and constant covariance between each pair of lactations. --Examine the hypothesis presented by Harville (1979) and Mansour, Nordheim, and Rutledge (1985) that errors might follow a first order non-stationary autoregressive process. Because of the computational simplicity of the follo- wing tasks, they too were performed: --Estimate sire variance components for each lactation. --Estimate heritability of milk yield for each lactation. --Estimate sire solutions for each lactation. --Rank the sires for each lactation. --Calculate the rank correlation among sires. k" II LITERATURE REVIEW 11.1 Studies 93 £3223 Lactations 11.1.1 Overview: First 25; Later Lactations ig Sire Evaluation Milk yield is and will continue to be the most important trait selected for in dairy cattle. Currently; dairy cattle improvement depends on the emphasis placed upon increasing milk yield. Cows are ranked by Dairy Herd Improvement Associations according to criteria such as actual production, production ability, or cow index. Bulls are ranked on milk yield of their daughters. These methods have excelled for selecting those cows and bulls to maximize milk output of the herd. Sire evaluation methods have evolved rapidly. Up to 25 years ago most methods of evaluating bulls used dam-daughter comparisons. The spread of artificial insemination, which gave the possiblity of having daughters of a bull in.many herds, facilitated the introduction of methods comparing daughters with their herdmates. Onset of the computer made it possible to implement recent advanCes of variance component estimation on which.most sire evaluation methods depend. One important development has been the introduction Of'the Best Linear Unbiased Prediction (BLUP) by Henderson (1975)- This methodology could compensate for the effect of 10 selection and relationships between sires. Because milk yield of the dairy cow'is measured through her productive life, several records of a daughter could be used to evaluate a dairy bull. The first, second or any or all the lactations of daughters would be adequate to estimate the genetic value. However, developers of each sire evaluation system.have chosen to use one lactation.over the others. In the progeny test of dairy bulls, selection decisions are based principally on production by daughters in first lactation. The rationale behind the use of first lactation was summarized by Cassell and McDaniel (1983), who said: "Yield of first lactation offers advantages to dairy breeding researchers. Results are available earlier, and measurements exist for more cows than from later records. Extra- neous sources of variation such as injury, days dry following lactation, and preferential treatment are less likely’to influence yield of first lactationJ' Mao (1982) pointed out other arguments for the use of first lactation records. "-that the first lactation is an adequate indicator of a cow's lifetime performance: -that the computation is much simpler without having had to include a cow effect in the model: -that the supposition of different genes and genie actions for each lactation can be avoided: -that the assumption of no sire by age of cow interaction can be avoided; -that the potential bias due to selection can ll be also avoided, since not all cows are allowed by farmers to continue for more lacta- tions, and those who did make later lactations records are superior for one reason or the other to those cows that were cul led; and -that the inclusion of all lactations records would only increase the accuracy of evaluation on those bulls who already had plenty of daughters, but the accuracy of their evaluation is of primary interest." There are advantages for using later lactations for sire evaluation. Shaeffer (1982) summarized some of these: ”-increase the number of records; -increase the number of records per fixed effect e.g. per like herd-year-seasons: -reduces the standard error of prediction for sire" Let us document these arguments. An implicit condition for using first lactations is that the same genes influence production in first and later lactations. A genetic correlation less than unity would mean that some genetic control of later lactations is independent of genetic merit for yield of first lactation. The range of estimates of genetic correlation between first and second lactations is from .75 to .92 (Cassell and McDaniel, 1983). The inclusion of later lactations in sire evaluations systems has some advantages over first lactation evaluation systems. One of the advantages of the use of all lactations is the increase in accuracy in evaluation of sires for milk yield. Ufford et al. (1979) in a study comparing first versus all lactations by Best Linear Unbiased Prediction (BLUP) found that the use of all lactation records decreased 12 the variance of prediction error of sire solutions so that 15 daughters per sire with all lactations gave accuracy of sire values, equivalent to 25 daughters with only first lactations. IHe concluded that the genetic progress per year from selection of bulls to sire daughters would be expected to be 10 to 15% greater for all lactation records than for only first lactation records. A major criticism of evaluation systems employing first lactation records as estimates of performance is that the working life of dairy cattle may be reduced by placement of too much emphasis on early lactations. If a cow lasted in the herd for several lactations, it means that on several occasions the milk producer has assessed her and found her satisfactory; The length of life of a cow in a herd, or longevity, does not only depend upon milk.yield but on other characteristics including reproductive efficiency and health. Robertson and Barker (1966) said: "We can then look on longevity as the ulti- mate character in.the evaluation of different ways of selecting dairy sires, not for its own sake but as a numerical indicator of the judgements of milk producersJ' Lifetime performance of the cow is a determining factor of her profitability in a dairy operation, and later records may contain useful information relative to lifetime profitability; A positive correlation between yield of first la 51. p1; by ge SE Vi haj Se. in' an: em im a 1 311] Dr: Dre lie: Stu dam l3 lactation and traits of lifetime performance such as survival, length of herdlife, total performance, total lifetime yield and lifetime profitability are found in all published work concerning these matters, the remaining ques- tions to be answered are: how accurate are they predicted by yield of first lactation and will optimum progress for genetic gain in economically important traits be reached by selecting only on first yield? (Cassell and McDaniel, 1983). There are just as many literature references concerned with disadvantages of later records in sire evaluation. Per- haps a valid criticism of later lactation is that cows with second or later lactations have been selected. This introduces selection bias into the sire evaluation. 'Wickham and Henderson (1977) suggested that large biases seem unlikely because a small percentage of cows is culled at the end of the first lactation. Some other authors have investigated whether survivors of first lactations are.not a random sample of all first lactation cows with respect to milk yield. Van Vleck and Henderson (1963) approached this problem by studying daughters of sires in six classes of production (based on their proofs). They found that a higher proportion of daughters of lower production bulls were culled at the end of the first lactation. _ In their study, cows not having second records were 11 to 18% of all daughters of high sires versus 27 to 30% of all the daughters of low production sires. Reown et al. (1976) 3e 14 studying effects of selection bias on sire evaluation in 1969 records, found that those cows culled following second records exceeded those culled after first by 675 kg in first lactation.yield. A similar'positive relationship between first lactation deviation and sequent performance and survival was found by Hinks (1966). He reported that whereas the starters had a positive deviation from herd average of +51, the survivors to second lactation had a average deviation of +363. These results indicate that survivors of first lactations are not a random sample of all first lactation cows with respect to milk yield. Another difficulty for the use of later records is genetic differences in rate of maturity. Hargrove (1974) studying the rate of maturity of dairy females, examined differences between first and later lactations from 19,000 records. Deviations were not subject to influences of genetic trend and selection on first yield. He reported heritabilities of differences in rate of maturity of .1, approximately one-fourth the heritability for milk yield. Hickman and Henderson (1955) studied the milk.yield increase from first to second lactation records and age at first freshening for data from 4,000 cows. They also found the heritability for rate of maturity was one-third to one- fourth the heritability of milk.yield. Hillers and Freeman.(l965) took another approach to measure differences between sires in rate of maturity. They 15 used the regression within actual production on age at first calving as a measurement of rate of maturity; Data were from 76 California herds with freshenings between 23 and 35 months of age. Sire differences in rate of maturity were significant. .A general conclusion of this study indicates that differences among sires in rate of maturity are real, but heritabilities are low enough to discourage direct selection. Freeman (1973) in a review article on age adjustment of production records concluded that sire differences in rate of maturity imply that age factors to adjust records to some base by a common curve may obscure genetic differences between sires. This will cause unequal ranking for sires on first and later lactation. Cassell and McDaniel (1983) concluded to the discussion on this matter: "Thus we face the problem of knowing that age adjustment factors may obscure genetic diffe- rences yet to make no adjustment for age in sire evaluation places such evaluation at the mercy of progeny age distribution. Effective removal of only those effects of age common to all cows across all herds and.progeny groups is a challenge for future workJ' There are sire evaluation systems that use all lacta- tions (USDA-Modified Contemporary Comparison), only first lactations (Canadian Record of Performance, Northeastern Sire Evaluation Program) and a combination of two separate evaluations, for first and first and second combined (Israel sire evaluation system). Although it may be safer to avoid potential problems of 16 including later lactation records in sire evaluation, there are advantages such as cow evaluations generated as a by- product of all records of a cow in a sire evaluation system. The use of all records in sire evaluation may be even neces- sary for progeny testing programs of small populations or populations with fewer dairy records, as for less developed countries like Mexico (McDowell, 1983). II of s lact vali usin Selec They . PrOce. HOlst were . The in re. eVal U Pa ’1’ "h 01 17 11.1.2 Studies Comparing Sire Values Using Eggs; Egg Later Lactations Considerable effort has been directed to study results of sire evaluation methods using first, second, or later lactations. One purpose of these studies has been to 'validate the assumption of equality of the ranking of bulls using different lactations. Tomaszewski et a1. (1975) from data from two samples of daughters of 133 Holstein bulls for only first lactations and only second lactations, reported correlations of .64 between sire values for milk.yields. They also found that a sire value for first lactation can.predict with 85% accuracy the sire value for second lactation. Wickham and Henderson (1977) studied the effect of selection in evaluating sires by second lactation data. They developed a BLUP procedure to estimate biases. This procedure was used to obtain estimates of biases for 1109 Holstein sires. Evaluations for first and second lactations were estimated with a set of equations for a mixed model. The authors found that the magnitude of the bias was small in relation to the rank correlation of .8 between sire evaluations by first and second lactation. First, second and third lactation Ontario Records of Performance were analyzed separately by Nicholson et al. (1978) to evaluate 246 Holstein sires. These authors also ana wit; metl yie. rec: laC' inc: The laci Howe Nich lact and 18 analyzed second and third records, correcting for selection with the method suggested by Lush and Shrode (1950). This method consisted of subtracting the average of the first yield to the yield of those that made a second lactation record times the regression of second lactation on first lactation yield. This adjustment for selection tends to increase the variation known to be reduced by selection. The correlation among proofs was higher for second and third lactations adjusted for selection. This range was.71 to .75. However, there were differences in the ranking of the sires. Nicholson et al. (1978) did not see any need to use later lactation for sire evaluation. Bar-Anar (1975) studied the relationship between first and second lactations using the Israeli method «ammulative difference) and data representing 106 sires with two lacta- tions from each daughters. For each sire there were at least 60 daughters in each lactation group. The author found the correlation between tests to be of .76. He suggested that for the sire test program in Israeland because no extra cost is incurred by evaluating the second lactation a gain of 7% in accuracy in the estimation of sires will give an estimated yield increase of 7.5 kg milk cow/year in daughters of proven bulls. Modified Contemporary Comparisons for first and second lactations in the same and different herds were analyzed by Cassell et al. (1983a). Data were from 200 Holstein sires 19 with 500 or more daughters in each data set. They also used an adjustment to account for effects of selection on later lactations for evaluations by the same method used by Nicholson (1978) and originally'proposed by Lush and Shrode (1950). Correlations between first lactation evaluations in one independent data set and second lactation evaluations in the other were .87 and..84. The correlation between first and adjusted second evaluation was .87 for both independent data sets. First lactation evaluations accounted.for only about 75% of the variation in second record evaluation convinced the authors of a need for some form of later record sire evaluation at least for selection of sires of sons. Lofgren et a1. (1983) studied effects of culling on sire evaluation using the same data of 677,800 daughters of 200 widely used Holstein sires, those used by Cassell et al. (1983a). From these data, 10‘, 20, and 30% of second records were eliminated based upon least yield of milk in first lactation. Evaluations by BLUP of sires were obtained separately for both records and for culled.groups. They used a model that included effects of cow when more than one lactation was used. Second lactations were adjusted for selection following the method of Lush and Shrode (1950). All correlations between evaluations of first and.both‘were .95 and were unaffected by increased culling. Because these data are the same as those used by Cassell et al. (1983a), 20 authors were able to compare two methods of sire evalua- tions. Evaluations by BLUP and MCC procedures were affected similarly by culling. In both cases, culling reduced standard deviations of evaluations by second records and re- duced correlations between evaluations by first and second records. The two sire evaluation procedures also ranked bulls nearly identically. Cassell et al. (1983b) in another study compared the impact of culling or selection on sire evaluation by three mixed model procedures: 1) single trait evaluation of first and second lactations, 2) evaluation of both lactations together including a random component for cow effect, and 3) a multiple trait procedure where first and second lactation evaluations were calculated simultaneously. Data were first and second lactation records from the dairy herd at North Carolina State University where each of 130 females produced second records regardless of first lactation yield. The relationship matrix of the 45 sires represented in the data was included in all models. Culling was simulated at intensities of 10, 20, and 30% on deviations of first lactation from population means. The main conclusion of this study favor the use of the multiple trait approach because of its reduction of standard errors of prediction. The use of both lactations, according to the authors, appears to be a reasonable alternative for sire evaluation until multiple trait procedures become computationally 21 feasible. One can summarize the results of these studies comparing the sire evaluations using first and later lactation. The correlation among them is high but not perfect. Rankings of sires change from one evaluation to the other. Sire values for first lactation account for only from 75 to 85% of the sire evaluations for the second indicating a need for the evaluation for later lactations at least for sires of sons. These studies also indicate that selection bias cannot be completely responsible for differences in sire evaluations. II.l tere firsi condL paran (1960 milk lacta‘ third that h unity, lactat Ba animal£ nested .35, .24 lactati Ones av. average But on rec: pairs or first th were obt 22 11.1.3. Genetic Parameters 3g Milk Yield for First and Later Lactations. Emphasis on selection on first lactation yield has cen- tered attention on the same genes influencing production in first and later lactations. A number of authors have conducted studies with the purpose of estimating genetic parameters for first, second, and later lactations. Freeman (1960) found by daughter-dam regression heritabilities for milk of .36, .24, and .26 for first, second, and third lactations. Genetic correlations among first, second, and third lactations were between .93 and .98. He suggested that because these correlations are consistently less than unity, different sets of genes influence milk in different lactations. Barker and Robertson (1966) using records of 10,965 animals, 80 bulls, and a model with all random effects nested within fixed effects reported heritabilities of .35, .24, and .23 for milk yield of first, second, and third lactations. Genetic correlations of first yield with later ones averaged .80 but between second and third yields averaged .91. Butcher and Freeman (1968) using 12,500 Holstein lacta- tion records estimated the relationship between various Peirs of lactations of the same cow and heritabilities of first through fourth lactations. Heritability estimates Ware obtained by intrasire regression of daughter on dam. 23 Heritabilities were .37, .25, .35, and .40 for first, second, third, and fourth lactations. Repeatabilities estimated by intracow correlation were from..54 to .50 for the first through fifth lactation. The same authors in another study (Butcher and Freeman, 1969) used the same data to estimate genetic parameters and relationships between.pairs of lactation by five procedures. 1) Intraclass correlations for all cows that have the first record of the pair, regardless of whether the second record was present. 2) Intraclass correlations using only those cows with both records of the pair under consideration. 3) The regression of the second record of the pair on the first record of the pair. 4) A Maximum Likelihood procedure described by Curnow (1961). 5) A procedure to estimate parameters free of the effects of selection on the independent variable developed by the author. The maximum likelihood procedure (Curnow, 1961) works as if the only factor that determines if a cow has a second record is the magnitude of her first one,‘with no culling, variances of all records are equal, and all first records are available, whether the cow has a second record. The method derived by Butcher and Freeman (1969) constructs estimates of variances and covariances free of effects of selection by regression techniques, and then these relationships are computed by simple correlations. we: inc The par to pro aCC: how: effe Proc 24 Ranges for heritabilities for first to third lactations were .17 to .39, .11 to .35 and .11 to .40. These results indicate differences in methods of computing heritabilities. The authors suggested that the method used to estimate these parameters should be determined by how well the data conform to conditions necessary for unbiased estimates. The procedure derived to remove effects of selection seemed to accomplish the desired results. The authors suggested, however, that to establish firmly that the procedure removes effects of selection would have to be done by simulation procedures. Perhaps the most comprehensive attempt to investigate variance and covariance components, heritabilities, and genetic and phenotypic correlations between first and second lactations milk records was the study of Rothschild and Henderson (1979a). They used data of 423,314 first records and 339,182 selected second records. The authors extended the Maximum Likelihood.(M1) algorithm presented by Henderson and Quaas (1977) to estimate sire and error variance and covariances for the two trait model when records on one trait were missing. The authors included the error covari- ance because in a sire evaluation model, no cow component is included. Therefore, to estimate variances and covariances for first and second lactations an error covariance must be included because the same cow has both records, and therefore covariances between errors must exist. From result. mates ( select practi< requiri hours : ties f. Correl. author: first . T. °°rre1 to CHI daught Maximu rait editin third herd~y e‘feCt in thi 25 results with simulated data the authors found that ML esti- mates of variances and covariances were unaffected by 50% selection within fixed effects. The authors discussed the practicality of ML estimation, and.they mentioned that time required for one round of iteration was between 28 and 30 hours for an IBM 370-138 with 512 K of memory; Heritabili- ties for first and second records were .41 and .35. Genetic correlation between first and second records was :92. The authors estimate matrix of error variance-covariance for the first two lactations is as follows: 900,561 507,430 507,430 1,032,391 . Tong et a1. (1979) estimated heritabilities and genetic correlations for the three lactations from records subject to culling. Lactation records on 13,544 cows representing daughters of 90 sires. The authors used a Restricted Maximum Likelihood (REML) procedure derived from a multiple trait model discribed by Schaeffer et al. (1978). After editing, the data yielded 5,036 second lactations and 1,500 third lactations. Their model contained fixed effects of herd-year-season and age of cow at calving and random effects of sire and error. Error covariances were ignored in this model. After ten rounds of iterations 26 heritabilities of milk yield for first, second, and third lactations were .26, .19 and .17. The genetic correlations were between first and second .89, first and third .85, and second and third..89. This study shows that genetic correlations between adjacent lactations are higher than between nonadjacent lactations. Meyer (1983) developed a REML procedure for estimation of genetic parameters for later lactations of dairy cattle. She derived an algorithm for a two way mixed model allowing for missing observations while taking account of residual covariances. The author presented an example of data consisting of 2,247 first lactations, 1701 second lactations, and 1,186 third lactations of British Friesians. With small herd- year-season (HYS) subclasses, the data structure was improved by adding records for daughters of proven sires calving in the same herd-year-season. .As variation between proven sires is expected to be reduced by selection of proved sires, these sires were treated as fixed effects (i.eq, their daughters contributed information to the components within but not between sires). The model was also extended further by adding four regression coefficients per lactation to correct for the systematic effects of lactation length (linear), month of calving within season (linear), and calving age (linear and quadratic). Heritabilities of milk yields for first, second, and third lactai correl betwee The au merely variat mate 0 lactat 27 lactations were .33, .34, and .28. Estimates of genetic correlations were 1. between first and second lactation .98 between second and third, and.83 between first and third. The author mentioned that the data set considered was chosen merely to illustrate the procedure: therefore, sampling variation was too large to allow inferences. Authors' esti- mate of error variance covariance matrix for the three lactations is as follows: 557.31 371.67 405.91 371.67 928.51 565.45 405.91 565.45 1,008.67 . The papers of Rothschild and Henderson (1979), Tong et al. (1979), and Meyer (1983) are attempts to apply Maximum Likelihood and Restricted Maximum Likelihood procedures to account for bias from cow selection. The REML algorithm used by Tong et al. was for residual covariances of zero. Because they are repeated measures on the same cow, this is not appropriate. The authors argued however, that the error covariances are likely to be small relative to error variances. The algorithm derived by Rothschild and Henderson ( 1979) is a ML procedure, which means that variance components estimates were biased upwards by loss of degrees of freedom from estimating herd-year-season effects. This bias from ML is greatest for error components because the 28 denominator is the number of observations. For REML the degrees of freedom are the number of observations minus the number of fixed effects that is used to estimate variance components. Estimation of genetic parameters for later lactations has been an important subject of study for animal breeders. A purpose has been to try to demonstrate that different lactations are genetically the same trait. Failing to demonstrate this would justify the use of first only records or multiple trait procedures for sire evaluation for milk yield. Estimation procedures often are subject to bias from selection because cows that remain in the herd are likely to have been selected for higher yields. Later lactations have been used as typical example for analysis of data under selection (Henderson, 1975). The genetic correlation between lactations is greater than.80 and genetic correlations between adjacent lactations are higher than between nonadjacent lactations. Several authors have speculated that the significant difference of the genetic correlation between lactations may have something to do with a interaction of sire by age or sire by parity. Sire variance across all three lactations also differed (Tong et al.,1979). This may indicate that the sire effect is different for each lactation. The difference of genetic correlations could be explained by differ be use is gen H for 1a Possib M: mental and man 29 differences in sire effect in rate of maturity. This could be used to support that milk yield in different lactations is genetically the same trait. Heritabilities for milk yield are consistently lower for later lactations. Butcher and Freeman (1968), listed possible explanations for these differences as they said: "a) If all genes have equal effects, first lactations are controlled by more pairs of genes than second lactation or, if the same number or genes control both lactations, they have larger effects on first lactation. b) The presence of a genetic maternal effect that gradually decreases in importance could cause the estimate of heritability of first lactation to be larger. c) The presence of constant genetic effects on second lactations would lower estimates of heritability of second lactationsJ‘ Most authors seem to favor the opinion that environ- mental factors in the lifetime of the cow, such as diseases and.management, will.increase variation of later records.This could obscure genetic variation resulting in different estimates of genetic parameters. II.2. measu may are u once 1 treaty inV951 treatm a sequ includ design: Ir methods rat 10“ a Pointed a block to t study t L Tim. IR: 1‘“ ch er: 30 11.2 Analysis 2; Sequential 9; Repeated Observations 11.2.1 Repeated Measurements Experiments It is practice in a wide range of experiments to repeat measurements over time on the same experimental units which may be a plot, animal, or plant. Experimental designs which are used in these situations are: (1) Those in which experimental units are treated once only during the experiment or receive the same treatment repeatedlyu These include those experiments that investigate growth curves and profile studies like block.and treatment designs. (2) Those in which each experimental unit receives a sequence of treatments over successive times. These include rotational trials, change over and superimposed designs. In animal experiments, analysis by repeated measurement methods has some advantages. Gill (1978) reviewed the rational for performing experiments in this manner. He pointed out three major reasons, (1) to use each subject as a block to conserve resources and reduce experimental error, (2) to test for residual effects of treatment, and (3) to study trends of individual responses to treatments. Time is a factor in such experiments and the issue which arises relates to possible correlation between 31 successive observations. Gill and Hafs (1971) referring to this problem said: "the structure of underlying causes of ‘variation often is over-simplified in analysis of such data. Simplifications usually’fall into one or two categories: 1) Analyzing factorial experi ments as if completely randomized, when the structure is really a split plot and 2) Ignoring the correlations of errors induced by repeated measurements." Recognition that measurements within animals are likely to vary less than those among animals leads to the study of the random effect of individual subjects. This characterizes the split plot design where the error term no longer represents variation among homogeneous subjects treated alike but measures variation within subjects treated alike. Constancy of error variance and independence of errors, the underlying assumptions for univariate analysis of variance, often are violated in the analysis of repeated measurements. In the analysis of repeated observations, error terms from different times are correlated. When this occurs, it is said that the error terms are autocorrelated or serially correlated. This happens when the errors associated with observations in a given time carry over into future times. When error terms are functionally related to the mean, for example, periods with larger means also have larger varia: will 1 over c occurs ment c are tr: monitoj whe lease her a $pr 32 variances, and constant error variance or homoscedasticity ‘will be unreasonable. ‘When the variance is not constant over observations, it is said that heteroscedasticity occurs. To illustrate these conditions in the repeated measure- ment case, let us examine a simple model in which n animals are treated alike and each animal is measured r times to monitor changes of variable y. Let us write a model describing this relatonships. Y1 1 x1,t X1,t-1 x1,t-2 - ° ° x1,r b0 61 Y2 1 x2,t x2,t-1 x2,t-2 ° ° - x2,r b1 82 e e e e e e e e e b2 e - + yn l xn,t xn,t-1 xn,t-2 ° ' ' Xn,r br 3n where: y a column vector of observations for the dependent variable y, x =- matrix giving r repeated measurements of variable x, the first column of 1's representing the intercept term, b - column vector of unknown parameters bl,...,br, e s column vector of n error terms associated with each observation. . The previously mentioned underlying assumption of least squares methods on the error term can be represented more explicitly by deriving the error variance by rules of expecta We Pe 33 expectations. We can write: e2 (e1, e2 . E (e e') a E . Performing the multiplication we obtain: 3191 e182...elen E. 3231 e232...ezen en 61 en e2 en en the pre W11 othemi: E( e Rh 34 Applying the expectation operator E to each element of the preceding matrix we obtain: E(elel) E(elez) . . . E(elen) E (e e') - E(ezel) E(ezez) E(eze n) E(enel) E(enez) . . . E(enen) With homoscedasticity and no correlation serial or otherwise, the matrix reduces to: Iv.0 o! E(e e') 0 Ve . . . 0 0 0 . Ve because E(etet) - Va and E(et er) - cov et e s 0 where r f t . This is a homogeneous variance-covariance matrix. In other repl ic observ terms Ve To situatio gr” dis E(e 35 other words, each error term, observed across all possible replications, has the same variance regardless of the observations and there is no correlation among the error terms associated with different observations. E(e e') also can be written by an identity matrix: 1 o o O O O 0 Vs 0 1 0 . . . 0 II VeI . 0 0 . . . 1 To visualize the heteroscedastic (heterogeneous) situation where the error term for each observation is drawn from distributions with different variances but without correlation among error terms, we write E(e e') as: V81 0 0 e e e 0 E(. a.) - o V62 e e e o 0 0 . . Ven matri) the er 9t asa 36 To illustrate a heterogeneous variance-covariance matrix let the error term in period t, et be a function of the error term in period t-l plus a random component Vt et - p et_1 + Vt . By extending this equation backwards one can write each et as a function of only previous Vt °t ' ”2 Vt-z + p Vt-l + Vt ' p3 Vt-3 + p2 Vt-z + R Vt-1 + vt - SUM(i- 1,t) p1 vt-i‘ This is if that the series extends prior to the beginning of our sample of observations, so that even our first observation corresponds to a high value for t. As t becomes large with p<1, the series becomes: (1 + p2 + p3 + . . .) vt - 1/(1-p2) vt The variance of the error term in autocorrelation can be obtain from the expectation, this can be written: E(ezt) - E (ezt) + p2 E(e2t_1) + p4 E(e2t_2) + . . a Ve (1 + p2 + g4 + . . .) - Ve/(l-B) Tl expects Conse correla VCV a 37 E(et e't-s) - Vep. This expressions for E(etz) and E(e e') imply that the expected correlation between 3t and et_1 is p. Consequently, E(et et-l) for first order serial correlation the variance covariance matrix could be written: 1 p n2 . . . pn'l vcv - Ve p 1 p . . . en'2 ”11“]. fill-2 ”11-3 There are several ways in which the errors can be asso- ciated: periods of this association also could extend more than a previous observation. The number of periods of this association is known as the order: i.e., a second order process assumes that observations made 2 and 1 periods before the present observation are correlated with the present observation. Later in this review some other types of relationships will be presented. Whatever the extent or number of lags influencing the current observation, there is a correlation of repeated observations. This is a major problem in analyzing repeated measurements. Gill and Hafs(l97l) indicated that experience calls for examin decidi: univar varian typica; the in} illust to spl; mals. having Standaz Pmcedu Th measure “Sic a U-H-r'f‘HmdnL—Irf :1: O 38 examination of the variance covariance matrix before deciding the analytical approach because the validity of the univariate split-plot analysis depends on the homogenous variance-covariance matrix. The authors in presenting the typical split-plot case to animal scientists, pointed out the importance of random assigment of animals. They also illustrated the severe errors of interpretation if one fails to split residual errors into portions among and within ani- mals. It seems to be a concensus that for balanced data having a homogeneous variance-covariance matrix the standard univariate analysis of variance is the best procedure for analysis of repeated measurements. There are other procedures used to analyze repeated measurements. Multivariate procedures do not require the basic assumptions of homogeneity of variance and independence of errors because they deal with simultaneous variation of two or more variables (Sokal and Rohlf,l969). By this method repeated observations are treated as if they were different variables. Gill(l981) listed some of the advantages of multivariate procedures: "It offers opportunity to examine not only the original data but multivariate sets of linear functions (usually'contrast) of observational units without requiring equal variances and covariances among these unitsJ' He also added: "The principal advantage of multivariate analysis is that it permits valid tests of the repeated factor(e.g. time) and its interactions with treatments without justifications." However, Morrison (1967) demonstrated that unless the seri abso than race} data been that repea there are t leasu combi each mEasu 39 serial correlation between two periods exceeds .5 in absolute value, univariate procedures are more sensitive than the multivariate T2 test (Hotelling's T). It has been recommended (Gill,l978) that lowly but uniformly correlated data should be analyzed by univariate methods. Anotherdrawback of the multivariate procedures has been pointed out by Gupta and Perlman (1974). They found that the power of the test declined as the number of repeated measurements in one animal (unit) increased: therefore, one has to be careful with inferences when there are fewer animals than sampling points. There are other approaches to the analysis of repeated measurements, Gill (1979) presented a procedure that combines significance levels of correlated univariate tests, each performed on data from a different periods of measurement. This method is based in intratreatment correlations between data from all pairs of periods. Another approach to deal with the problem of serial correlation is to adjust for serial correlation or to use time methods such as autoregression analysis. Chistian et a1. (1978) opted to remove serial correlation by a moving average time series process of first order as described by Box and Jenkins (1970). The data generated by the time series analysis later were analyzed by conventional methods. The authors fail to provide theoretical support to the basis of their analysis.‘ Gill (1981) mentioning this method cement value b Ga. study 1: efficie An roduct Curves Parana: those ¢ was as: Correl The au- Vere 3 Value 40 commented that users of this procedure were skeptical of its 'value because of its low power. Gallant and Goebel (1976) demostrated in a monte carlo study that the autoregressive estimators are the most efficient linear unbiased estimators in finite samples. Anderson (1981) in a study of the Tribolium egg production curves as a model for dairy cattle lactation curves used this approach. He compared estimates of parameters obtained with the autoregressive process with those obtained with standard regression. In this study, it was assumed that only three previous egg counts were correlated with the current count, i. e., only three lags. The autoregression coefficients of each of the three lags were solved by Yule-Walker equations. To determine the value of the autoregression coefficients, autocorrelation from ordinary least squares was estimated. The original data then were adjusted by the coefficients to remove autocorrelation. The author postulated that autoregression, which is based on the autocorrelation between ordinary least squares residuals may be considered a transformation of the data. Estimates of parameters were similar for each method (autoregressive and ordinary least squares). Therefore, the author concluded, "Autoregression analysis improved the accuracy of the estimates of error variance and not the accuracy of estimates the model parame- ters.” 1 induce cattle to cor neasur hetero for a hetero< Correc1 5P1 it-} way the 41 Kesner et al. (1981) in a study to show that estradiol inducespreovulatory Luteinaizing Hormone.(LH) surge in cattle, used a similar approach to that of Anderson (1981) to correct for autocorrelations caused by the repeated measurement nature of the experiment. Facing a heterogeneous variance-covariance matrix the authors opted for a logarithmic transformation to correct for heterogeneous variance and an autoregressive process to correct for serial correlation. Further, they performed a split-plot analysis with transformed data improving in this way the accuracy of the estimate of error. St observe princid researc linear standar multiva °Pinion linear models. Particu 42 11.2.2 Analysis 9; Time Series Data Studies in which the responses of each individal are observed on two or more occasions represent one of the principle research strategies in animal and medical research. Much of the literature on specifying and fitting linear models for serial measurements use methods based on standard regression, analysis of variance (ANOVA) and multivariate linear models (Ware, 1985). Harville's (1977) opinion on this matter is that it is a mistake to think of linear models only in terms of ordinary regression and ANOVA models.'To do so is to miss many potential applications. In particular, not all the observations may’ have been taken at the same time so that the observations are regarded best as time series. Such time series data are common in many fields, such as economics. Ware (1985) indicated that analysis of serial measurements should be viewed as a univariate regression analysis of responses with correlated errors. Such a formulation suggests new and more flexible approaches to modeling and estimation of parameters For example, Wilson et a1. (1981) presented the methodology for estimation of biological parameters when the within individual observations errors are autocorrelated. It is timely now to discuss approaches to the study of change or behavior of a variable throughout time. This can be derived intuitively from.the study of repeated measurements. As an example, consider a series of mease be re the h may h varia to fa EXpla Varia varial it is Varial 43 measurements of the variable y throughout time. This could be represented as a time function y(t), which may represent the historical performance of some biological variable: it may have moved up or down partly in response to other variables. However, much of its movement may have been due to factors that one simply may have not been able to explain. It may be difficult to relate to other biological variables, or data are not available for those explanatory variables that are believed to affect y(t). In this case, it is no longer possible to predict future movements in a variable by relating it to a set of other variables: instead the basis of prediction depends solely on past behavior of the variable and that variable alone. The study of a single sequence of observations is called time series analysis. This is the study of repeated observations of a single variable, e.g., the study of a hormone during the estrous cycle. If some kind of, systematic behavior like a trend or a cyclical pattern is in the variable to study, one can attempt to construct a model for the time series which does not offer an explanation for its behavior in terms of other variables but does replicate its past behavior in a way that might help to forecast its future behavior. The choice of a time series model usually'will result in cases where little information is known of the determinants of the variable of primary concern, and a suf: thrc move out the 44 sufficiently large amount of data is available to construct a time series of reasonable length (number of measurements throughout time). A time series model accounts for patterns in past movements of a particular variable. Durbin (1960) pointed out the purposes for studyng time series was to 1) describe the data, 2) explain the behavior of the data, 3) forecast the behavior, 4) control the system, and 5) study simultaneous variation of several series. A simple view of the time series model is to think of it as a sophisticated method of extrapolation: however, difference arises because time series analysis presumes that the series has been generated by a random process: it is assumed that each y1, y2, Yt in the series is drawn randomly from a probability distribution. In this way, it is possible to infer something about the probabilities associated with alternative future values of the series. Theoretically, it is assumed that the observed series y1“.yt is drawn from a set of jointly distributed random variables, i.e., that there exists some probability distribution function, guy1.uyt) that assigns probabilities to all possible combinations of Yt' Because complete specification of the probability distribution for a time series is almost always difficult, especially if the series has more than six data points, it is necessary to constuct a simplified model of the time ser. be; acc: The but Nels actu 45 series which explains its randomness. For example, it may be assumed that the yd,uu,yt are correlated with each other according to a simple first order autoregressive process. The actual distribution of y's may be much more complicated, but this simple model may be a reasonable approximation. Nelson (1973) talking about the problem of finding the actual joint distribution concluded: "In practice, of course, we must first attempt to infer from the data what mechanism it is that generated the data. Thus, the past history of the time series is called upon to do double duty: First it must inform us about the particular mechanism which describes its evolution.through.time and.second, it allows us to put that mechanism to use in forecasting the future" Another characteristic of time series that one should know is whether the underlying random process that generated the series is invariant with respect to time. Such a series is called stationary: this requires the series to have a constant mean and to fluctuate about that mean with a constant variance. This is because if the random process is fixed in time, it is possible to model the process with an equation with fixed coefficients that can be estimated from past data. A non-stationary series will be one that has been chan- ging in average over time, presenting a trend: as an example we have some economical variables like.the.Gross National Product which has been growing steadily: for this reason alone in 193 P few of static] encount one or or vi], 1 Th differs: Order 0: 46 alone its random properties in 1980 are different from those in 1933. Pindyck and Rubinfeld (1981) mentioned that probably’ few of the time series that one meets in practice are stationary. However, many of the non-stationary time series encountered have the property that if they are differenced one or more times, the resulting series will be stationary or will be called homogeneous. The number of times that the original series must be differenced before a stationary series results is called the order of homogeneity. A first order homogeneous non- stationary time series (wt)‘will be: wt ' Yt ' Yt-i ' DYt where: wt - observations t of the w stationary series resulting from the subtracting of the observations Yt - Yt-l of the nonstationary series. If a series happened to be second order homogeneous, the series would be stationary by subtracting the first differences. It can be shown in this way wt ' DzYt ‘ DYt ' DYt-1° Time series data often are analyzed on the basis of line: proce proce (Box analy formul °rthee and Con 47 linear models in which errors are generated by stochastic processes like moving average processes, autoregressive processes, or mixed autoregressive moving average processes (Box and Jenkins,l970) Let discuss some of the basic models in time series analysis. Each observation can be represented with this formula: yt - mt + St + at where: Yt - observation in time "t" mt - trend or trend cycle effect St - periodic or seasonal effect °t - random error. As we discussed earlier, by differencing we can eliminate mt or the trend or trend cycle. The formula will be reduced and could be written: yt I St + at. The seasonal variation (St) in a time series can be removed by seasonal adjustment techniques that basically’consist of a complex averaging of past estimates. National economic data in the United States usually'is adjusted by the Bureau of Census of the U.S. Department of Commerce (X-ll Variant of the Census Method 11 Seasonal Adjustment Program, 1967) by thi hehavi cycles other produc decrea curren M averag antoree I] a weig) More. past Va A Obsel'v; “Ito 48 by this method of adjustment. Seasonality is a cyclical behavior that occurs on a regular calendar basis, i.e., cycles with a periodicity that is annual, monthly or any other unit of calendar time. For example, Peruvian anchovy production shows seasonal once every 7 years in response to decreased supply brought about by cyclical changes in ocean currents. Models used to study time series are: 1) moving average, 2) autogregressive, and 3) mixed model, the autoregressive-moving'averagetmodel. In a moving average model, Yt is derived completely by a weighted sum of current and lagged random errors. In the autoregressive model, Yt depends on a weighted sum of its past values and a random error term. A moving average model of order q where each observation Yt is generated by a weighted average of random error going back q periods (lags) can be written in the form of an equation and denoted by MA(q): yt ‘ M + at - 0 et_1 - 92 et_2 - eee Sq Bt_q where: 91 . . . eq a fixed parametes et_1 . . . et-q a error terms times t to q M = mean of the series. An autoregressive model of order p where each observation y is ge going the C1 denot where: 49 is generated by a weighted average of t past observations going back p periods, together with a random disturbance in the current period can be written with this equation and denoted as AR (p): Yt - pl Yt-l + 82 Yt-Z + . . . + 0p yt-p + M + at where: p1 . . . pp - weights of 1 to q past observations M a constant associated with the mean of the TS et - error term for time t. An example of a simple autoregressive model would.be the AR(1) or first order autoregressive process that can be written in the equation form: Yt ' A Yt-i 1 et' The number of past periods used in these models is referred to as the order, the number of lags, or the memory of the series. Identifying the order of the underlying process is a major problem. The knowledge of the autocorrelations will indicate the appropriate order. Many time series cannot be modelled.aSjpurely’moving average or as purely autoregressive. For these cases a logical extension of the models AR and MA is used. (M) hm , It 3 I The cc I differ a mode autore denote indica T in ani: them 1 Series RESQar SerieS lactat Amthe. satisr. animal 50 A mixed autoregressive moving average process of order (p,q) can be written with this equation and denoted as an ARMA.(p,q) mOdel: The components of this model already have been defined. If a series is non-stationary, it can be homogenized by differencing the series to produce the stationary series wt: a model using this series is said to be integrated. .An autoregressive model with a differenced series will be denoted as ARI (p, d) were the number of differences used is indicated by the letter d. Time series models presented have not been used widely in animal breeding. Perhaps one of the reasons for not using them is that traditionally the traits of interests are short series, i.g., total lactation yield, annual wool production. Researchers have not thought of these observations as time series data. But other important traits like growth and lactation curves have been considered time series data. Another deterent for the use of these models is because satisfactory methods for analysis of serial measurements in animal breeding are not applied easily. IIIiIMATERIALS AND METHODS 111.1 Data 111.1.1 Source 2; Data The "Asociacion de Criadores Holstein-Friesian de Mexico" sponsors a DHIA.(Dairy Herd Improvement Association) program. Records are processed by the regional Center at Provo, Utah, and completed milk records are forwarded to the United States Department of Agriculture (USDA) and Cornell University. Fat test are not recorded in Mexico. The herds consist mainly of cattle registered with the association plus grade Holsteins and crossbreds of other breeds. A number of cows were imported from the United States and Canada. Semen used to breed these cows is from either the United States, Canada, or Mexico. The data bank contained 152,331 records for calvings from 1969 to 1980. These data are 45,655 first lactations, 34,472 second lactations, 26,345 third lactations, and the remaining are fourth and later lactations. The record format is similar to that adopted at the 1966 National Computing Workshop and presented in the Dairy Herd Improvement Letter ARS 44-22 (1970). This population may not be a true random sample of the Mexican dairy cattle population because animals are identi- fied. They should constitute a superior population because 51 weDE in ten III.1. Restric houp w COanuh 52 the DHI herds in Mexico are under better management and are in temperate areas of the country. 111.1.2 Data Screening and Editing Procedures Records were deleted for several reasons including: Breed other than Holstein, Incorrect Identification, Unidentified sire, and Abnormal termination of lactation. For the purpose of this study it was necessary to form a subset of data with the records of cows that completed the first three lactations. There were 5259 cows with three consecutive lactations. Also no cow in this subset changed herds during the three first lactations. In addition to the previous requirements, more records were deleted for the following'reasons: Sires having fewer than six records, Lactations following in.a single herd-year-season group, and Records of daughters of sires disconnected. Restrictions on the number of daughters per sire to six was arbitrary. Lactations falling in a single herd-year-season group were deleted because all other effects would be confounded with herd-year-season and would have not contri- buted number Al the ii the th clinat througi Mexico WM 0; temper. Tl each 1 357, a. lactat 53 buted to estimation. For this reason there are different number of cows in each lactation. After screening and editing there were 3989 cows for the first lactation, 3985 cows for the second, and 3990 for the third lactation. They were distributed in 9 years, 2 climatic seasons, and 80 herds. Seasons are dry (October through March) and rainy ( April through September). In Mexico seasonal management changes are determined by the type of feeding and feedstuffs available, not by temperature. This is reason for only two seasons. There were different numbers of herd-year-seasons for each lactation also. Number of of herd-year-seasons are 361, 357, and 371 for the models for first, second, and third lactations, respectively. 111.2 Strate to Calculate the Error Variance-Covariance Matrix To understand methods of this study an overview of the steps needed to construct the error variance-covariance matrix of three lactations is necessary. In analyzing dairy records there have been three approaches taken by researchers. When they studied all or several lactation of cows, they used.the so-called.multiple lactation model. This model incorporated the cow effect and could be described by: where When lacta where In th, and e: multi} mixed that c a“ltiw c°lums yi to ; 54 yi - Fixed and Random Effects + Cow + ei where: e1 - random error. When only one lactation was studied at a time, a single lactation model is used, and it could be described by: Yi - Fixed and Random Effects + “i where: u. 1 ' COW' 1+el' In the latter case the residual “i contains the cow effect and the error effect e1. Another approach to study multiple lactations is multiple trait analysis. This procedure is an extension of mixed procedures designed to analyze simultaneously traits that can.be explained by similar'model equations. Unlike multivariate analysis where Yi are lined up side by side as colums of a matrix Y, the multiple trait operation requires Yi to stack up on top of the next to become a long column vector, y. Therefore, a basic multiple trait model is: where: 55 yl x1 0 o bl 21 o o l sl n1 0 - O O O O + O O O O + I where: Y1 is vector of trait 1 (lactation 1) equation being X fixed effects and Z random effects for traits 1 to t, b1 is vector of solutions for fixed effects, s1 is vector of solutions for random sire effects, “1 is vector residuals lactation 1. Each lactation is a different trait instead of repeated observations of the same cow. There are several multiple trait algorithms that have been used in the later lactation model. Walter and Mao, (1985) compared assumptions and statistical properties of each of them. Tong et a1. (1979), treated each lactation as a different trait with error covariances zero. Rothschild and Henderson (1979) and Meyer (1983) developed algorithms that do not require zero residual covariances. The approach used in this study to calculate the variance-covariance matrix was to fit a single lactation model to each of three lactations to obtain solutions. Estimates Yij were obtained by each observation Yij adjusted for so by sub form a residu as $1.10 56 for solutions. The individual residuals then were computed by subtracting Yij from observation Yij' Estimated residuals “ij of ith lactation in the jth cow form a vector of residuals. With the three vectors of residuals, one for each lactation, the matrix E was formed as suggested by Morrison (1967k “11 “21 “31 E ‘ “12 “22 “32 “in “2n “3n The E matrix then is premultiplied by E' (i.e., 15']: is pro- duced) to obtain the sum of squares and products SSCP matrix, which divided by the degrees of freedom becomes variance-covariance matrix VCOV named S. Steps to estimate the variance covariance matrix S is summarized: -Obtain solutions from fitting a single lactation model to each of the three lactations. -Estimate Yij from solutions. -Subtract Yij - Yij to obtain residuals “ij' -Form matrix E with the vectors of residuals uin' III.3. applie Where: 57 -Produce the SSCP matrix by multiplying E' with E. -Divide SSCP by degrees of freedom to obtain 8. 111.3 Medel 111.3.1 Equations and Characteristics 9; Model A model describing each of the variables of interest was applied to each lactation. Ytijk ' Hut + Sti + HYSij + ftl DIMtijk + °tijk where: Ytijk is the unadjusted milk yield for the tth lactation (t - 1,2,3) for the 1th sire in the jth herd-year-season from a population of registered Holsteins in Mexico DHI, having their sires identified and lactating from 1969 to 1980. Nut is the tth mean of the named fixed effects. sti is the random effect of the ith sire in the tth lactation: i - i, ..., 193: HYStj is the fixed effect of the jth herd-year-season in the tt'h lactation in which a cow freshened. The j has andifferent number of factor depending of the lactation 58 being fitted j = 361, j = 357, and j = 371 for t = 1,2,3. ftl is the regression coefficient of Ytijk on DIM Dn‘tijk is the effect of days in milk in the tijkth subclass, and etijk is the error effect associated with Ytijk' Factors HYStj and DIMtijk are fixed effects and Ytijk' stir and etijk are random. E(y) - Xb, and the variance of y is ‘V - 2 G 2' + R. Because the objective of this study is to examine the variance-covariance structure of the residuals, no specific assumptions about errors are set other than multivariate normal distribution of the three dimensional vector of errors corresponding to the three lactations: Cov (s, e):- 0 is no correlation between error and sire: (Vs) - G is that there is no covariance between 3's, in other words, no additive genetic relationship and independent sampling between s's: and that each sire effect is drawn from the same population with mean zero and variance Vs. Sire effect, 3, is normally distributed: no correlation between the ranking of sti and the number of observations for sti’ interactions of sire with herd-year-season effects are 59 trivial and negligible: and sire and residual effects are of primary interest whereas herd-year-season and days in milk.are nuisance factors. Presenting the model in matrix form one obtains: y - Kb + Zs + e where: Y X is the observation vector on either the first, second, or third unadjusted lactation is an n x p incidence matrix where n - number of cows and p is the number of parameters to estimate, number of herd-year-seasons (HYS) and one for column Mu. It contains 1's and 0's correspondingto the presence or absence of observations in the HYS: and for observation a 1in the column for Mu and the observation for the covariate. is a vector of length nHYS+l containing the unknown constants of the fixed effects b'-[ u, HYSleee HYSn’ b1] is a n x 193 incidence matrix containing 1's and.0's corresponding'to the presence or absence of observations within each sire. is an 193 by 1 vector of unobservable random effects for s, 31 . . . s193 is a n by 1 vector of unobservable random residuals corresponding to y. Characteristics of the model also can be expressed in matrix form: 30?) ' Kb E(s) - 0 wher« Vari; 6O COV(s,e) - 0 vy - V'- 262' + R where: G - Vs - E(s's) ' EIS' E(3)] [5 "' E(SH' $1 [31 82 e e e 519310 8193 For s all covariances are zero and both.have homogeneous variance i.e., 1 Va. Va 0 . . 0 0 Ve . . 0 G - e e e e e a 1193ve 0 0 . . Ve R is then R - Ve - E(ee') = E[6 ‘ E(3)] [e ' E(e)]' 61 31 [31 32 e e e $4076] . E3 32 For each independent model it is assumed that R has homogeneous variance, i.e., IVe. Ve 0 . . . 0 0 Ve . . . 0 RE 0 O O O O O -Inve O O O 0 ve In this way the variance-covariance matrix for all random factors of each model can be written y vn ZVS InVe Var s - 119 3V3 0 e Sym InVe 62 111.3.2 Numerical Example To describe methods used for setting up and solving mixed model equations, it is often beneficial to do so in an example. A hypothetical example of ten cows in three herd-year- seasons (HYS) with two covariates is illustrated. Tables 3.1 and 3.2. Table 3.1 - Data for example problem. Data are in Milk Sire Yield Cow Identification HYS kg/100 Cov Cov 2 1 l l 600 396 305 2 l 2 700 401 380 3 l 3 800 304 280 4 l 1 580 400 370 5 2 2 420 365 305 6 2 2 560 426 360 7 2 3 910 550 410 8 3 l 1010 370 320 9 3 l 340 400 340 10 3 3 580 410 380 63 Table 3.2 - Sire by Herd-year-season Distribution. Sire HYS 1 2 3 SW1 1 2 0 2 4 2 1 2 0 3 3 1 1 1 3 $0141 4 3 3 10 The model can be written in matrix notation: y - Xb + 23 + e where: y is the observation vector of milk yield x is an n x p incidence matrix, where n - 10 number of cows, p is number of parameters to estimate, and one for columnMu. It contains 1's and 0's corresponding to the presence or absence of observations in the HYS-, and for each observation a.l in the column for Mu.and observations for each covariate. b is a vector of length 6 containing the unknown constants of the fixed effects 2 is a 10 x 3 incidence matrix containing 1's and 0's corresponding to the presence or absence of observations within each sire. s is a 3 x 1 vector of unobservable random effects for s, [Y]= 600 700 800 580 420 560 910 1010 340 580 is a 10x 1 vector of unobservable random residuals corresponding to y. I HHHHHHHHI—‘H 64 OOOHI-‘O O HHOHOOOHO 396 401 304 400 355 426 550 370 400 410 305 380 280 370 305' 360 410 320 340 380 I [b] Mu H521 HYS3 covl cov2 U) ...n OOOOOOOHPH O OHI—‘H O HHI—‘HOOOO 65 The normal equation are then: X'X Z'X X'Z Z'Z and become for the example: X'y Z'y “111 “122 “133 “114 “225 “226 “237 “318 “329 “3210 '66 [ x'x z'z 1 10 4 3 3 4022 3454 g 4 3 3 4 4 0 0 1566 1335 g 2 o 1 3 0 3 0 1192 1045 g 0 2 1 3 o 0 3 1264 1070 g 2 o 1 4022 1566 1192 1264 1652234 140666531501 1341 1180 .3.:.-‘:2..:39.§.....1.24.5. 107° .3.19.6.9.9:.34939332333354.3273.-1922. 4 2 o 2 1501 1335 g 4 0 o 3 1 2 0 1341 1075 g 0 3 0 3 1 1 1 1180 1040 i 0 0 3 covl COVz heesee E X'Y: Z'Y 1'- 2,530 1,680 2,290 2,633,360 2,249,600 2,680 1,890 1,930 67 The variance of y a vy = ZGZ' + R [Z I I G J [ 2' 100 100 Veoo'11100 vysioo OVeO 00011 010 OOVe 00000 010 3x3 010 001 001 001 001 10x3 j R Ve Ve Ve Ve 0 Ve + 0 Ve Ve Ve Ve Ve 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 3x10 68 = [ VY J Vs+Ve Vs Vs 0 0 0 0 0 0 0 Vs+Ve 0 Vs Vs Vs 0 0 0 0 Vs+Ve 0 0 0 Vs Vs Vs Vs Vs+Ve 0 0 0 0 0 0 Vs+Ve 0 0 0 0 Symmetric Vs+Ve 0 0 0 0 0 Vs+Ve 0 0 0 Vs+Ve 0 0 0 Vs+Ve Vs+Ve Henderson (1975) derived mixed model equations (MME) by a maximum likelihood approach assuming normality. He proved that fixed constants from MME are equivalent to those from normal equations of generalized least squares and that predictors from MME are best linear unbiased predictor (BLUP) with or without normality assumption. Prediction refers to the likely outcome that one would expect of a future occurrence from a class, which is randomly out of a population of classes defined by a factor. The mixed model equation MME of best linear unbiased predictors are (Mao, 1983): 69 X'RTlX x'R‘lz b X'Rle = z'nflx Z'Rle +c‘1 s z'nfly Then,“ x'n'lx develops from: I X' 1 1 1 1 1 1 1 1 1 1 0 0 1 o o o 1 o o 1 0 0 0 0 1 1 0 o 1 0 1 1 0 1 0 0 o 1 0 o 410 400 370 550 426 355 400 304 401 396 380 340 320 410 360 305 370 280 380 305 e R‘1 l/Ve o 0 0 o 0 0 o 0 l/Ve o 0 o 0 0 o 0 l/Ve 0 0 o 0 o 0 l/Ve o o o 0 0 l/Ve 0 0 0 0 l/Ve 0 0 0 l/Ve 0 0 Sym l/Ve 0 l/Ve 6x10 .4 H r: 14 H F‘ r4 14 H Similiarly: OH Because 6'1 is: l/Vs O O 0 l/Vs 0 2 1501 1335 70 x o 395 o 401 1 304 o 400 o 355 1 550 o 370 1 400 1 410 x'R'lz 3 o 2 o 1341 1075 o o l/Vs 1 1180 1040 305 380 280 370 305 410 320 340 380 . l/Ve. 71 [ z'R.‘l z + 6'1 1 4 + l/Vs O 0 0 3 + l/Vs 0 l/Ve i0 0 3 + l/Vs [ x'R‘l x 1 10 4 3 3 4022 3454 4 4 0 0 1566 1335 3 0 3 0 1192 1045 l/Ve 3 0 0 3 1264 1070 4022 1566 1192 1264 1652234 1406665 3450 1335 1045 1070 1406665 1205850 In the same way: [ z'R‘lx 1 4 2 o 2 1501 1335 3 1 2 0 1341 1075 l/Ve 3 1 1 1 1180 1040 The right hand side is: 2,530 1,680 2,290 2,633,360 2,249,600 2,680 1,890 1,930 . 72 The mixed model equations are multiplied by R, and 6'1 is multipled by Ve/Vs, which is the variance ratio for error and sire components. In doing this 12'”1 is cancelled from both sides leaving a ratio of Ve/Vs - L in the diagonal of the z'z portion. The final MME is this: 10 4 3 3 4022 3454 g 4 3 3 4 4 0 0 1566 1335 E 2 0 1 3 0 3 0 1192 1045 g 0 2 1 3 0 0 3 1264 1070 E 2 0 1 4022 1566 1192 1264 1652234 1406665 g 1501 1341 1180 3450 1335 1045 1070 1406665 1205850 g 1335 1075 1040 .....2......3..."...6........2........I;6.1..........1.;.;5...§.;:;;}V;...m......6..."...........(.).. V 3 1 2 o 1341 1075 g 0 3+Ve/Vs 0 3 1 1 1 1180 1040 g 0 0 3+Ve/Vs HYSl 2,530 Hys2 1,680 HYS3 - 2,290 COVl 2,633,360 cov2 2,249,600 ...;I... ........2.:.6..:;). 52 1,890 53 1,930 . To solve for these equations a L ratio of 15, derived from estimates of heritability for milk yield was used. The 73 heritability of milk yield in dairy cattle has been estimated to be around .25. The paternal half-sisters heritability is: h2 - .25 a 4Vs / (Ve + Vs), set Vs = 1, then: .25 = 4(1) / (l + Ve), L - Ve/Vs - 15/1 - 15. then: L = 15. Solutions are: Hys1 559 HYSZ 492 HYS3 684 COV 1 = .64 COV 2 -.54 31 7.37 82 -3.46 53 -3.91 74 III.3.3 Programming Strategy The actual labor of setting up the mixed model equations requires understanding matrix manipulations. In programming mixed model equations one cannot read the records and create the incidence ( 0's and 1's ) matrix X and premultiply it to create X'X. This would require much computer memory, making this alternative inefficient. Therefore, it was required to set the mixed model equations, directly. To reduce further the number of equations within manageable limits and still obtain the exact same solutions as it all the equations were solved, a mathematical procedure called absorption was used. (Mohammad et al. 1985). For example, in this case one may have thousands of HYS's, and solutions for HYS's in this study are not of great importance, then one may absorb HYS into sire and covariate group. As described by Searle (1971), it is possible to reduce the mixed model equation by partitioning the XWX matrix into four submatrices and the XEy vector into two subvectors. This procedure is called.absorption by partition or block absorption because it involves reduction of. equations as a block after the normal equation is partitioned. 'To illustrate this process we will absorb the HYS submatrix into the Sire submatrix. 75 For this example, only the Sire, and HYS portion of the normal equation will be used, the covariate‘will ignored for simplicity porpuses. The row and column for Mu will be ig- nored, because the sum of the effects equals the Mu row and column. The normal equation will be of this form: z'z z'x Z'y x'z x'x 7 X'y 4 0 0 2 1 1 2680 0 3 0 0 2 1 1890 o 0 3 2 0 1 _ 1930 2 0 2 4 0 0 - 2530 1 2 0 0 3 0 1680 1 1 1 0 0 3 2290 _ To absorb HYSs submatrix into sire submatrix*we*will use the following formula z'z = z'z - z'x (x'X)"1 x'z . . .(1) x'x absorbed [ 2'2 J-[ 2'1!) [ (x'X>‘1 J [ x'2 4 0 0 2 1 1 1/4 0 0 2 0 2 z'zA - 0 3 0 - 0 2 1 0 1/3 0 1 2 0 0 0 3 2 0 1 0 0 1/3 1 1 1 76 4 0 0 .5 .33 .33 2 0 2 0 3 0 - 0 .66 .33 1 2 0 0 0 3 .5 0 .33 1 1 1 1.66 1 1.33 1 1.66 .33 1.33 .33 1.33 4 0 3 1.66 1 1.33 2.33 -1 1.33 Z'ZA - 0 3 0 - 1 1.66 .33 3 -1 1.33 -.33 0 0 3 1.33 .33 1.33 1.33 -.33 1.66 . To absorb the HYS subvector 3"}! into the sire subvector y'z we use the following formula Z'YX'yabsorbed a Z'y - z'x (x'X)"1 X'y . . . (2) my 1 [ z'x amt)"1 1 WY 1 2680 .5 .33 .33 2530 y'zA - 1890 - 0 .66 .33 1680 1930 .5 0 .33 2290 2680 2588.33 91.67 1890 - 1883.33 - 6.67 1930 2028.73 -98.33 77 The new normal equations with the nuisance effect absorbed will be: 2.33 -1 -1.33 51 91.67 -1 1.33 - 33 $2 a 6.67 -1.33 -.33 1.66 83 -98.33 This procedure is, nevertheless, inadequate for large data sets and computers because it will require set up and storage of the entire matrix. An alternative method is to use a row and column technique. This absorbs one class at a time while setting up equations pertaining to the effects of interest. This technique requires that data be sorted by classes of the factor being absorbed. Additional sorting by classes of the factors being adjusted reduces the programming task, but it is not required. The mixed model equation will be constructed from all records of one HYS group at a time. After all of its records are read, it will be absorbed immediately into the rest of the equation. In this way, it never*will be necessary to store more than one HYS at a time in computer memory nor will execution time be spent to obtain solutions for HYS. This process will be illustrated with.the same data sample. The algorithm is as follows. First, data are sorted by HYS; then sires are sorted within HYS. The data 78 sorted in this manner are presented in Table 34% Table 3.3- Example data sorted by HYS and sire sorted within HYS. Milk Yield HYS Sire Cow kg/loo Covl Cov2 l 1 1 600 396 305 1 1 4 580 400 370 1 3 8 1010 370 320 1 3 9 340 400 340 2 1 2 700 401 380 2 2 5 420 365 305 2 2 6 560 426 360 3 1 3 800 304 280 3 2 7 910 550 410 3 3 10 580 410 380 Because we are going to absorb HYS into the matrix con- taining sires and the covariates,let us call this matrix 2'2 and the elements of this matrix A(i row, j column). The right hand side vector will be RES and its elements ri. For our example, a memory array of n(number of sires) + 2(number of covariate) squared for 2'2 and a vector space of n+2 for RHS. We will read the first HYS into z'z, and element Aij represents the number of daughters of the sire Aij' the ri elements of R38 are the totals of milk yield of the daughters of sire i. For the covariates part, the Aij elements are totals of the covariate for sire as well as the sums of squares and crossproducts of the covariates in the RHS, the ri are crossproducts of the covariates with milk yield. As the first HYS is read, it will accumulate in a vector the total number of observation for each sire as well as totals for the covariates in a vector (VectHYS) of order n+2 x 1. We also will store the total number of observation in each HYS in the scalar nHYS and the total milk yield in scalar YHYS' A15 z'zA A55 RHS] After reading all the records contained in each HYS we will have VectHYS and the scalars “ms and YHYS' 8O veCtHYS V1 V2 [hays] [YHys] where: nHYS - number of records in HYSi V5 YHYS . Z'Y'S in HYSi' The new elements for matrix z'z and vector RHS wil 1 be gene- rated by: Aij - Aij - Vecti x Vectj/nHYS . . . (3) r1- r1 ' VECti x YHYS/nHYS e e e (4). After absorption of one HYS, the vector‘Vect.and scalars “Hys and YHYS are zeroed out for the next HYS. With this procedure only one pass of the data is required to complete the absorption and setup the normal equations. This procedure will be illustrated with the data used previously. Remember that herd-year-seasons will be absorbed into sires and covariates reducing our matrix by three to a 5 x 5 which equals the number of sires plus the number of covariates. The first Z'ZA matrix is formed after four records falling in the first HYS class are read. 81 [ z'zA [ 51 s2 s3 Covl 2 0 0 796 0 0 0 0 0 0 2 0 796 0 770 613716 675 0 660 523180 YHys = [21530] J Cov2 J 675 0 0 523180 447925 To form the absorb matrix Z'ZA and vector formulas (3) and (4), i.e., A11 - 2 - (2x2)/4 - 1 rl A13 - 0 - (2x2)/4 - -1 A31 - 0 - (2x2)/4 - 1 r2 A33 - 2 - (2x2)/4 - -1 A14 - 796 - (2x1566)/4 - 13 r3 A41 - 796 - (2x1566)/4 - 13 A43 - 770 - (2x1566)/4 - -13 r4 A34 - 770 - (2x1566)/4 - -13 A15 - 675 - (2x1335)/4 = 7.5 A51 - 675 - (2x1335)/4 - 7.5 A35 - 660 - (2x1335)/4 - 7.5 r5 A53 . 660 - (2x1335)/4 - -7.5 A44 - 613,716-(1566x1566)/4 - 627 A55 - 447,925-(1335xl335)/4 - 2368.75 A45 - 523,180-(1335x1566)/4 - 527.5 A54 - N " " = 527.5 [ Vect ] [ RHS 1 2 1180 0 o 2 1350 1566 979300 1335 836400 nHYS = [4]. RES we use the - 1180-(2x2530)/4 = -85 - 0 - 0 - 0 - 1350-(2x2530)/4 - 85 979,300-(1566x2530)/4 -11,195 835,400-(1335X2530)/4 “7,987.5 82 E ”A J E ans 1. 1 O -l 13 7.5 -85 O 0 O O 0 0 -1 O l -13 -7.5 85 13 0 -13 627 527.5 -ll,195 7.5 0 -‘7.5 527 2368.75 -7,98‘7.5 The following HYS class is accumulated into Z'ZA and R38 [ z'zA 1 [Vect] [ ans 1: 1+1 0 -1+0 13+401 7.5+380 1 700-85 0 2 0 791 665 2 980+0 -1 0 1 -13 -7.5 0 85 13+401 791-13 -13 627 527.5 672,560 +475,065 +417,065 1192 -11,195 7.5+38 665 -7.5 527.5 2368.7 595,700.0 +417,065 367,025. 1045 -7,987.5 “Hys - [3] yHYS = [1680]. New 2'2; and R38 are then calculated with formulas (3) and (4), i.e., All - 2 -(lxl)/3 = 1.66 A12 - 0 -(1+2)/3 - -.66 r5 - 5,877,125.5 - (1045x1680)/3 = 2512.5. The second Z'ZA and RES is then obtained: 83 [ z'zA 1.66 -.66 -1 16.66 -.66 .66 0 -3.66 -1 0 1 -13 -16.66 -3.66 -13 2,507.66 39.16 -31.66 -13 2,379.16 1 39.16 -31.66 -7.5 2,379.16 5,385.41 E RES 1. 55 -140 85 6155 2512.5 This example only has three HYS classes to absorb. Thus, the information in the last HYS group is accumulated now. I 1+1.55 0-.56 0-066 1+066 0‘1 0 16.55+304 '3.56+550 39.16+280 '31.56+410 z'zA 0-1 16.66+304 o -3.66+550 1+1 -13+410 -13+410 +2507 563016 -7.5+38 +2379 466420 YHYS - [1680]. 39.16+280 1 -31.66+410 1 -705+308 +2,379 466420 +5385 390900 ] [Vect] [RES ] 800+55 910+l40 1 580+85 6155 1264 +981500 2512 +817500 107q The last Z'ZA and RES then are calculated with algorithm formulas (3) and (4), All I 2.66 - A12 I -.66 - r5 8 820,012 - i.e., (lxl)/3 = 2.33 (lxl)/3 a -.99 ~ 1 (lO70x2290)/3 = 3245.85. 84 The final absorbed matrix Z'ZA and vector RES are: ( z'zA J [ RES ]. 2.33+15 -1 -1.33 -100.67 -37.50 91.67 -1 1.33+15 -.33 125 21.67 6.67 41.33 -.33 1.66+15 -24.33 15.83 -98.33 100.67 125 -24.33 32958 17972 10491.6 -37.50 21.67 15.83 17972 14652 3245.83 Before 2'2; is inverted.and.s and covariate are solved.the ratio 1:. was added to the diagonal elements of random or sire components as one can see in the underlined elements of Z'ZA. This is to follow the MME procedures described. The solutions obtained by these two methods of absorption are the same as those from the unabsorbed matrix. However, no solutions are obtained for the absorbed fixed effects. The solution vector is: 81 70375 A 32 -30462 83 - -3.913 cov~1 .649 C0V~2 -0546 e The solutions for HYS can be obtained.by back solving. This is using the solutions, the accumulated Vect of each HYS, and nHYS and yHYS. This either requires storing the Vect, nHYS and yHYS information or rereading the data. Generally' 85 the data are reread as this reduces storage requirements significantly. For the present example one has for HYSl. Solutions Vectl nHYS YEYSl 7.375 2 [4] [2580] -3.462 0 -3.913 2 .649 1566 -.546 1335 Hys1 - [2580 - 2(7.375) - 0(7.375) - O(-3.462) - 2(-3.913) -(.649) (1566) - {-.546) (1335) 1/4 - [2235.65]/4 s 558.91. For EYSZ similarly; Vect2 nHYs2 YHYSz 1 [3] [1680] 2 0 1192 1045 HYSZ = [1680 - 1(7.375) - 2(-3.462) - .649(ll92) - -.546 (1045)]/3 8 [1476.51]/3 a 492.17. Finally, for HYS3 we have: Vect3 nHYS3 YHYS3 1 [3] [2290] 1 1 1264 1070 86 HYS3 - [2290 - (7.375) - (-3.462) -(2.913) -.649 (1264) - -.546(lO70)]/3 - [2053.881/3 - 684.62. In this way the vector of solutions is identical to that obtained with the direct inversion approach. This also could be used to check the two methods. Hvsl 559 HY82 492 HYS3 684 31 __ 7.37 32 " -3.46 83 -3.91 covl .64 cov2 -.54 An in-house Fortran program to accomplish this task was developed. The International Mathematical and Statistical Library (IMSL) of Fortran subroutines was used in part to do the matrix manipulations. The matrix inversion algorithm used was developed by Healy (1968). The in-house program is in Appendix A. 87 III.4 Variance Components Estimation There are many methods for estimation of variance components from.unbalanced data. These methods vary in degrees of computational complexity. There is no method that can be said to be best for all situations (Mao, 1981). The appropriate method is determined by the statistical properties and characteristics of the model studied. The iterative restricted maximum likelihood (REML) procedure using solutions from mixed model equations (MME) was used to compute variance components in this study. The most desirable characteristics of REMI.are (Mao, 1981): 1) It renders non-negative estimates of variance compo- nents when MME solutions are used in maximum likelihood equation. 2) It maximizes the random portion of the likelihood which is invariant to the fixed effects in a mixed model and does not require that the fixed effects are known, as in maximum likelihood (ML). 3) REML can be used in iterative computations. Let us continue using the sample of data to illustrate the iterative restricted maximum likelihood procedure. The solutions from.mixed model equation 5 and 3 will be computed for each lactation by: U!) 5 X'Z Z'z + G'lL z'x Z'y x'x X'y where: L is a diagonal matrix of the variance ratio of Ge/Gs, and 6'1 is the inverse of the relationship matrix for sires. Let C be generalized inverse of the coefficient matrix: Z'z + G"1 X'Z Z'X X'X 88 8X cxs Cxx The generalized inverse of A.absorbed matrix from sample data with the ratio L of 15 is the following: .0590 0.0017 c’ - 0.0054 0.0003 -.0002 L1 =3 15. 0.0017 0.0649 0.00001 -.0005 0.0006 0.0054 0.0003 -.0002 0.00001 -0.005 .0006 0.0612 0.0002 -0.0003 0.0002 0.00009 -.0001 -.0003 -.0001 0.0002 89 The RHS are: 91.67 6.67 -98.33 1049.16 3245.83 The solutions are: A 51 7.376 A 82 -3.463 A 52 -.546 Where 31 corresponds to the random effects, sire effects, and 5i are solutions for covariates. The REML estimates of the variance components are: 9e - (y'yc - b'X'y - s'Zy - Ls's)/[n- r(x)]- Because the absorption of some of the fixed effects HYSi took place prior to solving the equations, no explicit solutions of those absorbed.fixed classes are available. The computation of residual variances required some adjustments. During the absorption process, sums of squares pertaining to the HYS were collected. These sums of squares are to be subtracted from the total sums of squares, y‘yc. Because the program was designed to obtain the EYSi solutions with the back up procedure previously illustrated, the appropriate sum of 90 squares was calculated, therefore, with the HYS solutions; both techniques give identical results. The product of L ss'ss also had to be substracted for the computation of the residual variance. This corresponds to the L variance ratio added to the random portion of the MME as demonstrated by Thompson (1969). Because only one random factor is involved as in the present model the sire variance is computed under the assumption: V(38) ' GVs’ Therefore: 68 3 [8.88.8 + (tress)] / qs where: trc is the trace of the c generalized inverse corresponding to sires qs is the number of classes in the random sire effects. Let us continue with the numerical example. Y'Yc a Y'Y " SSHys where: SSHYS is the sum of squares of absorbed EYS. 91 y'yc - 4,616,600 - 25303 - 16802 - 22903 4 3 3 I 4,616,60 - 4,289,058 I 32,7582 b'x'y - (0.649) (10491.6) + (-.546) (3246) I 5037 s'Z'y - (7.37) (91.67) + (-3.46) (6.67) + -3.91 1039 3'3 - 7.372 + -3.462 + -3.912 I 81.8 A Ls'S - (15) (81.84) - 1228. trC I 0.059 + 0.0649 + 0.0612 I 0.1851 Ge - (327,582 - 5037 - 1039 - 1228)/8 - 40,035 Gs - [81.84 + 90.1851) (40,035)]/3 - 2,497. With the estimate of Va and Gs, a new ratio can be computed. L2 . Ge / Gs - 40,035/2497 - 16 92 As one can see, 3 and 8 depend on L; Ve relies on a, Ve, and L: 9e relies on Q, 5, and L; and 9e and 9s are needed to compute a new estimate of L. These relationships lend REMI. to iterative computation. The heritabilities in the literature determined the first L. The process of solving for 6 and 3, then computing Ve and Vs is continued by replacing L with the new ratio of 9e,/ Vs and recomputing the REML estimators until the current and new ratio converge ( convergence was defined by when the difference between the current and new ratio was less than.2). From the first round of iteration in our numerical example, the new L ratio calculated with the variance esti- mates was 16. This ratio was added to the sire portion of the MME before inversion, and a new process then was started. The statistics for the computation are: .0755 .0015 .0047 .0002 -.002 .0015 .0603 .000 -.0005 .0005 C- .0047 .000 .0570 .0002 -.0003 -.0002 -.0005 .0002 .0000 -.0001 -.0002 0.0005 -.0003 -.0001 .0002 L2 ‘ 16. 93 The new solutions are A 51 6.92 A 82 -3.22 A 83 = -3.70 52 -0544 y'yb - 32,7582 tress = .1928 b'xy - 5001 3'3 - 85.63 s'Zy - 976 LQ'S - 1370 Va - 40,029 A Vs - 2,601 L3 - 15032’ If one were to continue the new ratio to be used would be 15.32. For the purpose of this example it was chosen to stop here. 94 III.5 Heritabity Estimates With the estimates of residual and sire components of variance an estimate of heritability for milk yield was calculated. As defined by Lush (1945) heritability in the "narrow sense" is the proportion of the total variance in a trait that is attributed to the average of additive effects of genes. Heritability in the "broad sense" is the fraction of total variance attributable to genetic variance, which not only contains the variance due to additive effects but includes variance due to dominance and epistatic effect. Heritability in the "narrow sense" is the one usually' referred in the literature and in this study. Genetic theory indicates that in a randomly mated popu- lation the half-sibs Vs is equal to one-quarter of the additive genetic variance (with zero epistatic variance), and Vs + Ve expresses the total phenotypic variance. Therefore, heritability in the narrow sense can be expressed as: h2 - 4 Vs /(Vs+Ve). The denominator is the phenotypic variance after the variance due to named fixed effects in the model have been 95 removed. The approximate standard error of the ratio of variance components will be used to compute the standard errors of heritability (Dickerson, 1969). 55(2/9) - 4/9 [ son V(fi) ]. III.6 Ranking 9; Sires Sire solutions for each lactation were compared. Spearman's rank correlation analysis was used (Gill, 1978). Spearman's rs is defined as the sum of the squared differences in the paired ranks for two variables over all cases, divided by a quantity which can be described as what the sum of the squared differences in ranks would have been had the two sets of rankings been totally independent. rs - 1 - [6 SUM{i-l,n} dizj / [ n3-n 1. In animal breeding rank correlations are used to compare methods of sire evaluations. 96 111.7 Methods for the Analysis of the Error Variance-Covariance Structure Procedures to construct the variance-covariance matrix of the three lactations are outlined in III.2. Solutions for the model were stored on disk. Estimates of Yij were obtained for milk yield of lactation it11 in jth cow ( i.e., Yij ) from the stored solutions. Individual residuals were estimated by subtracting the estimates from the observations: “:3 ’ Yij ' Yij where: uij is the estimated residual of lactation 1th in the jth cow. The estimated residuals for each lactation were stored in vectors. The matrix ( a vector for each of the three lactations) of residuals was squared and multiplied directly to form a sum of squares and cross products matrix (SSCP). The SSCP matrix was divided by degrees of freedom, which is the number of observations minus the number of fixed effects fitted into the model, that is, covariable days in milk and corresponding herd-year-seasons. Once divided the matrix SSCP becomes the variance-covariance 97 matrix S. Direct multiplication and squaring was a shortcut to avoid matrix multiplication of the three vectors of the residuals matrix. 111.7.1 Test for Homogeneous Variance A procedure to examine the assumption of homogeneous variance and common covariance is described by Gill (1978). This procedure was developed to study repeated measurement designs. One can think of the three lactational milk yields per cow as repeat measurements. To better understand the procedure let's follow a formal explanation. Among r animals measured, one may compute p different variances (S2 =- Sii' i - 1, 2, ..., p), one for each of the p periods of measurements. One also may compute a covariance ($11., ifi') for any two periods. Because $12 =- 821, $13 - 831, there are only p(p-l)/2 different covariances. The p variances must be estimates of a common population variance "V” and the p(p-l)/2 covariances must be estimates of a common population covariance "VV". If the population variances are equal, any population covariance is equal to the common correlation in the population "rho" multiplied by the common variance, i.e., "rhoV". From the sample of data, one may estimate V from the average sample variance, 3 - SUM szi/p, and rhoV from the average sample covariance, sij - SUM SUM sii./(p(p-1)/2). The p x p matrix 98 of common variances and covariances of the population is an "ideal" matrix SIGMO to which the real population variance- covariance matrix 816)! may or may not conform to the hypothesis Ho:SIGM=SIGMO. The sample variance-covariance matrix S is an estimate of 8161!, and the matrix of averaged sample variance- covariance So is an estimate of SIGMO if the hypothesis is true: V1 W12 0 e e Wlp V rhOV e e e rhOV VVbl VVPZ V3 rhoV rhoV V And the sample matrices would be: 81 512 e e o 31p Si Siil e e e Siil S B 521 52 e e e 82p 50 a 811' Si 0 e e 811. spl SP2 0 e 0 SP SiiI SiiI e e 0 Si The test for the Ho:SIGM - SIGMAO used in this study was developed by Box (1949, 1950). The statistics for an experiment with r subjects, each measured for y at p times 99 are: hl - tp loge (|S|/ISOI) where |S| and |So| are determinants of the matrices S and so. For p 5 the test statistics is q - (1-h1)h3 versus the table Chi Square statistic with v degrees of freedom, v1 - (192 + p-4)/2- The hypothesis is rejected if this statistic exceeds the critical value for alpha, value of the test. III. 7. 2 Method to Test for the Hypothesis that Errors 53113; a —FI—§E Order Autoregressive Mod31——_- A maximum likelihood procedure was used to test the hypothesis of Mansour, Nordheim, and Rutledge (1985). They stated that the errors of consecutive lactation milk yields follow a first order non-stationary autoregressive process. The objective of the test is to obtain from the variance-covariance matrix calculated from.data S, estimates of ‘Va, Va, 0 the model parameters. The estimates were used recalculate a matrix SIGMAT. This was used to test the hypothesis ( i.e., Ho:SI<§Mu =- SIGMAT ) with Chi Square procedure. The estimate of a the autoregressive parameter and R s 100 ‘Ga/Ve were fitted with an iterative routine used to maximize this three statistics simultaneously. The derivation of the method is as follows; the autoregressive model can be expressed: Let the errors e1, e2, e3 be independent and each normally'distributed with mean zero and variance Ve ~N(0,Ve). Let the observed residuals be: 111-91‘1'3 u2 - 0 ul + e2 + a where: a is the random effect of the cow and is assumed independently and identically distributed (iid). N(o, Va), Va variance attributed to genetic and permanent environmental factor. Ve ‘variance attributed to temporary environmental factors. then: u - u2 ~ N(O,SIGHu). 101 The vector of residuals corresponding to the three lactation periods is assumed to be.multivariate normally'distributed with mean zero and variance SIGMu, the variance-covariance assuming first order autoregression of error where: Va+Ve Va+¢Ve Va+pZVe srsuh - Va+pVe Va+Ve+pZVe Va+pVe+p3Ve Va+¢2Ve Va+pVe+¢3Ve Va+pZVe+p4Ve . Let R - Va/Ve, then: R + 1 R + p R + p? SIGHu-VeR-bp R+02+1 R+p+p3 avup. R + 02 R + p + p3 R + 02 + 04 +1 The log of the likelihood function for n independent observations “1' . ., un on u is Under Ho: that Mansour model holds log L a - l/2N log | sxcnul - 1/2 SUM ui SIGM'lui a - l/2N [log |A¢| + 3 log Ve + {trmp-l S)}/Ve] 102 This is maximized for Ve - 9e - 1/3 traifl,’l S), where: p and R are chosen to maximize: log L - -l/2N [loglAg | + 3 log {1/3 tr(A¢'l 8)} +3] - -l/2N [loglAgl + 3 log tr (Afl’lsn - l/ZN [-3 16g3+3]. Then test with Chi square statistic: 0 - N [ 16ga lApl + 3 16ge tr(A¢'l 5) - (16ge |S| + 3 loge 3 ) ]. is asymptotically distributed as Chi Squaren,3 - Chi Square3 under the null hypothesis, Ho: SIG)!u - SIGMAT. The hypothesis is rejected if the appropiate test statistics exceeds the critical value for alpha value of the test. IV RESULES AND DISCUSSION 1V.1 Basic Parameters of the Model for each Lactation The first results of this study were those from the fitting of the model to each lactation. lResults, reported in Table 4.1, indicate the characteristics of the model for each lactation. Table 4.1 Basic parameters of the model for each lactation. Lactation Parameter 1 2 3 Number of records 3989 3985 3990 Number of sires 193 193 193 Number of herd-year-seasons 361 357 371 R2 .97 .97 .97 Numbers of records used in the analysis of the three lactations differ because observations belonging to a sin- gle herd-year-season group were deleted during the program execution. This difference is small, therefore, numbers of 103 104 observations were equal for the purpose of this study. The number of sires in the analysis was the same in each lactation. This indicates that the number of daughters per sire was enough to make up for the loss of those single herd-year-season observations. The number of herd-year- seasons absorbed into the sire and covariate effect was also similar for each lactation. Because the number of observa- tions and classes are similar for each lactation, it was assumed that there was not a major difference in the of the analysis for each of the lactations. Similar size of incidence matrices and equations to be absorbed seemed to support this assumption. The coefficient of determination R2 which measures the proportion of reduction of the variation of milk yield achieved by the model was high and identical for each lacta- tion. This indicates the adequacy of the model in descri- bing causes of variation for each lactation. IV.2 Variance Components and Heritability Estimates Means and coefficients of variation for milk.yield for the three lactations as well as the iterative REMI.estimates of variance components and heritability are in Table 4.2. 105 Table 4.2 ‘Variance components, heritabilities, L ratios, means and coefficients of variation for milk yield in three lactations. Lactation 1 2 3 Mean of unadjusted milk yield (kg) 4,890 5,520 5,520 Coefficient of variation for unadj. milk yield .27 .26 .34 Sire Variance (Vs) 837 664 703 Error Variance (Ve) 7,416 10,054 11,469 Final L ratio Ve/Vs 8.86 15.56 16.10 Heritability (hz) .40 .23 .23 Standard error h2 .04 .03 .03 Means for unadjusted milk.yields were 4,890 kg for first lactation and 5,520 kg for second and third lactation; this 12.5% increment in milk yield between first and later lactation is expected from the normal rate of maturing of the cows. Production is approximately the same for the Holstein dairy herd in Mexico as reported by McDowell et al. (1976b) and Cisneros et a1. (1980). Variability of milk yield expressed as the coefficient of variation was similar for the first and second lactation and higher for the third: this also was expected because as 106 a cow gets older the probability that environmental factors affecting its performance is higher. Another explanation could be that because of the relative higher cost of registered cattle in Mexico, dairy breeders tend to retain relatively lower producing cows in the herd for the purpose of getting an extra valuable offspring out of them. The latter would explain both the similarity of milk yield from the second and third lactation, as well as the higher variability for milk yield in the third one. Sire variances were lower for second and third lacta- tions. The explanation that environmental variance increases throughout the life of a cow overshadowing its genetic makeup (sire variance), as suggested by Barker and Robertson (1966) does not hold. Sire variance for the second lactation is lower than that of third lactation. The same authors indicated that analyses of Swedish data showed a decline of sire variance from first to second lactation, followed by an increase from second to third lactation (Hansson and During, 1961), in exactly the same manner as in this study. This would be expected if selection had been based on factors having a higher environmental than genetic connection with milk yield; for instance, accidents, diseases, or low fertility. Another explanation for these results could be increase of milk yield with age at calving. Sire differences for rate of maturity and therefore, higher milk yield in later lactations also could cause sire 107 variance to increase in later lactations. However, Hansson and During (1961) suggested that environmental disturbances may have a cumulative.effect over succeeding lactations causing this difference in sire variances. Differences in sire variance are believed to be caused by the cumulative effect of environment over succeeding lactations. Heritabilities of milk yield for first, second, and third lactations agree with the literature. Table 4.3 summarizes the heritabilities for three lactations in Holstein cattle. The similarity of heritabilities estimates with those reported is not only in their magnitude but also in their trend. Heritabilities for'milk.yield in first lactation is higher than those of later lactations. Most authors favor that environmental factors in the lifetime of the cow, such as diseases and management, increase varia- tion of later records. There are other'possible explana- tions for the decline of heritability such as larger effect of genes controlling first lactation than second lactation. The presence of a genetic maternal effect that gradually' decreases in importance also might explain the decline. Yet another possible explanation is the presence of constant genetic effect on second lactations which would lower estimates of heritability of second lactations (Butcher and Freeman, 1968). Recent heritabilities are from maximum likelihood (ML) Rothschild and Henderson (1979), Meyer (1983), and from res- 108 tricted maximum likelihood (REML) Tong et al. (1979). The study of Powell et al. (1981) uses a set of data similar to that of the present study because the daughter-dam regression procedure used required that both dam and daughter have three lactations. Heritabilities can not be compared because they are drawn from different populations which are under different climate, management, and selection practices. They also vary in the analytical procedure used to estimate heritabilities. And the trait, i.e., milk yield, was sometimes actual and sometimes adjusted. Adjustments were not always the same. McDowell et al. (1976b) using a subset of the same data used in this study reported a heritability estimate of.12 for milk yield. To estimate this heritabili- ty, they used a sire component within herd from Henderson's method 1 (an alternative analytical procedure) Conclusion from this study is that because one is able to observe an increase of environmental variation, heritabilities would tend to be lower in later lactations. Cumulative effects of injury or mastitis would be consistent with this viewpoint. 109 Table 423 IHeritabilities for milk yield of the first three lactations in Holstein cattle. Lactation Source and Method 1 2 3 Freeman (1960) daugher-dam regression .36 .24 .26 Molinuevo and Lush (1964) daughter-dam regression .30 .09 .08 Barker and Robertson (1966) paternal half sisters .35 .24 .23 Butcher and Freeman (1968) intrasire regression of daughter on dam' .37 .25 .35 Butcher and Freeman (1969) Fiveprocedures .17 to .39 .11 to .35 .11 to .40 Maijala and Hanna (1974) paternal half sisters .26 .20 .17 Rothschild and Henderson (1979b) paternal half sisters .41 .35 -- Tong et a1. (1979) paternal half sisters .26 .19 .17 Powell et a1. (1981) daughter- dam regression on Modified Contemporary deviations .36 .31 .26 Meyer (1983) paternal half sisters .33 .34 .28 This study .40 .23 .23 paternal half sisters 110 1V.3 Effect of Covariable Days in Milk Data were not adjusted to the customary 305 days lacta- tion but actual days in lactation were fit linearly as a covariate with milk yield. Fitting one regression coefficient per lactation to correct for the systematic effect of lactation length allowed full expression of variation. Regression coefficients of days in milk.on:milk yield for first, second, and third lactations were 1.36 i .06, 2.33 i .08 and 2.16 $.03. The magnitude of the coefficients is associated with milk production in each lactation; the coefficients for the second and third lactation are almost a unit greater than that of the first lactation Meyer (1983) also fitted linearly lactation length as one of the regression coefficients in her model: however; she did not report the regression coefficients. 111 IV.4 Sire Solutions for Each Lactation Solutions for the 193 sires were computed for each lactation. Summary of these solutions is in Table 44L The mean of BLUP sire solutions are zero by definition. Table 4.4 Range and standard deviation of BLUP sire solutions (n - 193) each lactation. Lactation 1 2 3 Range -90 to 77 -46 to 42 ~46 to 45 Standard deviation 20.88 16.00 16.51 The range of the BLUP sire solutions is wider for first lactation than for second and third. This variability also could be appreciated through the standard deviations. The standard deviation for the BLUPs for first lactation is almost 25% higher than for the others. Lofgren et a1. (1983) also reported a 5% higher standard deviation of BLUPs for first lactation. Results of the Spearman's Rank correlation for the sire solutions are in Table 4.5. The range of rank correlations between sires with daughters in different lactations, i.en only first lactation or only second lactation, reported in the literature goes from .64 (Tomaszewski et al., 1975) to 112 .95 (Lofgren et al., 1983). Most of the estimates are in the upper part of this range. Rank correlations from this study are low (Table 4.5). Schneeberger (1982) reported a low rank correlation between first and second lactation BLUP sire values for Holsteins in Venezuela. One of his explanations for this was that when the environmental fluctuations are larger and number of progeny per sire is low, variation of sire values is high. McDowell (1983) also agreed with this explanation. Because the primary objective of this study was not to estimate sire values, sires with as few as six progeny re- mained in the data. If the objective of this study had been to estimate sire values accurately, a larger number of progeny per sire would have been required. In the data set, the majority of bulls had between 6 and 14 progeny. The low number of progeny per sire is probably the explanation for the low rank correlation and the large standard deviation and range the sire values. 113 Table 4.5 Rank correlation* of BLUP sire solutions for the three lactations. Lactation First Second Second Lactation .58 Third Lactation .53 .59 *Spearman's "r" 114 IV.5 Analysis 2; the Variance-Covariance Matrix The variance-covariance structure from this study is: 7,416 3,991 3,114 (3989) (3950) (3954) s - 3,991 10,541 4,813 (3950) (3985) (3941) 3,114 4,813 11,469 (3985) (3941) (3990) Numbers within parenthesis correspond to the number of observations. As explained earlier, not all cows were in each lacta- tion because of elimination of single herd-year-seasons during the program execution. Therefore, the degrees of freedom were calculated from each model for each element of the S matrix. For the case of diagonal elements, these are the same as for the error variance component, that is, the total number of observations minus the number of fixed effects fitted into each model. In this case, fixed effects are covariable days in milk, and number of herd-year- seasons. For the off-diagonal elements, degrees of freedom were calculated from the number of products minus the average number of herd-year-seasons in the two involved lactations. The difference of degrees of freedom was small. The degrees of freedom were 3,627 for first and second llactations and 3,618 for third lactations. For statistical 115 testing purposes, it was decided to use 3,985 as the number of observations because it represented a midpoint between the several numbers of observations. IV.5.1 Test for Homogeneous Variance The purpose of this procedure was to test the null hypothesis H:SIGM -SIGM° by the method of Box (1949, 1950), or in this case H: S - so. The test consisted in obtaining an estimate so by separately averaging variances and covariances in S. Then the likelihood ratio criterion,i.e., of the ratio of the determinants of matrices S and 80, L - |S| / |So|,was used. The average variance-covariance matrix So from our data was: 9,809 3,973 3,973 so - 3,973 9,809 3,973 3,973 3,973 9,809 - repeated measurements p - 3 - total of animals r - 3,985 measured 51 - [3 (42) (3)1/[6 (3984) 2 (8)] n1 - .0004 h3 . [-3984 log 3 {I 5.59 x 1011 | / | 6.04 x 1011 |) h3 - 308 116 q I (1 - .0004) (308) I 307 q - 307 which by far exceeds Chi square 0.01, 4 - 13.27. Therefore, one should conclude that variance covariance matrix s is heterogeneous. Examining the estimated variance-covariance structure one can note increasing values along'principal diagonal (7416, 10541, 11469), increasing values on the first diagonal (3991, 4813), and decreasing values in the first row (7416, 3991, 3114). It appears that the error variance increased with time and that the error covariances are larger in adjacent lactations. This pattern is similar to that of the error variance-covariance matrix of lactational yield of milk fat reported by Butcher and Freeman (1968: 1969) and used by Mansour, Northeim, and Rutledge (1985) in a first order autoregressive model. Other estimates of error variance (Ve) for first, second, and third lactations also show an increase for second and third lactations. 'Tong et al. (1979) reported first, second, and third lactation (REML) Ve estimates of 543,531; 715,826 and 824,750. This is an increase of 30% for the second and of 50% increase for the third with respect to the first. Rothschild.and Henderson (1979a) reported (ML) Ve estimates of 900,581 and 1,032,391 for first and second lactation this 14% increase is the smallest of those reported. Meyer (1983) reported (REML) Ve estimates of 557.31, 928.51 and 1008.67: the 117 percent increase in variance was 60% from first to second and 80% from first to third. The percent Ve increase in this study was 42% from first to second and 52% from first to third. These increases are larger than those reported by Tong et.aJ. (1979) and smaller than those reported by Meyer (1983). The Ve increase from second to third lactation is less than from first to second lactation. The percentage increase from second lactation to third was 8% from Meyer (1983) and this study, and 15% from Tong (1979). This smaller difference in Ve between second and third lactation compared to Ve between first and later lactations seems to indicate that environment may have a different influence in these lactations than in the first. Deaton and McGilliard (1964) Mansour, Nordheim, and Rutledge (1985), and Nicholson et a1. (1978) have suggested treating first lacta- tions differently from later lactations. The Box test for homogeneous variance was not used to test the other estimates of variance-covariance matrices because they involved different cows and different numbers of cows at each lactation. Because Rothschild and Henderson (1979a) did not study third lactations and Tong et a1. (1979) assumed that no error covariance existed in their’multiple trait model it was not possible to compare their results with this study directly. Results could be compared only with those of Meyer(1983). The error covariance was highest between 118 second and third lactation in both studies, and they are almost the same between first and second, and first and third for Meyer (1983). The present study error covariances are greater for adjacent lactations. The variance-covariance matrix estimated in this study is heterogeneous. It also appears that other variance- covariance matrices in the literature follow a similar pattern, even those from studies in which the selection effect has been removed (Tong et al., 1979: Rothschild and Henderson, 1979a; Meyer, 1983). Implications of heterogeneous variance-covariances for the three lactations are several: (1) First lactations should be treated differently from later lactations for example by the use of an index weighting lactations by their heritabilities: (2) Later lactations should be used in sire evaluation systems to account fully for error variation of daughters: (3) Transformation of the data to reduce the heterogeneity should be used to be able to use all the records of daughters of a sire. This is important for example, for small populations such as in less developed countries where all records available should be used to improve the accuracy of the sire evaluations. The pattern followed by the error variance-covariance matrix from this study may induce one to think that the errors follow a non-stationary autoregressive random process as assumed by Mansour, Nordheim, and Rutledge (1985). One 119 of the could be modeled after this pattern. 1V.5.2 Test for the Hypothesis That Errors Follow A First Order Autoregressive Model The purpose of this procedure is to test the hypothesis that the errors of consecutive lactational milk yield follow a first order non-stationary autoregressive process. For: 7,416 3,991 3,114 8 I 3,991 10,541 4,813 3,114 4,813 11,469 The tr (Ap'IS) is minimized for 9 =- 6 =- .414 and R - Va/Ve - .14 being tr (Afi'IS) - 23,377. Ge - l/3(tr(A¢'1S) a 7,792; therefore, 9a - R 9e = (.14) (7,792) = 1,090. R+1 R+p R+p2 sxcu - Ve R+p R+g+1 R+p+p3 = VeAp R+p2 R+¢+p3 R+02+p4+1 120 1.14 0.554 0.311 Ag I .554 1.3113 0.624 0.311 0.624 1.340 |Afi| - 1.235. In this case the estimate of SIGMu is SIGMAT - 9e A9. 8,883 4,317 2,426 sxsuam - 4,317' 10,218 4,869 2,426 4,869 10,447 The best fit is the one that makes SIGMAT as close as possible to S. The Chi square test for our estimates 9e, 9a and a is 0 - N [loge|Ag| + 3 loge tr (Ag-18) - (loge |S| + 3 16ge3)] N = 3985 I 3985 - [0.211 + 30.17 - (27.05 + 3.295)] I 3985 [30.38 - 30.28] I 404.47 Q I 404.47 for 3 df. Therefore, reject the Ho : SIGMu - SIGMAT at any reasonable a1Pha. 121 Before rejecting the assumption.that errors follow a first order autoregressive process, one should compare the error variance-covariance matrix from this study S with that estimated with 9a, Ve, and 2 obtained from the likelihood procedure SIGMAT. In general, SIGMAT resembles S because elements of the main diagonal increase with time and covariances from adjacent lactations are higher than those from nonadjacent lactations. The difference between these two matrices is that elements in SIGMAT corresponding to first lactation seem to be higher and elements corresponding to the third lactation are lower than in S. The test statistic rejects the hypothesis that errors follow’a non- stationary first order autoregressive process; however, given the large number of observations it is expected to reject statistically nearly any hypothesis. One should look.for a model that could explain.better the pattern for the estimated variance-covariance matrix S. The reason the elements corresponding to the third lactation were underestimated may have had something to do with a change of the real ability of the cow in later lactations. Age affects milk yield. The age or maturing effect usually' is corrected in most sire evaluation programs to enable comparisons of yields of cows at a common age. IIt is usual to express the production of a cow in mature equivalent units. For this reason, it was decided to incorporate an adjustment parameter into the model. This parameter 9 122 which accounts for 1 degree of freedom represents the possible increment in Va caused by the maturing effect of the cow. The hypothesis to be tested was that errors follow a first order non-stationary autoregressive model and that the Va in later lactations is also affected by 0 or the maturing effect of the cow. Let the observed residuals be: ‘12 - ”111 + 32 ‘1' 6a u-¢u+e+82a-pze+pe+82a+e 3 2 3 1 2 3 where: 8 - parameter associated to the cow effect, representing the increase in milk yield in later lactations or maturing parameter, SIGMS will have this form: Va+Ve pVe+eVa 02 Ve+e2 Va sxsxe- Ve + 82 Va + 02 Ve p3 Ve + me + 93 Va Sym Ve(l+,02 + 04) + 84Va Let R = Va/Ve smug-Va R 8 1+82 92 83 1+94 123 92 83 + o 9 92 9 92 9+93 92 9+93 ¢+¢4 The tr (A'%GS) is maximized for €= 1.27, [3 - .23, A A Ve I 6,598, Va I 1,715. Estimate of SIGM is SIGMATG - Ve Age 8,313 SIGHATG - 3,696 3,116 Q - 95.72 for 2 df. 3,696 9,714 5,112 3,116 5,112 11,428 Reject the Ho: SIGMu -SIGMAT6 at any reasonable alpha. R==.26 The model that incorporates the 8 parameter to account for the maturing effect of the cow seems to model the pattern of the variance-covariance matrix better. The test statistic to reject this hypothesis is much lower than that to reject the first order autoregressive assumption(404.47 vs 95.72). This indicates that SIGMAT from the 8 models is considerably closer to S (specially in the covariances), 124 SIGMAT from 8 model has larger estimates for elements corresponding to the third lactation. The matrices S and SIGMAT from both models are in Table 4.6 . Table 4.6 Sample S and Variance-Covariance matrices from first order autoregressive plus cow's maturing effect models. SIGMATp and SIGMATG. 7,416 3,991 3,114 8 I 3,991 10,541 4,813 3,114 4,813 11,469 8,883 4,317 2,426 SIGMA?” I 4,317 10,218 4,869 2,426 4,869 10,447 8,313 3,696 3,116 SIGHATG I 3,696 9,714 5,112 3,116 5,112 11,428 Though, the 8 model is a better approximation to the variance-covariance matrix from this study, it is not possible to obtain a perfect fit without incorporating other ‘Parameters and saturating the model. These models fit the llactation data better than a model assuming no correlations Of errors. A positive association of errors exists. A biological 125 justification would be that if a cow produces an unusually' good or bad yield initially, it would.be natural to think that her manager would have a high or low expectation of her ability and provide sequent care accordingly. Because Ve is large relative to Va, the major impact of the differential management would. be reflected by autoregression..Another argument is that effects of accidents or diseases such as an infection could be carried over from one lactation to the next. The two models presented in this study could be considered adequate, the autoregressive process models the effects for the different lactations: the arbitrary 9 model, that included the maturing effect, models the data better but has the disadvantage of the number of unknown parameter (unknown variances and covariances) that would be added to the model. V’ SUMMARY AND CONCLUSIONS Errors from each record of the same cow have been treated as independent of each other when analyzing later lactations. .Also a new and equal set of environmental effects affected the performance of the cow during each productive cycle or lactation. Need to incorporate more flexibility to these relation by allowing errors from each lactation to be correlated with each other and to vary from lactation to lactation in accordance with a first order non-stationary autoregressive process has been suggested. The potential to use recursive estimation techniques later lactations would be possible if residual errors follow a first order autoregressive pattern. :More accurate estimates of parameters and real ability of the cow could be obtained from these relationships. Purposes of this study were to estimate the error variance-covariance structure of first, second, and third lactations; test the assumptions that there is homogeneous error variance and constant covariance among each pair of lactations: and to examine the hypothesis that errors might follow a first order non-stationary autoregressive process. Milkzyield records of cows with three consecutive lactations were used to estimate the variance-covariance 126 127 matrix S. Data were from registered Holstein cows in Mexican DHI. The model applied.to each lactation included variables representing random effects of sire, fixed effect of herd- year-season, and days in lactation fitted as a covariable. Normal equations for sire and covariate days in milk were set up while herd-year-seasons were absorbed by a looping process. The normal equations then were transformed into mixed model equations by incorporating the ratio L (Ve/Vs) in the submatrix for sires. Solutions from the mixed model equations were used to back solve for herd-year-season solutions. Then all solutions were used to compute residuals. Sire and fixed effects solutions, the trace of the sire submatrix, the product of the ratio L times the squared sire solutions, and the trace from the generalized inverse were used to obtain restricted maximum.likelihood (REML) variance components.The process was iterated until the estimates converged. 4A FORTRAN program to perform this task is in Appendix A. With the REMI.estimates of sire and residual variance components, milk yield heritabilities were computed. Heritabilities and standard errors wer .40 i_.04, .23 i .03, and .23 i .03 for first, second, and third lactations. These results agree with those in the literature in magnitude and trend. 'The decline of heritability is due to the increasing environmental factors in the lifetime of a 128 cow. This causes an increment in variation that reduces the importance of the genetic make-up of a cow i.e., this reduces heritabilities for later lactation. Solutions for 193 sires were computed for each lactation. The range and standard deviations of the solutions were -90 to 77 (i 21), -46 to 42 (4_-_ 16), and -46 to 45 (i 16) for first, second, and third lactations. Spearman's Rank correlations for sire solutions were computed to compare the value of sires in each lactation. The rank correlation was .58 for sire values in first and second, .53 for sire values of first and third and..59 for sire values of second and third lactations. There is large variability in sire solutions and a low rank correlation of sire values. This is due to the number of progeny per sire. The 8 matrix is not homogeneous (p < 0.01) according to the Box test for homogeneous variance. In the S matrix, increasing principal diagonal (7416, 10541, 11469), increa- sing first diagonal (3991, 4813), and decreasing first row (7416, 3991, 3114) are noted. It appears that the error variance increases with time and that the error covariances are higher in adjacent lactations. This pattern induces one to conclude that the errors might follow a first order non- stationary autoregressive process. A maximum likelihood procedure was developed to test the hypothesis that errors follow a first order autoregressive model. Estimates that 129 maximized the Log likelihood were a - .414, 9e - 7,792, and Va - 1,090. These estimates were used to recalculate a matrix SIGMATg. The H: S - SIGMATfl was tested with a Chi square statistic. The hypothesis that errors follow a first order autoregressive process (Mansour, Nordheim, and Rutledge, 1985) was rejected (p < .01). The autoregressive model underestimated the error variance in later lactations. A model that incorporated a parameter 8 representing the possible increment in Va caused by the maturing effect of the cow was tested. The test is that the errors follow a first order non-stationary autoregressive model and that the Va (genetic and permanent environmental contribution of the cow) in later lactations is also affected by a or the maturing effect of the cow. The maximum likelihood estimates were 9 - .23 6 1.27, Ve - 6,598, and Va - 1,715. The hypothesis ms :- SIGMATG was also tested. Model was rejected (p < .01). SIGMATG has larger estimates for the elements corresponding to the third lactation. Though the 8 model is a better approximation to the variance-covariance matrix 8, it is not possible to obtain a perfect fit. One may conclude that even though these models were rejected statistically, they fitted 8 better than a model with no correlation of errors i.e., 0:0. A biological justification would be that if a cow produces an unusually good or bad yield initially, it would be natural to think 130 that her caretaker would.have a high or low expectation of her ability and provide sequent husbandry accordingly. Because Ve is large relative to Va, the major impact of the different management would be reflected by autoregression. Another argument is that cumulative effects of accidents or diseases would.be carried over from one lactation to the next. These two models could be considered adequate, although not in compliance with the assumptions. The autoregressive process models the effects for the different lactations. The arbitrary 8 better describes the data but has the disadvantage of the number of unknown parameters (unknown variances and covariances) that would be added to the model. A first order non-stationary process to model the error structure should be used for analises of repeated lactations to reduce variances of estimates of parameters. LIST OF REFERENCES LIST OF REFERENCES Anderson, C.R. 1981. A biometrical and genetic study of Tribolium egg production curies as a model for lactation curves. Ph.D. thesis, University of Illinois at Urbana-Champaign. Bar-Anan, R. 1975. Relations between first and second lactation characteristics of progeny groups and effects of tandem selection on yield improvement. Anim. Prod. 21:121. Barker, J.S.F., and A. Robertson. 1966. Genetic and phenotypic parameters for the first three lactations in Friesian cows. Anim. Prod. 8:221. Box, G.E.P. 1949. A general distribution theory for a class of likelihood criteria. Biometrica 36: 317-46. . 1950. Problems in the analysis of growth and wear curves. Biometrics 6:362-89. Box, G.'E.P. , and G.M. Jenkins. 1970. Time Series Analysis, Holden Day, San Francisco. Butcher, D.F. , and A.E. Freeman. 1968. Heritabilities and repeatabilities of milk fat production by lactations. J. of Dairy Sci. 51: 1387. . 1969. Estimation of heritability and repeatability of milk and milk fat with selection effects removed. J. Dairy Sci. 52: 1259. Cassell, B.G. and, B.T. McDaniel. 1983. Use of later records in diary sire evaluation: A review. J. Dairy Sci. 66:1. Cassell, B.G., B.T. McDaniel, and H.D. Norman. 1983a. Modified contemporary comparison sire evaluation from first, all, and later lactations. J. Dairy Sci.66:140. Cassell, B.G., B.T. McDaniel, and H.D. Norman. 1983b. Impact of culling on modified contemporary comparison sire evaluations. J. Dairy Sci. 66:1359. 131 132 Cassell, B.G., J.S. Clay and, H.D. Norman. 1985. Differences in modified contemporary comparison sire evaluations from first and later lactation by breed. J. Dairy Sci. 68:1778. Christian, L.E., D.O. Everson, and S.L. Davis. 1978. A statistical method for detection of hormone secretory spikes. J. Anim. Sci. 46:699. Cisneros, P.G., G.R. Wiggans and, G.J. King. 1980. USDA summary of 1980 Mexican. cow herds averages. Mimeo of Animal Improvement Programs Laboratory. Beltsville, MD. Curnow, R.N. 1961. The estimation of repeatability and heritability from records subject to culling. Biometrics 17:553. Deaton, O.W., and L.D. McGilliard. 1964. First, second and third record of a cow to estimate superiority of her daughters. J. Dairy Sci. 47:1004. Dickerson, G.E. 1969. Techniques for research in quantitative animal genetics. Techniques and procedures in animal science research. In Monograph of Amer. Soc. Anim. Sci. Chempaign, IL. pp. 36. Durbin, J. 1960. Estimation of parameters in time-seriesregression models. J.R. Statis. Soc., B, 22. Freeman, A.E. 1960. Genetic relationships among the firstthree lactations of Holstein cows. J. Dairy Sci. 43:876. ' . 1973. Age adjustments of production records: History and basic problems. J. Dairy Sci. 56:941. Gallant, A.R., and J.J. Goebel. 1976. Nonlinear regressionwith autocorrelated errors. J. Am. Stat. Assoc. 71:961. Gill, J.L., and H.D. Hafs. 1971. Analysis of repeated measurements of animals. J. Anim. Sci. 33:331. . 1978. Design and Analysis of Experiments in the Animal and Medical Sciences. Vols. 1 and 2 Iowa State University Press, Ames, IA. 133 . 1979. Combined significance of non-inde- pendent test for repeated measurements. J. Anim. Sci. 48:1979. . 1981. Evolution of statistical design and analysis of experiments. J. Dairy Sci. 64:1494. Gupta, S.D., and M.D. Perlman. 1974. Power of the noncentral F-test: Effect of additional variables on Hotelling's T2-test. J. Am. Stat. Assoc. 69:174. Hansson, A., and T. During. 1961. Reliability of heifer records in progeny testing. K. Lantbr Hbgskol Annlr. 27:351. Hargrove, G.L. 1974. Rate of maturity of dairy females. J. Dairy Sci. 57:328. Harville, D.A. 1977. Maximum likelihood approaches to variance components estimation and to related problems. J. Am. Stat. Assoc. 72:320. Harville, D.A. 1979. Recursive estimation using mixed linear models with autoregressive random effects. Proceedings of a conference in honor of C.R. Henderson Variance components and animal breeding. L.D. Van Vleck and S.R. Searle (eds.). 157-179. Ithaca: Cornell University Press. Healy, M.J.R. 1968. Inversion of a positive semi-definite symmetric matrix. J. Appl. Stat. 17:199. Henderson, C.R. 1975. Best linear unbiased estimation and prediction under a selection model. Biometrics. 31: 423 . Henderson, C.R., and R.L. Quaas. 1977. A comparison of method 3 and maximum likelihood estimators of heritabilities and genetic correlations. Res. ASAS 1977 Mtg. Hickman, G.G., and C.R. Henderson. 1955. Components of the relationship between level of production and rate of maturity in dairy cattle. J. Dairy Sci. 38:883. Hillers, J.R., and A.E. Freeman. 1965. Differences between sires in rate of maturity of their daughters. J. Dairy Sci. 48:1680. 134 Hinks, C.J. 1966. Selection practices in dairy herds. IIselection patterns in later lactations. Anim. Prod. 8:467. Keown, J.R., H.D. Norman, and R.L. Powell. 1976. Effects of selection bias on sire evaluation procedures. J. Dairy Sci. 59:1808. Kesner, J.S., E.M. Convey, and C.R. Anderson. 1981. Evidence that estadiol induces the preovulatory LH surge in cattle by increasing pituitary sensitivity to LHRH and then increasing LHRH release. Endocrinology 108:1386. Lofgren, D.L., B.G. Cassell, H.D. Norman, and B.T. McDaniel. 1983. Effects of culling on sire evalua- tions by mixed models. J. Dairy Sci. 66:2418. Lush, J.L. 1945. Animal Breeding Plans. The Iowa State College Press, Ames, Iowa. pp. 171. Lush, J.L., and Shrode, R.R. 1950. Changes in milk production with age and milking frequency. J. Dairy Sci. 33:338. Maijala, K., and M. Hanna. 1974. Reliable phenotypic and genetic parameters in dairy cattle. Pages 541-563 in Proc. lst World Congr. Genet Appl. to Livestock Prod. Madrid. Mansour, H., E.V. Nordheim, and J.J. Rutledge. 1985. Maximum likelihood estimation of variance components in repeated measures designs assuming autoregressive errors. Biometrics 41:287. ' Mao, LL. 1981. Lecture notes on multi-factor unbalanced data. Michigan State University, E. Lansing. . 1982. Modeling and data analysis in animal breeding. Notes for an inter-nordic post-graduate course. Swedish University of Agricultural Sciences. Uppsala, Sweden. Meyer, K. 1983. Maximum likelihood procedures for estimating genetic parameters for later lactations of dairy cattle. J. Dairy Sci. 66:1983. Mohammad, W.A., M. Grossman, and R.D. Shanks. 1985. Algebraic equivalence of matrix inversion, elimina- tion and absorption for use in animal breeding. Am. Stat. 39:124. 135 Molinuevo, H.A., and J.L. Lush. 1964. Reliability of first, second, and third records for estimating the breeding value of dairy cows. J. Dairy Sci. 47:890. Morrison, J.L. 1967. Multivariate Statistical Methods. McGraw-Hill, New York, NY. McDowell, R.E. 1983. Strategy for improving beef and dairy cattle in the tropics. Cornell Intern.Agric. Mimeo. 100. McDowell, R.E., J.R. Camoens, L.D. Van Vleck, E. Christensen, and E. Cabello Frias. 1976. Factors affecting performance of Holsteins in subtropical regions of Mexico. J. Dairy Sci. 59:722. Nicholson, H.H., L.R. Schaeffer, E.B. Burnside, and M.G. Freeman. 1978. Use of later records in dairy sire evaluation. Can. J. Anim. Sci. 58:615. Nelson, C.R. 1973. Applied Time Series Analysis. Holden-Day, Inc., San Francisco. Pindyke, R.S. and D.L. Rubinfeld. 1981. Econometric models and economic forecast. McGraw-Hill, Inc. New York. Powell, R.L., H.D. Norman, and R.M. Elliot. 1981. Different lactations for estimating genetic merit of dairy cows. J. Dairy Sci. 64:321. Robertson, A. , and J.S.F. Barker. 1966. The correlation between first lactation milk production and longevity in dairy cattle. Anim. Prod. 8:241. Rothschild, M.F. , and C.R. Henderson. 1979a. Maximum likelihood estimates of parameters of first and second lactation milk records. J. Dairy Sci. 62:990. Rothschild, M.F., and C.R. Henderson. 1979b. Effects of selection on variances and covariances of simulated first and second lactations. J. Dairy Sci. 62:996. Schaeffer, L.R. 1982. Notes on linear model theory and Henderson's mixed model techniques. University of Guelph, Guelph Canada. 136 Schaeffer, L.R., J.W. Wilton, and R. Thompson. 1978. Simultaneous estimation of variance aand covariance components from multitrait mixed model equations. Biometrics 34:199. Schneeberger, C.P. 1982. Factors affecting the performance of Holstein herd in Maracay, Venezuela. Ph.D. thesis, Cornell University, Ithaca, NY. Searle, S.R. 1971. Linear Models. John Wiley, New York. Sokal, R.R., and F.J. Rohlf. 1969. Biometry. W.H. Freeman and Company, San Francisco. Thompson, R. 1969. Iterative estimation of variance components for non-orthogonal data. Biometrics 25, 767. Tomaszewski, M.A., B.T. McDaniel, H.D. Norman, and F.N. Dickinson. 1975. Relations between sire summaries of first and second lactations. J. Dairy Sci. 58:116. Tong, A.R.W., B.W. Kennedy, and J.E. Moxley. 1979. Heritabilities and genetic correlations for the first three lactations from records subject to culling. J. Dairy Sci. 62:1784. Ufford, G.R., C.R. Henderson, J.F. Keown, and L.D. Van Vleck. 1979. Accuracy of first lactation versus all lactations sire evaluations by best linear unbiased prediction. J. Dairy Sci. 62:603. Van Vleck, L.D., and C.R. Henderson. 1963. Bias in sire evaluation due to selection. J. Dairy Sci. 46:976. Walter, J.F., and LL. Mao. 1985. Multiple and single trait analyses for estimating genetic parameters in simulated populations under selection. J. Dairy Sci. 68:91. Ware, J.E. 1985. Linear models for the analysis of longitudinal studies. Am. Stat. 39:95. Wickham, B.W. and C.R. Henderson. 1977. Sire evaluation by second lactation records of daughters. J. Dairy Sci. 60:96. 137 Wilson, P.D., J.R. Hebel, and R. Sherwin. 1981. Screening and diagnosis when within-individual observations are Markow-dependent. Biometrics 37:553. APPENDIX A OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 000 EQUIVALENCE 138 TABLE A . FORTRAN program for mixed model analysis. PROGRAM REML(TAPE6,TAPE3,TAPE4,TAPE5,TAPE7,OUTPUT,TAPE2IOUTPUT) ***RESTRICTED MAXIMUM LIKELIHOOD PROGRAM*** PROGRAM ABSORSB HYS EQUATIONS USING THE ROW LINE TECHNIQUE VARIABLE LIST VARIABLE NAME SIRE HYS MY CI DIM LHYS NREC TMYHYS INDICE(I,J) A(INDICE(I,J) AINV(INDICE(I,J) RHS(I) RHSl(I) VECT(I) ms NOHYS WK ( I) D(I) C(I) TOT(I) DESCRIPTION SIRE HERD-YEAR-SEASON GROUP MILK YIELD CALVING INTERVAL DIM LAST HERD-YEAR-SEASON DUMMY RECORD COUNTER TOTAL MILK YIELD EACH HYS FUNCTION THAT CONVERTS ROW, COLUMN REFERENCES TO HALF-STORED VECTOR REFS VECTOR CONTAINING HALE-STORED Z'Z MATRI VECTOR CONTAINING HALE-STORED Z'Z INVER RIGHT HAND SIDE NON-ABSORBED OR ORIGINAL RIGHT HAND SID VECTOR OP SIRE PROGENY AND TOTAL COVARIATES NUMBER OF RECORDS IN EACH HYS NUMBER OF HERD-YEAR-SEASONS WORKING AREA A VARIABLE NEEDED BY SUBRO LINV4P. DIMENSION TO RANK OP MATRIX BE INVERTED. STANDARD ERRORS OP SIRE SOLUTIONS SIRE SOLUTIONS TOTALS VECTOR OE COVARIATES PLUS DEFEND I.-READ VARIABLES I.A -DECLARE AND DEFINE VARIABLES INTEGER HERD,YEAR,SEA,SIRE,LHERD,LYEAR,LSEA,NREC,NHYS,EOREC ,COW REAL MY,DIM,CI DIMENSION C(3000,1),AINV(19110),A(19110),RHS(195,1),RH51(195,1), + VECT(195),WK(195),D(195),TOT(3),TOTBAR(3),CONT(3) (MY,CONT(1)),(CI,CONT(3)),(DIM,CONT(2)) I.B -INITIALIZING TO ZERO 00 000 0 0 000 00000 0 0 139 DATA A/19110*0.0/ DATA RHS/l95*0.0/ DATA RHSl/195*0.0/ DATA C/3000*0.0/ DATA AINV/19110*0.0/ DATA VECT/195*0.0/ DATA D/195*0.0/ BOREC - 0 TMYHYS I 0.0 NHYS I 0 TYY I 0.0 NREC I 1 COREAC I 0.0 NOHYS I O SUMSIR I 0.0 RSQ I 0.0 VARDEP I 0.0 RATIOl SETDIF NSIRES NCOV I 15.6 00.2 193 HIII xx - ucov + 1 Do 67 I-l,NN TOT(I) - 0.0 CONTINUE NRANR - NSIRES + ucov LSIRE I NSIRES*(NSIRES+1)/2 LHALF I NRANK*(NRANK+1)/2 ACCUMULATOR OF TOTAL DEP OP FIXED FACTORS BEING ABSORBED TOTAL SUMS OP SQUARES FOR DEPEND CORRECTION EACTOR USED TO CALCULA HERD-YEAR’SEASON SUMS OP SQUARES COEFFICIENT OP DETERMINATION VARIANCE OP DEPENDENT VARIABLE I.C -PARAMETERS TO BE SET SET UP DUMMY VARIABLE EQUAL TO NUMBER OF COV PLUS DEPENDENT I.D -VARIABLES READ(6,99,ENDISS)HERD,CID,MY,SIRE,CI,DIM,YEAR,SEA,COW GO TO 22 I.E -READ INPUT START OF INFINITE LOOP TO READ ALL INPUT DATA 0 ”00000 0 0000 0 9000 68 000 140 READ(6,99,ENDI55)HERD,CID,MY,SIRE,CI,DIM,YEAR,SEA,COW WRITE(2,99)HERD,MY,SIRE,CI,DIM,YEAR,SEA,COW PORMAT(1X,I6,I4,9X,F5.0,I5,F4.0,P4.0,I3,I2,1X,I5) NREC I NREC+1 IP(HERD.NE.LHERD)GO TO 5 IF(YEAR.NE.LYEAR)GO TO 5 IF(SEA.NE.LSEA)GO TO 5 II.-PORM INCIDENCE MATRIX II.A -ACCUMULATE PROCESS A(INDICE(SIRE,SIRE)) - A(INDICE(SIRE,SIRE))+1.0 A(INDICE(NSIRES+1,SIRE)) - A(INDICE(NSIRES+1,SIRE) )+DIM A(INDICE(NSIRES+2,SIRE)) - A(INDICE(NSIRES+2,SIRE))+CI A(INDICE(NSIRES+1,NSIR£S+1)) - A(INDICE(NSIRES+1,NSIRES+1))+ DIM**2 A(INDICE(NSIRES+2,NSIRES+2)) - A(INDICE(NSIRES+2,NSIRES+2))+ c1992 A(INDICE(NSIRES+2,NSIRES+1)) - A(INDICE(NSIRES+2,NSIRES+1))+ DIM*CI RESl(SIRE,l) - msusmuwm RESl(NSIRES+l,l) - RESl(NSIRES+l,l)+MY*DIM RESl(NSIRES+2,l) - RHSl(NSIRES+2,l)+MY*CI RES(SIRE,1) - RES(SIRE,1)+MY RHS(NSIRES+1,1) - RHS(NSIRES+1,1)+MY*DIM RES(NSIRES+2,1) - RES(NSIRES+2,1)+MY*CI WRITE(2,70) A(INDICE(SIRE,SIRE)),A(INDICE(NSIRES+2,NSIRES+2)) ,RES(SIRE,1) FORMAT(1X,3£16.5) VECT(NSIRES+1) - VECT(NSIRES+1)+DIM VECT(NSIRES+2) - VECT(NSIRES+2)+CI VECT(SIRE) - VECT(SIRE)+1.0 NEYS - NHYS+1 TMYEYS - TMYEYS+MY SETTING FLAGS FOR NEXT READ LEERD - HERD LTSAR - YEAR LSEA - SEA TYY - MY**2+TYY Do 68 I-1,NN TOT(I) - TOT(I) + coNT(I) CONTINUE x - NOHYS + 1 WRITE(2,91)HERD,MY,SIRE,CI,DIM,YEAR,SEA,COW,K WRITE(5,91)HERD,MY,SIRE,CI,DIM,YEAR,SEA,COW,K FORMAT(lx,I6,6X,F6.0,1X,IS,1X,2(F5.0,lX),2(I4),2(IS)) so TO 100 II.B -ABSORBING HYS H 0000 3000 0 3001 0000|-‘ 00000 0000 702 141 EOREC I 1 CONTINUE PRINT*,' ABSORBING HYS INTO SIRE AND COVARIATES ' DO 1 I I 1,NRANR DO 2 J I 1,I A(INDICE(I,J)) I A(INDICE(I,J))-(VECT(I)*VECT(J))/NHYS WRITE(2,98)A(INDICE(I,J)) FORMAT(1OX,E16.5) CONTINUE RHS(I,1) I RHS(I,1)-VECT(I) *TMYHYS/NHYS CONTINUE COREAC I CORPAC+((TMYHYS**2)/NHYS) II.C ISAVE VECT NHYS AND TMYHYS T COMPUTE SOLUTIONS FOR HYS WRITE(3,*) NHYS,TMYHYS FORMAT(3X,I8,1X,E15.5) WRITE(3,*)(VECT(I),II1,NRANR) FORMAT(1X,8E15.5) II.D- ZEROING VECT,NHYS,TMYHYS BEFORE START READING NEW HYS GROUP NHYS I O TMYHYS I 0.0 DO 10 I I 1,NRANK VECT(I) I 0.0 CONTINUE II.E- COUNTING THE NUMBER OP HERD YEAR SEASONS NOHYS I NOHYS+1 IF(EOREC.NE.1)GO TO 22 PRINT*,' END OP THE ABSORBING PHASE ' ppINTe,0eeeeeeeeeeeeeeseeeeeeeeeaeeeaseeeeeeeeeeeeeeeeeeeeee0 II.P- END OF ABSORBING PHASE WRIT A AND RES PRINT*,'*****NO. OE RECORDS*‘*RECORDS USED***COUNT HYS***‘ WRITE(2,702) NREC,K PRINT*,’********NUMBER OE HERD-YEAR-SEASONS*****' WRITE(2,702)NOHYS FORMAT(5X,18) TOTBAR(1) I TOT(1)/NREC VARDEP I (TYY-TOT(1)*TOT(1)/NREC)/(NREC-l) PRINT*,' ROW VARIANCE ' 0‘ 0 00000000 ”2 no tum FAQ 0 (10¢5r)0¢1 0 130 C1004 C735 142 WRITE(2,*)VARDEP Do 59 I-2,NN TOTBAR(I)ITOT(I)/NREC CONTINUE PRINT*,' MEANS or DEEEND AND COVAR"S ' WRITE(2,70) (TOTEAR(I),I-I,NN) PRINT*,' FINAL z"z MATRIX AFTER ABSORBING HYS ' DO 3 I - 1,NRANK NEITE(2,94)(A(INDICE(I,II)).II-I,NEANE) FORMAT(1X,8£15.5) CONTINUE PRINT*,' FINAL Ens AFTER ABSORBING Eys ' Do 410 I - 1,NRANK NEITE(2,97) REE(I,I),REs1(I,1) PORNAT<3X,2E16.5) CONTINUE III.-START REML PROCESS III.AI OBTAIN SOLUTIONS ADD RATIO TO SIRE DIAGONALS NDIAG I 0 ITERA I 0 PRINT*,‘ SIRE DIAGONALS + RATIO ' DO 150 I I 1,NSIRES IF(A(INDICE(I,I)).EQ.0.0) THEN NDIAG I NDIAG+1 END IF A(INDICE(I,I)) I A(INDICE(I,I))+RATI01 WRITE(2,70) A(INDICE(I,I)) CONTINUE PRINT*,'**** MUM OF SIEEs NO PRESENT** ',NDIAG CONTINUE III.E- INVEET A INTO AINv PRINT*,' VECTOR OE SYMETRIC MATRIX ' DO 401 I - 1,LHALF WRITE(2,104)A(I) FORMAT(10X,EIG.5) CONTINUE x - 1 DO 735 I - 1,NRANK J - I*(I+1)/2 PRINT*,' ROW ',I PRINT 1004,(A(L),L - K,J) K - 3+1 FORMAT(1X,6216.5) CONTINUE N - NRANK CALL LINV4P(A,N,AINv,wx,IEE) 143 PRINT*,' ZZ +RATIO INVERTED MATRIX HALF STORED. ' DO 134 I I 1,LHALF . WRITE (2,34)AINV(I) FORMAT (1X,E16.5) ' CONTINUE III.B- RESTORE ORIGINAL A MATRIX 0000U000 Pfi U ab DO 151 I I 1,NSIRES A(INDICE(I,I)) I A(INDICE(I,I))-RATI01 151 CONTINUE C C III.C- NULTIPLY AINV BY RES AND C OBTAIN SOLUTIONS C N I 1 IE I NRANK IC I NRANK C CALL VNULSF(AINV,N,RES,N,IE,C,IC) C C PRINT*,' SOLUTIONS ' C DO 5001 I I 1,NRANK C NRITE(2,34)C(I,1) C5001 CONTINUE C C IV.A- CALCULATE ERROR VARIANCE C C CALCULATE UPU C UPU I 0.0 DO 1290 I I 1,NSIRES UPU I UPU+C(I,1)**2 1290 CONTINUE PRINT*,' U PRIME U ' WRITE(2,290)UPU 290 FORNAT(1X,E16.5) CALCULATE TKONPSON FACTOR THOFACIRATI01*UPU 0 0000 PRINT*,' THOMPSON FACTOR ' WRITE(2,290)THOFAC IV.A.1- NULTIPLY SOLUTIONS BY RHS 0000 BPXY I 0.0 BPXYl I 0.0 DO 1950 I I 1,NRANK 1950 950 750 0000 600 890 000 292 000 000M 144 EPXY I BPXY+RHS(I,1)*C(I,1) BPXYl I BPXY1+RHSI(I,1)*C(I,1) CONTINUE PRINT*,' SOLUTIONS TIMES RHS BPXY AND BPXYI ' WRITE (2,950)BPXY,BPXY1 FORMAT(1X,2E16.5) ERRVAR I TYY-CORFAC-EPXY-TROEAC DFERR I NREC-NCOV-NOHYS ERRVAR I ERRVAR/DFERR PRINT*,'ERRVAR,TOTSS,BPXY,CORFAC' WRITE(2,750)ERRVAR,TYY,EPXY,COREAC FORMAT(1X,4E16.5) RSQI(CORFAC+EPXY+TROFAC)/TYY PRINT*,' RSQ ' WRITE(2,*)RSQ IV.A.2- CALCULATE SIRE VARIANCE IV.A.3- CALCULATE TRACE TRACE I 0.0 II I 1 I I 0 I I I+II TRACE I TRACE+AINV(I) II I II+1 IF(I.EQ.LSIRE)GO TO 890 GO TO 600 CONTINUE PRINT*,' TRACE ' WRITE(2,290)TRACE IV.A.4- CALCULATE SIRE VARIANCE SIRVAR - TRACE*ERRVAR EIRVAR - SIRVAR+UPU SIRVAR - SIRVAR/NSIREE PRINT*,' SIRE VARIANCE ERROR VARIANCE ' WRITE(2,292)SIRVAR,ERRVAR FORNAT(5X.216.5,5X,£16.5) EER - (4*SIRVAR)/(SIRVAR+ERRVAR) wRITE(2,291) MER PRINT.’ 'fifiitiiiiittiiifii*i ' FORMAT ( ' HERITABILITY ' ,P9.4) IV.A.5- CALCULATE NEW RATIO RATIO2 I ERRVAR/SIRVAR IV.A.6- CALCULATE RATIO DIFFERENC RATDIF I ABS(RATI01-RATI02) PRINT*,' RATIOI NEW RATIO RATIO DIFFERENCE ' 768 999 0000000 000 145 WRITE(2,768)RATI01,RATIOZ,RATDIP FORMAT(1X,3E16.5) IF(RATDIF.LE.SETDIF)GO TO 999 ITERA I ITERA+1 PRINT*,'******* END OF ITERATION *** ',ITERA RATIOl I RATI02 GO TO 130 CONTINUE PRINT*,'* *' PRINTi,'itittiiiiit*iititiittttttiiiiitiitiittttifl' PRINT*,' TEE ITERATIVE PROCESS HAS ENDED ' PRINTi,Uit.t*tiittttitittti*iiitttiiitititttittiii' PRINT*,' ' PRINT*,'** TEE RATIO DIFFERENCE IS LESS TEAN** ',SETDIF PRINT*,' ' PRINT*,' THERE WERE ',ITERA,' ROUNDS 0F ITERATION ' PRINT*,' ' PRINT*, 'SIRE VARIANCE ' WRITE(2,290) SIRVAR PRIflTt,'tiiiitiflttiittiiitiflt' PRINT*,' ' PRINT*, 'ERROR VARIANCE' WRITE(2,290) ERRVAR PRINTC"ttttiitfliitittfititifiifi' PRINT*,' ' PRINT*,'EERITAEILITY' NRITE(2,290) HER PfiIRTi"ifitiiittttittflitttfiii' PRINT*,‘ ' PRINT*,' ' DO 728 I I 1,NCOV WRITE(2,827) I,A(INDICE(NSIRES+I,NSIRES+I)) FORMAT(5X,'SSCOV ',IS,'I ',E16.5) vfirflTt"*iititttttififiiitiiiiiitt0 CONTINUE WRITE(2,729) A(INDICE(NSIRES+2,NSIRES+1)) FORMAT(5X,'SCPCOV1 COV2 I ',216.5) pnINTO,I*OOOOOOOOOOOOOOOOOOOOOOO0 IV.E- CALCULATE STANDARD ERRORS 0F SOLUTIONS IV.B.1- CREATE A VECTORS WITH INVERSE DIAGONALS J I 0 DO 7355 I I 1,NRANK J I J+I D(I) I AINV(J) IV.B.2- MULTIPLY DIAG * ERROR VAR 000;: 000 U! 0'! 2005 5002 3005 5003 9090 0000 C404 567 146 D(I) I D(I)*ERRVAR IV.E.3- CALCULATE STANDARDS ERROR 0(1) - SORT(D(I)) CONTINUE V.- PRINT FINAL RESULTS PRINT*, ' SOLUTIONS AND STANDARD ERRORS ' PRIM'. I ' *fiitiiiiiiiiii*ii*iiifiiflflitiiitiiifiifiii ' DO 5002 I - 1,NSIRES SUMSIR - SUMSIR . SUMSIR+C(I,1) WRITE(2,2005)I,C(I,1),D(I) E0RMAT(Sx,' SIRE',IS,‘I ',E9.4,sx,E9.4) CONTINUE DO 5003 I - NSIRES+1,NSIRES+NCOV WRITE(2,3005)I,C(I,1),D(I) FORMAT(5X, 'COVAR',I5, '- ',29.4,sx.r9.4) CONTINUE WRITE(2,9090) SUMSIR FORMAT(SX,'THE SIRE SOLUTIONS SUM TO ',r12.4) IV.- START THE BACK UP PROCESS TO OBTAIN EYS SOLUTIONS REWIND 3 SUM I 0.0 NSOL I NOEYS+NRANR HERE I 0.0 DO 1001 J I NRANK+1,NSOL READ(3,*) NMYS,TMYHYS READ(3,*)(VECT(I),II1,NRANR) WRITE(2,3000) NHYS,TMYHYS WRITE(2,3001)(VECT(I),II1,NRANK) DO 567 I I 1,NRANK SUM I SUM+(C(I,1)*(I1.0*VECT(I))) WRITE(2,404)C(I,1),VECT(I) FORMAT(2X,2E15.5) CONTINUE C(J,1) I (TMYHYS+SUM)/NEYS WRITE(2,66)SUM,TMYHYS,C(J,1) FORMAT(SX.3E15.5) SUM I 0.0 BPKY I EPHY+(TMYHYS*C(J,1)) CONTINUE ERRVARZ I TYY-BPHY-BPXYl-THOFAC ERRVAR2 I ERRVARZ/DFERR ERRDIFIERRVARZIERRVAR PRINT*,' ERROR VAR HYS SOLUTIONS AND ERROR VAR CORR FACTOR ' 147 WRITE(2,750) ERRVAR2 ,ERRVAR,ERRDIF C 8081 PRINT*,' ** ALL SOLUTIONS INCLUDING THE EYS ** ' PRINT*,' ** WERE WRITEN IN TAPE 4 ** ' D0 76 I I 1,NSOL ' C WRITE(2,34)C(I,1) WRITE(4,*)C(I,1) 76 CONTINUE STOP END c *fltti END op REML pggcnnnitaaittt Ci‘tfi'ktiflfiitifiiifismomns**O****i********i** SUEROUTINE VMULSE(A,N,E,M,IE,C,IC) DIMENSION A(I), E(IE,M), C(IC,M) DO 51 J - 1,M Do 51 I - I,N C(I,J) - 0. INDEX - I*(I-1)/2 D0 52 x - 1,: INDEX - INDEX+1 52 C(I,J) I C(I,J)+A(INDEX)*B(K,J) IF (I.EQ.N) GOTO 51 K I 1+1 INDEX I INDEX+I DO 53 RE I R,N C(I,J) I C(I,J)+A(INDEX)*B(RR,J) 53 INDEX I INDEX+NR 51 CONTINUE RETURN END C ALGORITHM AS7---J. APPL. STAT. 17,199---EEALY SUEROUTINE LINV4P(A,N,AINV,W,IER) DIMENSION A(1),AINV(1),U(1) NRow - N IER - 130 IF(NROW.LE.O)GOTO 100 IER - 0 CALL LUDECP(A,AINV,NROW,DI,DZ,IER) IF(IER.NE.O)GOTO 100 NN - NR0w*(NRON+1)/2 IROW - NROW NDIAC - NN 16 IE(AINV(NDIAC).EQ.0)OOTO 11 L - NDIAG DO 10 I - IRON,NROW N(I) - AINV(L) L - L+I 10 CONTINUE ICOL - NRow JCOL - NN MDIAC - NN 15 L - JCOL 13 12 11 17 14 100 99 12 13 148 X I 0. - IF(W(IROW).EQ.0)GOTO 99 IF(ICOL.EQ.IROW) X I 1./W(IROW) KINROW IF(X.EQ.IROW)GOTO 12 X I X-W(K)*AINV(L) K I XII L I L-1 IF(L.GT.MDIAG)L I LIK+1 GOTO 13 IF(W(IROW).EQ.0)GOTO 99 AINV(L) I X/W(IROW) IF(ICOL.EQ.IROW)GOTO 14 MDIAG I MDIAG-ICOL ICOL I ICOL-1 JCOL I JCOL-l GOTO 15 L I NDIAG DO 17 J I IROW,NROW AINV(L) I 0. L I L+J NDIAG I NDIAG-IROW IROW I IROW-l IF(IROW.NE.0)GOTO 16 RETURN IER I 129 END III-IALGORITHM ASG-I-J. APPL. STAT 17,195-I-HEALY SUEROUTINE LUDECP(A,UL,N,DI,D2,IER) -----01,D2 NOT CURRENTLY IMPLEMENTED DIMENSION A(1),UL(1) DATA ETA/1.E-9/ IER I 1 IF(N.LE.0)GOTO 100 IER I 129 J I 1 K I 0 DO 10 ICOL I 1,N L I 0 DO 11 IROW I 1,ICOL R I K+1 W I A(K) M I J DO 12 I I 1,IROW L I L+1 IF(I.EQ.IROW)GOTO 13 W I W-UL(L)*UL(M) M I M+1 IF(IROW.EQ.ICOL)GOTO 14 IF(UL(L).EQ.0)GOTO 21 UL(K) I W/UL(L) GOTO 11 149 21 UL(K) I 0. 11 CONTINUE 14 IF(AES(W).LT.AES(ETA*A(K)))GOTO 20 IF(W.LT.0.)GOTO 100 UL(K) I SQRT(W) GOTO 15 20 UL(K) I 0. 15 J I J+ICOL 10 CONTINUE IER I 0 100 RETURN END FUNCTION INDICE(IROW,ICOL) INDICE I IROW*(IROW-1)/2+ICOL RETURN END *Eosoo Linc-584 Soc-1 OK-