. 9.10.4. . .
1;? V Sv'ntzi‘t! .
if’pi‘yhb '1. In B.— .vs .
I h: P? r. v
I:

f l‘,. .
d... at." v c
:12!) 3!. 9a
pricli‘unﬁﬁ.‘

rc- ¢v».!lwt. .

r I v _ ‘ I. '1'}: q . f‘l" ' . .
‘ .1 g... 3".- 3 . J... E .5
Bra... K751... . Zriwl. 9.} Ir: ...
”hf-.8315... I... v... , 910%:
its! eh: u .. ,3
Jacki}. 535‘ V
. .l . EVKE 2; I‘
V ridmuﬂvur la"! y 2. n v, 1.....th . ,.
l . .. . . loin}! .y .
. . v! 5):! (“waft-V. 5!...‘( 5-... , It...
Bunkrhltg Who. 5!: . hp 9'35. 1‘) 3“!
{.Eg'i'. Dh¥=i: .
if u - Q’V’X‘
. . .2 . Ia.
. 1.251 I} . A l h
-.~’nr‘$"."
Lt.‘ ~t¢rf1H .ii V . ,
. y : 53:3 qu¢pLahkAI
.I fl‘lv‘K-i'l‘rlp ‘1‘;
‘ :‘tv’. 17". '
V lei‘zrlvin it...
y .26: t I 1.: v
91;}..5

1 (II I. y . .l
. .’.II{E.} XVI:
tr .L.’

lXai‘llzvl. II. ‘
Nixxlfvﬂguﬂ “in: 1.". . r:

_ ‘ ‘ U x
. i a: .3 3.», s. grin?!
v.1 Crhrf'Il‘- . , . . II. "V’L‘
V: ‘38. . . WabfWI;
$5.... .5... K
n . .1. .w . ‘- I- I.‘ .

. I
. ;.....br:
£12.! .6

I ‘3 .'
39““
V.

I} 151'. v9.3: Devil.
?l6..iini.9ltv.vizv.l tv
#339356. . 0‘5!» OIV. V
ISIS Nit?! . .
. a..Ibt|b.l..v .
vsnglzli": V
In in"... ‘u 9?...
{25358.33}!
. $693 .. I‘ll)?!“
1.0...Pc‘1 u . Yin... vll.
, 13.1.3.- Trix

' i

[1 V

1123'

. 1‘
Q: 1...! It v:s
:I»:1..‘$¢
3‘;

122::

i

t. .

it)!!! .1121.

. cu: . itaf‘liib‘
r l. .3

Q?

0": IxI'AQ
:Yl~ob .,.£1.ty .V
. .r.‘. I.

l... .0...‘

.\\
a1 3

\OI. :v 15.3%.. .I...
.ly...1.1.vl;.i I
svtflv|\l.ﬁ\.lrl1 .
1§|.9.\.! 1": 12.5.15...
I. CXLI-utu‘llt 9.. w . . «7"? ll
CJvtixvxyilzl . , : vs: 1
t :lSi‘évote S ,
)‘l-Itilitulvoyl
. V 3.0.vl. ‘50! .2. .
inst!!! “L
'1 ‘1’"! ‘9 I D’v.'n
‘1 t it," |"7.nl».:"b III.“
2!... .v 1...?
.I.

t t S... \ I...
. luti‘oi

. .. 19c»- ﬁu
vuxu‘iu! :VVI‘ .1». I
o. 3.7.. II. rah ‘1’.“- 3
ll;uu.l!t..( It: i ; 143‘ V.
1;! y.» vu‘!‘ - ,r I! Iliqo vivacitluu...‘
tr}. (ﬁlial; ..: stifzvtao 1.! 99%|:
: \‘.’~5I¥Ar"v¥l¥§b2§itﬂtl\ev.
. 3...»...11c-T

.v . ‘ IF. 0‘! ‘bli..vl
a." v....rv.r.§!l.

.72.,

I \Ill'x. . ‘n. n
.L. sci...-
1 iiii‘zpvl.
{-1.6} ‘
Igi yrir
nu.:
, . {I}... 3.1:
. 1’22..va v..y....|!
t. #2355: I. 2.1.: tv‘
lvltb}...? F! I .2 .1
... ‘Ds::v 3!.
A ‘Ii

Vtrvnu
. 'IY- ‘

9
3

. ..I.I.Iv'$..|l .
ul‘qt :1 $331.3
if 'rt.|.‘i.v.r|-. .
k“) ,Ziinfui'b .. .
.ON‘1.KVA\’I!IQOIIII. .V . V
. . 1O [ ‘8? 1 7"}. t’iltrifv‘b:
. . ‘ .. n..»\!1.l|r.h.t!!§
. V i .Vvlez‘. 1 V .
IX‘I‘Q‘IVDIIVIift
v 3‘55... Vii-Ivct‘~:

‘1'!

(as!!!

. . w :1: I! .
. i. ... .' l‘.“|..v|).ul: .

E!

3iauiiil. 0). . . .
Ix. volll‘tillcu “VI“,
(02“;tﬁiw I‘x‘ﬁ‘tz'lvl'u. 3 . . ' . . . .
1.1ij u..p,\n...\w.n,~vv...a V\.;n.!.aWol.1J.u . . ‘ . ‘ V A .
. L‘ f a .. n .-V c l .1 .l. A .
E s. V A . ...5 but; V. .. . . . . . E13133 L.”
. . . . .. 4.9."... . .. . . . ‘ tucbl..o?tl Riki”.

am

 

.n.b:. r p

ANSTATEU

l llflllllllllUlllHlHllllllHill

300891 4339

 

 

 

 

 

 

 

lllllllllﬂllzll

 

This is to certify that the
thesis entitled

GENETIC PARAMETER ESTIMATION FRCN SINGLE
AND MULTIPLE TRAIT ANALYSES

presented by

Terri Lynne Moore

has been accepted towards fulﬁllment
of the requirements for

M. S. degree in Animal Science

 

 

41 W
Major professor

Datewif/770

0-7639 MS U is an Afﬁrmative Action/Equal Opportunity Institution

 

 

h

r r m 4‘
LIBRARY
Michigan State

1 University

‘ ’-

 

 

PLACE IN RETURN BOX to remove this checkout from your record.
TO AVOID FINES return on or before date due.

DATE DUE DATE DUE DATE DUE

ﬂ
=¥Fll

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

MSU Is An Affirmative Action/Equal Opportunity Institution
czleb'cmma-pd

 

GENETIC PM ESTIMATION HOE SINGLE AND HULTIPIS MIT
ANALYSES

3!

Terri. Lynne Hoore

A THESIS

thitted to
Inchigen State University
in partial fulfill-ant of the requirements
for the degree of

MASTER OF SCIENCE

Depart-eat of Aninl Science

1990

GENETIC BARANETEE ESTINAIION FED! SINGLE AND NDLIIPLE TRAIT
ANALYSES

By

Terri Lynne‘loore

The estimation of genetic parameters from unbalanced data with
information on many traits poses severe computing problems. subsets of
traits may be repeatedly selected such that multitrait models are
computationally manageable and biases in estimates are minimized. Such
biases, if exist, may be dependent on the magnitude of underlying true
parameters. The current method of choice for estimating (co)variance
components is derived from the normal density function. Its
statistical properties may not hold if data for analysis is not
normally distributed. This work examined, by simulation, three
potential sources of bias in estimating genetic parameters: the number
of traits in the analysis, the magnitude of the underlying genetic
parameters, and violation of normality assumptions. Results indicate
that both accuracy and precision of genetic correlation estimates were
dependent upon all three sources examined. The exception was that
precision of heritability estimates was dependent only on the true

underlying heritability value.

ACKNOWLEDGEMENTS

I'd like to thank my committee members for their guidance and
support: Dr. Ivan Mao, Dr. Ted Ferris, and Dr. John Gill. They
have all been very patient and understanding throughout the years
I have been here.

Many of the good friends I have made here have encouraged and
supported me over the past three years. In particular, Florah,
Gwang-Joo, and Gustavo: they have all made the hard times a little
easier to get through and could always put a smile on my face.

I would especially like to thank Just Jensen. He has more
patience and understanding than anyone I know. We have been through
a lot together and I hope one day I can give to him all that he has
given to me.

I would like to thank Sheryl Hulet for helping in preparing
this thesis, but more importantly for being a good friend. She has
a way of making the work place a lot more bearable and almost even
fun.

I would also like to thank my family for all their support.
They have always been there when I have needed them, and I will

definitely need them in next four years!

ii

TABLE OF CONTENTS

LIST OF TABLES .................................................

1. Introduction ...............................................

1.1 Number of Traits .....................................

1.2 Underlying Population Parameters .....................

1.3 Normality Assumptions ................................

2. Objectives .................................................

3. Review of Literature .......................................

3.1 Introduction .........................................

3.2 (Co)variance Component Estimation ....................

3.2.1 History .......................................

3.2.2 EM-REML Method of Estimation ..................

3.2.3 Models ........................................

3.3 Advantages of Multiple Trait Analysis ................
3.3.1 Improved Accuracy and Reduction in Selection

Bias ..........................................

3.3.2 Evaluation of A11 Animals .....................

3.3.3 Estimation of Covariance With No Crossproducts

3.4 Limitations of Multiple Trait Analysis ...............

3.4.1 Computational Requirements ....................

3.4.2 Degree of Correlation Among Traits ............

4. Comparison of Genetic Parameter Estimates From Single and

Multiple Trait Analyses ....................................
4.1 Abstract ..............................................
4.2 Introduction ..........................................
4.3 Materials and Methods .................................
4.3.1 Simulation Procedure ...........................
4.3.2 Statistical Analysis ...........................
4.4 Results ...............................................
4.4.1 Biases .........................................
4.4.2 Mean Square Errors .............................
4.4.3 Correlations ...................................

iii

Page

19

2O
21
22
22
24
25
25
26
32

4.4.4 Grouping of Traits .............................
4.5 Conclusions ...........................................

Comparison of Genetic Parameter Estimates From Single and
Multiple Trait Analyses When Underlying Distribution is
Skewed .....................................................

5.1 Abstract ..............................................
5.2 Introduction ..........................................
5.3 Materials and Methods .................................
5.3.1 Simulation Procedure ...........................
5.3.2 Statistical Analysis ...........................
5.4 Results ...............................................
5.4.1 Biases .........................................
5.4.2 Mean Square Errors .............................
5.4.3 Correlations ...................................
5.5 Conclusions ...........................................
Summary ....................................................
6.1 Heritability ..........................................
6.2 Genetic Correlations ..................................
6.3 Sampling Subsets of Traits ............................
Bibliography ...............................................

iv

32
35

38

39
40
41
41
42
45
45
45
52
56

58
59
60
61

62

LIST OF TABLES

Table Page
1. Underlying parameter structures investigated ........... 24

2. Percent biased heritability and genetic correlation
estimates ............................................ 27

3. Average root mean square errors for heritability estimates 28

4. Average root mean square errors for genetic correlation
estimates ........................................... 29

5. Average root mean square errors of genetic correlation
estimates falling in the range (.08-.lO) and (.26-.48) 31

6. Correlations between estimates of heritability for
two extreme values (.1,.8) for multiple trait and
single trait analyses ............................... 33

7. Correlations between estimates of genetic correlations
for all possible pairs of multiple trait analyses.... 34

8. Underlying parameter structures investigated for skewed
traits .............................................. 43

9. Number of analyses run and number of estimates obtained
from each replicate of each parameter combination.... 44

10. Percent biased estimates for those situations having a
greater than expected number (P<.05) ................. 46

11. Average root mean square errors for heritability estimates
from skewed underlying distributions ................. 48

12. R2 values and standardized partial regression
coefficients for three multiple regression models with
standard error of heritability estimate as the
dependent variable ................................... 50

13.

14.

15.

16.

Average root mean square errors of genetic correlation
estimates from skewed underlying distributions .......

R2 values and standardized partial regression
coefficients for three multiple regression models with
standard error of genetic correlation estimates as the
dependent variable ...................................

Correlations of heritability estimates for two extreme
values(.1,.8) between multiple and single trait
analyses .............................................

Correlations of genetic correlation estimates when one

trait is skewed between different multiple trait
analyses .............................................

vi

51

53

54

55

1. INTRODUCTION

Genetic parameter estimates, such as heritabilities and genetic
and phenotypic correlations, are obtained from estimates of
(co)variance components, usually from large unbalanced data sets with
information on many traits. The use of all data on all traits often
leads to models that are computationally demanding. To alleviate the
difficulty data may be sampled by including only a subset of the traits
of interest. The current method of choice for the estimation of
genetic parameters is Restricted Maximum Likelihood (REML), which is
derived from the normal density function. If data to be analyzed are
not normally distributed, estimation results may not have the
desirable statistical properties of REML.

These are two of a number of factors that may potentially effect
the accuracy and precision of genetic parameter estimates from multiple
trait analyses (MTA). The understanding of these possible sources of
bias should enable one to develop strategies in sampling subsets of
traits that yield high estimation accuracy and precision while
minimizing computational requirements.

1.1 Number of Traits

Studies in literature have indicated that the accuracy of genetic

parameter estimates may depend on the number of traits in a MTA.

Buttazzoni and Mao (1989) examined the estimates of sire and residual

variance components and heritability estimates from both single and
multiple trait analyses of the same data set and found that while
residual components from MTA were slightly greater than those from
single trait analyses, sire variance components were consistently much
greater.

Lin and Lee (1986) compared estimates from single trait and
multiple trait analyses of the same data set in addition to looking at
the effects of sequentially adding traits to a mixed model on parameter
estimation by MTA. Their results suggested that parameter estimates
may vary depending on the type of analysis (single or multiple trait)
and upon other traits included in a MTA. They found that heritability
estimates of a given trait or genetic correlation estimates of two
given traits change as additional traits are added to or deleted from a
MTA. Walter and Mao (1985) found similar results for genetic
correlations from single trait and two-trait analyses. This suggests
that differences in parameter estimates reflect the joint contribution
of other correlated traits which are omitted in subset MTA with smaller
number of traits. This is a direct consequence of using different
(co)variance matrices. Therefore, it appears genetic and phenotypic
parameter estimates are conditional upon other traits included in
simultaneous analyses. Schaeffer and Wilton (1981) reported similar
findings in that differences they found in the sign of the correlations

between sire proofs depended upon the number of traits in a MTA.

1.2 Underlying Population Parameters

The magnitude of the population parameters from which the sample
was obtained may affect the accuracy and precision of genetic parameter
estimates. Schaeffer (1984) studied reductions in prediction error
variances (PEV) from two-trait models over single trait models using
various combinations of genetic and residual correlations. He found
that the percentage increase in accuracy was dependent on the
difference between genetic and residual correlations, implying that the
ability of MTA to increase accuracy of estimation is dependent on the
levels of correlations used.

Walter and Mao (1985) compared (co)variance REML estimates under
various genetic and residual correlations in simulated populations.
Results indicated that while estimates of residual variances were
consistent across different levels of genetic and residual
correlations, estimates of sire variances tended to decrease as genetic
correlation increased.

1.3 Normality Assumptions

A violation of normality assumptions may have an effect on
biasedness in (co)variance component estimates. Both Maximum
Likelihood (ML) and REML procedures require random effects contributing
to the observation vector to be random samples from underlying normal
populations. This may not be the case in many instances. Traits such
as calving difficulty, litter size, and conformation are either
subjectively scored or categorized due to the discrete nature of the

units of measurement. The symmetry of the distribution is, therefore,

dependent upon the frequencies within each class. However, such traits
were assumed to be normally distributed when applying ML or REML.

The effects of selection may also cause the distribution of a
random factor to be skewed (Banks and Mao, 1985). Cows are culled at
various stages in their lifetime for a variety of reasons. Thus, the
population of older cows is more likely a selected population. This
could cause skewness of residuals in the model. Intense selection in
the male population and groups of half siblings could cause skewness in
the sire distribution.

Buttazzoni and Mao (1989) indicated that the discrepancies found
between single trait and multiple trait estimates appeared to be
inversely related to the magnitude of the estimates and directly
related to the skewness of the residuals. Banks and Mao (1985)
examined the dispersion and asymptotic biasedness properties of
variance component estimates through REML for single trait analyses
when sire and residual variances were skewed. Results indicated that
the method of estimation appeared robust to skewed distributions, in
terms of accuracy, while it was not in terms of precision as sampling

variances of the estimates were greater in skewed distributions.

2. OBJECTIVES

The objectives of this work were to examine, by simulation, three
possible sources of bias in genetic parameter estimates from single and

multiple trait analyses: 1) the number of traits included in an

analysis, 2) the magnitude of the underlying parameters, and 3) the
effect of violation of normality assumptions. If patterns of bias can
be found, guidelines in sampling subsets of traits that would yield
estimates with optimal properties while minimizing computational

requirements may be developed.

3. REVIEW 0? LITERATURE

3.1 Introduction

Multiple trait analysis (MTA) utilizes information from all traits
to estimate (co)variances and evaluate animals through genetic and
environmental correlations between traits. Advances in computer
technology and concern about effects of selection on traits measured
have resulted in increased interest in multiple trait models.
Simplifications in computing often can be made for specific types of
multiple trait models, such as when every animal is measured for each
trait or if the same model can be used for all traits (Meyer, 1986).

Estimates of genetic parameters obtained through MTA have been
studied by several workers (Schaeffer and Wilton, 1981; Walter and Mao,
1985; Lin and Lee, 1987) There is little known, however, about
properties of these estimates, in terms of accuracy and precision, and
factors affecting these two properties. Factors which may have a
varying influence in REML estimation on the accuracy and precision of
the estimates include the magnitude of correlations of genetic
elements of traits, number of traits in an analysis, and robustness of
the REML procedure against severe violations of distribution
assumptions.
3.2 (Co)variance Component Estimation
3.2.1 History

Genetic parameter estimates such as heritabilities and genetic and

phenotypic correlations are obtained from estimates of

(co)variance components. Several methods for (co)variance component
estimation exist. Searle (1989) provides an extensive review on
variance component estimation methods. The methods reviewed include
Analysis-of-Variance (ANOVA) methods for balanced data, Henderson's
methods 1,2 and 3, Minimum Norm Quadratic Unbiased Estimation (MINQUE),
Maximum Likelihood (ML) and Restricted Maximum Likelihood (REML) for
use on unbalanced data.

For many years analysis-of—variance (ANOVA) and analysis-of-
covariance (ANCOVA) estimations were the standard procedures to
estimate genetic and environmental (co)variances for both balanced and
unbalanced data. For the balanced case, ANOVA estimators are
translation invariant, minimum variance quadratic unbiased and can be
used regardless of distributional properties. But they can also yield
negative estimates. When normality is assumed, the estimators are not
just minimum variance quadratic unbiased but are minimum variance
unbiased. For the unbalanced case however, the only known properties
of ANOVA and ANCOVA methods were translation invariance, i.e. invariant
to the fixed effects in the model, and unbiasedness, but even the
latter property no longer holds for populations under selection.
MINQUE of Rao (1971) was the first attempt to minimize sampling
variances, the variance of an estimate, in the class of translation-
invariant quadratic unbiased estimators. The smaller the sampling
variance of an estimator the more efficient it is.

Hartley & Rao (1967) developed the first general method for

dealing with unbalanced data in ML estimations which demand the

assumption of some known form of distribution function for the data
vector. The ML procedure of Hartley and Rao yields simultaneous
estimation of both the fixed effects and the variance components.

ML estimators are derived by maximizing the likelihood over the
parameter space, which is non-negative as far as variance components
are concerned. ML estimators of variance components, however, do not
lead to those estimators derived from ANOVA methods since these
estimators can take negative values. The difference concerns divisors
of certain mean squares, resulting in some of the solutions being
biased estimators, i.e., the ML estimators of the variance components
take no account of the loss in degrees of freedom (d.f.) resulting from
the estimation of the fixed effects.

Patterson and Thompson (1971) extended this procedure to
Restricted Maximum Likelihood (REML), modifying the ML procedure of
Hartley and Rao by adapting a transformation which partitions the
likelihood under normality into two parts, one due to the fixed
effects, and one due to error contrasts free of the fixed effects,
i.e., contrasts with expectation independent of the fixed effects.
Maximizing this latter likelihood yields what are called REML
estimators. Patterson and Thompson (1971) described this procedure for
the univariate case. Thompson (1973) subsequently extended it to the
multivariate case. A comprehensive review of ML approaches to variance
component estimation, their properties, and problems of application was
presented by Harville (1977). Searle (in Paper BU-673-M, Biometrics
Unit, Cornell University, 1979) gave a detailed account of ML and

related procedures, summarizing and comparing the algebra for

alternative approaches in the univariate case. Both authors emphasized
REML. Thompson (1982) discussed REML to estimate genetic parameters in
animal breeding. A number of studies reported REML algorithms for
specific analyses in this field (Thompson, 1977; Schaeffer, et a1.
1978; Lin and Lee, 1986).

3.2.2. EH-REML Method of Estimation

The use of REML in the estimation of (co)variance components has
become increasingly popular due to its desirable statistical
properties. REML, as ML, has the property of invariance under
translation. Furthermore, REML estimators have the additional property
of reducing to ANOVA estimators for many, if not all, cases of balanced
data, unlike the ML estimators of Hartley and Rao (1967) (Corbeil,
Searle, 1976). This additional proerty is a useful one because of the
optimal properties of ANOVA estimators from balanced data, particularly
minimum variance properties. A second additional property of REML, as
indicated by Meyer and Thomson (1982), and Henderson (1987), is that it
may have considerable power to eliminate selection bias due to culling.
However, all data in selection decisions must be used in the analysis
if the selection bias is to be eliminated.

The REML sets of equations, as with ML, are non-linear in the
variance component estimators and must be solved numerically, usually
by iteration. Hence, computational requirements are extensive.

Several algorithms for obtaining REML estimates of variance components
exist and can be classified according to the information from
derivatives of the likelihood function that is utilized: first and

second derivatives, first only, and derivative-free. Procedures such

10

as Fisher's method of scoring and Newton-Raphson require expected
values of second derivatives. Expectation-Maximization (EM)-type
algorithms exploit first derivative information. Derivative-free
approaches obtain REML estimates by direct maximization of the
likelihood function using standard optimization procedures.

The EM algorithm (Dempster et al. 1977) has been the most
frequently used due to the relative ease of programming required,
expressions that are intuitively easy to understand, and the guarantee
that estimates are in the parameter space. The latter property is not
a feature of other algorithms for ML or REML.

The EM algorithm is iterative in nature and is directed at finding
values of the parameter vector, ¢, which maximize the density g(y|¢)
given an observed y, but it does so by making use of the associated
family f(x|¢0 that is thought to represent the population from which
the sample comes. Each iteration of the algorithm consists of an
expectation step followed by a maximization step. The expectation step
involves assuming that the current estimate is equal to that of the
true parameter, and given the observation vector, computes the
expectations for quadratic forms. The maximization step then computes
the new estimates by simply dividing the expectations of quadratic
forms by the suitable d.f.

However, the EM algorithm, in general, converges slowly. Several
attempts have been made to speed up convergence, such as the common
intercept approach (Schaeffer, 1979), the use of relaxation factors, or
non-linear adjustment (Mistal & Schaeffer, 1986) which reduce the

number of iteration rounds required for convergence. Other approaches

11

to reduce the amount of computation in each round of iteration involve
the use of canonical, Householder, and Cholesky transformations applied
to different elements of the EM algorithm.
3.2.3 MOdels
For any statistical analysis there is an assumed or implied
model. Linear models are extensively used because they are easily
applied.
Consider t traits and let the model for each be denoted as:
yi - xibi + Ziui + e1 (i-l,...,t) [l]
where
7i is an observation vector for trait i of length n1;
b1 is an unknown vector of fixed effects for the ith trait of
length p1;
ui is an unknown vector of random effects for the ith trait of
length qi;
e1 is an unknown vector of random residuals corresponding to yi;
and
Xi and 21 are observed incidence matrices of order n1 x pi and n1
x qi, respectively.
If the design matrices are equal for all t traits , i.e. if
Xi - X and 21 - 2, then the model for all t traits simultaneously can
be written as a direct extension of [1]:
Y - (It*X)b + (It*Z)u + e [2]
y - vec Y, "*" denotes the direct product operation (Searle,
1982),

b' - [b1'. bz'. . . . . bt'].

1

u: _ [“1" ‘12:,

e' - [e1', ez',

12

, ut'], and

, et'].

The expectations, E( ), and (co)variances, V( ), are:

E(y) - (It*X)b. E(u) - 0. E(e) - 0

V(u)-Vr 1

‘11

‘12

 

 

-G-P
G11 G12 "
62200
symmetric

 

G1:

“2:

G

 

tt

With one random factor per trait with the same number of classes then

 

G - r
81118 81213
322Is
symmetric
similarly,
V(e)-V r- - -
e1
°2
°t

 

 

 

gltIs

g2tIs

symmetric

q

 

 

- IS*Go where Go - r
811 812
822
symmetric
°"’ th rllln r12In °
"° 32: r221n '

symmetric
Rtt _ _

 

 

31c

82:

 

 

13

- In*Ro where Ro - - 1
r11 r12 "’ r1:
r22 "’ r2:
symmetric
rtt _

 

 

Then V(y) - V - Z(IS*GO)Z' + In*Ro

The model is therefore characterized by the assumed
structures of G and R. Under these assumptions, the mixed model
equations that would yield the best linear unbiased estimator (BLUE) of
the fixed effects and the best linear unbiased predictor (BLUP) of the

random effects can be written as (Henderson, 1973):

1 1 1 [3]
x'x*n'o x'z*n'o b (X'*R'o)y

z'xsa'g z'z*n'i + 13*6’g u (Z'*R';)y

where G and R contain a priori estimates in [3].

Let C, a generalized inverse of the coefficient matrix in [3]
be partitioned by denoting the 1th and jth submatrix of C corresponding
to the 1th and jth subvectors of u as cij'

Utilizing expressions by Dempster et a1. (1977), the EM

algorithm to estimate the ijth elements in G and R are:

811(k+1) - [u'i(k)“j(k) + tr(cij(k))]/q [4]

rij(k+1) ' [a'1(k)°j(k) + tr(”HUGH/n [5]

for the kth round of iteration, where Bij is the submatrix WCW'

corresponding to the ijth pair of traits, where Wh[X:Z]. These

l4

expressions are due to Henderson (1984), and correspond to the REML
estimators given by Patterson and Thompson (1971) and Harville (1977)
but extended to multiple traits.
3.3 Advantages of Multiple Trait Analysis
3.3.1. Improved Accuracy and Reduction in Selection Bias

In the statistical analysis of animal breeding data, traits are
often considered one at a time without considering other traits
measured on the same individual. Usually one is interested, however,
not only in the mode of inheritance of a particular trait, but also in
its relationships with other traits and expected changes in the latter
when selecting on the particular trait analyzed. For these cases
multivariate analyses are required to obtain estimates of genetic and
phenotypic correlations between traits. Moreover, while univariate
analyses implicitly assume that all correlations are zero, joint
analyses of correlated traits utilize information from all traits to
obtain estimates for a specific trait and are thus likely to yield more
accurate results. This is of particular relevance when data are not a
random sample. For animal breeding data this is often the case since,
typically, data originate from selection experiments or are field
records from livestock improvement schemes which select animals on the
basis of performance. Usually one or more traits have undergone
selection resulting in missing observations for some traits. This is
particularly true in sequential culling, where observations on one
trait are used for selection, and the selected group of animals then is
measured for a subsequent trait.

In these situations, univariate analyses are expected to be biased

1

If]:

15

while multivariate analyses may account for selection. Pollack, et a1.
(1984) examined the ability of multiple trait methods to reduce or
eliminate selection bias, either by sequential selection or selection
on a correlated trait. Results indicated that in both cases, bias in
the single trait evaluation was eliminated by multiple trait
procedures. Henderson, (1975) showed that multiple trait models
account for selection bias if selection is described as a translation
invariant function of the traits that had been used to make selection
decisions.

Analysis-of—variance methods have been used widely to estimate
genetic and phenotypic correlations. These require records for all
traits for all individuals. If there are missing records, this implies
that part of the relevant information is ignored. If the lack of
records is the outcome of selection based on some criterion correlated
to trait(s) under analysis, estimates are likely to be biased by
selection (Meyer, 1989). In contrast, ML estimation procedures utilize
all records available and, under certain conditions, account for
selection. Essentially, it is required that all information, unless
totally uncorrelated, on which selection decisions have been based be
included in the analysis. Even if these conditions are only partially
fulfilled, ML estimates are often considerably less biased by selection
than their ANOVA counterparts (Meyer and Thompson, 1984).

3.3.2. Evaluation of All Animals

A second advantage of MTA that has been noted by several authors

(Schaeffer and Wilton, 1981; Schaeffer, 1984) is that it allows every

animal to be evaluated for all traits without actually being observed

16

for all traits. This is due to the non-zero genetic and residual
covariances among traits that are incorporated into the analysis.
Therefore, in multiple trait analysis, evaluation of an animal for a
trait is composed of contributions from all traits in the analysis.
For example, a sire that has no progeny recorded for calving ease can
have a calving ease evaluation which is based upon genetic correlations
of calving ease with the other traits in the analysis. This cannot be
accomplished with single trait analyses unless the relationship matrix
is used (Schaeffer, 1984). The correlation between error effects for
different traits will have a direct effect on the contribution from an
observation on a trait (Schaeffer, 1984). As the absolute value of the
error correlation increases, the weight on observations from other
traits also increases. Therefore, in some instances, multiple trait
evaluations could be greatly different from single trait evaluations
because of correlation among traits. As reported by Schaeffer and
Wilton (1981), the accuracy of the evaluation will also be dependent
upon the number of progeny with observations on the other traits.
3.3.3. Estimation of Covariance with No Crossproducts

A third advantage of MTA is the estimation of covariance
components when crossproducts or sums of two traits do not exist and
the same linear model is not possible for both traits. Usually,
covariance components are estimated between traits measured on the same
individual with the same linear model being assumed for each trait.
Consider, for example, yearling weights on male and female beef calves.
A different model is appropriate for each sex. Also, to estimate the

covariance between male and female yearling weights, a procedure of

17

using crossproducts and sums on each individual is not possible. If an
interaction of sire-by-sex of calf is present, the sire component of
covariance would yield a genetic correlation less than unity. This
would mean that sires need to be evaluated from their female and male
progeny separately. The estimation of such a covariance would require
a multiple trait procedure.

3.4 Limitations of Multiple Trait Analysis

3.4.1. Computational Requirements

One limitation of applying multiple trait analysis is the
increased number of equations to be solved. Computations are usually
cumbersome, time consuming, and costly. One would need to weigh the
gain in accuracy, with special reference to the elimination of
selection bias, relative to the extra computing effort. There are
situations, however, when models for different traits cannot be the
same and/or different traits cannot be measured on the same
individuals, then multiple trait methods become a necessity.

Restricted maximum likelihood estimation requires the inversion of
the entire coefficient matrix of the mixed model equations. By direct
inversion of the coefficient matrix, the time required becomes
proportional to the order of the coefficient matrix (n); CPU ~ on3
where c is a constant. There have been strategies developed that
alleviate this problem. One of them is to eliminate the fixed effects
in the model by absorption, therefore, requiring only solutions to the
random factors in the model.

Other shortcuts applied to calculations have been developed that

depend on the particular model being used. Many of these shortcuts

l8

involve the use of transformations applied to different elements in the
EM algorithm. In the case where the model is assumed to contain only
one random factor and the same model is used for all traits, then
canonical transformation can be applied. The purpose of canonical
transformation is to obtain a set of canonical variates, between which
all covariances are zero, without loss of any information contained in
the original variables. Variance components are estimated for each
canonical trait using single trait analysis, and then transformed back
to the original scale.
3.4.2. Degree of Correlation Among Traits

The utilization of MTA would seem to be most advantageous when the
absolute values of the correlations between traits are high so that
information on each trait would contribute more to the accuracy of
estimation and/or prediction in other traits. However, as indicated by
Meyer (1985), Hill and Thompson (1978), Seal (1966) and others, in the
analysis of highly correlated traits there is a strong chance,
depending on the method of estimation and the amount of data, that the
estimated (co)variance matrices are not within the allowable parameter
space. For the estimated (co)variance matrix to be in the allowable
parameter space, it must be positive definite. In genetic
applications, (co)variance matrices must be positive definite. This
probability of obtaining non-positive definite (co)variance matrices

increases with the number of traits.

4. Comparison of Genetic Parameter Estimates From Single and
Multiple Trait Analyses

19

20

4.1 ABSTRACT

Discrepancies between estimates of genetic parameters from single
and multiple trait analysis of the same data set were examined by
simulation. Two possible causes were studied: the magnitude of the
underlying genetic parameters, and the number of traits in the
analysis. Different situations were simulated to cover a range of
heritabilities as well as genetic correlation structures. A model with
fixed management group effects and random sire effects was used to
simulate records for four traits. All animals were recorded for all
traits and the same model was used for each trait. Genetic parameters
were estimated using an EM-REML algorithm with canonical and
Householder transformations.

Overall, no significant biases were found in heritability
estimates. Biases in genetic correlations tended to occur more when
the underlying correlation was negative. Inclusion of correlated traits
in a multiple trait analysis did not increase accuracy of heritability
estimates but did increase accuracy of estimates of genetic
correlations.

Correlations between the estimates for both heritability and
genetic correlations for different analyses were all high (.78 and
greater). Correlations between the estimates for heritability were
highest between two trait and single trait analyses and lowest between
four trait and single trait analyses. Correlations between the
estimates of genetic correlations were highest between four trait and

three trait analyses and lowest between four trait and two trait

21

analyses. Correlations for both heritability and genetic correlation
estimates increased as the underlying heritability increased and were

highest under a positive gentic correlation.
4.2 INTRODUCTION

Unlike single trait analysis (STA), multitrait analysis (MTA)
takes into account all correlations among traits in estimating genetic
and phenotypic parameters. The advantage of MTA over STA is the
increase in estimation accuracy of components of (co)variance,
especially from populations undergoing selection (Walter and Mao,
1985). Differences in estimates resulting from the two methods can be
expected. Some studies (Schaeffer and Wilton, 1981; Lin and Lee, 1986;
Buttazzoni and Mao, 1989) report large changes in magnitude and even
signs of the estimates when applying the two methods to the same data
set. Biases such as these may be the result of the effect of several
factors. Estimates of genetic paramteres may be biased by selection if
not all the data used in selection decisions are included in the
analysis. Biases may also occur in estimates of the parameters if some
or all the assumptions of the model are violated, for example,
normality. The true population parameters may also affect the accuracy
of the estimates, for example, the degree of association between
traits. Finally, the number of traits one decides to include in a
multitrait analysis may also have an effect on whether the estimates
are biased or not.

Even though the estimation accuracy and/or prediction error

through MTA is greater, one limitation of applying MTA is the increased

22

complexity and computing requirements. The gain in accuracy relative
to the extra computing efforts needs to be considered. It would be
desireable to group traits within an analysis that would be
computationally feasible and yet still achieve a high degree of
accuracy.

The objective of this study was to examine causes in discrepacies
between genetic parameter estimates from mixed models using EM-REML
methods of estimation in simulated unselected populations. The causes
that were examined in this study were; the magnitude of the underlying
genetic paramters, and the number of trait included in the analysis.
As a result of examining these two causes, it was also possible to
study the effect of grouping traits for computational feasibility so

that the highest accuracy possible could be achieved.
4.3 MATERIAL AND METHODS

4.3.1 Simulation Procedure

Records were generated for four traits using the same model for
each trait. For each trait, the model was:

Yuk " mi+sj+eijk

where yijk was a record on the kth progeny of sire j in the ith
management group; management group effect (m) was fixed whereas sire
effect (3) was random, and eijk was the random residual. All animals
had records for all four traits. Sire and residual components were
generated from a multivariate normal distribution with expected values
E(sJ)-0 and E<eijk)'o and specified sire and residual (co)variances.

Each set of data generated contained progeny records from a total

23

of 40 sires distributed across 22 management groups. All sires were
assumed unrelated and unselected. The number of progeny per sire
followed a Poisson distribution with a mean of 20. Progeny of sires
were assigned randomly to management groups according to a uniform
discrete distribution from one to four. Thus, each sire could have
progeny in at most four different management groups. Progeny records of
the first four sires were randomly assigned to management groups 1-4,
and those of sires 5-8 were randomly assigned to management groups 3-6,
etc. The design pattern of sires across management goups was
constructed such that the data generated were unbalanced yet connected
over management groups.

A total of 12 underlying parameter structures were designed by
varying levels of heritabilities and genetic correlations. The
situations investigated are summarized in Table 1. Residual
correlations were kept constant at .5 across all 12 combinations, and
within each combination, genetic correlations were the same for all
pairs of traits. In the positively correlated situations, genetic
correlations of .2 and .8 were to chosen to examine the effect of
weakly associated versus strongly associated traits, respectively, on
the genetic parameter estimates. In the negatively correlated
situations, genetic correlations of -.2 and -.3 were chosen to examine
the effect of a weak versus a stronger correlation on the genetic
parameter estimates. The value, -.3, was chosen due to the constraints
on limits of a covariance in relation to the variances of the other
variables. Under an equal correlation structure with four traits, the

strongest negative correlation possible while keeping the (co)variance

24

matrices positive definite is -.3.
4.3.2 Statistical Analysis

Each of the parameter combinations were replicated 50 times.
Single and multiple trait EM-REML methods using canonical and
Householder transformations as described by (Jensen and Mao, 1989) were
applied in order to estimate genetic parameters. Iteration of the EM-
REML procedure was stopped when the absolute relative difference
between the Euclidean norm of the estimated sire and residual
(co)variance matrices in consecutive rounds was less than 10'5 or until
5000 rounds of interation had been reached. Those estimates that did
not converge were not used in summaries. The EM-REML estimators
correspond to those given by Patterson and Thompson (1971) and Harville

(1977) but extended to multiple traits as described by Jensen and

Table l. Underlying parameter structures investigated

 

No. Analyses run/ Total no. of

 

 

 

 

 

 

h2 of trait No. Trait sets estimates

Situation 1 2 3 4 r8 4 3 2 1 1:2 r8
1 1 .3 .6 .8 .2 1 4 6 4 32 24
2 1 .3 .6 .8 .8 1 4 6 4 32 24
3 l .3 .6 .8 -.2 1 4 6 4 32 24
4 1 .3 .6 .8 -.3 1 4 6 4 32 24
5 2 .2 .2 .2 .2 1 4 6 4 32 24
6 2 .2 .2 .2 .8 l 4 6 4 32 24
7 2 .2 .2 .2 -.2 1 4 6 4 32 24
8 2 .2 .2 .2 -.3 1 4 6 4 32 24
9 .8 .8 .8 .8 .2 1 4 6 4 32 24
10 .8 .8 .8 .8 .8 l 4 6 4 32 24
ll .8 .8 .8 .8 -.2 1 4 6 4 32 24
12 .8 .8 .8 .8 -.3 l 4 6 4 32 24

 

25

Mao, (1988).

Sample parameters of sire and residual (co)variances and
heritabilities were computed for each data set based on true
transmitting abilities of sires and residuals generated. Converged
estimates from each of the 50 replicates were compared to the averages
of the sample parameters and not to the true parameters that were
simulated, because in some cases simulated parameters deviated from
the underlying parameters.

The estimates of genetic parameters were tested for normality. A
majority of the parameters estimated showed non-normality and,
therefore, the non-parametric Sign test (Daniel, 1978) was used to test
for biases. The exact significance level was determined by the number

of estimates that converged for each parameter.

4.4 RESULTS

4.4.1 Biases

Table 2 lists the number of biased estimates averaged over 50 reps
for heritabilitites and genetic correlations in each of the 12
combinations. In only four combinations (1,4,7,8) were the number of
biased estimates greater than expected by chance when using a
significance level of .05. In combination 1, 14.3% of the estimates
were biased; combination 4 had 23.2% , combination 7 had 12.5%, and
combination 8 had 10.7% of the estimates biased. For all combinations,
biases in heritability estimates were off by no more than .03 of the
sample parameter, indicating that no important bias in heritability

estimation occurred.

26

Biases found in genetic correlation estimates occurred when the
underlying correlations were negative. Biases occurring in the four
trait analyses tended to be farther away from the edge of the parameter
space (weaker negative), whereas biases occurring in the two and three
trait analyses tended to be closer to the edge of the parameter space
(stronger negative).

4.4.2 Mean Square Errors

Table 3 shows average root mean square errors (RMSE) of
heritability estimates for four levels of heritabilities according to
the four levels of genetic correlations and the number of traits in the
analysis. Values ranged from .06 for a low heritability to .20 for
large heritabilities. For each level of heritability, RMSE's did not
change as the number of traits in the analysis changed. This indicates
that STA was as precise as MTA in the estimation of heritability. For
any specific MTA, RMSE's for a particular heritability value remained
relatively constant over the four levels of genetic correlations. This
result suggests that precision of heritability estimation is not
dependent on the level of association between traits in a MTA.

Table 4 shows average RMSE's for the four levels of genetic
correlations according to the number of traits in the model and the
heritability values of the traits involved. RMSE's for correlation
estimates showed much more variation than those for heritability
estimates, with values ranging from .08 to .48. The magnitude of these
values seemed to depend on the underlying genetic correlation, the
underlying heritability values of the two traits involved, and to a

minor extent, when heritabilities were low, the number of traits in the

27

Table 2. Percent biased heritability and genetic correlation

 

 

 

 

 

 

estimates
True Parameters % Biased Estimates
2 2

h rg Total h rg
l, 3, 6, 8 2 14.3 14 3 0 0
1, 3, 6, 8 8 0.0 O O 0 O
1, 3, 6, 8 - 2 5.3 5 3 0 O
1, 3, 6, 8 - 3 23.2 10 7 12 5
2, 2, 2, 2 2 1.8 1 8 O 0
2, 2, 2, 2 8 5.4 0 0 5 4
2, 2, 2, 2 - 2 12.5 7 1 5 4
2, 2, 2, 2 - 3 10.7 0 0 10 7
8, 8, 8, 8 2 5.4 5 4 0 O
8, 8, 8, 8 8 0.0 0 0 0 0
8, 8, 8, 8 - 2 0.0 0 0 0 0

 

28

 

 

 

 

 

 

 

 

 

 

 

0.. 0.. 0.. 0.. 0.. 0.. 0.. 0.. 0.. N.. N.. ... No. No. 00. 0o. .
0.. ON. 0.. 0.. 0.. 0.. 0.. 0.. 0.. N.. N.. ... No. No. 00. 00. N
0.. ON. 0.. 0.. 0.. 0.. 0.. 0.. 0.. N.. N.. ... 00. No. 00. 0o. 0
0.. ON. 0.. 0.. 0.. 0.. 0.. 0.. 0.. N.. ... ... No. No. 00. 0o. 0
...-t.
.0 .0:
0.- N.- 0. N. 0.. N.- 0. N. 0.- N.- 0. N. 0.- N.- 0. N. "a.
0. 0. 0. .. "N;
.moucaﬂumo muﬂaﬂnsuﬁumn you mnonuo mumsvm mama uoou ommuo>< .m manna

29

ON. ON. OF. ON. ON. ON. MO. NO. NO. ON. ON. ON. an..n.u
«N. MN. NN. Op. Ow. Op. OO. OO. OO. NF. 5.. NF. no..o.v
OM. ON. NN. NN. NN. 0N. FF. ... pp. MN. MN. MN. no..M.v
NN. ON. NN. NN. NN. MN. OO. O.. OF. MN. MN. NN. no..M.O
MM. OM. NN. OQ. NM. MM. OF. O.. O.. «O. OM. MM. NN....O
0M. NM. ON. MM. ON. ON. 5.. NF. NF. PO. OM. OM. 00....v
MO. OM. OM. we. MO. OO. ON. ON. MN. MO. Me. «O. AM..—.O
OM. OM. NM. OM. MM. NM. ON. ON. OF. OM. OM. MM. RN..N.V

g

IIIII IIII IIII IIIII IIII IIII IIIUI IIII IIII IIIII IIII IIII N

N 0 0 N 0 0 N 0 0 N 0 0 "00.01.
00.0:

M.- N.- o. N. "at

mouoaﬁumm soaumHouuoo owumsom no muouuo mumsvm acme uoou omnum>¢ .v manna

30

analysis.

Trends found in RMSE's for genetic correlation estimates can be
seen easier when values of similar magnitude are grouped together as
indicated in Table 5, where the values have been separated into three
ranges (.08-.10), (.17-.20), and (.26-.48). Only those values in the
two extreme ranges (.08-.10) and (.26-.48) are shown. RMSE's were
smallest when traits were highly correlated (.8) and moderate to highly
heritable (.3,.6,.8). Within this group, RMSE's tended to decrease as
the heritability value increased, indicating precision of covariance
estimation between strongly associated traits increases as the
heritability value of the two traits increases. Also, RMSE's stayed
almost constant as the number of traits in the analysis changed,
indicating precision of covariance estimation between strongly
correlated, highly heritable traits does not depend on the amount of
extra information obtained when more traits are included in the
analysis. RMSE's for genetic correlations were largest (.26-48) in
those combinations where either one or both traits had a low
heritability value and traits were weakly correlated. RMSE's under
these conditions tended to decrease as the heritability value of one of
the traits increased, again indicating, that precision of covariance
estimation, in general, increases as heritability increases. Also,
RMSE's tended to decrease as the number of traits in the analysis
increased, opposite to what was observed for strongly correlated
traits. The same trends observed in the RMSE's for the two extreme
ranges of genetic correlations can also be seen for those RMSE's

falling in the medium range.

31

00. 00. 00. .0..0.0
... ... ... .0..0.0
00. 0.. 0.. .0..0.0
00. 00. NN. 00. N0. 00. .0. 00. 00. .0....0
00. N0. 0N. 00. 0N. 0N. .0. 00. 00. .0....0
00. 00. 00. 00. 00. 00. 00. 00. .0. .0....0
00. 00. N0. 00. 00. N0. 00. 00. 00. .N..N.0

........ -- -- -- --- -- -- --- -- -- -- N0
N 0 0 N 0 0 N 0 0 N 0 0 "...-t.
.0 .0:

0.. N.. 0. N. "00

100.-0N.0 0:0 A00.-00.0 00:00 on» :0 0000000
mounawumo cowucaouuoo Owuocom mo muouuo oucsvm coma uoou omcuo>¢ .m manna

32

4.4.3 Correlations

Table 6 shows correlations between estimates of heritability for
two extreme values (.1,.8) for all possible pairs of multiple and
single trait analyses. Correlations were high for all levels of
heritabilities indicating high repeatability in the estimation of
heritability from different analyses. Slight trends were found in the
correlations according to the level of heritability, the underlying
genetic correlation, and the number of traits in the analyses. As
expected, correlations were highest for analyses differing by only one
trait. For heritability estimates, correlations were highest between a
two trait analysis and a single trait analysis and lowest between a
four trait analysis and a single trait analysis. Correlations tended
to be slightly higher for large heritabilities. Under a positive
genetic correlation, correlations between heritability estimates also
tended to be higher.

Table 7 shows correlations between estimates of genetic
correlations for all pairs of multiple trait analyses. Similar trends
were found as those mentioned above for heritabilities. For all levels
of genetic correlations, correlations tended to be highest between four
trait and three trait analyses and lowest between four trait and two
trait analyses. Correlations increased as the heritability of the
traits increased, also correlations tended to be higher under positive
genetic correlations than those for negative genetic correlations.
4.4.4 Grouping of traits

Trends found in correlations between the estimates from different

analyses and trends in RMSE's for both heritabilities and genetic

33

 

 

 

 

 

OOOO. HMOO. OOOO. OHOO. OOOO. OOOO. n.l O.
OOOO. vMOO. OOOO. «OOO. MOOO. OOOO. N.I O.
NOOO. OOOO. MOOO. MOOO. OMOO. OOOO. O. O.
OMOO. OOOO. NOOO. eOOO. OMOO. OOOO. N. O.
NOOO. HOMO. OOOO. OOHO. OOOO. ONOO. O.I H.
HNOO. OmbO. MNOO. OOOO. ONMO. Omsm. N.| H.
OOOO. OOOO. mOOO. OMMO. OMOO. HOOO. O. H.
OMOO. OMOO. ONOO. OOOO. HOOO. NOOO. N. H.
0\N N\0 N\0 0\0 N\0 0x0 on Na
momchcm cH muchu mo .02
.momchcc
“Hana mHOGHm 0cm uchu oHnHuHsa Mom AO..H.O mmch> oawuuxo
03» you OuHHHnnuHuos no mouwaHumo coosuon acoHucHoHHoo .O oana

34

N-OO. MONO. MOOO.

OMNO. «OOO. .NOO.

ONOO. NMOO. ONOO.

OOOO. MNNO. ONOO.

NO—O. NOON. OMNO.

MNeO. NOOO. OOMO.

PNQO. MQOO. NNcO.

MMMO. MMOO. MNMO.

Nxe N\M MNO

NN.-cu we .0:

M.-

OMOO. OOOO. «FOO.

MOOO. OMNO. NMOO.

MHOO. «FOO. «ONO.

—OOO. OONO. ONOO.

NMOO. OONO. OOMO.

ONOO. OFNO. O—OO.

NMOO. o—NO. ONOO.

OMMO. ONNO. «MMO.

NNO NNM wa

uumﬂgu $0 .08

N.-

OOOO. NOOO. OOOO.

OOOO. ONOO. OOOO.

ONOO. OMOO. NMOO.

NONO. NMOO. NHMO.

MOOO. MOOO. PMNO.

MOOO. ONOO. ONNO.

MNOO. MONO. OMOO.

NOOO. OPOO. NOOO.

N\¢ N\M Mxe

mum...» $0 .02

OOOO. FOOO. NOOO.

MOOO. —OOO. NMOO.

mNOO. OOOO. OOOO.

OMOO. OOOO. FOOO.

NOOO. "ONO. OMOO.

OOOO. OMNO. OOOO.

OMNO. ONOO. NOOO.

OMOO. «ONO. OOOO.

N\c NNM Mxe

«v.00» 00 .02

~.

.mothccm Uchu onHuHsa no much oHnHmmon
HHs you usoHucHouuoo oHuosoO no mounEHumu coosuon nsoHunHouuoo

NO..0.0

NO..0.0

AO..M.O

NO..M.O

.0.:30

.0..ﬁs

AM..—.O

nN..N.v

.O OHQMB

35

correlations can be used to evaluate the types of traits, that when
grouped together, would result in higher estimation accuracy of the
genetic parameters. However, results can only apply to those
situations where traits are equally correlated. For highly correlated
traits, inclusion of additional traits in a multiple trait analysis
does not seem to increase accuracy of genetic parameter estimates. For
weakly correlated or negatively correlated traits, inclusion of
additional traits, especially with larger heritabilities, may increase
accuracy and repeatability of the estimates of covariance. The use of
a multiple trait model over a single trait model for the estimation of
heritability does not seem to provide any increase in accuracy or

precision of the estimates.

4.5 CONCLUSIONS

Results indicate that heritability estimates of a given trait do
not vary from single trait to multitrait analysis and do not vary
across different levels of positive and negative genetic correlations
when all traits are equally correlated. Therefore, STA was as accurate
as MTA for estimating heritabilities. For genetic correlation
estimates, biases were most often found for negative correlations.
Under these conditions, results indicate that when all traits are
equally correlated, estimates of genetic correlations could vary
depending on the number of traits in the analysis and the magnitude of
the underlying genetic correlation. In terms of absolute values,
genetic correlation estimates were biased downwards for stronger

negative correlations and biased upwards when traits were weakly

36

correlated (negatively).

Root means square errors of heritability estimates increased as
the underlying heritability increased but did not vary from single
trait to multiple trait analysis. Therefore, precision in heritability
estimation was not dependent on the type of analysis, STA or MTA.
RMSE's for genetic correlation estimates had a much wider range as
compared to RMSE's for heritabilitites, the magnitude of which depended
on the underlying genetic correlation, heritability of the two traits
involved, and the number of traits in the analysis. For all levels of
genetic correlations (positive and negative), precision in covariance
estimation tended to increase as the heritability values of the traits
increased. Under weak genetic correlations, precision of estimation of
covariance tended to increase as the number of traits in the analysis
increased.

Correlations between the estimates of heritability and genetic
correlations from different analyses were all high. Correlations for
heritability estimates tended to be higher than those for genetic
correlation estimates. Correlations between estimates of heritability
were highest between two trait and single trait analyses and increased
as the heritability increased. Correlations between estimates of
genetic correlations were highest between four trait and three trait
analyses and also tended to increase as the heritability values of the
traits involved increased. Correlations of the estimates for both
heritability and genetic correlations were lowest under a negative
genetic correlation.

For weakly correlated or negatively correlated traits, inclusion

37

of additional traits in a multitrait model seems to increase estimation
accuracy of the covariances especially if heritability values of the
traits are large. For highly correlated traits, no gain in accuracy of
genetic parameter estimates occurred when additional traits were added
to the model. In order to be able to choose the best grouping or
sampling strategy of traits, additional work needs to be done in cases

where traits are not equally correlated.

5. Comparison of Genetic Parameter Estimates From Single and
Multiple Trait Analyses When Underlying Distribution is Skewed

38

39

5.1 ABSTRACT

Discrepancies between single trait and multiple trait estimates of
genetic parameters when the underlying distribution is skewed were
examined by simulation. A model with fixed management group effects
and random sire effects was used to simulate records for four traits.
Residual variance effects for one of the four traits were skewed for
five different levels of heritability. A total of 24 situations with
varied levels of genetic correlations were examined. Genetic
parameters were estimated using an EM-REML algorithm with canonical and
Householder transformations. The degree of skewness used had no effect
on heritability estimates for either single trait or multiple trait
analyses, but did affect the number of biased estimates of genetic
correlations when the underlying heritability value of the skewed trait
was small. Degree of skewness had no effect on the magnitudes of mean
square errors for either heritability or genetic correlation estimates.
Correlations of the estimates for heritability and genetic correlations
between single trait and different multiple trait analyses were at
least .78 and higher for both skewed and non-skewed traits. For all
heritability levels, correlations were the highest between two trait
and single trait analyses and lowest between four trait and single
trait analyses for both skewed and non-skewed traits. Correlations for
the estimates of genetic correlations were highest between four and
three trait analyses and lowest between four trait and two trait

analyses, for both skewed and non-skewed traits.

40

5.2 INTRODUCTION

Estimates of (co)variance components are required for estimation
of genetic parameters such as heritabilities and genetic correlations.
Numerous methods for variance component estimation exist and there is
no method that is considered universally best. The distinction among
the methods depend on the properties of the estimators that a
particular method provides. The Maximum Likelihood (ML) and Restricted
Maximum Likelihood (REML) estimators have become popular, in part,
because they can be derived readily from Henderson's mixed model
equations (MME) and also due to their desirable statistical properties,
such as non-negativity, consistency, and asymptotic normality. REML
estimates also have the advantage, for balanced data, of reducing to
the standard Analysis of Variance (ANOVA) estimates, which are known to
have minimum variance properties.

The REML procedure, however, requires random effects contributing
to the observation vector to be random samples from underlying normal
populations. This may not be the case in some instances for certain
traits. There are traits that exist that have observations which
follow a skewed distribution, such as calving difficulty or litter
size, which are subjectively scored or categorized. The asymmetry of
the distributions will depend upon the frequencies within each class.
The effects of selection may also cause the distribution of a random
factor to be skewed.

The objective of this study was to compare results of (co)variance

estimation for single and multiple trait REML estimators in simulated

41

populations under a false assumption of normality of the residual
effects under varying conditions of heritabilities and genetic

correlation values.
5.3 MATERIALS AND METHODS

5.3.1 Simulation Procedure

Records on three traits were generated according to the model:

yijk ' "'1 + 93 + eijk

where yijk was a record on the kth progeny of sire j in the ith
management group; management group effect (m) was fixed whereas sire
effect (s) was random, and eijk was the random residual. Sire and
residual components were generated from a multivariate normal
distribution with expected values E(sj)-0 and 3(913k)'0 and specified
sire and residual covariances.

For a fourth trait, records were generated according to the same
model, except for eijk' such that the observation vector, y, followed a
log-normal distribution. Random 81 and eijk have the same expectations
and (co)variances as listed above. All animals had records for all
four traits.

For each set of data generated, the number of progeny per sire,
number of management groups, and number of sires per management group
were identical to that described in section 4. A total of 24
underlying parameter structures were designed by varying levels of
heritability, genetic correlations, and the underlying distribution of
one trait. The situations investigated are summarized in Table 8.

Residual correlations were kept constant at .5 across all 24 situations

42

and within each situation, genetic correlations were the same for all
pairs of traits.

The degree of skewness for the log-normal distributed traits was
kept constant at 1.0 across all situations. The coefficient of
skewness for the log-normal distribution is (eae+2)(eae-l)'5 (Hastings
and Peacock, 1975). Setting this equal to one gives a residual
variance on the normal scale (agn) of approximately .09876 for all
combinations. With this value and using a median value of 1.0, the
residual variance on the log scale (02L) can be determined to be
approximately .11457 (Hastings and Peacock, 1975). Therefore to
simulate a specific heritability value for a trait on the log scale
(hi), the following formulae were used to determine the heritability
value for the trait on the normal scale (hi): "3n - h: * agL / (4 -

a

2 2
sn is the sire variance on the normal scale. Once as is

hi) where n

determined for a specific heritability value on the log scale, the
heritability value on the normal scale can be determined by
h; - 4*°§n/(°§n+°§n).
5.3.2 Statistical Analysis

Each parameter combination was replicated 50 times. Single and
multiple-trait EM-REML methods using canonical and Householder
transformations as described by Jensen and Mao (1988) were applied in
order to estimate genetic parameters. Table 9 shows the total number
of different multiple and single trait analyses that were run and the
number of estimates that were obtained from each replicate. Iteration
of the EM-REML procedure was stopped when the absolute relative

difference between the Euclidean norm of the estimated sire and

43

Table 8. Underlying parameters structures investigated for

skewed traits

 

h2 of trait

 

Situation

 

‘18

1A
1B
16
1D

2A
23
2C
2D

ah

ah

3A
3B
3C
3D

31

4A
48

6L

40

ea

4D

11
12

s_

 

a underline indicates skewed trait.

44

residual (co)variance matrices in consecutive rounds was less than 10'5

or until 5000 rounds of iteration had been reached.

that did not converge were not used in summaries.

Those estimates

The EM-REML

estimators correspond to those given by Patterson and Thompson (1971)

and Harville (1977) but extended to multiple traits as described by

Jensen and Mao (1988).

Sample parameters of sire and residual (co)variances were computed

for each data set based on true
residuals generated. Converged

replicates were compared to the

Table 9. Number of analyses run

each replicate of each

transmitting abilities of sires and
estimates from each of the 50

average of the sample parameters. The

and number of estimates obatined from
parameter combination

 

Number of

traits in analysis

 

 

 

 

 

 

 

 

4 3 2 1
no. of
analyses: 1 4 6 4
Type of Estimate
Skewed Normal
2 2
h rg h rg
no. of
estimates: 8 12 24 12

 

45

non-parametric Sign Test (Daniel, 1978) was used to test for biases,
where the exact probability level was determined by the number of

replicates that converged for each parameter.

5.4 RESULTS

5.4.1 Biases

Table 10 lists the percent of the biased estiamtes found for only
those situations which had a significant number. In only 8 of the 24
combinations were the number of biased estimates greater than expected
by chance when using a significance level of .05, however only 59% of
all the biased estimates, for both heritabilities and genetic
correlations, were for a skewed trait.

Thirty-six percent of the biased estimates were heritability
estimates, and only one third of these pertained to skewed traits, all
with low heritabilities. Overall, low heritability levels tended to
produce more biases, whether the distribution was skewed or not, than
higher heritability levels. Ninety-eight percent of the biased
heritability estimates occurred from populations with low underlying
heritability values (.1,.2,.3).

Sixty-four percent of the biased estimates were genetic
correlations, where 72% of the biased correlation estimates contained a
skewed trait. Of this 72%, more biases occurred when traits with low
heritability values (.1,.2,.3) were skewed.

5.4.2 Mean Square Errors
Table 11 shows average root mean square errors (RMSE) of

heritability estimates from skewed distributions for four levels of

46

heritability (.l,.3,.6,.8) according to levels of genetic correlations

Table 10. Percent of biased estimates for those situations having a
greater than expected number (P<.05)

 

% Biased Estimates

 

 

 

 

 

 

 

 

True Parameters Skewed Normal

h2 rg Total h2 rg h2 r8
.l,.3,‘§,.8 .2 12.5 0.0 0.0 7.1 5.4
.1,.3,.6,‘§ .8 19.7 1.8 5.4 5.4 7.1
‘1, 3, 6, 8 - 2 23.3 8 9 5 4 5 4 3 6
‘1, 3, 6, 8 - 3 12.5 1 8 7 1 0 O 3 6
.l,‘3, 6, 8 - 3 17.8 1 8 8 9 0 0 7 1
‘2,.2,.2,.2 .2 30.4 12.5 0.0 16.1 1.8
‘2, 2, 2, 2 8 14.3 0 0 5 4 0 0 8 9
‘2,.2, 2, 2 - 3 17.9 7 1 5 4 0 0 5 4

 

and number of traits in the analysis. Values ranged from .06 for
small heritabilities to .18 for large heritabilities. The effect of
skewness did not seem to affect the magnitudes of the RMSE's when
compared to those for non-skewed traits. Mean square errors tended to
increase as underlying heritability increased and remained constant
over different genetic correlation levels and number of traits in the
analysis.

Table 12 shows the R2 values and the standardized partial

47

regression coefficients from three different multiple regression models
that were run where the standard error of the heritability estimate was
the dependent variable in each model. The first model run contained
independent variables of number of traits, the underlying heritability
value, the underlying genetic correlation value, and whether the trait
was skewed or not. As expected, the underlying heritability was highly
significant and explained the majority of the variation in the standard
errors. The underlying genetic correlation was also significant,
whereas the number of traits and skewness were not.

The second multiple regression was run within level of genetic
correlation, with the number of traits, underlying heritability value,
and skewness of the trait as the independent variables. The underlying
heritability value was significant for all levels of genetic
correlation. Skewness of the trait was significant under two levels of
genetic correlation, however, both coefficients were essentially zero.

The third multiple regression was run within level of
heritability. The number of traits was significant only under the
smallest heritability value of .l. The underlying genetic correlation
was significant in three out of the four heritability levels. Skewness
was significant in two out of the four heritability levels.

Table 13 shows average RMSE's for genetic correlations estimates
when one trait was skewed according to number of traits in the model
and the heritability values of the traits involved. Magnitudes of the
values did not seem to depend on the level of heritability of the
skewed trait. Similar trends were found in the values as those found

in Section 4 where magnitudes depended on the number of traits in the

 

 

 

 

 

 

 

 

 

 

 

 

N.. 0.. 0.. 0.. N.. N.. N.. N.. ... N.. N.. N.. .0. N0. .0. 00. .
N.. 0.. 0.. 0.. 0.. 0.. N.. N.. N.. N.. N.. N.. N0. N0. .0. 00. N
N.. 0.. 0.. 0.. N.. 0.. N.. N.. N.. N.. ... N.. 00. N0. N0. 00. 0
N.. 0.. 0.. 0.. N.. N.. N.. N.. N.. N.. ... N.. 00. 00. 00. 00. 0
8
.0 00.0..
.0 .02
I C I O I 0 I I I O 0 Cu
0 . N - 0 0 - N . 0 N 0 . N . N 0 - N - 0 . 0
0. 0. "N0
mcoHpanHMumHU OCHOHumcss cozoxm
aouu moucaHumo OUHHHnmuHuon you muouum oucsvm same #000 0O0H0>< .HH oHnme

49

analysis, the heritability value of the traits involved, and the
underlying genetic correlation.

Table 14 shows the R2 values and standardized partial regression
coefficients from three differnet multiple regression models that were
run where the standard error of the genetic correlation estimate was
the dependent variable in each model. The first model contained the
five independent variables of number of traits, the smaller
heritability value of the two traits, the larger heritability value of
the two traits, the underlying genetic correlation, and whether the
trait was skewed or not. All factors were highly significant except
for skewness of the trait.

The second model was run within level of genetic correlation, with
the number of traits, the smaller and larger heritability values, and
the skewness of the trait as the independent variables. The number of
traits was significant when the underlying genetic correlation was
negative. The smaller heritability value was significant for all
levels of genetic correlation, as was the larger heritability value,
except under the correlation of .8. Skewness was significant in only
one of the four levels of genetic correlation.

The third model was run within levels of heritability of the two
traits involved, with the number of traits, the underlying genetic
correlation, and the skewness of the trait as the independent
variables. The number of traits was significant when either one or
both traits had a low heritability value of .l. The underlying genetic
correlation was significant for all pairs of heritability values

whereas skewness was not.

50

Table 12. R2 values and standardized partial regression
coefficients for three multiple regression models
with standard error of heritability estimate as the
dependent variable.

 

Beta Coefficients

 

 

 

 

 

 

 

R2 NT h2 rg skew
58 - 042 749** 117** - 017
r8
.2 .93 - 007 .972** -.063*
.8 .38 -.060 .622** -.031
-.2 .92 -.035 .958** .078**
-.3 .85 -.057 .918** -.056
b2
.1 .08 -.302** -.078 .047
.3 .12 -.149 .251** -.230**
.6 .21 -.063 .437** .190*

.8 .05 -.063 .255** -.054

 

51

0N. 0N. 0.. 0N. .N. .N. 00. 00. 00. 0N. 0N. 0N. .0..0.0
0N. 0N. .N. 0N. 0N. 0N. ... 0.. 0.. 0N. 0N. NN. .0..0..
NN. 0N. 0N. 00. 00. N0. ... ... ... NN. 0N. 0N. .0..0.0
N0. .0. 0N. 00. 00. N0. NN. 0.. 0.. 00. 00. .0. .0.....
00. N0. 00. 00. N0. N0. 0N. 0N. 0.. 00. 00. 00. .0.....
00. 00. N0. 00. .0. 00. 0.. 0.. 0.. 00. 00. 00. .0....0
00. 00. 0N. 00. N0. 00. 0N. 0.. 0.. 00. N0. 00. .N..N.0

----- -- -- --- -- -- --- -- -- --- -- -- N0
N 0 0 N 0 0 N 0 0 N 0 0 "0.....
.0 .0:

0.- N.- 0. N. "00

mCOHudnHuuch OcHaHumvca 603030 ﬁouu
moumaHumo :oHOMHouuoo oHumcwo Ho muouum oucsvm Gama uoou mmcum>4 .nH mHnma

52

5.4.3 Correlations

Table 15 shows the correlations of the heritability estimates for
the two extreme values (.1,.8) when a trait was skewed for all possible
pairs of multiple and single trait analyses. Correlations were high for
all levels of heritabilities indicating that heritability estimates
from different analyses were highly repeatable. Similar trends were
found in the correlations as were found in Section 4 according to the
level of heritability, the underlying genetic correlation, and the
number of traits involved. As expected, correlations were highest for
analyses differing by only one trait, for example, estimates from a
four trait and a three trait analysis were correlated higher than those
from a four trait versus a single trait analysis. Correlations for
heritability estimates were highest between a two trait analysis and a
single trait analysis and tended to be lowest between a four trait
analysis and a single trait analysis. Correlations tended to be
slightly higher for large heritabilities. According to different
levels of genetic correlations, correlations for the heritability
estimates were highest under strong positive genetic correlations and
lowest under strong negative correlations. However, with a large
heritability value, this trend was not as noticeable.

Table 16 shows correlations of the genetic correlation estimates
when one trait was skewed for all possible pairs of multiple trait
analyses according to underlying genetic correlations and heritability
values of the traits involved. All correlations were high but slightly

less than those for heritability, as expected. Magnitudes of the

53

 

 

 

 

 

 

 

Table 14. R2 values and standardized partial regression
coefficients for three multiple regression models with
standard error of genetic correlation estimate as the
dependent variable.

Beta Coefficients
R2 NT hZL hZH rg skew
.73 - 112** -.468** -.233** -.568** .027
r8
.2 .81 .080 .560** .503** .129
.8 .62 .081 .788** .005 .035
.2 .71 .130** .619** .333** -.004
.3 .81 .289** .619** .357** .022
h2
.1,.3) .54 .158** .731** .052
.1,.6) .70 .242** .234** -.002
.1,.8) .54 .226** .713** .092
.3,.6) .73 .088 .854** -.110
.3,.8) .78 .110 .878** .074
.6,.8) .74 .030 .865** -.007

 

54

 

 

0000. 0000. 0N00. 0N00. 0000. 0N00. 0.- 0.
0000. 0000. 0000. 0000. 0000. 0000. N.- 0.
0000. 0000. 0000. 0000. N000. 0000. 0. 0.
0000. 0000. 0000. 0000. 0000. 0000. N. 0.
0000. 0000. 0000. 0000. 0000. 0000. 0.- 0.
N000. 0000. 0000. 0000. 0000. 0000. N.- 0.
0N00. 0000. NN00. 0000. 0000. 0000. 0. 0.
0000. N000. 0000. 0000. 0000. 0000. N. 0.
0\N 0\0 N\0 0x0 N\0 0\0 00 N0

 

MOM%HMCM CH munmﬂhu H0 .02

 

.momOHmcc uHmuu mHOch can oHQHuHsa cmoauon 0O..H.O mmaH0>
casuaxo osu you mounaHumo 09000000000: no mCOHucHouuou .00 oHnua

55

OOMO. OOOO. MOOO.

OOMO. NNOO. OMOO.

OFOO. OOMO. FQOO.

OOPO. ONMO. OONO.

NOOO. NMNO. ONMO.

OOOO. OMMO. MOMO.

OOOO. OMMO. NOMO.

NMOO. NOOO. FOMO.

N\¢ NxM M\e

00.000 00 .0:

M.-

OOOO. FMOO. MOOO.

NNOO. OOOO. OFOO.

ONOO. MOOO. ONOO.

NONO. «OOO. MOOO.

OMOO. MOMO. HONO.

OOPO. OOMO. NNMO.

ONOO. «MMO. MOMO.

NMOO. NNNO. FOOO.

Nxv NNM M\0

00.000 00 .o:

N.-

pOOO. OOOO. OOOO.

MOOO. NOOO. OOOO.

MOOO. NMOO. MMOO.

ONmO. ONOO. OOOO.

FMMO. FNMO. NOMO.

OONO. NMNO. OONO.

NNMO. NOOO. NMNO.

OONO. OMOO. OMOO.

NNO NNM M\0

Oumngu v0 .03

MNOO. OOOO. NOOO.

OqOO. OOOO. OOOO.

«ONO. NMOO. ONOO.

OOOO. FMOO. MOOO.

«OOO. ONNO. «FOO.

NONO. FOOO. OOOO.

OOOO. NOOO. NOOO.

M—OO. NMOO. OMOO.

Nxe N\M M\¢

nun-Lu $0 .02

N.

.mmmaHmsm #000» mHQHana ucohouuHo comsuon

mosmxm 00 #0009 mso con: mmucaHumm mcoHunHouuoo oHumcmO mo mcoHumHmuuou

0910.0

no..M.v

0O..F.v

.0..E0

0M..P.O

ANLN.0

.OH OHQMB

56

correlations did not seem to depend on the heritability value of the
skewed trait. As with heritability, trends in the values were found
according to number of traits in the analysis, the heritability values
of the traits involved, and the underlying genetic correlation. Across
all levels of genetic correlations and heritabilities, correlations
values were highest between four trait and three trait analyses and
lowest between four trait and two trait analyses. This trend differed
from that found in heritabilities in that correlations for heritability
estimates were highest between a two trait and a single trait analysis,
where as for genetic correlations, correlations were consistently
highest between a four trait and a three trait analysis. Correlations
were highest when both traits were highly heritable, with little
variation across different levels of genetic correlations. When
heritability values were low, correlations tended to be lower and more
variable across different levels of genetic correlations, where
correlations were highest under a genetic correlation of .8 and lowest

under a genetic correlation of -.3.

5.5 CONCLUSIONS

Results indicate that in terms of biases, the degree of skewness
used had no effect on heritability estimates from either single or
multiple trait analyses. However, more biases were found when
heritability values were small, whether skewed or not, than for those
with large heritability values. The degree of skewness seemed to have
a small effect on the estimates of genetic correlations, especially

when the heritability value of the skewed trait was small.

57

In terms of MSE's, the degree of skewness had no effect on the
magnitude of values for either heritability or genetic correlation
estimates. Similar trends in MSE’s for skewed traits were found as
those of non-skewed traits for both heritabilities and genetic
correlations.

Only one degree of skewness was examined in this study. The
direction of the skewness was also held constant in the situations
examined. The form of skewness examined here seemed to have no effect
on the accuracy and precision of genetic parameter estimates. Thus,
REML appears rather robust in terms of expectation and sampling
variances of the estimates for this type of skewness.

Asymmetric sire effects, as opposed to the skewed residual
effects that were examined in this study, may prove to have more of an
effect on the accuracy and precision of heritability and genetic
correlation estimates. Also, different levels of residual correlation

needs to be examined as well as skewing more than one trait.

6. SUMMARY

Variance component estimation methods for unbalanced data are
plentiful and there is no universally best method. Different methods
will give different estimates from the same set of data. The current
method of choice is REML due to its desirable statistical properties.
Once the method of estimation has been determined there are a number of
factors that will affect the properties of the resulting estimates.
Three sources of potential bias in (co)variance component estimation by
EM-REML were examined by simulation: the number of traits in the
analysis; the magnitude of the underlying parameters; and violation of
normality assumptions. The understanding of these possible sources of
bias should enable one to develop strategies in selecting subsets of
traits that yield high estimation accuracy and precision while
minimizing computational requirements.

Different populations were simulated to cover a range of
heritabilities as well as genetic correlation structures. A model with
fixed management group effects and random sire and residual effects was
used to simulate records for four traits. Each population was
replicated 50 times. Single and multiple trait EM-REML methods were
applied in order to estimate genetic parameters. Converged estimates
from each of the 50 replicates were compared to the average sample
parameters. The Sign test was used to test for biases in the
estimates. In studying the effect of violation of normality
assumptions, residual effects for one of the four traits was skewed

such that the records generated followed a log-normal distribution with

58

59

a constant degree of skewness of 1.0. The above procedures were then
repeated.

Estimates for both heritabilities and genetic correlations were
examined. Results were summarized in terms of accuracy or amount of
bias in the estimates; precision or magnitude of mean square errors in
the estimates; and correlation between estimates from analyses
involving different number of traits.

6.1 Heritability

From analyses involving different number of traits, none of the
heritability estimates were significantly biased. Also, the accuracy
of heritability estimates did not appear to be dependent on the degree
of association between traits in a multiple trait setting.
Heritability estimates of weakly correlated traits were as accurate as
those of strongly correlated traits. Across all levels of underlying
genetic correlations and from analyses involving different number of
traits, more biases in estimates tended to occur when underlying
heritability values were small than when heritability values were high.

Correlations of heritability estimates between single, two, three,
and four trait analyses were high for all levels of underlying
heritability. For low levels of heritability, where majority of the
biases in heritability estimates were found, high correlations among
the estimates indicate consistency in the direction of the bias.

For all levels of heritability, mean square errors did not change
as the number of traits in the analysis changed, or as the genetic
correlation among the traits changed. This suggests that STA was as

precise as MTA for estimating heritability.

60

In general, results suggested that little is gained through MTA
for estimating heritability.

The degree of skewness used had no effect on the amount of
biasedness occurring in heritability estimates for single or multiple
trait analyses. Correlations between estimates of heritabilities from
skewed underlying distributions were as high as correlations between
heritability estimates arising from normal distributions.

The effect of skewness did not seem to affect the magnitudes of
the MSE's of heritability estimates when compared to those of non-
skewed traits.

6.2 Genetic Correlations

Estimates of genetic correlations tended to be biased when the
underlying genetic correlations were negative. The direction of the
bias seemed to depend on the number of traits included in the analysis.
Biases occurring in a four trait analysis tended to be weaker negative
or underestimated, whereas biases occurring in the two and three trait
analyses tended to be stronger negative or overestimated. Biases also
tended to occur under strong positive correlations, however, these
biases were much smaller than those under negative correlations and
could be considered negligible.

Mean square errors for genetic correlation estimates indicate
precision of estimation depended on the underlying genetic correlation,
the underlying heritabilities of the two traits involved, and the
number of traits in the analysis. Precision in covariance estimation
was the highest when traits were highly correlated positively and

moderate to highly heritable. Within this parameter setting, Mean

61

square errors stayed constant as more traits were included in the
analysis indicating no gain in precision is made through MTA. Mean
square errors for correlation estimates were largest under conditions
where the traits were weakly correlated with small heritability values.
Under this parameter setting, precision in covariance estimation seemed
to depend on the heritability values of the traits and the number of
traits in the analysis. Results indicated that precision increased as
the heritability values increase and as the number of traits in the
analysis increased. This suggests that a gain in precision can be
obtained through MTA.

The degree of skewness used had a small effect on estimates of
genetic correlations. Correlations between estimates of genetic
correlations when a trait was skewed were as high as those of non-
skewed traits.

Mean square errors for genetic correlation estimates did not seem
to be affected by the amount of skewness. Similar trends were found as
those for non—skewed traits in terms of underlying heritabilities,
underlying genetic correlations, and the number of traits in the
analysis.

6.3 Sampling Subsets of Traits

Trends found in the accuracy and precision of the estimates can be
used as guidelines for selecting subsets of traits when computer
resources are limited. Results found should only apply to those
situations examined here where traits are equally correlated. For
heritability estimation, results indicated that estimates are

consistent across different levels of parameter values and number of

62

traits in an analysis. There was no gain in accuracy or precision of
genetic correlation estimates by adding or deleting traits from a MTA
when traits were highly correlated. The greatest increase in accuracy
and precision occurred when traits were negatively correlated and with
small heritability values. Under these conditions, adding more traits
to the analysis continued to improve the properties of the estimates.
In general, when traits were negatively correlated, for all levels of
heritability, adding more traits to the analysis continued to increase

accuracy and precision of genetic correlation estimates.

BIBLIOGRAPHY

7. BIBLOGRAPHY

Banks, B.D., I.L. Mao, and J.P. Walter. 1985. Robustness of the
restricted maximum likelihood estimator derived under normality
as applied to data with skewed distributions. J. Dairy Sci. 68:
1785-1792.

Buttazzoni, L. and I.L Mao. 1989. Genetic parameters of estimated net
energy efficiencies for milk production, maintenance, and body
weight change in dairy cows. J Dairy Sci. 72:671-677.

Corbeil, R.R., and S.R. Searle. 1976. Restricted maximum likelihood
(REML) estimation of variance components in the mixed model.
Technometrics 18:31-38.

Corbeil, R.R., and S.R. Searle. 1976. A comparison of variance
component estimators. Biometrics. 32:779-791.

Daniel, W.W. 1978. Applied nonparametric statistics. Houghton Mifflin
Company.

Dempster, A.P., N.M. Laird, D.B. Rubin. 1977. Maximum likelihood from
incomplete data via the EM algorithm. Royal Stat. Stat. J. Series
B 39:1-38.

Graser, H.V., Smith, S.P. and Tier, B. 1987. A derivative-free
approach for estimating variance components in animal models by
Restricted Maximum Likelihood. J. Anim. Sci. 64:1362-1370.

Hartley, H.O. and J.N.K. Rao. 1967. Maximum likelihood estimation for
the mixed model analysis of variance model. Biometrics. 54:93.

Harville,D A. 1977. Maximum likelihood approaches to variance component
estimation and to related problems. J Amer. Stat.Assoc.
72:320-338.

Hastings, N.J., and J.B. Peacock. 1975. Statistical distributions. John
Wiley and Sons.

Henderson, C.R. 1973. Sire evaluation and genetic trends. In: Proc.

Anim. Breeding Genetic. Symp. in Honor of Dr. J.L. Lush p 10-41.
Am. Soc. Anim. Sci. and Am. Dairy Assoc., Champaign, IL.

62 - A

63

Henderson, C.R. 1976. Multiple trait sire evaluation using the
relationship matrix. J. Dairy Sci. 59:769.

Henderson, C.R. 1978. Simulation to examine distributions of estimators
of variances and ratios of variances. J. Dairy Sci. 61:267-273.

Henderson, C.R. 1984. Applications of Linear Models in Animal Breeding.
p160. University of Guelph Press, Guelph, Can.

Henderson, C.R. 1984. Estimation of variances and covariances under
multiple trait models. J Dairy Sci. 67:1581-1589.

Henderson,C.R. 1985. MIVQUE and REML estimation of additive and
nonadditive genetic variances. J. Anim. Sci. 61:113-121.

Henderson, C.R. 1986. Recent developments in variance component
estimation. J Anim. Sci. 63:208-216.

Henderson, C.R. ANOVA, MIVQUE, REML, and ML algorithms for estimation
of variances and covariances. Iowa State University 50th
anniversary of statistics book.

Henderson, C.R. 1987. Progress in statistical methods applied to
quantitative genetics. Proceedings of the 2nd international
conference on quantitative genetics.

Henderson, C.R., and R.L. Quaas. 1976. Multiple trait evaluation using
relatives records. J. Animal Sci. 43:1188.

Hill, W.G. and R. Thompson. 1978. Probabilities of nonpositive definite
between group and genetic covariance matrices. Biometrics.
34:429-439.

Jennrich, R.T., and P.F. Sampson. 1976. Newton-Raphson and related
algorithms for maximum likelihodd variance component estimation.
Technometrics. 18:11.

Jensen, J., and I.L. Mao. 1988. Transformation algorithms in analysis
of single trait and multitrait models with equal design matrices
and one random factor per trait. J. Dairy Sci. 66:2750-2761.

Lin,C.Y., and A.J.Lee. 1986. Sequential estimation of genetic and

phenotypic parameters in multitrait mixed model analysis. J Dairy
Sci. 69:2696-2703.

Meyer,K. 1983. Maximum likelihood procedures for estimating genetic
parameters for later lactations of dairy cattle. J Dairy Sci.
66:1988-1997.

64

Meyer, K. 1985. Maximum likelihood estimation of variance components
for a multivariate mixed model with equal design matrices.
Biometrics. 41:153-165.

Meyer, K. 1987. Restricted maximum likelihood to estimates variance
components for mixed models with two random factors. Genet. Sel.
Evol. 19:49-68.

Meyer, K. 1989. Estimating variances and covariances for multivariate
animal models by REML. Genet. Sel. Evol. (submitted).

Meyer, K. and R. Thompson. 1984. Bias in variance and covariance
component estimators due to selection on a correlated trait. J.
Anim. Breeding and Genetics. 101233-50.

Patterson,H.D., and R. Thompson. 1971. Recovery of inter-block
information when block sizes are unequal. Biometrika 58:545-554.

Pollack, E.J., and R.L. Quaas. 1981. Monte carlo study of genetic

evaluations using sequentially selected records. J. Animal Sci.
52:257.

Pollack, E.J., J. van der Werf, and R.L. Quaas. 1984. Selection bias
and multiple trait evaluation. J. Dairy Sci. 67:1590-1595.

Rao, C.R. 1971. Minimum variance quadratic unbiased estimation of
variance components. J. Multivariate Analysis 1:445-456.

Rothschild, M.F., C.R. Henderson, and R.L. Quaas. 1979. Effects of
selection on variances and covariances. J Dairy Sci. 62:996.

Schaeffer, L.R. 1983. Notes on linear model theory, best linear
unbiased prediction and variance component estimation. Dept. of
Anim. and Poult. Sci., Univ. of Guelph, Ontario, Can.

Schaeffer, L.R. 1984. Sire and cow evaluation under multiple trait
models. J. Dairy Sci. 67:1567-1580.

Schaeffer, L.R. 1985. Maximum likelihood method for multiple traits for
two traits, one breed, sire model. Summary.

Schaeffer, L.R. 1986. Estimation of variances and covariances within
the allowable parameter space. J Dairy Sci. 69:187-194.

Schaeffer, L.R., and J.W.Wilton.l981. Comparison of single and multiple
trait beef sire evaluations. Can. J. Anim. Sci. 61:565-573.

Schaeffer, L.R., J.W. Wilton, and R.Thompson. 1978. Simultaneous
estimation of variance and covariance components from multitrait
mixed model equations. Biometrics 34:199-204.

65

Seal, H.L. 1966. Multivariate Statistical Methods for Biologists.
London, Methuen.

Searle, S.R. 1971. Topics in variance component estimation. Biometrics.
27:1-76.

Searle, S.R. 1982. Matrix Algebra Useful for Statistics. John Wiley and
Sons, Inc. New York.

Searle, S.R. 1989. Variance components- some history and a summary
account of estimation methods. J. Anim. Breed. Genet. 106:1-29.

Smith, S.P., and H.U. Graser. 1986. Estimating variance components in a
class of mixed models by restricted maximum likelihood. J. Dairy
Sci. 69:1156-1165.

Sorensen, D.A., and B.W. Kennedy. 1984. Estimation of genetic
variances from unselected and selected populations. J. Animal Sci.
59:1213-1223.

Thompson, R. 1969. Iterative estimation of variance and covariance
components for non-orthogonal data. Biometrics. 41:153-165.

Thompson, R. 1973. The estimation of variance and covariance components
when records are subject to culling. Biometrics. 22:527-550.

Thompson, R. 1977. The estimation of heritability, with unbalanced
data. I. Observations available on parents and offspring. 11. Data
available on more than two generations. Biometrics. 33:485—504.

Thompson R. 1982. Methods of estimation of genetic parameters. In
Proceedings of the Second International Congress on Genetics
Applied to Livestock Production, Madrdi. Vol 5, 95-103.

Walter, J.P., and I.L. Mao. 1985. Multiple and single trait analyses
for estimating genetic parameters in simulated populations under
selection. J Dairy Sci. 68:91-98.

 

   

MICHIGAN STRTE UNIV. LIBRRRIES
[I“lull“[WWIWVINI]WWI[IHIWINW
31293008914339