Vivian!“ ‘ , , . . . ‘ , bum-ﬁg it...

im; . . a?“

 

‘HFSlS

MICHIGAN STATEU

II I III IIIIIIIIIIIIIIIIIIIIII'IIIIIIII

01555 6545

II

 

 

 

 

 

 

 

 

 

 

 

 

 

This is to certify that the

dissertation entitled
GENETIC DIVERSITY FOR RESTRICTION FRAGMENT LENGTH
POLYMORPHISM (RFLP) MARKERS WITHIN SOYBEAN
(GLYCINE MAX L. MERR.) GERM PLASM AND ITS USE AS
A SELECTION CRITERION FOR PARENTS IN A BREEDING PROGRAM.

presented by

Theodore J. Kisha

has been accepted towards fulﬁllment
of the requirements for

Doctoral degreein Plant Breeding & Genetics -

Crop & Soil Sciences

as“ AQMA

Major professor

Date August 21, 1996

MS U is an Affirmative Action/Equal Opportunity Institution 0-12771

 

LIBRARY

Michigan State
University

 

 

 

PLACE IN RETURN BOX
TO AVOID FINES rotum

to remove this Check
on or before data duo.

DATE DUE DATE DUE DATE DUE

out from your rocord.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

GENETIC DIVERSITY FOR RESTRICTION FRAGMENT LENGTH POLYMORPHISM (RFLP)
MARKERS HITHIN SOYBEAN (GLYCINE MAX L. MERR.) GERN PLASM AND ITS USE AS
A SELECTION CRITERION FOR PARENTS IN A BREEDING PROGRAM.

By

Theodore James Kisha

A DISSERTATION
Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Department of Crop and Soil Sciences

1996

ABSTRACT
GENETIC DIVERSITY FOR RESTRICTION FRAGMENT LENGTH POLYMORPHISM (RFLP)

MARKERS WITHIN SOYBEAN (GLYCINE MAX L. MERR.) GERM PLASM AND ITS USE AS
A SELECTION CRITERION FOR PARENTS IN A BREEDING PROGRAM.

By

Theodore James Kisha

Genetic diversity is limited in soybean in the US because only a
few early plant introductions formed the original breeding pool. This
study examined RFLP markers among samples of ancestral plant
introductions, more recent plant introductions, and cultivars and elite
lines from the northern US. Markers uniquely identified all lines
examined. Cluster analysis grouped ancestors according to area of
origin, while other lines formed groups in agreement with their
pedigrees. Genetic distances among lines determined with RFLP, Random
amplified polymorphic DNA (RAPD), and coefficient of parentage data were
compared. Correlations between genetic distance and genetic variance of
several agronomic traits were examined in two population sets over two
years. Distance measures were generally positively correlated with
genetic variances. There was a negative correlation with yield variance
in one population set in one year. A multiple regression model using
mid-parent yield and marker genetic distance predicted the highest
yielding progeny. The relationship to mid-parent yield was always
positive, but highest yielding progeny were negatively associated with
genetic distance for one population set. The data herein suggest that
using RFLP distance estimates for parent selection can increase the

probability of producing transgressive segregates for yield.

This work is dedicated to the memory of my father

George Kisha

ACKNOWLEDGEMENTS

I would like to express my appreciation to Dr. Brian Diers, whose
guidance and friendship made my education a truly enjoyable experience.
I would also like to acknowledge the assistance of my graduate committee
members; Dr. Jim Kelly, Dr. Jim Hancock, and Dr. Mike Thomashow. Their
guidance and discussion has proven invaluable to my education. I would
also like to express my appreciation for the support given by friends
and colleagues, especially Dr. Bob Olien, during some difficult moments.
Finally, and above all, I would like to thank my wife, Linda, whose

strength and love during these trying years has been phenomenal.

iv

TABLE OF CONTENTS

LIST OF TABLES ........................................................ vi
LIST OF FIGURES ..................................................... viii
GENERAL INTRODUCTION ................................................... 1

SECTION ONE
RESTRICTION FRAGMENT LENGTH POLYMORPHISM RELATIONSHIPS

AMONG SOYBEAN LINES IN THE NORTHERN UNITED STATES ..................... 14
Introduction ....................................................... 15
Materials and Methods .............................................. 18
Results and Discussion ............................................. 24
Conclusions ........................................................ 40

SECTION THO
THE RELATIONSHIP BETHEEN GENETIC DISTANCE AND GENETIC VARIANCE

Introduction ....................................................... 43
Materials and Methods .............................................. 48
Results ............................................................ 54
Discussion ......................................................... 76
Conclusions ........................................................ 82
GENERAL CONCLUSIONS ................................................... 83
APPENDIX .............................................................. 86
LIST OF REFERENCES .................................................... 92

Table
Table

Table

Table

Table

Table

Table

Table

Table

Table

Table

LIST OF TABLES

Soybean cultivars and lines analyzed ...................... 20

Contribution of alleles from parent cultivars to
selected progeny of crosses Williams by Essex and

Williams by Ransom ........................................ 37
Cultivars and lines used as parents ....................... 49
Primers used in RAPD analysis ............................. 49

Parents, genetic distance estimates, and genetic
variances for several agronomic traits for the
1993 single-row plots ..................................... 55

Parents, genetic distance estimates, and genetic
variances for several agronomic traits for the
1994 two-row plots ........................................ 56

Parents, genetic distance estimates, and genetic
variances for several agronomic traits for the
1994 single-row plots ..................................... 57

Parents, genetic distance estimates, and genetic
variances for several agronomic traits for the
1995 two-row plots ........................................ 58

Correlations and P-values among genetic distance
measures for the parents of population sets ............... 59

Correlation coefficients and P-values of genetic

distance estimates between parents with genetic

variances of seVeral agronomic traits for population

set A ..................................................... 62

Correlation coefficients and P—values of genetic

distance estimates between parents with genetic

variances of several agronomic traits for population

set 3 ..................................................... 63

vi

Table 2.10 - Correlation coefficients and P-values of genetic
distance estimates between parents with genetic
variances of several agronomic traits for population
set B. Population 17 omitted ............................. 64

Table A.l - Allele frequencies and polymorphism information

content (PIC) per locus for clone/enzyme combinations
among all lines and cultivars or within groups ............ 87

vii

Figure

Figure

Figure

Figure

Figure

Figure

Figure

Figure

Figure

Figure

Figure

LIST OF FIGURES

Agricultural areas associated with soybean
production in China (Committee for the Horld Atlas

of Agriculture, 1973) .................................

Phenogram showing the relationships of 20 ancestral

plant introductions, based on RFLP analysis ...........

Phenogram showing the relationships of a sample of

soybean lines, based on RFLP analysis .................

Phenogram showing the relationships of a sample of
soybean lines, based on coefficient of parentage

analysis ..............................................

Scatterplot of the correlation between genetic
distances based on RFLP and coefficient of

parentage analyses ....................................

Phenogram showing the relationships of the parents

and selected progeny of the cross Hilliams by Essex...

Phenogram showing the relationships of the parents

and selected progeny of the cross Hilliams by Ransom...

Scatterplot of yield genetic variance versus RFLP
genetic distance for population set A in the 1994

..26

..27

..30

..33

..34

..38

.39

two row plots ........................................... 65

Scatterplot of yield genetic variance versus RAPD
genetic distance for population set A in the 1994

two row plots ........................................... 65

Scatterplot of yield genetic variance versus genetic
distance from the combined analysis of RFLP and RAPD

data for population set A in the 1994 two-row plots ..... 66

Scatterplot of yield genetic variance versus RFLP
genetic distance for population set 8.

a) 1994 single row plots b) 1995 two row plots .......... 67

viii

Figure

Figure

Figure

Figure

Figure

Figure

Figure

Figure

Figure

Figure

Figure

.10

.11

.12

.13

.14

.15

Scatterplot of yield genetic variance versus genetic
distance from the combined analysis of RFLP and RAPD

data for population set B.

a) 1994 single-row plots b) 1995 two-row plots .......... 68

Scatterplot of maturity genetic variance versus RFLP
genetic distance for population set 8 in the 1994
single-row plots ........................................ 69

Scatterplot of maturity genetic variance versus
genealogical distance for population set 8 in the
1994 single-row plots ................................... 69

Scatterplot of maturity genetic variance versus RAPD
genetic distance for population set 8 in the 1995
two-row plots ........................................... 70

Scatterplot of maturity genetic variance versus genetic
distance from the combined analysis of RFLP and RAPD
data for population set 8 in the 1995 two-row plots ..... 70

Scatterplot of maturity genetic variance versus
genealogical distance for population set 8 in the
1995 two-row plots ...................................... 71

Regression of population mean yield on mid-parent
yield for the two-row plots. a) Population set A
b) populaton set 8 ...................................... 72

Multiple regression model for the prediction of the

top five yielding progeny of the 1994 two~row plots

as a function of mid-parent yield and RFLP genetic
distance............, ................................... 73

Multiple regression model for the prediction of the
top five yielding progeny of the 1995 two-row plots
as a function of mid-parent yield and RFLP genetic
distance ................................................ 74

Multiple regression model for the prediction of the

top five yielding progeny of the 1995 two-row plots

as a function of mid-parent yield and genetic distance
from the combined analysis of RFLP and RAPD data ........ 75

Correlation between yield genetic variance from
the single-row plots to the two-row plots.
a) Population set A b) Population set 8 ................. 78

ix

GENERAL INTRODUCTION

The genetic distance between individuals is a quantitative
estimate of the difference of their genetic makeup. Genetic distance can
be measured in terms of probability, using coefficients of parentage
(CP) where pedigrees are known (Falconer, I989), indirectly by measuring
differences in expressed genetic traits, more directly by measuring
differences in gene products such as isozymes, or directly by analysis
of DNA. Indirect measurements may be qualitative, such as flower color,
hairy versus glabrous stems, hilum color; or quantitative, such as
differences in plant height, leaf size, and days to maturity. Since any
distance measurement must be related to differences in genes, no
characters should be used which are not a reflection of differences in
genes. Sneath and Sokal (1973) list as inadmissable characters that are
environmentally determined and characters that are to any degree
correlated. The former are not related to the genetic makeup and the
latter bias the distance by summing multiple measurements on the same
character.

Genetic distance based on quantitative characters can be expressed
geometrically in n dimensions of Euclidean hyperspace, where n is the
number of characters measured. The Euclidean distance between
individuals (Sneath and Sokal, 1973) is given as:

d“ = [AI/n]

where:

An = [Xi-141(Xu'xnylm

The n characters are assumed independent and normally distributed and
are standardized by giving them a mean of zero and a variance of unity.
Equal weighting of characters may introduce an indeterminable amount of
error when characters are a result of different numbers of segregating
genes. Error will also occur if different combinations of genes result
in the same phenotypic effect. When correlations exist among a set of n
characters, distance can be expressed as a function of a subset of m < n
principal components (Sneath and Sokal, 1973). Euclidean distances can
then be calculated on the basis of m orthogonal axes in hyperspace.

Relationships among individuals can also be based on the
correlation of standardized quantitative characters between two
individuals (Sneath and Sokal, 1973). The distance is given by the
compliment of the correlation coefficient (1 - r).

Genetic distance based on qualitative characters begins with
expressing the data in the form of an association coefficient (Sneath
and Sokal, 1973), which is a measure of character matches relative to
the number of possible matches. These pair-wise comparisons take the
form of a 2 X 2 matrix for each line in the overall n X m data matrix in

which n individuals are compared over m possible character states:

Individual j

 

 

1 0
1 a b

Individual k
0 c d

 

 

 

 

The row or column corresponding to the number 1 indicates a character is
present, while the row or column corresponding to the number 0 indicates
a character is absent. In the case of a two—state character, a match may
be defined by either a or d and a mismatch by either b or c, but for
multi-state characters, d provides no useful information, since it gives
no indication whether the individuals are similar or different for the
other character states. In this case, an association coefficient which
ignores d would be appropriate. The coefficient of Jaccard (Sneath and
Sokal, 1973; Rohlf, 1992), for example, does not consider matches based
on mutual lack of a trait (d). Similarity is based on a/(a+b+c), and
distance is determined by the compliment of similarity. Sneath and Sokal
(1973), as well as Rohlf (1992), provide lists of a number of
association coefficients which differ in the way the results a,b,c, and
d are handled.

Plant breeders have used some of the distance measures defined
above in an attempt to predict the outcome of matings. Generally, the
goal is to predict which crosses will have the greatest genetic variance

of progeny, or the highest performing transgressive segregants or

4

hybrids. Cowen and Frey (1987a) examined the relationship of
genealogical distance between parents with progeny performance in oat
(Avena sativa L.) using a diallel mating design without recipricals.
They evaluated progeny populations for generalized genetic variance and
transgressive segregation for bundle weight, grain yield, straw yield,
harvest index, height, and heading date. The generalized genetic
variance (Goodman, 1968) was calculated from the genetic variance-
covariance matrices of mean squares and cross products for genotype and
genotype X location interaction for bundle weight, grain yield, and
harvest index. Significant positive correlations were found for
genealogical distance with generalized genetic variances and with
transgressive segregates for height.

The same populations were later used to examine the relationships
between several other distance measurements and progeny performance
(Cowen and Frey, 1987b). A Euclidean distance was calculated using the
first five principal components based on the correlation matrix of 12
quantitative agronomic traits measured for the nine parents. This
distance proved to be negatively correlated with both transgressive
segregation and generalized genetic variances. The second distance Cowen
and Frey used was calculated from the 9 X 9 matrix of parental and
population mating means for grain yield. This distance is based on the
assumption that heterotic effects are proportional to diversity (Hanson
and Casas, 1968). These distances were positively correlated with
transgressive segregation in one year and with generalized genetic
variance in both years. The third distance measure used by Cowen and
Frey was calculated using the correlation of general combining ability

(GCA) effects (Cervantes et al., 1978) over all the traits measured. The

5
distance was taken as l-r. This distance measure was positively
correlated with mid-parent heterosis in one year.

Souza and Sorrells (1991a) used the first six principal components
from the correlation of 13 quantitative traits and the covariance of 15
discrete qualitative traits (1991b) to estimate genetic distance among
oat genotypes. They found that classification using quantitative traits
was according to area of adaptation. This method of classification did
not agree well with that taken by coefficient of parentage, while
classification using qualitative traits clustered lines according to
common ancestors in their pedigrees. Genetic distances between parents
based on either quantitative or qualitative traits were poor predictors
of progeny genetic variance (Souza and Sorrells, 1991c). Only distance
based on coefficient of parentage was significantly related to genetic
variance and, for all agronomic traits measured except biomass, this
relationship was negative.

A common factor in the estimations of genetic distance in the
above examples is the complexity of their calculation and the time and
effort required to collect the necessary data. A method of estimating
genetic distance which precludes the a priori knowledge of the genetic
effects of the parents is the ideal goal. Molecular markers can provide
these estimates.

Isozymes (Lewontin and Hubby, 1966) are molecular markers whose
variants can be used in a qualitative estimation of genetic distance.
Non-denatured proteins are separated by electrophoresis and visualized
using specific staining techniques. Isozyme variants are theoretically
catagorized on the basis of chargezmass ratio, so their detection relies

heavily on amino acid substitutions which result in net loss or gain in

6
charge. Substitutions not resulting in charge differences or large
changes in mass should be more difficult to detect. Ramshaw et al.
(1979) found cryptic differences within electrophoretic variants of
isozymes of hemoglobin. Twenty known variants were separated into only
eight electromorphs under “standard" conditions of pH 8.9 and 4.5%
acrylimide. Further manipulations of pH, acrylimide concentration, and
increased running time eventually were able to discriminate 17 classes
for an efficiency of 85%. Chemically similar substitutions in different
parts of the protein were discriminated 77% of the time under standard
conditions. Further manipulation increased this efficiency to 90%. Four
out of five chemically different but charge-equivalent substitutions at
the same location on the protein were distinguishable under standard
conditions, but one was not distinguishable under any of the conditions
used. These results show that while several stringent analyses may
separate isozymes with an acceptable degree of reliability, results
using only a single protocol may lead to errors due to isozymes scored
as identical but which are merely alike in state.

Cox et al. (1985) found significant correlations between
genealogical distance and isozyme distance using 11 enzymes in groups of
soybean (Glycine max L. Merr.). The correlation was higher for groups
with lower mean genealogical distance. Lamkey et al. (1987) estimated
genetic distance among 35 maize lines using isozyme differences at 9
loci. They found that isozyme genetic distance between parents was
unable to predict hybrid performance. Damerval et al. (1987) tested the
hypothesis that quantitative differences in gene products could be more
important sources of genetic variability in maize than qualitative

differences based on the presence or absence of a particular gene

7

product variant. They found that quantitative differences in enzymes in
maize were more related to Mahalanobis distances (Mahalanobis, 1936)
than were qualitative differences. The Mahalanobis distances were
calculated on the basis of general combining ability for 14 heritable
quantitative characters. This suggested that regulatory processes may
play an important role in genetic diversity. If this is the case, direct
qualitative analysis of differences in DNA sequences would provide more
useful information than qualitative analysis of gene products, because
differences in regulatory regions of DNA would be randomly sampled along
with differences in coding regions. Direct analysis of DNA increases the
extent of genome sampling by including introns and flanking sequences
which may include promoters or enhancers. Additionally, direct DNA
analysis, compared with isozyme analysis, does not rely on changes
solely within coding regions which result in amino acid substitutions.

Differences between individuals at the DNA level can be estimated
using restriction fragment length polymorphism (RFLP) (Southern, 1975).
Genomic DNA is digested with a restriction enzyme, separated by size on
an agarose gel, denatured, and transferred to a nylon membrane. The DNA
on the membrane can then be probed with a radioactively labelled
(Feinberg and Vogelstein, 1984) DNA clone, and the fragment to which the
probe hybridizes visualized on x-ray film. Qualitative differences in
RFLP banding patterns coded as one of a number of available association
coefficients (Sneath and Sokal, 1973; Rohlf, 1992), as discussed
earlier, are used to calculate genetic distance.

Genetic distances estimated using RFLP markers may be subject to

error. Size differences of the genomic DNA to which the clone hybridizes

may be due to point mutations which either eliminate or create new

8
restriction sites, or DNA rearrangements (Borst and Greaves, 1987).
These rearrangements may be inversions, deletions or insertions.
Polymorphism that arises from DNA rearrangement is a macromolecular
difference which may be superimposed over micromolecular differences.
Roth et al. (1989) propose that genetic variation may be generated
within inbreeding plants by rearrangements due to specific
recombinational processes in response to stress. They found that tissue
culture of soybean root resulted in changes in RFLP markers arising from
DNA rearrangement. Genetic alterations in plants regenerated from tissue
culture is well documented (Mein, 1983; Evans et al., 1984). The
surprising aspect of the results of this work was that the
rearrangements resulted in previously characterized RFLP fragments. The
majority of RFLP alleles characterized in soybean are dimorphic (Keim et
al., 1989; Keim et al., 1992;) and are due to rearrangements of DNA
(Apuya et al., 1988). Instead of generating unique alleles, the
rearrangements which occurred during tissue culture resulted in
conversion from one allele to the other previously characterized allele.
Such rearrangements arising in whole plants would result in errors in
genetic distance estimates if alleles alike in state are assumed to be
identical by descent.

In general, RFLP’s have proven superior to isozymes for the
estimation of genetic diversity. McGrath and Quiros detected nearly
three times the number of alleles at RFLP loci than at isozyme loci in
Brassica campestris L. (syn. B. rapa Metz.). Messmer et al. (1991)
detected polymorphism at 94% of RFLP loci examined compared with 68% of
isozyme loci. The maximum number of isozyme alleles at a given locus was

three compared with a maximum of eight alleles at a given RFLP locus.

9
The level of RFLP diversity was also twice that for isozyme diversity of
common bean (Phaseolus valgaris L.) (Velasquez and Gepts, 1994).

Genetic distance estimated using RFLP data has been tested
extensively as a predictor of progeny performance in maize. Smith et al.
(1990) showed a close relationship between hybrid performance and RFLP
distance of parents in maize using parents representing a wide range of
related and unrelated elite corn belt germ plasm. Lee at al. (1989)
found significant correlations of RFLP distance with both hybrid grain
yield (r = .46) and specific combining ability (SCA) (r = .74) in maize
(Zea mays L.). Godshalk et al. (1990), however, found no such
relationship. Hhereas Lee’s group tested crosses both within and among
heterotic groups, Godshalk’s group selected crosses which minimized
matings within heterotic groups. Melchinger et al. (1990) found only
moderate relationships between RFLP distance and hybrid grain yield (r =
.32) and SCA (r = .39). They concluded that RFLP’s have only limited use
in predicting progeny performance in maize, especially among unrelated
lines.

Genealogical distance was significantly correlated with RFLP
distance in oat (Avena sativa L.), but not with a distance calculated
using the first five principal components of the parental correlation
matrix for 12 agronomic traits (Moser and Lee, 1994). There were no
correlations of RFLP distance between parents with progeny genetic
variance for grain yield, biological yield, harvest index, height, or
heading date. There was a small but significant (r = .32) correlation of
RFLP distance with straw yield genetic variance in one year. Parental
distance based on RFLP markers was unable to predict either heterosis or

population genetic variance for grain yield in oats.

10

Another type of molecular marker for estimating differences at the
DNA level is random amplified polymorphic DNA (RAPD) (Williams et al.,
1990; Welsh and McClelland, 1990; Rafalski et al., 1991). These markers
are DNA fragments arising from a mixture of short oligodeoxynucleotide
primers of a single randomly chosen sequence mixed with genomic DNA and
subjected to the polymerase chain reaction (Mullis and Faloona, 1987).
The RAPD estimation of genetic distance is simpler than that using RFLP
markers because it requires no development of specific clones to be used
as probes.

Although RAPD markers are easy to generate, genetic distances
estimated from RAPD markers may be subject to error. Primer binding
sites on the genomic DNA template at a distance that can be overlapped
during the extension phase of the PCR reaction should result in
amplification of the intervening DNA sequence; however, Williams et al.
(1990) have shown that the final amplification products may be a result
of competition among binding sites rather than the actual number of
available sites. Thus, template and primer DNA concentrations must be
identical for each reaction mixture for reliable comparison of the
resulting markers.

Smith et al. (1994), in a phylogenetic analysis of bacterial
strains, found that presence or absence of a RAPD phenotype arose from
either the absence of the primer binding site or competition from a
preferred alternative RAPD product. They also detected co-migrating RAPD
products from unrelated loci, as well as multiple, related products
within a given reaction mixture. F, hybrids from crosses between maize
inbreds did not always reveal simple inheritance of a dominant RAPD

marker (Heun and Helentjaris, 1993). This indicates that amplification

11
of a given RAPD product could be dependent upon the genetic background,
rather than the presence or absence of the DNA segment corresponding to
the actual RAPD product. The problems encountered above should not
preclude the use of RAPD markers to measure intraspecific genetic
distance among inbred lines however, providing reaction conditions are
carefully controlled (Ellsworth et al., 1993).

Genetic relationships using RAPD markers have been estimated in
rice (Oryza sativa L.) (Yu and Nguyen, 1994), Brassica species (Mailer
et al., 1994; dos Santos et al., 1994; Jain et al., 1994; Thormann et
al., 1994; Hallden et al., 1994), tomato (Lycopersecon esculaentum
Mill.) (Williams and St. Clair, 1993), wild oat (Avena sterilis L.)
(Heun et al., 1994), and barley (Hordeum vulgare L.) (Tinker et al.,
1993). Heun et al. (1994) compared RAPD markers to isozymes for
determining relationships among wild oat accessions. Both isozyme and
RAPD markers were able to distinguish all 24 of the wild oat accessions
studied. Cluster analyses produced similar groupings among the
accessions, but overall correlation of distance estimates was only
moderate (r = .36). Principal component analysis resulted in more
definitive groupings for the RAPD markers. A comparison of RAPD and RFLP
markers in Brassica oleracea (L.) genotypes (dos Santos et al., 1994)
gave equal coefficients of variance (CV) of the genetic distance
estimates for equal sample size for both marker types. Both marker types
identified distinct groupings for the sub-species cabbage, broccoli, and
cauliflower. The observed differences in genetic distance estimates were
concluded to be the result of sampling error rather than inherent DNA-
based differences in how RAPDs and RFLPs reveal polymorphism. Thormann

et al. (1994) estimated genetic relationships within and among

12
cruciferous species using RAPDs and RFLPs based on either genomic DNA
(gDNA) or cDNA clones. The number of markers required for a CV of 10%
was approximately 300 for each marker type. The correlations between
distances among the three marker types were all high (r > .90).
Dendrograms were compared using matrices based on cophenetic values and
the Mantel test for matrix correspondence (Mantel, 1967). The
correlation between the gDNA dendrogram and the cDNA dendrogram was
higher than either correlation between RFLP dendrograms with the RAPD
dendrogram. Although all three correlations were high (r = approximately
.90) for intraspecific comparisons, the correlations between RFLP-based
and RAPD-based dendrograms was low (r < .37) for interspecific
comparisons. Hybridization tests using the RAPD fragments as probes
demonstrated that some of the fragments scored as identical were not
actually homologous at the interspecific level.

Jain et al. (1994) examined the use of RAPD genetic distance
estimates to predict heterosis among crosses of Indian mustard (Brassica
juncea L. Czern and Cass). They tested 12 Indian and 11 exotic B. juncea
genotypes. Although they found no direct relationship between RAPD
genetic distance and hybrid performance, RAPD analysis was able to
classify the genotypes into two distinct groups comprised almost
exclusively of the Indian and exotic genotypes, respectively. Crosses
between groups exhibited more overall heterosis than crosses within
groups.

Soybean is a self pollinated crop with limited genetic diversity
in the elite germ plasm used by applied breeders in North America
(Delanney et al., 1983). This limited genetic diversity makes research

to exploit the existing diversity very important for continued

13
improvement of the crop. Delanney et al. (1983) calculated that ten
ancestors contributed more than 80% of the gene pool for the northern
soybean germ plasm. Continued improvement of soybean yield could be
facilitated by identification of diverse parents within adapted
germplasm for making cross pollinations, or the identification of unique
diversity from among more recent plant introductions. Molecular markers
could provide the necessary tools to make this identification.

The lack of diversity in soybean assumed by genealogical analysis
is reflected in the low number of RFLP alleles found. Most RFLP loci
have only two alleles and, in some cases, the second allele is rare
(Keim et al., 1989; Keim et al., 1992). Despite this, enough RFLP
diversity has been found to uniquely identify and establish
relationships among large numbers of soybean lines (Skorupska et al.,
1993). The large degree of relatedness among elite soybean lines may
actually increase the effectiveness of molecular distance estimates
among parents in predicting progeny performance. Some studies (Smith et
al., 1990; Lee et al., 1989) have indicated that there is a high
correlation of molecular genetic distance with progeny performance among
closely related parents. The work presented here was undertaken to
examine l) the relationship between molecular markers and coefficients
of parentage and 2) the relationship between parent genetic distance and
progeny performance in soybean. Because of the close relationships among
soybean lines in the Northern U.S., parent genetic distance may predict
progeny genetic variance. Additionally, since pedigree information is
not available for the early ancestral lines from which North American
lines were developed, molecular marker distance may be more accurate

than genealogical distance for this purpose.

SECTION ONE

RESTRICTION FRAGMENT LENGTH POLYMORPHISM RELATIONSHIPS AMONG SOYBEAN
LINES IN THE NORTHERN UNITED STATES

INTRODUCTION

The continued improvement of soybean (Glycine max L. Merr.) yield
in the northern United States may be limited by lack of genetic
diversity. Only a few of the plant introductions brought from eastern
Asia in the early twentieth century were suitable for seed production in
the U.S., and these formed the original gene pool from which present
soybean cultivars have been derived (Committee on Genetic Vulnerability
of Major Crops, 1972). Delanney et al. (1983) calculated that ten
ancestors contributed more than 80% of the gene pool for northern
soybean germplasm. The genetic base does not appear to have changed in
recent years (Gizlice et al., 1994), even with the inclusion of
proprietary cultivars (Sneller, 1994). St. Martin (1982) compared 50
years of soybean breeding in the U.S. to a program of recurrent
selection. He estimated the effective number of lines recombined each
cycle to be between 11 and 15. This suggests that there has been a loss
of genetic variability in soybean through selection in breeding programs
and random drift. Gizlice et al. (1994) estimated that the genetic
diversity in public cultivars was down 21% from that of the original

ancestral plant introductions.

15

16

Relatedness of soybean genotypes can be estimated using pedigrees
to calculate coefficient of parentage, or by analyzing each genotype for
morphological or molecular markers. Cox et al. (1985) compared genetic
distance estimates among soybean lines calculated using coefficient of
parentage, morphological characters, and isozyme markers. Rank
correlation coefficients of estimated genetic distances among all types
of measurements, including a combination of both isozyme and
morphological traits were statistically significant, but ranged from
0.15 to 0.60. This wide range may have been a result of the few isozymes
or morphological traits used to estimate distance.

Keim et al. (1989) compared 58 soybean accessions using 17
restriction fragment length polymorphism (RFLP) loci. These included 48
accessions from the species G. max, 8 from G. soja Sieb. and Zucc., and
2 from "Glycine gracilis" Skvortz. The G. max accessions included 18
cultivars, 10 plant introductions, and 20 ancestral lines. Polymorphic
loci generally had only two alleles, and for one-third of these loci,
the second allele was rare, occurring in only one or two of the
accessions characterized. 0n the average, any two cultivars differed at
only 16% of the loci. Seven of the cultivars were identical at all 17
RFLP loci. The average within group diversity was greatest among the G.
max plant introductions.

Keim et al. (1992) screened l6 ancestral and 22 adapted lines of
G. max at 128 RFLP marker loci. Seventy percent of the clones were
polymorphic, and their average polymorphism information content (PIC)
was 0.30. Only one in five markers was informative between any two

soybean genotypes. The polymorphism frequency among adapted lines was

17
lower using clones selected by screening interspecific germ plasm than
when using clones selected using intraspecific germ plasm.

Skorupska et al. (1993) characterized 108 genotypes of G. max.
using 83 molecular probes. These included ancestral genotypes, breeding
lines, and elite cultivars encompassing maturity groups V-IX. The
majority of the probes were uninformative, and only 35% detected
polymorphism between any two lines with a frequency greater than 0.30.
The greatest genetic distances were among the ancestral genotypes, while
recently developed lines had a relatively narrower range of diversity.
Genotypes within maturity groups were associated by principal component
analysis, suggesting that molecular diversity was diminished through
selection within geographical regions.

The studies outlined above included probes which had not
previously been screened for levels of polymorphism revealed in adapted
germ plasm. While the average marker diversity was low, some probes
revealed no polymorphism, while others revealed above average marker
diversity. In this study, only clones which had previously been
determined to reveal high levels of polymorphism within elite soybean
germ plasm were used as probes. The RFLP markers from these probes were
used to l)determine the relationships among ancestral plant
introductions 2)estimate genetic distances among Northern soybean
genotypes, 3)assess whether genetic relationships based on RFLP data are
related to those based on known pedigree relationships, 4)determine
whether RFLP allelic diversity has been lost in modern, elite lines from
the Northern U.S. compared with the ancestral plant introductions,
5)examine more recent plant introductions as a source of exploitable

genetic diversity, 6)estimate the effect of selection on the

18
contribution of alleles from parents compared to that expected from the

coefficient of parentage.

MATERIALS AND METHODS

One hundred and three soybean cultivars and lines (Table 1.1) were
evaluated using 57 RFLP markers. Seventy cultivars or elite lines from
the northern U.S. (referred to hereafter as northern elites) were
evaluated because they were important regional cultivars, or because
they were parents in the Michigan State University breeding program. The
20 ancestral plant introductions (referred to hereafter as ancestors)
were evaluated because they contributed approximately 80% of both the
Northern and Southern soybean germ plasm parentage (Delanney et al.,
1983; Gizlice et al., 1994; Sneller, 1994). A sample of 13 plant
introductions (PI’s) were selected because they performed well as
parents when crossed with adapted genotypes from the northern U.S.
(Nelson, 1994). The 70 cultivars and lines included ’Williams’, ’Essex’,
and ’Ransom’ , 10 cultivars selected from the cross Williams by Essex,
and 5 cultivars selected from the cross Williams by Ransom. The progeny
of these crosses were not included in the estimates of genetic distance
mean and variance for the northern elites because these closely related
lines would have biased the results. Some of the lines were not analyzed
at all 57 marker loci.

Soybean DNA was extracted from greenhouse grown plants according
to Keim and Shoemaker (1988) with modifications. Ten seed were sown for

each genotype, but, in some cases, tissue was collected from as few as

19

20

Table 1.1 Soybean cultivars and lines analyzed.

 

Cultivars and Elite Lines

 

 

 

Asgrow

A2234(II) A3127WIII)
A2396(II) A3860WIII)
A2543(II) A3966%III)
A2943( I I) A4268'( IV)

A5308’(V)

Agripro

AP 1989(1)

Iowa State Univ.

 

A81-356022NIII) AC89-241029(II)
A84-185032(II) AC90-115043N1)
A85-293OB3(II) IA 2007(11)
A86-103027(II) IA 2008(11)
A88-221013(II)

AC89—l45013(1)

Michigan State Univ.

 

E90006(II) E90012(II)
E90009(II) E90013(III)
£90010(11) £37223(11)

Northrup King

MKS-3351111)
NKC-393KIII)
NKS 13-46(1)
NKS l9-90(I)
NKS 20-2o‘(11)

 

NKS 20-26(11)
NKS 23-12(11)
NKS 25-99(11)
NKS 42-40’(IV)
NKS 48-84(IV)

Pioneer HiBred

 

 

 

 

 

 

P9273(II) P9441'(IV)
P9341§(III) P9471'(IV)
Univ. of Minn.

M82-946(I)

Ohio St. Univ.

HC84-2001(II)

Univ. of Ill.

LN86-983(II)

Purdue Univ.

C1786(II) C1817(II)
C1797(II)

Public Cultivars

Archer(I) Hack(II)
Beeson 80(II) Haroson(I)
Bert(I) Hobbit3(III)
Brock(I) Hoyt(II)
Burlison(II) Kenwood(II)
Century 84(II) Pella 86(III)
Conrad(II) Pixie’(IV)
Dimon(II) Ransom(VII)
Elf*(III) RCAT Angora(II)
Elgin 87(II) Sibley(I)
Essex(V) Sprite*(III)
Gnome*(I) Williams(III)

21
Table 1.1 (Cont’d)

 

Plant Introductions

 

 

 

Ancestral Introductions Other Plant Introductions
AK(Harrow(III)' Mejiro(IV) PI 68508(II) PI 427099(I)
Biloxi-3(VII) Mukden(II)' PI 297515(II) PI 445830(I)
CNS(VII) Palmetto(VII) PI 297544(II) PI 391594(II)
Dunfield(III)‘ Patoka(IV)‘ PI 361064(II) PI 68522(II)
Flambeau(00) Richland(II)' PI 54610(III) PI 384474(II)
Lincoln(III)‘ Roanoke(VII)' PI 407710(I) PI 90566-1(III)
Manchu(III) S-100(V)‘ PI 68658(II) PI 290126-b(II)
Mandarin(I) Seneca(II)‘ .

Mandarin Tokyo(VII)

(Ottawa)(0)

Manitoba Brown(OO)

 

1 Progeny of Williams by Essex, 1 Progeny of Williams by Ransom, §
Analyzed using only 38 marker loci, 1 Ancestral lines which contributed
parentage to the cultivars and elite lines examined in this study, #
Ancestral lines which did not contribute to northern soybean germ plasm.
Maturity groups are given in parenthesis.

four plants because of poor seed germination. Freeze-dried leaf tissue
was pulverized using a paint shaker modified to hold 50ml disposable
polypropylene centrifuge tubes. The dry tissue was placed in the tube
along with 5ml of glass beads and shaken for two minutes. Pulverized
tissue was incubated for one hour at 65°C with CTAB extraction buffer
(2% CTAB, 1.4M NaCl, 0.2M EDTA, 0.1M Tris-HCl pH 8.0, 1% 2-mercapto-
ethanol). The aqueous phase was then extracted twice with
chloroformzisoamyl alcohol (24:1) and the nucleic acid precipitated with
ice-cold iSOpropanol. DNA that proved difficult to cut with restriction
enzyme was dissolved in a high salt solution and precipitated again to
remove bound carbohydrate (Fang et al., 1991). Restriction enzyme

digestions, electrophoresis, Southern blotting and hybridizations were

22
done according to Maniatis et al. (1982) with adaptation described by
Diers and Osborn (1994).

The soybean genotypes were evaluated by RFLP analysis using 50
clones as hybridization probes. The clones (Table A.1) were obtained
from Iowa State University and the University of Utah (Keim and
Shoemaker, 1988). The clones were selected because they were previously
shown to reveal a high frequency of polymorphism in elite germplasm
(Webb, 1992, Skorupska et al., 1993). Each polymorphic RFLP fragment was
scored as present or absent and genetic distance (RD) among the
genotypes was calculated using a the compliment of the simple matching
coefficient (l-(n’/n), where n’ is the number of alleles two lines have
in common and n is the total number of alleles scored in each
comparison). Cluster analysis was performed on the similarity matrix
using the unweighted pair-group method, arithmetic average (UPGMA).

Principal component analysis was done by first calculating a
correlation matrix of alleles from the RFLP data. Genotypes were then
plotted using eigenvectors calculated from the correlation matrix.
Genetic similarity calculations, cluster analyses, and principal
component analyses were done using NTSYS-pc software (Rohlf, 1992).

Polymorphism information content (PIC) at each locus was computed
using the formula l-Zp,f, where p,J is the frequency of the jth RFLP
allele at the ith locus (Anderson et al., 1993). PIC is a measure of the
genetic diversity. PIC increases with both the number of alleles at a
locus and the equality of frequency of those alleles.

Genealogical distance (GD) was calculated as the compliment of the
coefficient of parentage (CP). GD values used in clustering and

correlation analyses were calculated with the assumed relations among

23
ancestors as described by Carter et al. (1993). Other ancestors were
assumed to be unrelated, each parent was assumed to contribute equally
to all progeny, and all lines were assumed to be completely inbred. The
CP between any line and a line derived from a random mating population

was calculated as:

rx.RM=l/n 211-1..)Y‘m

where rnn is the CP between line x and a line from a particular random
mating population, n is the number of parents used to form the
population, and.r,J, is the CP between x and the 1'”1 parent of the
population. All CP values were calculated with SAS programs (Sneller,

1994b).

RESULTS AND DISCUSSION

Fifty clones were hybridized onto the soybean DNA (Table 1.2).
Seven of the clones revealed two independent polymorphic loci, whereas
the remainder revealed only one polymorphic locus. Thus, a total of 57
marker loci were scored. Fifty-three marker loci had only two alleles,
two loci had three alleles, and two loci had four alleles. Where three
or four alleles were present, the least common allele(s) was observed
only in the ancestral lines and/or the plant introductions. The allelism
of fragments was readily identified because of the predominance of only
two alleles at any locus and the inbred nature of the genotypes.
Previous studies with soybean have shown a similar number of alleles for
polymorphic markers (Keim et al., 1989; Keim et al., 1992; Skorupska et
al., 1993) /~,/" 11::

The mean and range 06:319ffor loci in this study were 0.39 and
0.10-0.61 for the ancestors, 0.29 and 0.00-0.57 for the PI’s, 0.37 and
0.0-0.50 for the northern elites, and 0.39 and 0.04-0.54 overall (Table
A.1). This is an increase over average PIC values previously reported
for soybean of 0.28 (Keim et al., 1989), 0.30 (Keim et al., 1992), and
0.24 (Skorupska et al., 1993). The greater PIC values in our study were
probably the result of prior screening for high values within elite germ
plasm.

According to the Committee for the World Atlas of Agriculture

(1973), the soybean production region of China is found within three

24

25
agricultural areas defined by climate (Figure 1.1). These are the
Northeast Cold Temperate Area (NECTA), the North Temperate Area (NTA),
and the Central Subtropical Area (CSA).

Cluster analysis (Figure 1.2) grouped the ancestors according to
place of origin as listed by Bernard et al. (1987a). ’Palmetto’, ’CNS’,
and ’Biloxi-3’, are ancestors from the CSA near the Yangtze delta (below
32N latitude) and clustered apart from all the other ancestors examined.
These three ancestors and ’Mejiro’ (PI 80837, from the Rikuu AES, Japan)
have the ’Arksoy’ cytoplasm (Grabau et al., 1992; Hanlon and Grabau,
1995). The remaining ancestors have ’Bedford’ cytoplasm, except for
Lincoln, whose cytoplasm is unique among the ancestors in this study.
Most ancestors from the NECTA of China, which includes the Heilungjiang
and Jirin provinces between 42N and 49N latitude, were clustered
together (P154610, ’Dunfield’, ’Manchu’, ’Patoka’, and ’Richland’). This
cluster also includes ’Flambeau’, an introduction from Russia whose
origin is likely from near this region, ’A.K.(Harrow)’ and ’S-100’,
which are selections from ’A.K.’, which probably originated from within
the NECTA, and ’Lincoln’, whose parents are unknown. Although Mandarin
was introduced from Sui Hua, a town in the Heilungjiang province (NECTA)
near 47N latitude, it and the selection ’Mandarin(0ttawa)’ are clearly
separated from other ancestors from the NECTA. These two ancestors are
more closely associated with those originating from latitudes between
32N and 42N, which form separate clusters. ’Tokyo’ (Yokohama, Japan, 36N
latitude) and ’Roanoke’ (a rogue from ’Nanking') are loosely associated
with ancestors of the NECTA. ’Mukden’ (from the NTA)in the Liaoning

Province, 42N latitude), ’Seneca’ (origin unknown), Mejiro (37N

26

1. Northeast Cold Temperate Area
2. North Temperate Area
3. Central Subtr0pica1 Area

DJ

Figure 1.1 Agricultural areas associated with soybean production in
China (Committee for the World Atlas of Agriculture, 1973).

217

.mﬁmmaocm mama so
comma .mcoauosoouuca ucoam Hmnumoocm on yo mmanmcoauoaou on» mca3onm Emumoconm m. a muamﬁm

m-_xossm _
(m u mzu J _
111 appuzsca _
. zam.cmo~_zcz
Acmama .zzocxca <pzv oa_auz
au=5.5~4-uwz cumzum “
I 595: _
cmeaucaz capo-z_qcozc=.13
I z 5825. L
Aezocxca .caaaav ugozaoa _
mu=u_oa5-cm= 111 o>xoP
ozcszu_a

scumzcma . 1
axopca .rIII +111111H
:zuzc: . _
com—azso
o_wrm_a _ _
zoaacxixc
oo~1m _ _
111 zgouz_4. . II
o._ may may
3:26:38 93:32.5 add

 

 

 

 

 

 

 

<humz

 

 

 

 

 

 

-J

 

 

ea 1mg
cvmmeo Co mme<

 

 

 

 

28
latitude) and ’Manitoba Brown’ (origin unknown) are clustered and
loosely associated with Mandarin and Mandarin(0ttawa).

Lincoln was believed to be a selection from the cross Mandarin by
Manchu (Bernard, 1987), but molecular evidence disputesﬂthis. Lincoln
cytoplasm differs from that found in either Mandarin or Manchu (Grabau
et al., 1989). Our study provides further evidence that Lincoln is not a
progeny of Mandarin by Manchu. We found that Lincoln has alleles for 17
markers not found in either Mandarin or Manchu. Given a mean value of
61% shared alleles within the ancestral introductions, the average value
of common alleles between parents and progeny from crosses among these
lines would be 81%. Lincoln and Mandarin shared only 36% of their RFLP
alleles and were widely separated in a three dimensional principal
component analysis (Data not shown). Manchu shared 65% of its alleles
with Lincoln, however, Manchu was a heterogeneous introduction which
gave rise to numerous pure line selections.

The close relationship between AK(Harrow) and Lincoln provides a
possible clue to the origin of Lincoln. The cultivars ’Illini’ and
AK(Harrow) are selections from A.K., are phenotypically
indistinguishable (Carter et al., 1993), and were found by Keim et al.
(1992) to differ at only 1 in 129 RFLP loci. In this study, Lincoln and
AK(Harrow) shared common alleles at 83% of the RFLP loci examined.
Because both Illini and Lincoln were released by C. M. Woodworth at the
Illinois Agricultural Experiment Station (Bernard et al., 1987), it
raises the possibility that Illini or another selection from A.K. could
be a parent of Lincoln.

The marker information provided insight into other relationships

among ancestors. Bernard et al. (1987) listed CNS as probably equivalent

29
to Nanking, and Roanoke as a rogue from Nanking. CNS and Roanoke shared
only 40% of the RFLP alleles examined. If CNS is equivalent to Nanking,
the markers indicate that Roanoke is probably unrelated to Nanking.

The average RD among the ancestral lines Lincoln, AK(Harrow), PI
54610, S-100, and Dunfield was 0.24. These five ancestors contributed
38.5% and 35.4% of the elite parentage of soybean lines in the northern
and southern US, respectively (Gizlice et al., 1994). Ancestral lines
are typically assumed to be unrelated when GD is calculated among lines.
When GD estimates among the northern elite lines in this study were
adjusted by replacing a GD of one with the calculated RD among the
ancestors, the average G0 was reduced from 0.82 to 0.43.

The RFLP markers distinguished all lines evaluated, and clustering
was generally in agreement with known genealogical relationships (Figure
1.3). The selection Mandarin(0ttawa) is closely paired with its
ancestral line Mandarin. 'A2234’, ’A2543’, ’Century 84’, and ’Burlison’
are clustered together and each share the cultivar ’Century’ as a
parent. ’NKS20-26’ and ’NKS19-90’ are linked through their common parent
’Pride 8152’, which is a progeny of a cross with ’NKSl3-46’, also in the
cluster. ’Cl797’ and ’C1786’ are half-sibs and are paired together.
’E90012’ and ’E90013’, full-sibs, are paired and are clustered with
their half-sib ’E90010’.

RFLP distance (RD) analysis resulted in association of genotypes
into clusters previously defined by their ancestors. Biloxi-3 (maturity
group (MG) VIII), Palmetto (MG VII), CNS (MG VII), and Manitoba Brown
(MG 00) formed a cluster separate from all of the other lines examined
(Group 5, Figure 1.3). Biloxi-3, Palmetto, and CNS were the only

ancestral introductions evaluated from southern China and were among the

 

 

RFLP DISTHNCE COEFFICIENT

05 04 03 01 on
02234
crnrunv94
BURLISON
09941
nee—221013
681-356022
LINCOLN
sztsoueo.
I 597223

I 192009

I PELLABG a

 

 

 

ARCHER
aces-145013
NK543 84
NKS2S 96

HACK
886—10302?

 

 

:l
OE

AK(HRRRDH)
UlLLIﬂHS
r“— NKB-335

I #311 I Group I

 

C 7
RHNSOH
PleE
ELF
SPRITE
HOBBIT
GNOME

ASS 293033

_.________{_I aces—241029
02396

1 994 185032 c

, 9290-115043

C1797 I
. C1786 .

290009 I
01455330 .
3152212 1
FIDO-ILL ‘
F_ HANCHU I
DATOKA
01297515
01297544
01407710
DUNFlELD
0190568-1
01427099
- 0168508
0168658
01384474
01391594
RICHLAND
annetau

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Group 2
01361064 (NTA and Other)

 

 

 

I neuonnlu
[::H8NDGRIN(OTTUA)
an. .1

 

 

Group 3
HnQOSGN (Mandarin)
RCAT-GNGDRA
NKSl3-46
NKSlS-SO
[62007
NKSZO-Zé

 

 

 

 

 

 

NKS23—l2

 

 

Group 4
RCANOYE (Other)
anhETTO
CNS Group 5
BILOXl-3 (CSA)

NANIIOBA-BRN

 

 

 

 

 

 

Figure 1.3 Phenogram showing relationships of a sample of soybean lines, based on RFLP
ana YSlS.

 

31
latest maturing lines in this study. Manitoba Brown may have clustered
with the southern ancestors only by virtue of its differences with all
the other lines.

Another prominent cluster (Group 3) is defined by Mandarin and
Mandarin(0ttawa). This cluster is divided into three smaller clusters.
Cluster 3a included Mandarin and Mandarin(0ttawa) along with three
Michigan lines (E90010, E90012, E90013) and ’AP 1989’, whose pedigrees
traced back to Mandarin(0ttawa). It also included P1290126b, introduced
in 1963 from Hungary. Cluster 3b includes lines which trace back to
Mandarin(0ttawa) through cultivars and breeding lines from Minnesota.
Cluster 3c included five lines from Northrup King, ’IA 2007’, and
’Brock’, which all have P1257435, an introduction from West Germany, in
their pedigree. Cluster 3c also includes ’Hoyt’, which traces back to
Mandarin(0ttawa).

The majority of the northern elites clustered with the group of
ancestors originating within the NECTA (Group 1). This group can be
broken down further into associations with Lincoln (Group 1a), S-100,
and AK(Harrow) (Group ID), a cluster of six lines from Iowa State and
Purdue Universities along with an Asgrow cultivar (Group 1c), and the
remainder of the ancestors from this geographical area (Group 1d). Group
ID is comprised almost entirely of the parents and progeny of the
crosses Williams by Essex and Williams by Ransom. Other lines within
this group have Williams in their pedigree. The close relationship among
these lines set them apart from other lines within Group 1 and separates
Group 1c from Group 1a. The separation of group 1b from la is not
consistent with the close RD between Lincoln and AK(Harrow) of 0.17.

Cluster analysis without the progeny of Williams, Essex, and Ransom (not

32
shown) merges the members of Group 1c with Group 1a and places Lincoln
close to AK(Harrow), S-100, and Williams. Cluster 1d includes most of
the PI’s which are grouped with the ancestors Manchu, Patoka, Dunfield
and Richland, and, a Michigan breeding line, E90009. Ancestors whose
origins are intermediate to the extreme northern and southern ancestors
are associated outside of the groupings defined above.

Both RFLP and pedigree analysis grouped the progeny of the crosses
Williams by Essex and Williams by Ransom with one or the other of the
parents (Figures 1.3 and 1.4). However, pedigree relationships failed to
account for the close RD relationship (R0 = 0.30) between Williams and
Essex. The CP between Williams and Essex is near zero, but Lincoln and
S-100, ancestors of Williams and Essex, respectively, are closely
related by RFLP analysis. RFLP data (Figure 1.3) associated Williams and
Essex, while genealogical data (Figure 1.4) places Williams apart from
Essex. GD analysis also failed to account for unequal allele
contributions from the parents. GD analysis grouped most of the progeny
of Williams by Essex with Essex, while RD analysis showed that they were
more related to Williams. 60 analysis was unable to associate lines of
unknown pedigree, such as the ancestors, and account for allele
contributions deviating from probability estimates. These are the likely
reasons for disagreement between cluster relationships based on the two
types of distance information.

The correlation between genealogical distances (G0) and RFLP
distances (RD) among elite lines is highly significant (P<0.001, R =
0.68) (Figure 1.5). The low correlation coefficient may be a result of
the downward bias of GO associated with alleles alike in state as well

as the upward bias of GO from pre-existing relationships among the

 

 

 

   
  
    
  
  
 
  
 
  
  
  

seas-145013
LM&%3
mmm
DELLRGG
"1919-90
nu7,
ares-241029
[max
new
mus

888-22101

NK51316
NKS20-26
BPOCK
IR2007
ANGODA
NKS23'l2
NK520-20

5100
MUKDEN

Dl54610

Figure 1.4 Phenogram showing relationships of a sample of soybean lines, based
coefficient of parentage analysis.

 

CE

1b
33
(Mandarin)
la. 1c
3b. 1d
(NECTA)
ia, 1c
The Pl's from group ld have no known
pedigree and cannot be grouped in this
analysis. Groups 4 and 5 consisted of
ancestors which are not in the pedigrees I
of the lines in this analysis.
The groups listed corrspond to those 1
in Figure 1.3 and are listed in order I
the greatest contribution to the cluster
represented here.
lb, 1c. Ia

3c, 1d, la, lb

 

 

34

 

 

 

1.0
§ 0.8 s
53 Y=O.41X
£2
0 0.6 —I
£2
“0'5
C
{B 1034‘—
0- O
-J -
E" 0.2— O Q (Q
0 O O
0-0 1 1 T 1 i

 

0.0 0.2 0.4 0.6 0.8 1.0
Genealogical Distance (1 - CP)

Figure 1.5 Scatterplot of the correlation between genetic distances
based on RFLP and coefficient of parentage analyses.

ancestors. The ability of R0 analysis to cluster the ancestors by area
of origin is evidence that associations among the ancestors are real and
likely to account for alleles identical by descent among ancestors
previously believed unrelated. The effects of these biases should be
greatest as GD nears 1.0.

The mean genetic distance was 0.39 among the ancestors and 0.36
among the cultivars and elite lines. This represents a statistically
significant decrease in diversity of 8% in the cultivars and elite lines
compared to the ancestors. There were five alleles (A4,pA59;
A3/A4,pBl42; A3,pK258; A3,pR92; Table A.l) present in the ancestors that

were not present in the cultivars or elite lines examined. However,

35
these ancestors were not in the pedigrees of this sample of northern
elites. The 20 ancestral lines we examined contributed 81% of the
parentage within northern elite germ plasm according to Gizlice et al.
(1994). However, pedigree analysis of the northern elites examined in
this study revealed that the 20 ancestors comprised only 74.4% of their
parentage, and that several ancestors not in the pedigrees of these
northern elites were included (Table 1.1).

There were no unique RFLP alleles present in the PI’s. Since these
lines were selected on the basis of their performance as parents when
crossed with adapted lines, they should not be considered a random
sample of diversity within the available gene pool of Pl’s. The smaller
distances (average = 0.30) within this group of PI’s may be the result
of their selection for good performance as parents when crossed with
elite lines from the northern US. Although these lines were acquired
from China (P168508, PI 68522, PI 68658, PI 90566-1, P1391594, P1407710,
P1427099), Hungary (P1297515), Russia (P1297544, P1384474), Yugoslavia
(P1361064), and Romania (P1445830), the diversity implied by range of
source countries may be deceiving. The Chinese lines are all from the
northeast provinces of Heilungjiang and Jirin (Bernard et al., 1989a,b),
the Russian lines come from the far east region bordering Northeast
China, and the lines from eastern Europe were developed from imported
germplasm which likely has origins in northeast China (Nelson, 1995).

Ten cultivars from the cross Williams by Essex and five cultivars
from the cross Williams by Ransom were evaluated in this study because
these crosses were so productive in generating new cultivars. Both
crosses were between a Northern and a Southern cultivar. The

coefficients of parentage between the parents were near zero for each

36

cross (Carter et al., 1993), however, the RD’s did not reflect this for
either cross. Williams differed from Essex at 17 of 55 loci examined (RD
= 0.31) and Williams differed from Ransom at 21 of 55 loci (R0 = 0.38).
The cultivars ’NKB-335’, ’P9471’, ’A3127’, and Pixie differed
significantly from an equal contribution of alleles from each parent
(Table 1.3). When cultivars are grouped by company or university of
origin, all groups, except for cultivars developed by Asgrow are
significantly different than that expected had there been an equal
contribution of alleles from each parent. The Asgrow lines span maturity
groups 111, IV, and V; and may represent a broader range of adaptation
than do the Williams by Essex lines in the other breeding programs.

Cluster analysis of parents and progeny of the cross Williams by
Essex (Figure 1.6) shows an association of the majority of progeny with
Williams. Grouping of lines implies common alleles are shared among
them. The five maturity group (MG) 111 lines out of the cross Williams
by Essex shared a common allele at 5 out of 17 loci (pA89, pKl4, pK385,
pA203, and pR201). Four out of five of these were Williams (MGIII)
alleles. At the one locus where they shared an Essex allele (pA203),
that allele was found in all the Williams by Essex progeny examined. No
common alleles were shared by all group IV lines except the Essex allele
of pA203. In 21 out of 51 possible cases (3 breeding programs by 17
clones), Williams by Essex lines within a breeding program shared the
same allele. Within breeding programs. common alleles were shared by all
lines at 5 (Northrup King), 11 (Pioneer HI-BRED), and 5 (Asgrow) loci
out of 17. Although too few lines and alleles were examined for precise
frequency estimates, generally, contribution of alleles from the parents

were either bimodal or skewed toward a greater contribution from a

37

Table 1.2. Contribution of alleles from parent cultivars to selected
progeny of the crosses Williams by Essex and Williams by Ransom.

 

Williams by Essex

 

 

 

 

 

 

 

 

 

 

Progeny Name Williams alleles Essex alleles

Number Percent Number Percent Prob.‘
NKC-393 11 65 6 35 0.09
NKB-335 12 71 5 29 0.05'
NKS42-40 10 59 7 41 0.15
Total “
Northrup King 33 65 18 35 0.01
P9441 6 35 11 65 0.09
P9471 5 29 12 71 0.05'
Total .
Pioneer ll 32 23 68 0.02
A3127 5 29 12 71 0.05”
A3860 8 47 9 53 0.19
A3966 10 59 7 41 0.15
A4268 10 59 7 41 0.15
A5308 11 65 6 35 0.09
Total
Asgrow 44 52 41 48 0.08
Williams by Ransom
Progeny Williams alleles Ransom alleles

Number Percent Number Percent Prob.
Gnome 9 43 12 57 0.14
Elf 9 43 12 57 0.14“
Pixie 5 24 16 76 0.01
Sprite 8 38 13 62 0.10
Hobbit 8 38 13 62 0.10
Total “
Ohio St. Univ. 39 37 66 63 <0.01

 

1 The probability value is for the allele distributions given and is
calculated from the binomial frequency distribution assuming the null
hypothesis of allele frequencies of 0.5.

38
single parent within both breeding programs and maturity groups (data
not shown). This suggests that there likely had been selection during
breeding for traits associated with specific alleles.

Progeny of the cross Williams by Ransom were selected for high
yield, lodging resistance, and determinant growth habit, with specific
adaptation to highly productive environments (Cooper, 1995). Lines were
selected using a modified, early generation testing procedure (Cooper,
1990) which resulted in selection from within inbred lines. ’Elf’,
’Gnome’ and Pixie were selected from a common same Fz line, as were
’Sprite’ and ’Hobbit’ (Carter et al., 1993). Elf and Gnome were
selections from the same F3 line (Cooper and Martin, 1981). Cluster
analysis of parents and progeny of the Williams by Ransom cross (Figure

1.7) shows greater association of progeny with Ransom than Williams.

 

 

RFLP DISTANCE COEFFICIENT

 

 

08 0m 0m 03 m0

HILLIAMS
'-—"*———I________{ NKB-335

05308
r——— NKC-393
‘ 04268

83127
L___I[ 83860

83966
NK842-40
ESSEX

D9441
D9471

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 1.6 Phenogram showing the relationships of the parents and
selected progeny of the cross Williams by Essex.

39
This was not expected because the progeny (MGI-IV) were selected for
adaptation to an early maturing environment more amenable to Williams
(MGIII) than to Ransom (MGVII). The lines were similar in yield to
Williams when grown in this environment. This shows that favorable gene
combinations can be introgressed from gene pools outside areas of
adaptation, especially in conjunction with specific traits, such as

determinate growth habit.

 

 

QFLD OISIBNCE COEFFICIENT

 

 

08 0s 0; 0; m0
01111003

090300
______f_‘_“‘F mxm
ELF
~ 1——————————900115

HOBBIT
GNOIIE

 

 

 

 

 

 

 

 

 

Figure 1.7 Phenogram showing the relationships of the parents and
selected progeny of the cross Williams by Ransom.

CONCLUSIONS

RFLP fingerprinting can be a valuable tool for cultivar
identification.\The 50 clones revealed 57 independently segregating loci
that completely distinguished the 95 lines evaluated. The lower
diversity exhibited by RFLP alleles in soybean necessitates sampling
more loci for identification by genetic fingerprinting or estimation of
genetic distance than would be required for crops such as maize, which
exhibits PIC values closer to 0.80 (Smith et al., 1990). Uniquely
identifying closely related lines using molecular markers further
increases the number of loci required. The number of markers required to
differentiate two lines for at least two loci is increased by a factor
of 1/(1-F) for any comparison, where F is the coefficient of parentage
between the lines in question. The probability of detecting differences
between lines with a given marker set is a function of allelic diversity
(PIC) revealed by the markers used. For example, finding a marker
difference 99.99% of the time in two unrelated (coefficient of parentage
= 0) lines would require analysis at 14 loci with a PIC of 0.5 (p =
[0.5]" < .0001). However, if the two lines were related (e.g., F = 0.8),
the number of loci required to distinguish the two lines with the same

probability would be 14/(1-F) = 70. Thus, genetic fingerprinting must be

40

41
defined on the basis of both the marker set and the probability of
relatedness among the individuals being examined.

Although selection practiced in breeding programs and random
fixation may have reduced variability among elite lines, we found few
RFLP alleles have been lost in elite germplasm. At the same time, the
PI’s evaluated contained no alleles not already present in the elite
gene pool or available in the original ancestral lines. It is possible
that these results could be an artifact of the clones selected for use
in this study. Because RFLP markers seldom reveal more than two alleles,
even across diverse soybean types, and we used clones that were
previously shown to be polymorphic in elite germplasm, there would be
only a limited chance of finding new alleles in ancestral lines or Pl’s.
Perhaps clones that were monomorphic in elite material would reveal new
alleles in the ancestral lines or Pl’s.

The relationships given by cluster analysis using RFLP data are
generally in agreement with known genetic relationships estimated by
pedigree. Cultivars and elite lines were associated with ancestor(s),
which in turn were clustered according to geographical area of origin.
However, RD’s should be more accurate than GD’s since they account for
unknown relationships among primary breeding parents. A wide range of
genetic distances was obtained using RFLP’s. If RFLP genetic distances
are accurate measures of true genetic differences, significant progress
in performance from recombinant inbreds should still be possible from
crosses using parents selected from within elite germplasm. The
usefulness of either RFLP genetic distance estimates or coefficients of

parentage for selecting parents in breeding programs remains to be

42
determined by the relative successes of the two methods in the
production of transgressive segregants for release as new cultivars.

Selection within breeding programs for adaptation to particular
growing areas or specific traits can result in a significant deviation
from genetic relationships estimated by the coefficient of parentage.
RFLP molecular distance coefficients are probably more accurate measures
of genetic relationship than coefficients of parentage, even when
pedigree information is available and accurate.

The association of progeny with a parent outside their maturity
group in selection for specific traits such as semi-dwarf, determinate
genotypes shows that unadapted germplasm can be a source for new,
favorable gene combinations. The clustering of ancestors or cultivars
and elite lines associated with a defined geographical area may be a
result of random fixation during long term recurrent selection within
adapted gene pools. It is not necessarily indicative of selection of
alleles required for that environment.

The northern elites used in our study were a sample of lines from
breeding programs in the northern US, and many of these were selected
for use as parents in our breeding program. If the limited parentage of
the lines we used is representative of the kind of selection occurring
in breeding programs in general, it indicates a trend toward reduction

in diversity within soybean germ plasm in the northern US.

SECTION TWO

THE RELATIONSHIP BETWEEN GENETIC DISTANCE AND GENETIC VARIANCE

INTRODUCTION

Plant breeders that develop inbred cultivars take advantage of
genetic diversity between parents to produce agronomically superior
cultivars through new combinations of available genes. According to
quantitative theory, population genetic variance of a metric trait, such
as yield, is the result of simultaneous segregation of many genes
affecting that trait. Assuming no epistasis, a random population of
inbred lines resulting from a cross between two highly inbred lines has
genetic variance equal to 2(af), where a‘ is the genotypic value of the
homozygote at locus i, a quantitative trait locus affecting yield
(Falconer, 1989). Assuming that genes affecting yield are many and
randomly distributed throughout the genome, crossing high-performing
lines from distinct genetic backgrounds should provide the greatest
chance of pyramiding genes in combinations which result in progeny that
out perform either parent. Accurate genetic distance measurements would
then aid breeders in selecting diverse parental combinations. However,
genetic distance calculated from random sampling of the genome will fail
to account for either epistasis or loci with relatively large effects on
yield.

Souza and Sorrells (1991), found that variance for biomass among
igjderived families in oat (Avena sativa L.) initially increased with
genetic distance between the parents of the crosses based on coefficient

of parentage, but decreased as distances increased beyond a certain

44

45
point. Genetic variance for grain yield, test weight, heading date,
maturity date, and grain filling period all decreased with increasing
genetic distance. They suggested this negative relationship might have
been a result of including parents unadapted for the region in which the
experiments took place.

Restriction Fragment Length Polymorphism (RFLP) (Southern, 1975)
and Random Amplified Polymorphic DNA (RAPD) (Williams et al., 1990)
molecular markers can provide estimates of genetic distance if they
relate to average differences in coding or regulating sequences.
Distance estimates between parents using RFLP’s have successfully
predicted progeny performance in some cases. Lee et al. (1989) showed
that RFLP genetic distance between parents was correlated with grain
yield (R = 0.46) and specific combining ability (R = 0.74) in resulting
maize hybrids. Smith et al. (1990) found 1'2 values from regressing
hybrid grain yield and grain yield heterosis on RFLP genetic distances
between maize parent lines to be 0.87 and 0.77, respectively.

Other studies have shown little association between RFLP genetic
distance and progeny performance. Melchinger et al. (1990) found that
the correlation of parent distances with F, performance in maize was
positive and significant, but too small to be of predictive value,
especially between crosses of unrelated lines. Godshalk et al. (1990)
investigated the relationship between hybrid performance and RFLP-
derived genetic distances using inbred maize lines crossed with four
testers. They found that while RFLP markers could be used to assign
maize inbreds to heterotic groups, there was no relationship between
RFLP genetic distance and hybrid performance. Moser and Lee (1994) found

no significant relationship between marker genetic distances of parents

46
and hybrid grain yield in oats. The only significant relationship
between RFLP distance among parents and progeny genetic variance was for
straw yield in one of two years. Martin et al. (1995) examined the
relationship between molecular marker diversity and hybrid yield in
wheat using sequence tagged sites (Olson et al., 1989). Genealogical
distance was significantly correlated with marker genetic distance (r =
0.68), but they found no significant association for either of the
genetic distance estimates with F, grain yield, SCA effects, or
heterosis.

Thormann et al. (1994) showed RFLP and RAPD markers to be very
similar for estimating genetic distances within cruciferous species (r =
.96), although the number of markers required for a coefficient of
variation (CV) of the distance estimate of 10% was 327 for RAPD markers
and 294 and 288 for RFLP markers selected from a genomic and a cDNA
library, respectively. Comparison of genetic relationships among
Brassica aleracea L. genotypes by dos Santos et al. (1994) also showed
that RFLP and RAPD markers provide equal resolution. Bootstrap estimates
of the CV of either marker type showed no significant differences for
either the slope or intercept of the plot of CV vs number of markers.
Jain et al. (1994) showed no direct correlation of RAPD genetic
distances of parents with heterosis in Brassica juncea L. (Czern and
Cass), but cluster analysis was useful in identifying heterotic groups.

There is no published information to date on the relationship of
the genetic distance between parents of crosses and the genetic
variation in the progeny for soybean (Glycine max L. Merr.). The
objective of this research was to study this relationship. The extent of

relatedness among elite soybean lines may increase the effectiveness of

47
genetic distance estimates among parents in predicting progeny

performance.

MATERIALS AND METHODS
Distance Analysis

Genetic distances were estimated for forty-six soybean cultivars
and lines (Tables 2.1) using RFLP, RAPD, a combination of RFLP and RAPD
markers, and pedigree analyses. The cultivars and lines evaluated had
previously been used as parents in the Michigan State University soybean
breeding program. Fifty-seven polymorphic RFLP loci were obtained by
hybridizing each of 50 clone/restriction enzyme combinations (Table 1.2)
to total genomic DNA digested with one of five restriction enzymes.
Detailed protocols are given in section one, materials and methods.

RAPD analyses were performed using 43 decamer primgr§,(Table 2.2)
obtained from Operon Technologies Inc., Alameda, CA (kits AA-AZ).
Primers were screened prior to use for ability to reveal polymorphism
among a sample of eight soybean lines from various breeding programs
from the northern US. Reactions were performed in 25 pl volumes
containing 50 mM Tris, pH 8.5, 3 mM MgCl,, 200 pl each dNTP, 2 units
Stoffel fragment (Perkin-Elmer, Norwalk CT) and 25ng each of primer and
template DNA. The reactions were loaded into 200 pl thin-walled reaction
tubes and placed in a Gene-Amp 9600‘ thermo-cycler (Perkin-Elmer Cetus
Corp., Norwalk, CT). DNA was amplified using a cycling profile of 4 min
at 94° C followed by 3 cycles of 15 s/94° C, 15 s/35° C, 45 s ramp to 72°
C, 75 s/72° C; 34 cycles of 15 s/94° C, 15 s/40° C, 45 s ramp to 72° C, 75

48

Table 2.1 Cultivars and lines used as parents.

49

 

Population set A

Population set 8

 

 

A2234
A2943
A84-l85032
ABS-293033
A86-103027
AP 1989
ARCHER
BEESON 80
BURLISON
CENTURY 84
CONRAD
E90009
E90012

E90013
E87223
ELGIN

HC84-2001

HACK

HOYT

IA 200
KENWOO
M82-94
NKSI9-
NKSZ3-
PELLA
SIBLEY

87

7
D
6
90
12
86

A2234'
A2543

A2936
A88-221013
AC89-145013
AC89-221013

BERT

BROCK
C1786
C1797
C1817
E90006
E90010

HAROSON

IA 2007'

IA 2008
LN86-983
NKSl9-90'
NK520-26
P9273
RCAT—ANGORA

 

1 Parent was used in both populations

Table 2.2 Primers’ used in RAPD analysis.

 

Primer Sequence

Primer Sequence

Primer Sequence

 

Number 5’ 3’ Number 5’ 3’ Number 5’ 3’
AA 01 AGACGGCTCC AD 05 ACCGCATGGG AI 11 ACGGCGATGA
AA 02 GAGACCAGAC AD 08 GGCAGGCAAG AI 12 GACGCGAACC
AA 15 ACGGAAGCCC AD 11 CAATCGGGTC AI 15 GACACAGCCC
AA 17 GAGCCCGACT AE 03 CATAGAGCGG AI 16 AAGGCACGAG
AA 18 TGGTCCAGCC AE 05 CCTGTCAGTG AI 19 GGCAAAGCTG
AB 01 CCGTCGGTAG AE 09 TGCCACGAGG AJ 02 TCGCACAGTC
AB 04 GGCACGCGTT AE 19 GACAGTCCCT AJ 06 GTCGGAGTGG
AB 09 GGGCGACTAC AG 04 GGAGCGTACT AJ 09 ACGGCACGCA
AB 20 CTTCTCGGAC AG 08 AAGAGCCCTC AJ 11 GAACGCTGCC
AC 02 GTCGTCGTCT AH 06 GTAAGCCCCT AJ 12 CAGTTCCCGT
AC 05 GTTAGTGCGG AH 08 TTCCCGTGCC AJ 15 GAATCCGGCA
AC 06 CCAGAACGGA AH 09 AGAACCGAGG

AC 08 TTTGGGTGCC AH 14 TGTGGCCGAA

AC 12 GGCGAGTGTG AH 17 CAGTGGGGAG

AC 19 AGTCCGCCTG AH 18 GGGCTAGTCA

AD 01 CAAAGGGCGG

09 TCGCTGGTGT

 

1 Primers were obtained from Operon Technologies Inc., Alameda, CA.
Operon primer numbers are given, followed by their nucleotide sequence
from the 5’ to 3’ direction.

50
s/72° C; and a final extension period of 7 min at 72° C. Reactions were
kept at 4° C overnight, and 20 pl of the completed amplification
reaction mixture were run in 1.4% agarose gels.

Each polymorphic RFLP or RAPD fragment was scored as present or
absent and genetic distance among the genotypes was calculated using the
compliment of the simple matching coefficient (1 - n’/n, where n’ is the
number of alleles two lines have in common and n is the total number of
alleles scored in each comparison). Combined distances were calculated
from a matrix of all RFLP and RAPD marker data. Because the RAPD markers
were mostly dominant and the RFLP markers were mostly codominant, the
RFLP markers were scored as either present or absent for one allele per
locus to give equal weight to each marker type. Where both RFLP alleles
were present in a heterogeneous mixture, the marker was scored as -
present. This occurred in 31 out of a total of 2668 cases, and resulted
in a small amount of error compared with RFLP analysis where both
alleles were scored. This same error is inherent in RAPD markers that
are dominant. Genealogical distance (G0) was calculated as the
compliment of the coefficient of parentage as previously described in

section one.

Field Evaluation

Two sets of single seed descent populations were evaluated in
field tests. The populations were all derived from two-parent crosses.
Set A included 22 populations of Ft, lines evaluated in 1993 and a
subgroup of fourteen populations evaluated as F,,5 lines in 1994.

Subgroup populations were selected to provide a wide range in genetic

51
distance and germ plasm diversity. For each population, 28 lines and the
two parents were tested in each year. The tests were blocked by
population and lines were randomized within each population. In 1993,
the populations were sown on May 20 at the Michigan State University
farm near Mason, MI. Thirty seeds of each line were sown in plots 91 cm
long with a 76 cm row spacing and a 91 cm alley between ranges. Rows of
plots were bordered on each side with a continuous row of ’Dimon’. The
test was replicated 3 times using a randomized complete block design.
Plots were harvested for yield measurement over a period of several
weeks beginning in the middle of October. In 1994, populations were
evaluated at 2 locations; at the Michigan State University farm in East
Lansing, MI and near Britton, MI with 2 replications at each location.
The planting dates were May 13 for Britton and May 16 for East Lansing.
Plots consisted of two 2.74 m rows with 91 cm between ranges and row
spacing of 76 cm. Both rows were harvested to estimate yield. Harvest
dates were October 13 for Britton and October 18 and October 22 for East
Lansing.

Set 8 included 25 populations of F,,5 lines evaluated in 1994 and a
subgroup of ten populations evaluated as F“, lines in 1995. For each
population, 48 lines and the two parents were tested in each year. In
1994, the populations were sown on May 24 at the Michigan State
University farm in East Lansing, MI, using the same experimental design
as the 1993 test for set A. Plots were harvested over a period of
several weeks, beginning in the middle of October. In 1995, populations
were evaluated at the Michigan State University farm near Mason, MI and
near Britton, M1 with 2 replications at each location. The experimental

design and plot layout was the same as that for the 1994 test for set A.

52
Planting dates were May 22 for near Britton and June 1 at Michigan State
University, and harvest dates were November 8 and Oct 13, respectively.

In all plots, fertilizer rates per ha were 6.7 kg N, 26.9 kg P,
and 26.9 kg K. All plots except Mason, MI received .56 kg/ha Lexone’ (4-
Amino-6-(1,l-dimethylethyl)-3-(methylthio)-1,2,4-triazin-5(4H)-one) and
4.7 l/ha Lasso? (2-chloro-2’,6’-diethyl-N—(methoxymethyl) acetanilide)
incorporated into the soil prior to planting. Basogran‘ (3-(1-
methylethyl)lH-2,1,3-benzothiadiazin-4(3H)-one 2,2-dioxide) (1.2 l/ha),
Concentrated Crop Oil (1.2 l/ha), and Assure' (2-[4-[(6-chloro-2-
quinoxalinyl)oxy]phenoxyl] propionic acid, ethyl ester) (0.37 l/ha) were
applied post-emergence. The plots at Mason, M1 were treated as above,
except 2.3 l/ha Dual‘ (2-chloro-N-(2—ethyl-6-methylphenyl)-N-(2-methoxy-
l-methylethyl acetamide) was applied in place of Lasso.

Maturity date was recorded as the day on which 95% of the pods had
reached mature pod color (R8) (Fehr and Caviness, 1977). Plant height
was recorded as inches from the ground to the average terminal node of a
group of plant randomly chosen toward the center of the plot. Lodging
index was a subjective score of 1 through 5, where 1 indicated that
plants were almost completely vertical, and 5 indicated that the main
stem was lying flat on the ground. Both plant height and lodging were
measured at full maturity, just prior to harvest.

Genetic variances of populations were estimated from algebraic
combinations of mean squares (MS) (Johnson et al., 1955). The algebraic
estimate of genetic variance for the single row plots at one location
was (MS(genotype)-(MS(error))/3). The estimate taken over two locations
for the 2-row plots was MS(genotype by location)-MS(genotype))/4 where

the genotype by environment interaction was significant. Otherwise,

53

(MS(genotype)-(MS(error))/4) was used. Negative estimates of genetic
variance are listed, but were assumed to be zero for correlation and
regression analyses. Non-significant, but positive genetic variances
were analyzed at their calculated value. The standard error of the
genetic variance was estimated using the formula: [(2(MSIYVdf+2 +
2(M52)’/df+2)/(rl)’]°”, where M51 and MS2 are the mean squares used in
the algebraic determination of the genetic variance and denominator df
is the degree of freedom for that mean square. The terms in the overall

denominator are replications (r) and locations (l).

RESULTS

Fifty clones revealed fifty-seven independent polymorphic RFLP
loci (Table 1.2). The clones mapped to 14 of 23 linkage groups (Lorenzen
et al., 1995). Nine linkage groups were not covered (I,N,0,0,S,U,W,X),
although 6 of the 9 (Q,S,U,V,W,X) were small and had only 2-4 markers
per group. Forty-three decamer primers revealed 78 polymorphic RAPD
loci. The location of markers is not known because they have not been
incorporated into a map of the soybean genome. Genetic distances between
the parents and genetic variances for several agronomic traits are given
in Table 2.3 through Table 2.6.

For the parents of population set A, RFLP genetic distances (RFD)
averaged 0.38 with a range of 0.24 to 0.54, RAPD distances (RPD)
averaged 0.31 with a range of 0.23 to 0.39, while distance estimates
from a combination of both RFLP and RAPD marker data (CMD) averaged 0.36
with a range of 0.27 to 0.43 (Table 2.3). Genealogical distances (GD)
between parents of populations averaged 0.84 and ranged from 0.73 to
0.95. Genealogical distance was not correlated with either RFD or RPO,
although it was significantly correlated with CMD (Table 2.7). RPD was
significantly correlated with RFD, and CMD was significantly correlated
with both RFD and RPD.

The parents of population set B were more closely related than the
parents of population set A. The RFD between parents of population set 8

averaged 0.36 with a range of 0.20 to 0.48, RPD averaged 0.26

54

55

.8503000.e0008500000.e00035000332850.e.000380800000300e0e0e0s0eezss..0

 

 

 

 

 

 

 

 

00.0 ..00.0 0.0 ..0.00 0.00 ..0.00 000 ..0000 00.0 00.0 00.0 00000 000000 00
00.0 ..00.0 0.0 ..0.0 0.00 ..0.00 000 ..0000 00.0 00.0 00.0 000000 00000000 00
00.0 ..00.0 0.0 ..0.0 0.00 ..0.00 000 ..0000 00.0 00.0 00.0 00 000000 000000 00
00.0 ..00.0 0.. ..0.00 0.00 ..0.00 000 ..0000 00.0 00.0 00.0 00 0000000 000000 00
00.0 00.0 0.0 ..0.0 0.0 ....00 000 ..0000 00.0 00.0 00.0 00-00 000 000000-000 00
00.0 ..00.0 0.0 .0.0 0.00 ....00 000 ..0000 00.0 00.0 00.0 0000 00 000000 00
00.0 ..00.0 0.0 ..0.0 0.0 ..0.00 000 000 00.0 00.0 00.0 00 00000 000000 00
00.0 ..00.0 0.0 .0.0 0.00 ..0.00 000 .000 00.0 00.0 00.0 00-00 000 000000 00
00.0 00.0 0.0 ..0.0 0.00 ..0.00 000 ..0000 00.0 00.0 00.0 000000 000000 .0
00.0 00.0 0.0 ..0.0 0.00 ..0... 000 ..0000 00.0 00.0 00.0 0000 00 0000 00
00.0 ..00.0 0.0 ..0.00 0.00 ..0.00 000 ..0000 00.0 00.0 00.0 000000 0000 00
00.0 00.0 0.0 0.0 0.0 ..0.00 000 000- 00.0 00.0 00.0 .0 0000000 000.000 00
00.0 ..00.0 0.0 0.0- 0.0 ..0.00 000 000 00.0 00.0 00.0 000000-000 00000000 00
00.0 ..00.0 0.0 ..0.0 0.00 ..0.00 000 .000 00.0 00.0 00.0 00000 000000-000 0
00.0 ..00.0 0.0 ..0.0 0.0 ..0.00 00. ..000 00.0 00.0 00.0 0.0-000 000000000 0
00.0 ..00.0 0.0 0.0 0.00 ..0.00 .00 .0000 00.0 00.0 00.0 0000000 000000 0
00.0 ..00.0 0.0 ..0.00 0.00 ..0.00 000 .000 00.0 00.0 00.0 00-00 000 0000-0000 0
00.0 00.0 0.. ..0.0 0.. .0.0 000 .000 00.0 00.0 00.0 000000-000 0000-0000 0
00.0 .00.0 0.0 ..0.0 0.. ..0.00 000 ..0000 00.0 00.0 00.0 00 00000 00-00 000 0
00.0 ..00.0 0.0 ..0.00 0.00 ..0.00 000 ..0000 00.0 00.0 00.0 00 00000 000000-000 0
00.0 00.0 0.0 .0.0 0.00 ....00 000 .0000 00.0 00.0 00.0 000000 000000 0
00.0 ..00.0 0.0 .0.0 0.0 ..0.00 00. 000 00.0 00.0 00.0 00 00000 000000-000 0
.0.0 00 .0.0 00 .0.0 00 .0.0 00 0000 0000 .00 00.0 00.000 .000
00000 000000 00000000 00000 000.0000 00000.0

 

.0020 00.0.2050 0000 05 .08 0000.00 000800.000 00.00000 .08 08000000 200000 0000 020000000 80303 030000 00.00.0000 0.0 0300

56

.00000000 0000 0 000 .0 000000000 0000 0 000 .0 000000000 000000000000 . 00 .0 .000000000000 .000000 00.0 000 00.0 000 00 00000000000 .0 .0

 

 

 

 

 

 

 

 

00.0 00.0 0.0 000.00 0.00 000.00 0.0 0.0- 00.0 00.0 00.0 00000 000000 00
00.0 0000.0 0.0 000.00 0.00 000.00 0.0 000.00 00.0 00.0 00.0 000000 00000000 00
00.0 ..00.0 0.0 000.00 0.0 000.00 0.0 0.0- 00.0 00.0 00.0 00 000000 000000 00
00.0 ..00.0 0.00 000.00 0.0 000.00 0.0 0.0- 00.0 00.0 00.0 00 0000000 000000 00
00.0 .00.0 0.0 0.0.0 0.0 .00.0 0.0 0.0- 00.0 00.0 00.0 00 00000 000000 00
00.0 ..00.0 0.0 0.0.0 0.0 000.00 0.0 .0.0 00.0 00.0 00.0 00-00 000 000000 00
00.0 00.0 0.0 000.00 0.0 000.00 0.0 0.0 00.0 00.0 00.0 000000 000000 00
00.0 00.0 0.0 0.0.0 0.0 000.00 0.0 0.0 00.0 00.0 00.0 0000 00 0000 00
00.0 ..00.0 0.00 000.00 0.0 000.00 0.0 .00.00 00.0 00.0 00.0 000000 0000 00
00.0 00.0- 0.0 000.0 0.0 000.00 0.0 000.0 00.0 00.0 00.0 00 0000000 000-000 00
00.0 0000.0 0.0 .00.0 0.0 .00.0 0.0 0.0 00.0 00.0 00.0 000-000 00000000 0
00.0 ..00.0 0.0 000.00 0.00 0.0.00 0.0 0.0- 00.0 00.0 00.0 0000000 000000 0
00.0 0.00 0 0.0 0.0.0 0.0 .00.0 0.0 .00.0 00.0 00.0 00.0 00 00000 00-00 000 0
00.0 0000.0 0.0 000.00 0.00 000.00 0.0 0.0- 00.0 00.0 00.0 000000 000000 0
.0.0 0 .0.0 00 .0.0 00 .0.0 00 0000 0000 000 0000 000000 .000
00000 000000 00000000 00000 00000000 0000000

 

.00000 000.000 0000 000 000 000000 000000000 0000000 000 000000000 0000000 000 0000000000 00000000 0000000 .0000000 0.0 00000

57

.00000000 0000 . 000 .0 000000000 0000 . 000 .0 000000000 000000000000 . 00 .0 .000000000000 .000000 00.0 000 00.0 000 00 00000000000 .. ..

 

 

 

 

 

 

 

 

00.0 ..00.0 0.0 ..0.0 0.0 ..0.0 000 ..000 00.0 00.0 00.0 0000 00 00000 00
00.0 ..00.0 0.0 ..0.0 0.0 ..0.00 000 ..0000 00.0 00.0 00.0 00000 00-00 000 00
00.0 ..00.0 0.0 ..0.00 0.0 ..0.0 000 ..0000 00.0 00.0 00.0 000000 000000 00
00.0 ..00.0 0.0 ..0.00 0.0 ..0.00 000 ..000 00.0 00.0 00.0 00000 000-0000 00
00.0 ..00.0 0.0 ..0.00 0.0 ..0.00 000 ..0000 00.0 00.0 00.0 000000 0000 00
00.0 ..00.0 0.0 ..0.0 0.0 ..0.0 000 000 00.0 00.0 00.0 00000 0000 00 00
00.0 ..00.0 0.0 ..0.00 0.0 ..0.00 000 .000 00.0 00.0 00.0 000000-0000 0000000 00
00.0 00.0 0.0 ..0.0 0.0 ..0.00 000 ..000 00.0 00.0 00.0 00000 00000 00
00.0 ..00.0 0.0 ..0.00 0.00 ..0.00 0000 ..0000 00.0 00.0 00.0 00000 00000 00
00.0 .00.0 0.0 .0.00 0.00 .0.00 000 ..000 00.0 00.0 00.0 00000 00000 00
00.0 ..00.0 0.0 ..0.00 0.00 ..0.00 000 ..0000 00.0 00.0 00.0 000000-0000 000000 00
00.0 00.0 0.0 ..0.0 0.0 ..0 00 000 ..0000 00.0 00.0 00.0 00-00 000 000-0000 00
00.0 ..00 0 0.0 ..0.00 0.00 ..0.00 000 ..000 00.0 00.0 00.0 000000-000 00000 00
00.0 ..00.0 0.0 ..0.00 0.00 ..0.00 000 ..0000 00.0 00.0 00.0 00000 000000-0000 00
00.0 ..00.0 0.0 ..0.0 0.0 ..0.00 0000 ..0000 00.0 00.0 00.0 000000 00000 00
00.0 ..00.0 0.0 ..0.00 0.00 ..0.00 000 ..0000 00.0 00.0 00.0 000000 00000 00
00.0 00.0 0.0 ..0.0 0.0 ..0.0 000 ..0000 00.0 00.0 00.0 00000 00-00 000 0
00.0 ..00.0 0.0 ..0.00 0.00 ..0.00 000 ..0000 00.0 00.0 00.0 00-00 000 00000 0
00.0 ..00.0 0.0 ..0.00 0.00 ..0.00 000 ..0000 00.0 00.0 00.0 00000 000000-0000 0
00.0 ..00.0 0.0 ..0.00 0.00 ..0.00 0000 ..0000 00.0 00.0 00.0 00000 00-00 000 0
00.0 ..00.0 0.0 ..0.00 0.00 ..0.00 000 ..0000 00.0 00.0 00.0 00000 0000 00 0
00.0 ..00.0 0.0 ..0.0 0.0 ..0.00 000 ..0000 00.0 00.0 00.0 0000 00 00000 0
00.0 ..00.0 0.0 ..0.00 0.00 ..0.00 000 ..0000 00.0 00.0 00.0 00000 00000 0
00.0 ..00.0 0.0 ..0.0 0.0 ..0.00 000 ..0000 00.0 00.0 00.0 00000 000000 0
00.0 ..00.0 0.0 ..0.00 0.00 ..0.00 000 ..0000 00.0 00.0 00.0 00000 000000 0
.0.0 00 .0.0 00 .0.0 00 .0.0 00 0000 0000 .00 0000 000000 .000
00000 000000 00000000 00000 00000000 0000000

 

.3000 .00.0-000000 0000 05 000 0000.00. 000000.000 0000000 .000 000000.000 000000000 00000 6000000000 00003000 0000000 0000.000 0:0 030.0

.oocaymﬂu 9mm“ . mm“ ‘m «mugaumau mgmm . 9mm \w Nougmumﬂu Hmoﬁmoaoocoo . no .* .hao>ﬁuomqmmu .mam>oa Ho.o can no.0 map “a unmodwﬁcuam .. ‘.

 

58

 

 

 

 

 

 

 

Ho. ..Ho. ~.H ..N.. H.H *.Q.¢ ﬁ.~ “m.” ¢~.o 0N.o nm.o BOON «H xuomm mN
No. Ho. m.~ ..v.m H.H ..h.. e.” ..m.m NH.° m~.o No.9 aﬂoomm mooomu MN
mﬁ. .*Hm. m.m ..m.¢ﬂ m.m ..o.¢ﬁ m.“ ..~.m 5N.o N~.o H~.o “Hwﬁu www-0wzg NN
mo. .*0H. N.” *.m.m 0.H .‘o.h h.” .{¢.m Hm.o n~.° om.o xuomm moon «H 0N
mo. .‘wﬁ. N.ﬂ ..m.q H.H ..o.m m.H ..0.0 mﬁ.o 5N.o NB.° ”swam “Hwﬂu mH
Va. ..hm. ®.NH .‘N.mm H.NH ..5.hm m.MH ..¢.m¢ om.o mm.o H5.o ”Emma «mmmg NH
no. .‘mﬁ. m.m “N.N~ o.w .*m.hn 5.0 ..m.m~ mm.o «v.0 mm.o om-mH mxz ”Emma m
mH. ..mm. m.m ..m.¢~ m.“ .*m.om m.m m.¢H ~m.o ~¢.o hm.o mammg m~-o~ mxz o
No. .«wN. m.ﬁ .‘m.¢ ~.~ ‘.m.oﬁ m.n .‘O.HH mN.o m¢.o em.o Boom gH hmhﬂu c
NH. ..vv. N.v ..m.sﬂ ﬁ.m ..m.m~ m.m ..m.hﬁ om.o B¢.o vm.° “Hwﬂu aﬂoomu H
.m.m Nb .m.m No .m.m ~° .m.m Nb .Qmm «mum *gw mamx macaw“ .aom
mayo; “Sodom myﬂugpu: cam“, mocwpmﬂg mpgwumm

 

.303 38-25 32 05 .8“ 333 page? H828 .3“ $22.23 0383 23 $3258 8533 0:83 .3538 3 m3:

Table 2.7 Correlations and P-values among genetic distance measures"for
the parents of population sets.

 

Population set A

 

 

 

RFD RPD CMD GD
RFD - .55** .93** .42
<.009 <.001 <.053
RPD — .78** .41
<.001 <.06
CMD - .44*
<.O4
Population set 8
RFD RPD CMD GD
RFD - .42* .88** .79**
<.04 <.001 <.001
RPD - .70** .50*
<.001 <.02
CMD - .75**
<.001

 

 

*, ** Significant at the 0.05 and 0.01 levels, respectively. 1 RFP =

RFLP Distance, RPD = RAPD Distance, CMD = Combined RFLP and RAPD
Distance, GD = Genealogical Distance (l-Coefficient of Parentage.

60
with a range from 0.13 to 0.36, and CMD averaged 0.28 with a range of
0.18 to 0.39 (Table 2.4). GD between parents of population set 8
averaged 0.81 with a range of 0.58 to 0.94. In contrast to parents in
set 1, all distances calculated between parents in set 2 were
significantly correlated with one another (Table 2.7).

Two populations in set B were not included in the analysis. In
1995, population 6 was not included in the 1995 analysis because, at one
location, 14 of the 48 progeny lines along with the parent ’NK520-26’
were devastated by a disease which was not diagnosed. The algebraic
estimate of the yield genetic variance using the remaining progeny in
population 6 fit well within the linear regression model of yield
genetic variance versus RFLP distance (data not shown), but the variance
was non-significant according to the F-test. This could have been a
result of the loss of degrees of freedom from the reduced number of
progeny included in the analysis. Also, analysis of variance showed
significant genotype by environment interaction among the remaining
progeny. Therefore, population 6 was not included in the analysis in
1995.

The yield genetic variance of population 17 was almost twice that
of any other population in set 2 in both 1994 and 1995 (Table 2.4 and
Table 2.6), although the genetic distance between the parents was
moderate. Because of its disproportionately large yield genetic
variance, population 17 was tested as an outlier according to the
procedures given by Snedecor and Cochran (1967). Using the standard
error of the individual estimate:

5... = s...[1 + 1/(n-1) + m-D’mxrirlm

and:

61

t = (WW/Sm
where S), is the standard deviation from regression, n is the number of
the data points including the outlier, Y is the mean genetic distance,
and X0 is the distance associated with the outlier. The P-value
associated with t is set to nP. In all cases where the regression of
yield genetic variance on genetic distance was significant (population
17 omitted), population 17 was a significant outlier (nP < 0.05).
Correlations were calculated with and without population seventeen.

In all the experiments, most populations exhibited significant
genetic variance for all traits measured (Table 2.3 through Table 2.6).
The exception was yield genetic variance in population set A in 1995.
Only 5 out of 14 populations had significant yield genetic variance in
the 1995 two-row plots.

There were no significant correlations between any of the
distances and genetic variance estimates from the populations for set 1
in the 1993 l-row plots (Table 2.8). In the 1994 evaluation of the set A
populations in two-row plots, RFD (Figure 2.1), RPD (Figure 2.2) and CMD
(Figure 2.3) were both negatively correlated with yield.

Yield genetic variance of populations in set 8 was significantly
related to RFD in the 1994 single-row plots, with an r of 0.41 (Table
2.9). Maturity genetic variance was also significantly correlated with
both RFD and GD for these populations in 1994. In the 1995 two-row
plots, there were no significant correlations between genetic distance
and genetic variance for any trait.

when population 17 was excluded from the analysis of set 8
populations, the correlations between genetic distance and genetic

variance for yield and maturity generally increased (Table 2.10). The

62

lav oocmumﬂo Hwoﬂmoamocoo u 00 .mocmumﬂo om¢m can mama uocﬂnsoo
u omm .mocmpmHQ mqmm u mum 0

.mmmgcmumm mo ucmHOﬂmmooo

020 .oocmuwﬂo 0m¢m

.>Ho>ﬂuommmmn .mam>ma H0.0 0:0 m0.0 on» um unmoHMﬂcmHm «w.«

 

 

00. v 00. v 00. v mm. v ma. v ma. v 05. v 00. v
00.0 -.o mo.o mm.01 ma.o 00.0 00.0 00.0 cu

om. v mm. v Hm. v moo. v mu. v 00. v mm. v 00. v
0N.0I hm.0I 0N.0I 300.01 00.0! ma.0I hN.0| 00.01 020

mm. V mm. v No. v «0. v om. v mu. v 0H. v m0. v
00.0 «0.0I mH.0I .mm.01 00.0 00.0I 0m.0| hH.0I omm

mm. v 5H. v mm. v N0. V cm. v hm. v 50. v mo. v
mm.0l 00.0! 0H.0I .No.01 00.01 0N.0I 0H.0I mm.01 Ohm
mcﬂmvoq muﬂnaumz unmﬂom camﬂw ucﬂmooq muﬂnsumz unmﬂom vaoﬂm mocmumﬂo
vaumcmu

 

00C6ﬂh6> UHUQGOU

 

 

vmmd

woﬁﬂﬁHM> Oﬂpmﬁmw

 

 

.oco pom coaumHsmom you muﬂmnu oﬂaocouwm Hmum>mm mo moo:0ﬂum> oﬂuocmw npﬂ3 mucoumm

cmm3umnnammumaﬁuw0 mocmumﬂv Oﬂuocmw mo monao>im 6:0 mucmﬂoﬂuumoo coaumaouuoo 0.~ manna

63

.mmmucmumm mo #:mﬂoﬂmmmoo
lav monoumﬂo Hooﬂooammcoo u 00 .oocmumﬁo omom oco mqmm oocﬂneoo u 020 .oocmumﬂo omom
u omm .oocoumﬂo mama n max 0 .>Hm>ﬁuomommn .mao>ma Ho.o 00m mo.o on» no unmoﬂmﬂcmﬂm ««.*

 

 

 

 

 

 

on. v an. v «n. v om. v am. v moo. v oo. v mo. v
Hm.o om.o ma.o H~.o mﬂ.o .«Hm.o om.o oH.o am

an. v oH. v on. v Hm. v om. v no. v mm. v ma. v
om.o om.o om.o a¢.o ao.o om.o mH.o mm.o 020

mm. v oo. v 5H. v ma. v om. v oo. v mm. v om. v
em.o Ho.o oa.o om.o m~.o om.o H~.o -.o oom

Hm. v oH. v ow. v no. v om. v mo. v ea. v mo. v
a~.o o¢.o m~.o om.o Ho.o .e¢.o oH.o .Ha.o ohm
ocﬂoooq ouﬂusumz unoﬂmm oamﬂ» ocooooq xuﬂuoumz unoﬁom oaoﬂw mocmuwﬁo
Oﬂumcmo

wOCMHHm> OHUmch @OGMﬂHGNV Oﬂﬂmﬁmw
mood aooa

 

.03» now coﬂumHamom you muﬁmuu Uﬁﬁoconmm Hmuo>om no mmocmﬁuo> owuocmm suaa mucmuom
cmm3uonuamouoﬁﬂum0 mucoumﬂc oﬁumcov mo mozam>lm 0cm mucowoﬁmmooo :oﬂumamuuoo 0.0 manna

64

.omoucmumm no ucoﬂoﬂwmooo
lav mocmumﬂo Hmoﬂmoamocoo u no .oocmumﬂo omdm 0cm mqmm vmcﬂnﬁoo u 020 .mocoumﬂo omdm
u omm .mocmumﬂo mama u mom 0 .>Hm>ﬂuoommou .mam>ma Ho.o 02m mo.o on» no ucooﬂmﬂooﬁm ««.*

 

 

 

 

 

 

50. v 00. v 0H. v 00. v on. v ~00. v 00. V 0a. v
0m.0 .mh.o mm.0 05.0 0~.0 1mm.o «no.0 mm.0 mm

mm. v m0. v 0H. v 00. v mm. v 00. v ¢m. v MC. V
hm.0 .nn.0 mm.0 .mo.0 m0.0 00.0 ma.0 «50.0 Q80

mm. v m0. v 00. v no. v mm. v NH. v 00. v mv. v
mm.0 .mu.0 00.0 00.0 0N.0 mm.o ma.0 ha.o Qmm

on. v no. v om. v m0. v mm. v H0. v 0m. v a000. v
«H.0 00.0 00.0 «No.0 No.0 :mm.0 0H.0 :mm.o Ohm
mcﬂmooq huﬂusumz unoﬂmm oaoﬂ» mcﬁmooq muﬁusumz unmﬂom waoﬂw monmumﬂo
naumcmo

mocmaum> oﬂuocow mocmﬂum> oﬂwmcoo
mama vmma

 

.umuuﬁﬁo mﬂ ha coﬂuoazaom
.03“ now :oﬂumasmoa How muwmuu Uﬁeocouvm Hmuo>om mo mmocmﬂum> oﬂuocmm nu“; mucoumm
coo3umnn.mouoﬁﬂumo oucmnmﬂo oauocmv mo modao>|m 0cm mpcoﬁOHmmmoo :oﬂuoaouuoo 0H.~ manna

65

 

 

 

 

20
18 “i O
3 16 " R=-62
5 14a (3 I3<.02
§ 12 —
£3 ‘10 -‘
(D 8 ._
C
C)
8 6— 00
E 4 ‘ O
>- 2‘ O O
0 —. C) C) C) C) C) C)
l l l l l T
0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60
RF LP Genetic Distance

Figure 2.1 Scatterplot of yield genetic variance versus RFLP genetic
distance for population set A in the 1994 two-row plots.

 

 

 

 

20
18 —1
_ O R = -.56
8 16 P < 04
If: 14 ~ 0 '
g 12 —
é; 1()-4
(D 8 .0
C
O
8 6 — 0
i3 4-‘i (3 C)
>- 2 0 o O
0~ O 000 o
F T T l l j l 7
0.22 0.24 0.26 0.28 0.30 0.32 0.34 0.36 0.38 0.40
RAPD Genetic Distance

Figure 2.2 Scatterplot of yield genetic variance versus RAPD genetic
distance for population set A in the 1994 two-row plots.

66

 

 

20
18--O

816~

§14‘O R=-.74

£12a P<.003

£3 1()-

(D 8—

C

8 6— 000

is 4‘ O

>- 2‘ O
o— o 8 o o

r I l l T l l I

 

 

0.26 0.28 0.30 0.32 0.34 0.36 0.38 0.40 0.42 0.44
Combined Genetic Distance

Figure 2.3 Scatterplot of yield genetic variance versus genetic distance
from the combined analysis of RFLP and RAPD data for population set A in
the 1994 two-row plots.

correlations of yield genetic variance with RFD (Figure 2.4) and CMD
(Figure 2.5) were now significant in 1994 and 1995. Maturity genetic
variance remained significantly correlated with RFD (Figure 2.6) and GD
(Figure 2.7) for the 1994 tests, and was significantly correlated with
RPD (Figure 2.8), CMD (Figure 2.9), and GD (Figure 2.10) in the 1995
tests.

While the genetic variance of a population may be dependent on the
allelic difference between the two parents, the population mean is
usually a function of the parent means. Regression of mean yield of each
population with its mid-parent yield was positive and significant for
parents of both population sets in 1994 and 1995 two-row plots (Figure
2.11). This relationship was not evaluated for the 1993 and 1994 l-row

67

a) 1994 How plots

 

8000
8 7000 a R: 65 Ox Pop.17
g 6000 A P<.0001 A2234/P9273
-: (Pop. 17 Omitted)
g 5000 -—
£2
.2. 4000 a o 0 Q
8 3000 a 0 o O
2 2000 “ 0
$53 1000 (E :a0) 0 83
" C) C)
C) C)
0 l l r i l i

 

 

 

0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50
RF LP Genetic Distance
0) 1995 2-row plots

 

 

60
Pop.17
8 50 .. 0 Z A2234/P9273
g R: .72
'c 40 _ P < .05
g (Pop. 6,17 Omitted)
0
i5 3£l- C)
C
8 20 2
'2 O O
0
O O 0 Pop. 6 NK820-26IA2396
0 l l l I l l

 

 

0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50
RFLP Genetic Distance

Figure 2.4 Scatterplot of yield genetic variance versus RFLP genetic
distance for population set 8. a) 1994 single row plots b) 1995 two-row
plots.

68

a) 1994 1-row plots

 

 

8000
IR==.47

8 7000 i P < '03 - Ox 2326411129273
3 6000 # (Pop. 17 omitted)
g 5000 —
c)
“g 4000 — O
c O 90
(99’ 3000 — O O
2 2000 0 O

0 I I I I I

 

 

0.15 0.20 0.25 0.30 0.35 0.40 0.45
Combined Genetic Distance

b) 1995 2-row plots

 

 

60
Pop. 17
8 50 __ 0 é” A2234/P9273
C
.“3 R = .75
E 40 J P<.04
(Pop. 6,17 Omitted)
8 30 —
0 O
C
8 20 -—
2 O O
G)
5‘. 10 —. O 0 ﬂ 0
O O 0 Pop. 6 NK820—26IA2396
0 I T I l I

 

 

0.15 0.20 0.25 0.30 0.35 0.40 0.45
Combined Genetic Distance

Figure 2.5 Scatterplot of yield genetic variance versus genetic distance
from the combined analysis of RFLP and RAPD data for population set 8.
a) 1994 single-row plots b) 1995 two-row plots

69

 

 

 

 

100
Pop.17
co
8 80 _‘ A2234/P9273\ ()0
g O O O
> 50 ‘ R: .52 O o O 0
£3 P‘<.01 (I) (3
g 40 _ Pop. 17 Omitted O
(D
g; 20 - O O O CO
(0 (3.. C)
E
I I I I I T
0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50
RF LP Genetic Distance

Figure 2.6 Scatterplot of maturity genetic variance versus RFLP genetic
distance for population set 8 in the 1994 single-row plots.

 

100
a:
gg 8Cl-* C) C)
.g Pop. 17 7 Q 0 Q
g 50 _ A2234/P9273 Q) Q
g: ‘40 .1 P‘<.002 C)
8 Pop. 17 Omitted
g. 20 -—- o O 0 00 o
.5 <9 O O o
co (3 _i C)
E
I r5 I 177 I I I I

 

 

 

0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00
Genealogical Distance

Figure 2.7 Scatterplot of maturity genetic variance versus genealogical
distance for population set 8 in the 1994 single-row plots.

70

 

 

 

 

70
l?==.72

g 50 ‘ P<.05 Q
53 Pop. 6,17 Omitted
a 50 —
:, Pop 17
o _ A2234/P9273
r, 40 O
“c’
m 30 —~ 0
C9 llop.6 “”’”"';E;’ C)
Q 20 a NK820-26IA2396
.2 o

10 — O
E O o o O

0 I I I I I
0.10 0.15 0.20 0.25 0.30 0.35 0.40
RAPD Genetic Distance

Figure 2.8 Scatterplot of maturity genetic variance versus RAPD genetic
distance for population set B in the 1995 two-row plots.

 

 

70
a) _
o 60 — R - .77 Pop. 17
g P < .03 O 6‘ A2234/P9273
': 50 4 Pop. 6,17 Omitted
a:
>
.9 40 -
“0:; O
8 30 7 Pop.6 ”’9 O O
E 20 _i NK320-26/A2396
g o

10 — O
2 o o O

0 I I I I I

 

 

0.15 0.20 0.25 0.30 0.35 0.40 0.45
Combined Genetic Distance
Figure 2.9 Scatterplot of maturity genetic variance versus genetic

distance from the combined analysis of RFLP and RAPD data for population
set 8 in the 1995 two-row plots.

71

 

 

70
0 Pop. 17
g 60 — Oé/ A2234/P9273
g 50 — Pop.6
> NK820-26/A2396
2:52 04/
q) .
0:, 30 __, POD. 6, 17 Omitted O O
(.9
933‘ 20 —
e O O

10 —
2' 0:) O O

0 I I I I I I I

 

 

0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00
Genealogical Distance

Figure 2.10 Scatterplot of maturity genetic variance versus genealogical
distance for population set 8 in the 1995 two-row plots.

plots because of poor germination of the parent seed. A multiple
regression model including both mid-parent yield and genetic distance
was tested as a predictor of the mean of the top five yielding progeny
(MYS) within each population using the 1994 and 1995 2-row plots. The
model was a significant predictor of MYS for the populations in both
years. The relationship between RFD and yield potential was negative for
parent set A after the effects of mid-parent yield were removed (Figure
2.12). The combination of RFD and mid-parent yield (Figure 2.13) and the
combination of CMD and mid-parent yield (Figure 2.14) both provided a
predictive model in which yield potential was directly proportional to

both mid-parent yield and genetic distance for population set 8.

Pooulation Mean Yield (Bu/A)

Population Mean Yield (Bu/A)

72

 

52
50 —
48 —
46 -
44 —
42 —
40 —
38 —

a) 1994 - Pop. Set A

 

 

 

36

 

34 36 38 40 42
Mid-Parent Yield (Bu/A)

l

l

I I I I I I I
44 46 48 50 52 54

 

56
54 —
52 —
50 -
48 ~
46 —
44 -—

b) 1995 - Pop. Set B o

 

 

42

42

 

 

l l l l l l

44 46 48 50 52 54 56 58

Mid-Parent Yield (Bu/A)

Figure 2.11 Regression of population mean yield on mid arent '
— Ield for
the two-row plots. a) Population set A b) populaton seth y

73

.. pplA Ueew
2
‘**{3
.

 

 

 

 

 

 

“1108) 9 doi-
t
.
J
I

 

(Vine) I2IIU°l°d 9‘”

 

Figure 2.12 Multiple regression model for the prediction of the top five
yielding progeny of the 1994 two-row plots as a function of mid-parent
yield and RFLP genetic distance.

74

.. PIQ‘A ueew
O)
agﬁe
4

 

56

 

 

 

 

(wnﬁl 9 doi-
cm
8 3 #
cm
0')
o0
1%;
(’(7
1)

 

(we) Ienuewd 9‘9"“

 

Figure 2.13 Multiple regression model for the prediction of the top five
yielding progeny of the 1995 two-row plots as a function of mid-parent
yield and RFLP genetic distance.

75

822

PlGlA ueew

 

 

 

wine) 9 d0; -
01
N

 

   
  

O)
O)

 

050')
Sues

(vIna) IBIiuaioa 9191A

Figure 2.14 Multiple regression model for the prediction of the top five
yielding progeny of the 1995 two-row plots as a function of mid-parent
yield and genetic distance from the combined anaylsis of RFLP and RAPD
data.

DISCUSSION

Genetic distance estimates were significant predictors of genetic
variation for set 8 populations but not for set A populations. Poor
estimates of genetic variances for set A populations, especially in
1994, and the use of the parents with 50% plant introduction (P1) in
their pedigree for set A are possible explanations for this
inconsistency.

The coefficients of variation (CV) from the statistical analyses
of population set A and population set B were not greatly different. The
CV’s averaged 15.6% for population set A in 1993 and 14.4% for
population set 8 in 1994 in the analyses using single row plots. The
CV’s averaged 11.5% for population set A in 1994 and 10.1% for
population set B in 1995 in the analyses using two-row plots.

The standard errors of the yield genetic variance (SEYV) did not
differ greatly from one population set to the other. In fact, SEYV were
somewhat higher for population set 8 than for population set A. For the
single row plots (yield given in grams per plot), the 1993 SEYV for
population set A was 607, while the 1994 SEYV for population set 8 was
635. However, the corresponding mean yield genetic variance was 1204 for
population set A and 2049 for population set B. Only 18 of 22
populations from set A showed significant genetic variance for yield in
1993 (Table 2.3), while all but one of the 25 populations in set 2

showed significant yield genetic variance in 1994 (Table 2.4). The

76

77
situation was similar for the two-row plots (yield given in Bu/A). The
1994 SEYV for population set A averaged 4.1, while the 1995 SEYV for
population set B averaged 5.3. The corresponding mean yield genetic
variance for population set A was only 3.5, while that for population
set B was 15.4 (11.4 without population 17). Yield genetic variance was
significant for only 5 out of 14 populations in set A in 1994 (Table
2.5); while, for set B, all but population 6, which was severely
affected by disease, exhibited significant yield genetic variance in
1995 (Table 2.6). There was no correlation between yield genetic
variance from single-row to two-row plots for population set A, but the
correlation was high for population set B (Figure 2.15).

Six out of 14 populations in set A exhibited significant genotype
by environment variance in 1994 (GEV), while the only two populations
from set B that had significant GEV in 1995 were population 17, whose
yield genetic variance was almost twice that of the other populations in
set B, and population 6, which was severely affected by disease at one
of the two locations. The genotype by environment interactions may have
reduced the accuracy of the genetic variance estimates.

The inclusion of unadapted germplasm in the pedigrees of the
population sets tested may have reduced performance among the progeny.
Six of the 14 populations tested in two-row plots from parent set A
contained 25% plant introduction germ plasm in their pedigree, while
only 2 out of 10 row plot populations from parent set 2 contained 25%
plant introduction germ plasm. Schoener and Fehr (1979) showed that as
little as 25% plant introduction germ plasm within a population can

significantly reduce population performance. Souza and Sorrells (1991)

78

 

8
o

 

 

Yield Genetic Variance - 1994

.3
(n

 

-500 0 500 1000 1500 2000 2500
Yield Genetic Variance - 1993

50

 

30—
20—

d) o
10—
OCD o
0

Yield Genetic Variance - 1995

 

 

 

T 7*T I I I I I
0 1000 2000 3000 4000 5000 6000 7000 8000

Yield Genetic Variance - 1994

Figure 2.15 Correlation between yield genetic variance from the single-
row plots to the two-row plots. a) Population set A. b) Population set
B.

79

showed declining variability in progeny with an increase in parent
genetic distance for grain yield, test weight, heading date, and
maturity date in oat. Populations in that study were a result of an
adapted parent crossed with an unadapted parent. The greater the genetic
distance between the two parents, the less well adapted one of the
parents was. Variance for biomass was positively correlated with parent
genetic distance, but the effects of distance lessened as large
distances were approached. They suggested that plant biomass may not
have been as environmentally sensitive as the other traits. While the
use of PI germ plasm in the pedigree of parents of population set A may
have been a contributing factor to high GEV and less precise estimates
of yield genetic variance, it should be noted that set B populations
containing 25-50% PI germ plasm did quite well in 1994 and 1995. In
1995, both set B populations containing PI germ plasm had significant
yield genetic variance, while neither had significant GEV. One of these
populations (23) was a cross between two parents with 50% PI germ plasm.

The mean yields of populations in set A averaged 347g in the 1993
single row plots and 44.0 Bu/A in the 1994 two-row plots, while those of
population set B averaged 3569 in the 1994 single row plots and 49.6
Bu/A in the 1995 two-row plots. Because these means are from separate
years, the two population sets cannot be directly compared. However, the
lower yield potential exhibited by population set A in 1994, whether
from environmental or genetic causes, may have been a contributing
factor to the inability to accurately determine yield genetic variance
in that year.

The range of genealogical distances of the parents in set B was

0.94 to 0.62., while the range for parent set A was 0.95 to 0.73.

80
Genealogically distant parents share a greater proportion of alleles
alike in state than parents with a close genealogical relationship,
whose alleles are primarily identical by descent. The effects of
identity by descent versus alikeness in state on genetic distance are
unknown. Melchinger et al. (1990) found that there was no significant
relationship between parent genetic distance and hybrid performance when
only crosses between unrelated lines were considered. Marker distance
among unrelated lines is based entirely on alleles alike in state.

Field conditions for the 1994 two-row plots were less than ideal.
One location suffered early drought, causing uneven germination,
followed by heavy rains and hail. The other location had standing water
for a period of several days, and later exhibited substantial levels of
brown stem rot (Phialophora gregata). These conditions likely
contributed to genotype by environment interactions within the
populations, further reducing the amount of genetic variance calculated
among lines within the populations.

The population ’A2234’ by ’P9273’ (No. 17, Set B) exhibited yield
genetic variance disproportionate to the marker distance between the two
parents. This may have been because more of the markers were linked to
alleles of quantitative trait loci for yield that were different between
the two parents. It may also have been the result of greater epistatic
variation in this population than the others.

Other factors which may have contributed to lower precision of
genetic variance estimates in population set A compared with population
set B were the degree of inbreeding and the difference in progeny number
used in the two population sets. The populations developed from parent

set A were F,-derived, while populations from parent set B were F,-

81
derived. This would have resulted in a 14% smaller ratio of among-
linezwithin-line variance in parent set A populations compared to parent
set B populations (Falconer, 1989). Set B populations each contained 48
progeny, while set A populations contained only 28 progeny. The lower
number of entries in set A populations should have resulted in greater

error in the variance estimates.

CONCLUSIONS

Our data suggests that marker genetic distance estimates can
assist soybean breeders in choosing parents which will increase the
probability of transgressive segregation for yield in their progeny.
Population set B exhibited significant yield genetic variance within
almost every population, and this variance was significantly correlated
between years. Genetic distance from RFLP markers was positively and
significantly correlated with yield genetic variance in both years,
while genealogical distance was not. Marker data alone, however, will
not take the place of accuracy in field testing of both putative parents
and progeny. A multiple regression model based on RFLP or CMD marker
distance and performance data of parents was able to predict which
populations had the highest yielding progeny across a wide range of
marker distances and mid-parent yields for parents of population set 2.
Strict adherence to this model, however, may exclude some parent
combinations whose specific combining ability will exceed expectations
based on the data. Population 17 of set B had the highest yield genetic
variance of all the populations in that set, yet its parents’ RFD was

lower than the average RFD between parents of populations in set B.

82

GENERAL CONCLUSIONS

CONCLUSIONS

Despite the limited genetic base in soybean germ plasm in the
northern United States, there was enough RFLP diversity to distinguish
among cultivars and lines and establish genetic relationships. Ancestral
soybean introductions were clustered according to area of origin, which
indicates that shared alleles are likely identical by descent. Genetic
distances calculated using sufficient RFLP markers are probably more
accurate than those taken from pedigree relationships, since marker data
can account for selection practiced in breeding programs. There was
little diversity lost within modern germ plasm compared with the
ancestors, and no new alleles were found unique to a set of selected
newer plant introductions.

Genetic distance between parents was generally positively
correlated with progeny genetic variance among lines with good yield
potential. In a population set with lower yields, whether due to
environmental conditions or limited genetic yield potential,
correlations were low and sometimes negative. A multiple regression
approach using RFLP genetic distance and mid-parent yield to predict the
highest yielding progeny shows promise, but was not consistent between
the two population sets examined. More data is required before a general
conclusion can be drawn, but the data presented here suggests that
genetic distance estimates based on markers, especially those obtained

with RFLP markers, can assist soybean breeders in choosing parents with

84

85
the greatest probability of producing transgressive segregates for

yield.

APPEMJIX

87

 

 

 

 

 

00.H NH 05.0 00 N0.0 NH H0.0 00 N4

00.0 00.0 NH 0N.0 4N.0 00 NN.0 00.0 NH HN.0 0H.0 00 H4 0004
04.0 NH 0N.0 00 05.0 NH 44.0 00 N4

00.0 40.0 NH 04.0 40.0 00 04.0 0N.0 NH 04.0 00.0 00 H4 0004
N0.0 NH 05.0 00 05.0 NH N5.0 00 N4

54.0 0N.0 NH 0N.0 0N.0 00 04.0 0N.0 NH H4.0 5N.0 00 H4 5504
00.0 NH 0N.0 N0 NN.0 NH 5N.0 55 N4

N4.0 HN.0 NH N4.0 05.0 N0 N4.0 50.0 NH 04.0 N0.0 55 H4 N504
00.0 NH 00.0 00 00.0 NH 04.0 00 N4

00.0 00.H NH 04.0 N4.0 00 04.0 00.0 NH 00.0 N0.0 00 H4 4004
N4.0 NH HH.0 N0 0N.0 NH 5H.0 05 Nm

04.0 00.0 NH 0N.0 00.0 N0 00.0 05.0 NH 0N.0 NN.0 05 H0
N0.0 NH 0N.0 N0 00.0 NH 04.0 05 N4

0H.0 00.0 NH 0N.0 45.0 N0 00.0 00.0 NH 04.0 00.0 05 H4 N004
00.0 NH 00.0 00 00.0 NH H0.0 00 44
04.0 NH 04.0 00 4N.0 NH H4.0 00 N4
04.0 NH 00.0 00 00.0 NH 00.0 00 N4

50.0 00.0 NH 04.0 00.0 00 00.0 00.0 NH 40.0 H0.0 00 H4 0004

m.Hm ouHHm Honumooc4 mocwq Had
.on 0000” .z .on 0000 'z .on woum 4z .on 00mm 42 .oHoHH4 mcoHo

 

.mmsouo caspHB no mum>ﬂuaso 0cm mmcﬁa Ham mcoao mCONuchnEoo 05>Nc0\o:oao
you msooH Hon Auonv pcopcoo coﬂumﬁuomcw Bmwamuoﬁhaom 0:0 moﬁocoawoum mamaad .H4_0Hnma

88

 

0H.0 NH NH.0 00 N4.0 NH 5H.0 00 N4

0N.0 00.0 NH HN.0 00.0 00 N4.0 00.0 NH 5N.0 NN.0 00 H4 00H4
50.0 NH 00.0 00 N5.0 HH H0.0 N5 N4

44.0 NN.0 NH 4N.0 4H.0 00 04.0 5N.0 HH 0N.0 0H.0 N5 H4 05H4
00.H NH 40.0 00 N0.0 NH 00.0 45 N4

00.0 00.0 NH HH.0 00.0 00 NH.0 00.0 NH NN.0 00.0 45 H4 0NH4
0H.0 NH 0N.0 00 0N.0 HH NN.0 05 N4

0N.0 00.0 NH NN.0 00.0 00 04.0 40.0 HH 4N.0 05.0 05 H4 HNH4
N0.0 NH 00.0 40 NN.0 NH N5.0 05 N4

0H.0 00.0 NH 04.0 0N.0 40 NN.0 5H.0 NH 04.0 0N.0 05 H4 4NH4
00.H NH 40.0 00 N0.0 NH N0.0 N5 N0
00.0 00.0 NH HH.0 00.0 00 0N.0 5H.0 NH 4N.0 50.0 N5 Hm
05.0 NH 0N.0 00 N4.0 NH N4.0 N5 N4

0N.0 0N.0 NH 04.0 40.0 00 04.0 00.0 NH 00.0 00.0 N5 H4 HHH4
0N.0 NH 50.0 00 N0.0 NH 05.0 05 N4

54.0 H0.0 NH NN.0 NH.0 00 0H.0 00.0 NH NN.0 HN.0 05 H4 0004
0H.0 NH N4.0 00 0N.0 NH 0N.0 00 N4

0N.0 00.0 NH 04.0 00.0 00 0N.0 H5.0 NH 44.0 40.0 00 H4 0004
00.H NH 40.0 00 NN.0 HH 00.0 05 N0
00.0 00.0 NH 5N.0 0H.0 00 5N.0 0H.0 HH 4N.0 4H.0 05 H0
40.0 NH 05.0 00 00.0 HH 00.0 05 N4

00.0 04.0 NH NN.0 0N.0 00 04.0 00.0 HH 04.0 NN.0 05 H4 0004

“0.00000 .H4 «Hana

89

 

5H.0 NH 00.0 00 40.0 HH N0.0 N5 N4

0N.0 NN.0 NH 04.0 04.0 00 04.0 0N.0 HH 00.0 54.0 N5 H4 5H04
0N.0 NH 00.0 00 00.0 NH H0.0 45 N4

0N.0 05.0 NH 04.0 N4.0 00 00.0 00.0 NH 00.0 04.0 45 H4 0004
NN.0 NH 0H.0 00 50.0 NH 0N.0 05 N4

0N.0 5H.0 NH NN.0 N0.0 00 NN.0 NN.0 NH 04.0 00.0 05 H4 N044
05.0 NH 40.0 00 50.0 NH 00.0 45 N4

0N.0 0N.0 NH HH.0 00.0 00 NN.0 NN.0 NH NN.0 4H.0 45 H4 5044
0N.0 NH 0N.0 00 0N.0 NH 0N.0 45 N4

NN.0 05.0 NH H4.0 H5.0 00 N4.0 05.0 NH H4.0 N5.0 45 H4 N0N4
40.0 NH 44.0 00 N4.0 NH 04.0 00 N4

00.0 04.0 NH 04.0 00.0 00 00.0 00.0 NH 00.0 00.0 00 H4 05N4
00.H NH N0.0 00 00.0 NH H0.0 00 N4

00.0 00.0 NH 00.0 54.0 00 04.0 N4.0 NH 54.0 0N.0 00 H4 40N4
HN.0 NH HN.0 00 N4.0 NH NN.0 00 N4

N4.0 00.0 NH N4.0 00.0 00 00.0 00.0 NH 04.0 00.0 00 H4 0NN4
40.0 NH 40.0 00 00.0 NH N0.0 00 N4

00.0 04.0 NH 00.0 04.0 00 00.0 00.0 NH 00.0 54.0 00 H4 NNN4
00.0 NH N0.0 40 0N.0 0H N0.0 55 N4

N4.0 HN.0 NH 00.0 04.0 40 04.0 00.0 0H 00.0 54.0 55 H4 N0N4

50.00000 .H4 mHnme

90

 

0H.0 NH N4.0 H0 40.0 NH N4.0 05 N4

0N.0 00.0 NH 00.0 N0.0 H0 00.0 04.0 NH 04.0 50.0 05 H4 0400
00.0 NH 04.0 00 5H.0 NH 4N.0 00 N4

00.0 00.H NH 00.0 40.0 00 NH.0 NN.0 NH N4.0 00.0 00 H4 NNom
N0.0 NH 0N.0 00 N4.0 NH 0N.0 00 N4

54.0 0N.0 NH 44.0 N5.0 00 00.0 00.0 NH 04.0 40.0 00 H4 0Nom
HN.0 NH 04.0 54 50.0 NH N4.0 N5 N4

N4.0 00.0 NH 04.0 00.0 54 04.0 NN.0 NH 04.0 50.0 N5 H4 5404
00.0 NH HN.0 00 NN.0 NH 0N.0 00 N4

0N.0 0H.0 NH NN.0 05.0 00 N4.0 5H.0 NH 04.0 H0.0 00 H4 5404
0N.0 NH 04.0 00 5H.0 NH 0N.0 05 N4

54.0 N0.0 NH 00.0 00.0 00 0N.0 NN.0 NH N4.0 H0.0 05 H4 0H54
N4.0 NH 0H.0 H0 00.0 NH 5N.0 45 N4

04.0 00.0 NH 5N.0 40.0 H0 04.0 N4.0 NH N4.0 N5.0 45 H4 0054
40.0 NH 50.0 00 00.0 NH 00.0 00 N4

00.0 04.0 NH 04.0 N4.0 00 00.0 00.0 NH 04.0 44.0 00 H4 H004
N0.0 NH 0N.0 00 50.0 NH 04.0 45 N4

0H.0 00.0 NH H4.0 H5.0 00 N4.0 NN.0 NH 00.0 00.0 45 H4 0N04
H0.0 NH 00.0 00 N0.0 NH 00.0 00 Nm
54.0 0N.0 NH 04.0 H4.0 00 54.0 0N.0 NH 04.0 04.0 00 H0
00.H NH 04.0 00 00.0 NH N0.0 00 N4

00.0 00.0 NH 04.0 00.0 00 00.0 00.0 NH 00.0 04.0 00 H4 0004

“0.04000 .H4 mHnme

91

 

00.0 NH NH.0 00 00.0 HH 0H.0 N5 N4

00.0 00.H NH HN.0 00.0 00 0H.0 H0.0 HH 0H.0 00.0 N5 H4 0044
40.0 NH N5.0 00 00.0 NH 00.0 00 N4

00.0 04.0 NH 04.0 NN.0 00 04.0 00.0 NH 54.0 4N.0 00 H4 0NN4
00.0 NH 00.0 00 00.0 NH H0.0 00 N4
00.0 NH 05.0 00 00.0 NH 05.0 00 N4

0N.0 0H.0 NH 4N.0 NN.0 00 00.0 4N.0 NH 00.0 4N.0 00 H4 N0N4
00.H NH 00.0 00 NN.0 NH 40.0 00 N4

00.0 00.0 NH 0H.0 00.0 00 0N.0 5H.0 NH 0H.0 00.0 00 H4 0004
HN.0 NH 00.0 00 05.0 NH 00.0 00 Nm
N4.0 00.0 NH 04.0 04.0 00 04.0 0N.0 NH 04.0 N4.0 00 H0
00.0 NH No.0 00 00.0 NH No.0 00 N4

0H.0 N0.0 NH 40.0 00.0 00 00.0 00.H NH 00.0 00.0 00 H4 4H04
N0.0 NH 5N.0 40 05.0 NH N0.0 05 Nm
0H.0 00.0 NH 54.0 N0.0 40 0N.0 0N.0 NH 04.0 04.0 05 H0
NN.0 NH 4H.0 40 00.0 NH 0H.0 05 N4

0N.0 55.0 NH 4N.0 00.0 40 0H.0 N0.0 NH 4N.0 00.0 05 H4 5004
HN.0 NH 44.0 00 00.0 NH N4.0 00 N4

N4.0 00.0 NH 04.0 00.0 00 00.0 00.0 NH 04.0 00.0 00 H4 N004
00.0 NH NN.0 04 00.0 NH NN.0 45 Nm
00.0 00.H NH 4N.0 05.0 04 00.0 00.H NH 4N.0 05.0 45 H0
0H.0 NH 00.0 04 0N.0 NH 50.0 45 N4
N0.0 NH 00.0 04 00.0 NH 00.0 45 N4

40.0 NN.0 NH 04.0 H4.0 04 H0.0 5H.0 NH 40.0 4N.0 45 H4 N4Hm

50.»:oov .H4 «Heme

92

\ ﬁdQW'H "
N
no coau0pnmﬁoo on» :« voosHocﬂ no: ouo3 Eomc0m wn 05004403 0:0 xmmmm kn mE0NHHN3 mommouo
on» no mammoum one .Uouoom 003 00040 on» 50023 you azoum 0 :0 00:04 no Honsac onu ma 2 a

.0500H and on» #0 oHoHH0 :uﬁ on» no wonoavonm man 04 Q ouon3
on 0 .00004 080m you mcﬂmmﬂs ma mocoHo 0500 you 0u0o Hoxu0z .moﬁocoavonm oHoHH0

.unoﬂoz 004300405 moam0ouoou mo Macho on» :N .ouo.m.~ 00005000 0H0 UN on UNH00H0
muoxn0ﬁ on» 600 .oco Hogans on» nocoﬂmm0 ma Hoxu0a #50003 “04300408 Hmonmwn one .05004

#0:» #0 mamHH0 map ma 4 6:0 msooa on» ma 0 ouoc3 .40 oou0coﬁmoo 0H0 wuoxn0ﬁ H0Hnooaoz 5

 

 

0N.0 5N.0 0N.0 0N.0 000:
40.0 NH N0.0 00 04.0 HH N0.0 45 N4

00.0 04.0 NH 00.0 54.0 00 54.0 00.0 HH 00.0 04.0 45 H4 H0Nm
00.0 NH 00.0 00 5H.0 NH No.0 45 N4
00.0 NH NH.0 00 NN.0 NH 4H.0 45 N4

00.0 00.H NH NN.0 50.0 00 40.0 00.0 NH 0N.0 NN.0 45 H4 N000
40.0 NH 4N.0 00 00.0 HH 5N.0 45 N4

00.0 04.0 NH 0N.0 05.0 00 0H.0 H0.0 HH 0N.0 N5.0 45 H4 5H00
00.0 NH 0H.0 00 5N.0 HH 0H.0 N5 N4

0H.0 N0.0 NH HN.0 H0.0 00 5N.0 N5.0 HH 0N.0 NN.0 N5 H4 NHom
H0.0 NH 50.0 00 00.H NH 00.0 00 N4

54.0 0N.0 NH NN.0 NH.0 00 00.0 00.0 NH 0N.0 0H.0 00 H4 N440
00.H NH HN.0 00 0N.0 0H H4.0 05 N4

00.0 00.0 NH N4.0 00.0 00 0H.0 00.0 0H 4N.0 00.0 05 H4 HH4N

50.00000 .H4 mHnme

LIST OF REFERENCES

LIST OF REFERENCES

Anderson, J. A., G. A. Churchill, J. E. Autrique, S. D. Tanksley, and M.
E. Sorrells. 1993. Optimizing parental selection for genetic
linkage maps. Genome. 36:181-186.

Bernard, R. L., G. A. Juvik, R. L. Nelson. 1987a. USDA soybean germplasm
collection survey. Vol. 1. International Soybean Program. Urbana,
IL BOpp.

Bernard, R. L., G. A. Juvik, R. L. Nelson. 1987b. USDA soybean germplasm
collection survey. Vol. 11. International Soybean Program. Urbana,
IL 203pp.

Borst, P. and D. R. Greaves. 1987. Programmed gene rearrangements
altering gene expression. Science. 235:658-667.

Carter, Jr., T. E., Z. Gizlice, and J. H. Burton. 1993. Coefficient-of-
parentage and genetic-similarity estimates for 258 North American
soybean cultivars released by public agencies during 1945-1988. U.
S. Department of Agriculture, Technical Bulletin No. 1814, 169 pp.

Cervantes, T., M. M. Goodman, E. Casas, and J. O. Rawlings. 1978. Use of
genetic effects and genotype by environment interactions for the
classification of Mexican races of maize. Genetics. 90:339-348.

Cowen, N. M. and K. J. Frey. 1987a. Relationship between genealogical
distance and breeding behavior in cats (Avena sativa L.).
Euphytica. 36:413-424.

Cowen, N. M. and K. J. Frey. 1987b. Relationships between three measures
of genetic distance and breeding behavior in oats (Avena sativa
L.). Genome. 29:97-106.

Committee on Genetic Vulnerability of Crops. 1972. Genetic vulnerability
of major crops. Natl. Acad. Sci. Washington D.C.

Cooper, R. L. 1990. Modified early generation testing procedure for
yield selection in soybean. Crop Sci. 30(2):4l7-419.

Cooper, R. L. and R. J. Martin. 1981. Registration of Gnome soybean.
Crop Sci. 21:634.

94

95

Cox, T. S., Y. T. Kiang, M. B. Gorman, and D. M. Rodgers. 1985.
Relationship between coefficient of parentage and genetic
similarity indices in the soybean. Crop Sci. 25:529-532.

Damerval, C., Y. Hébert, and D. de Vienne. 1987. Is the polymorphism of
protein amounts related to phenotypic variability? A comparison of
two-dimensional electrophoresis data with morphological traits in
maize. Theor. Appl. Genet. 74:194-202.

Delannay X., D. M. Rodgers, and R. G. Palmer. 1983. Relative genetic
contributions among ancestral lines to North American soybean
cultivars. Crop Sci. 23:944-949.

Diers, B. H. and Osborn 1994. Genetic diversity of oilseed Brassica
napus germ plasm based on restriction fragment length
polymorphisms. Theoret. Appl. Genet. 88:662-668.

dos Santos, J. B., J. Nienhuis, P. Skroch, J. Tivang, and M. K. Slocum.
1994. Comparison of RAPD and RFLP genetic markers in determining
genetic similarity among Brassica aleracea L. genotypes. Theor.
Appl. Genet. 87:909-915.

Ellsworth, D. L., K. D. Rittenhouse, and R. L. Honeycutt. 1993.
Artifactual variation in randomly amplified polymorphic DNA
banding patterns. BioTechniques. l4(2):214-217.

Falconer, D. S. 1989. Introduction to quantitative genetics. John Riley
and Sons, Inc. pp. 264-270.

Fang, G., S. Hammar, and R. Grumet. 1992. A quick and inexpensive method
for removing polysaccharides from plant genomic DNA. Biofeedback.
13(1):52-55.

Frei, 0. M., C. U. Stuber, and N. M. Goodman. 1986. Use of allozymes as
genetic markers for predicting performance in maize single cross
hybrids. Crop Sci. 26:37-42.

Gizlice, Z., T. E. Carter, Jr., and J. H. Burton. 1994. Genetic base for
North American soybean cultivars released between 1947 and 1988.
Crop Sci. 34(5):]143-1151.

Godshalk, E. B., M. Lee, and K. R. Lamkey. 1990. Relationship of
restriction fragment length polymorphisms to single-cross hybrid
performance in maize. Theor. Appl. Genet. 80:273-280.

Goodman, M. M. 1968. A measure of ’overall variability’ in populations.
Biometrics. 24:189-192.

Grabau, E. A., V. H. Davis, and B. G. Gengenbach. 1989. Restriction
fragment length polymorphism in a subclass of the ’Nandarin’
soybean cytoplasm. Crop Sci. 29:1554-1559.

96

Grabau, E. A., H. H. Davis, N. D. Phelps, and B. G. Gengenbach. 1992.
Classification of soybean cultivars based on mitochondrial DNA
restriction fragment length polymorphism. Crop Sci. 32:271-274.

Hallden, C., N. O. Nilsson, I. M. Rading, and T. Sall. 1994. Evaluation
of RFLP and RAPD markers in a comparison of Brassica napus
breeding lines. Theor. Appl. Genet. 88:123-128

Hanlon, R. and E. A. Grabau. 1995. Cytoplasmic diversity in old domestic
varieties of soybean using two mitochondrial markers. Crop Sci.
35:1148-1151.

Hanson, H. D. and E. Casas. 1968. Spatial relationship among eight
population of Zea mays L. utilizing information from a diallel
mating design. Biometrics. 24:867-880.

Huen, M., J. P. Murphy, and T. 0. Phillips. 1994. A comparison of RAPD
and isozyme analyses for determining the genetic relationships
among Avena sterilis L. accessions. Theor. Appl. Genet. 87:689-
696. ‘

Huen, M., and T. Helentjaris. 1993. Inheritance of RAPDs in F1 hybrids
of corn. Theor. Appl. Genet. 85:961-968.

Jain, A., S. Bhatia, S. S. Banga, S. Prakash, and M. Lakshmikumaran.
1994. Potential use of random amplified polymorphic DNA (RAPD)
technique to study the genetic diversity in Indian mustard
(Brassica juncea) and its relationship to heterosis. 1994. Theor.
Appl. Genet. 88:116-122.

Johnson, H. J., H. F. Robinson, and R. E. Comstock. 1955. Estimates of
genetic and environmental variability in soybeans. Agron. J.
47:314-318.

Keim, P., and R. C. Shoemaker. 1988. Construction of a random
recombinant DNA library that is primarily single copy sequence.
Soybean Genet. Newsl. 15:147-148.

Keim, P., R. C. Shoemaker, and R. G. Palmer. 1989. Restriction fragment
length polymorphism diversity in soybean. Theor. Appl. Genet.
77:786-792.

Keim, P., N. Beavis, J. Schupp, and R. Freestone. 1992. Evaluation of
soybean RFLP marker diversity in adapted germplasm. Theor. Appl.
Genet. 85:205-212.

Lamkey, K. R., A. R. Hallauer, and A. L. Kahler. 1987. Allelic
differences at enzyme loci and hybrid performance in maize.
Journal of Heredity. 78:231-234.

97

Lee, M., E. B. Godshalk, K. R. Lamkey, and W. W. Woodman. 1989.
Association of restriction fragment length polymorphisms among
maize inbreds with agronomic performance of their crosses. Crop
Sci. 29:1067-1071.

Mahalanobis, P. C. 1936. On the generalized distance in statistics.
Proc. Natl. Inst. Sci. India. 2:49-55.

Mailer, R. J., R. Scarth, and B. Fristensky. 1994. Discrimination among
cultivars of rapeseed (Brassica napus L.) using DNA polymorphisms
amplified from arbitrary primers. Theor. Appl. Genet. 87:697-704.

Maniatas, T., E. F. Fritsch, and J. Sambrook. 1982. Molecular cloning: a
laboratory manual. Cold Spring Harbor Laboratory. Cold Spring
Harbor, New York. 545 pp.

Martin, J. M., L. E. Talbert, S. P. Lanning, and N. K. Blake. 1995.
Hybrid performance in wheat as related to parental diversity. Crop
Sci. 35:104-108.

McGrath, J. N. and C. F. Ouiros. 1992. Genetic diversity at isozyme and
RFLP loci in Brassica campestris as related to crop type and
geographical origin. Theor. Appl. Genet. 83:783-790.

Melchinger, A. E., H. Lee, K. R. Lamkey, and W. L. Woodman. 1990.
Genetic diversity for restriction fragment length polymorphisms:
relation to estimated genetic effects in maize inbreds. Crop Sci.
30:1033-1040.

Melchinger, A. E., J. Boppenmaier, B. S. Dhillon, W. G. Pollmer, and R.
G. Herrmann. 1992. Genetic diversity for RFLPs in European maize
inbreds: relation to performance of hybrids within versus between
heterotic groups for forage traits. Theor. Appl. Genet. 84:672-
681.

Melchinger, A. E., M. Lee, K. R. Lamkey, A. R. Hallauer, and W. L.
Woodman. 1990a. Genetic diversity for restriction fragment length

polymorphisms and heterosis for two diallel sets of maize inbreds.
Theor. Appl. Genet. 80:488-496

Melchinger, A. E., M. Lee, K. R. Lamkey, and W. L. Woodman. 1990b.
Genetic diversity for restriction fragment length polymorphisms:
relation to estimated genetic effects in maize. Crop Sci. 30:1033-
1040.

Melchinger, A. E., M. M. Messmer, M. Lee, W. L. Woodman, and K. R.
Lamkey. 1991. Diversity and relationships among U. S. maize
inbreds revealed by restriction fragment length polymorphisms.
Crop Sci. 31:669-678.

98

Messmer, N. M., A. E. Melchinger, M. Lee, W. L. Woodman, E. A. Lee, and
K. R. Lamkey. 1991. Genetic diversity among progenitors and elite
lines from the Iowa stiff stalk synthetic (BSSS) maize population:
comparison of allozyme and RFLP data. Theor. Appl. Genet. 83:97-
107.

Moser, H. and M. Lee. 1994. RFLP variation and genealogical distance,
multivariate distance, heterosis, and genetic variance in oats.
Theor. Appl. Genet. 87:947-956.

Mullis, K. B., and F. Faloona. 1987. Specific synthesis of DNA in vitro
via a polymerase catalyzed reaction. Meth. Enzymol. 155:335-350.

Olson, M., L. Hood, C. Cantor, and D. Botstein. 1989. A common language
for physical mapping of the human genome. Science 254:1434-1435.

Rafalski, J. A., S. V. Tingey, and J. G. K. Williams. 1991. RAPD markers
- a new technology for genetic mapping and plant breeding.
AgBiotech News and Information. 3(4):645-648.

Ramshaw, J. A. M., J. A. Coyne, and R. C. Lewontin. 1979. The
sensitivity of gel electrophoresis as a detector of genetic
variation. Genetics. 93:1019-1037.

Rohlf, F. J. 1992. NTSYS-pc: Numerical taxonomy and multivariate
analysis system (version 1.7). Exeter Software. Setauket, NY. pp
7.5-7.7.

Roth, E. J., B. L. Frazier, N. R. Apuya, and K. G. Lark. 1989. Genetic
variation in an inbred plant: variation in tissue cultures of
soybean [Glycine max (L.) Merrill]. Genetics. 121:359-368.

Skorupska, H. T., R. C. Shoemaker, A. Warner, E. R. Shipe, and W. C.
Bridges. 1993. Restriction Fragment Length Polymorphism in soybean
germplasm of the Southern USA. Crop Sci. 33:1169-1176.

Smith, J. J., J. S. Scott-Craig, J. R. Leadbetter, G. L. Bush, 0. L.
Roberts, and D. W. Fulbright. 1994. Characterization of random
amplified DNA (RAPD) products from Xanthomonas campestris and some
comments on the use of RAPD products in phylogenetic analysis.
Nol. Phylogenet. Evol. 3(2):]35-145.

Sneller, C. H. 1994a. Pedigree analysis of elite soybean lines. Crop
Sci. 34:1515-1522.

Sneller, C. H. 1994b. SAS programs for calculating coefficient of
parentage. Crop. Sci. 34:1679-1680.

St. Martin, S. K. 1982. Effective population size for the soybean
improvement program in maturity groups 00 to IV. Crop Sci. 22:151-
152.

99

Schoener, C. S. and W. R. Fehr. 1979. Utilization of plant introductions
in soybean breeding populations. Crop Sci. 19:185-188.

Southern, E. M., 1975. Detection of specific sequences among DNA
fragments separated by gel electrophoresis. J. Mol. Biol. 98:503-
517.

Souza, E. and M. E. Sorrells. 1991a. Relationships among 70 North
American oat germplasms: I. Cluster analysis using quantitative
characters. Crop Sci. 31:599-605.

Souza, E. and M. E. Sorrells. 1991b. Relationships among 70 North
American oat germplasms: II. Cluster analysis using qualitative
characters. Crop Sci. 31:605-612.

Souza, E. and M. E. Sorrels. 1991. Prediction of progeny variation in
oat from parental genetic relationships. Theor. Appl. Genet.
82:233-241.

Smith, 0. S., J. S. C. Smith, S. L. Bowen, R. A. Temborg, and S. J.
Wall. 1990. Similarities among a group of elite maize inbreds as
measured by pedigree, F} grain yield, heterosis, and RFLPs. Theor.
Appl. Genet. 80:833-840.

Snedecor, G. W. and W. G. Cochran. 1967. Statistical Methods. The Iowa
State University Press, Ames, Iowa. pp. 157-158.

Talbert, L. E., N. K. Blake, P. W. Chee, T. K. Blake, and G. M. Magyar.
1994. Evaluation of "sequence-tagged-site" PCR products as
molecular markers in wheat. Theor. Appl. Genet. 87:789—794.

Thormann, C. E., M. E. Ferreira, L. E. A. Camargo, J. G. Tivang, and T.
C. Osborn. Comparison of RFLP and RAPD markers to estimating
genetic relationships within and among cruciferous species. Theor.
Appl. Genet. 88:973-980.

Tinker, N. A., M. G. Fortin, and D. E. Mather. 1993. Random amplified
polymorphic DNA and pedigree relationships in spring barley.
Theor. Appl. Genet. 85:976-984.

Velasquez, V. L. B., and P. Gepts. 1994. RFLP diversity of common bean
(Phaseolus vulgaris L.) in its centres of origin. Genome. 37:256-
263. ‘

Welsh, J., and M. McClelland. 1990. Fingerprinting genomes using PCR
with arbitrary primers. Nucleic Acids Res. l9(2):303-306.

Williams, J. G. K., A. R. Kubelik, K. J. Livak. J. A. Rifalski, and S.
V. Tingey. 1990. DNA polymorphisms amplified by arbitrary primers
are useful as genetic markers. Nucleic Acid Res. 18:6531-6535.

100

Williams, C. E. and D. A. St. Clair. 1993. Phenetic relationships and
levels of variability detected by restriction fragment lenth
polymorphism and random amplified polymorphic DNA analysis of
cultivated and wild accessions of Lycopersicon esculentum. Genome.

36:619-630.

Yu, L. X. and H. T. Nguyen. 1994. Genetic variation detected with RAPD
markers among upland and lowland rice cultivars (Oryza sativa L.).
Theor. Appl. Genet. 87:668-672.

OTHER REFERENCES

Cooper, R. L., 1995. USDA/ARS, OSU/OARDC, Wooster, Ohio. Personal
Communication.

Neslon, R. 1994. USDA/ARS, Urbana, Illinois. Personal Communication.

Webb, 0. 1992. Pioneer HI-BRED International. Personal Communication.

"IIIIIIIIIIIIIIIII