' 2...“.52 7%.}, ‘ . n :2 . . . w... 3.! fix. .. (we .9. anfl. '5. 1-,...1! It a: ) .$im.wnflg .3wdaa..«.dmfiy$m {1.3.31.0124 gnaw-6‘“ v 3:. _ I. :l . .1‘ . 13..., . ‘ z! I I . .1: I. .27 A A ‘ 9 I. ..t . 3:11. «Juli, . \.,..J .r. . . , . $34.10!... x . ,. . ‘ : . . 38?? V wage? gap... y? . , , «WEE, FM P 3;»; .UBRARY M'Chigan State UniverSIty This is to certify that the dissertation entitled NOVEL STUDIEES OF SPONTANEOUS MUTATION: MEASUREMENTS OF FITNESS IN THE FIELD AND GENE EXPRESSION IN THE LAB presented by ANGELA JENNIFER ROLES has been accepted towards fulfillment of the requirements for the Department of PhD. degree in Zoology and Ecology, Evolutionary Biology & Behavior W/Mflgfigssor’s Sigfifi /2/ // [07 Date MSU is an Affirmative Action/Equal Opportunity Institution - ..- ---—o—-—---'-~—---- - PLACE IN RETURN BOX to remove this checkout from your record. To AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 6/07 p:/C|RCIDateDue.indd-p.1 NOVEL STUDIES OF SPONTANEOUS MUTATION: MEASUREMENTS OF FITNESS IN THE FIELD AND GENE EXPRESSION IN THE LAB By Angela Jennifer Roles A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Zoology Program in Ecology, Evolutionary Biology and Behavior 2007 ABSTRACT NOVEL STUDIES OF SPONTANEOUS MUTATION: MEASUREMENTS OF FITNESS IN THE FIELD AND GENE EXPRESSION IN THE LAB By Angela Jennifer Roles Spontaneous mutation provides the raw material for the evolutionary process. Understanding the rate at which mutations occur and the distribution of their effects on fitness is thus of fundamental importance. However, these parameters remain uncertain A and characterized in only a few model organisms. In addition, all published studies of spontaneous mutation have been performed under laboratory conditions. The application of these results to field conditions is unknown. While we are most interested in mutations with phenotypic effects that are visible to natural selection, mutation occurs at the molecular level. Recently developed molecular tools may allow us to better understand the linkage between spontaneous mutations in the DNA and resulting changes in fitness. In my dissertation research, I used mutation accumulation in Raphanus raphanistrum (wild radish) and Arabidopsis thaliana (mouse-ear cress) to explore the effects of mutations under field conditions in comparison to laboratory results. I asked whether mutations have similar effects on fitness in the field versus the laboratory and whether mutations have similar effects on fitness in different field environments. In addition, I used A. thaliana mutation-accumulation lines to assay the effects of spontaneous mutations on gene expression. I asked, what is the distribution of mutational effects on gene expression? Is there an average bias in effect (up- versus down-regulation)? Are there any parallel changes in gene expression across mutation-accumulation lines? I found that mutations have similar effects between field and greenhouse for both R. raphanistrum and A. thaliana. In R. raphanistrum, fitness was reduced in mutation- accumulation lines relative to the ancestor, proportionally more reduced in the field than in the greenhouse. In A. thaliana, average fitness was higher in the mutation- accumulation lines (though not significantly). For A. thaliana, I found that there are substantial differences in genotypic fitness between environments (genotype by environment interaction, GEI), highlighting the importance of context for new spontaneous mutations. In the gene expression studies, I found that there is a slight bias in mutational effects toward up-regulation though most of the genes which were expressed significantly differently were down-regulated. I found little evidence of parallel change in gene expression (a single gene), supporting the existence of many avenues by which mutations can affect fitness. ACKNOWLEDGMENTS I could not have completed this thesis without my advisor, Jeff Conner. Jeff has been the best advisor I could ask for — offering support when I was a teaching assistant through to my move (pre-PhD) to Oberlin College with my husband. Jeff is also largely responsible for improvement in my writing skills. I would also like to thank my committee members: Jim Hancock, Rich Lenski and Doug Schemske. All have been wonderful throughout my graduate years. My committee always provided excellent advice and I am grateful to have such engaged and knowledgeable mentors. There are often other graduate students who help to make it possible for one to survive those years and I am lucky to have had a number of wonderful friends going through the process with me. Especially my cohort - Meghan Duffy, Sarah Emery, Erica Garcia and Heather Sahli. I have also valued greatly interactions with my lab and office- mate Frances Knapczyk. From East Lansing, I am grateful for the friendship & support of Jay Sobel, Anna Fiedler and Jake McCarthy. KBS is a fabulous place to do graduate research and I enjoyed the years that I spent with a great group of folks, only a few mentioned here. I have also had much assistance in the field and lab with my research projects. For that, I thank Teddi Bearman, Jeff Conner, Meghan Duffy, Sarah Emery, Sarah Hodgson, Frances Knapcyzk, Cindy Mills, Heather Sahli, Christy Stewart, Trent Thompson, and Kevin Woods. Ruth Shaw was kind enough to provide the seeds for Chapter 3, Mark Hammond helped me with cultivation of Arabidopsis plants, and Charlie Fenster and Matt Rutter collaborated with Jeff & me on that project, making it much more interesting than it would otherwise have been. Jeff Landgraf at the MSU RTSF, David Yoder and Katherine Osteryoung were very helpful in my learning the techniques for the array analysis in Chapter 4. Lauren McIntyre taught me (and continues to teach me) how to do the analysis of that array data in Chapter 4! The support staff at KBS have iv always been wonderful. Nina Consolatti, Alice Gillespie, John Gorentz, Janell Lovan, Mike Martin, Sally Shaw and Melissa Yost have all been quite helpful. And of course, I would like to thank my wonderful family. I’m sure they wondered when I would finally finish and are glad that day is here. My parents, Nick and Trudy Roles, have been very supportive of everything I have sought to accomplish and I would not be the woman I am today without them. My sister, Jacki Roles, is a great fi’iend and I’m so lucky to have her to call when I’m in need of some counsel! My brother, Eric Roles, inspires me through his own commitment to a cause. My puppy, Indiana, has caused me a great deal of heartache over the years but also helps me maintain perspective — his constant joy has lifted my spirits on numerous occasions. Finally, and most of all, my husband, Kevin Woods. You are my other half and I’m so glad that we are finally together! TABLE OF CONTENTS LIST OF TABLES .............................................................................. viii LIST OF FIGURES ............................................................................. ix CHAPTER 1 INTRODUCTION ............................................................................... l Dissertation Overview .......................................................................... 2 CHAPTER 2 FITNESS EFFECTS OF MUTATION ACCUMULATION IN A NATURAL OUTBRED POPULATION OF WILD RADISH (RAPHAN US RAPHANIST R UM): COMPARISON OF FIELD AND GREENHOUSE ENVIRONMENTS .............................................................................. 5 Abstract ............................................................................................ 5 Introduction ....................................................................................... 6 Methods ........................................................................................... 9 Study species ............................................................................ 9 Generation and propagation of mutation accumulation populations ............. 9 Field assay of MA ...................................................................... 10 Greenhouse assay of MA .............................................................. 15 Results ............................................................................................. 18 Field assay of mutation accumulation ............................................... 18 Greenhouse assay of mutation accumulation ....................................... 19 Discussion ......................................................................................... 30 Potential problems with MCN designs ............................................... 31 Comparison of MCN designs to single-offspring descent ......................... 33 Measurements of MA in multiple environments ................................... 35 Conclusions .............................................................................. 36 CHAPTER 3 FIELD MEASUREMENTS OF GENOTYPE BY ENVIRONMENT INTERACTION FOR FITNESS CAUSED BY SPONTANEOUS MUTATIONS IN ARA BIDOPSIS THALIANA ................................................................. 37 Introduction ....................................................................................... 37 Materials and methods .......................................................................... 39 Overview ................................................................................ 39 Mutation accumulation ................................................................ 4O Kellogg Biological Station, Michigan ......................... - ...................... 41 Blandy Experimental Farm, Virginia ....... - ........................................ 43 Data preparation ........................................................................ 43 Results ......................................................................... ' .................... 45 Discussion ......................... -. ............................................................... 54 Effects of mutation accumulation on mean and variance ......................... 54 Challengesto MA studies ............................................................. 55 Mutational variability ................................................................. 58 GEI for new mutations ................................................................ 59 CHAPTER 4 LINKING GENE EXPRESSION AND SPONTANEOUS MUTATIONS: MICROARRAY STUDIES OF ARABIDOPSIS T HALIANA MUTATION ACCUMULATION LINES .................................................................... 62 Introduction ....................................................................................... 62 Methods ........................................................................................... 66 Selection of Arabidopsis MA lines .................................................. 66 Growth of seeds ........................................................................ 68 RNA isolation .......................................................................... 69 RNA labeling, purification and dye coupling ...................................... 69 Slide preparation ....................................................................... 69 Hybridization and array scanning .................................................... 70 Analysis and Results ............................................................................. 70 Discussion ......................................................................................... 73 CHAPTER 5 CONCLUSIONS ................................................................................ 82 Future directions ....................................................................... 83 LITERATURE CITED .......................................................................... 86 vi LIST OF TABLES CHAPTER 3 Table 3.1. Trait means (s.e.) in the Ancestor and MA lines. The site means are significantly different from each other for all traits (P < 0.0001). P-values are from the Linetype term of the model. Total fitness is fruit number for all planted individuals ........................................................................................ 47 Table 3.2. REML among-line variance estimates in the Ancestor (VAnc) and MA lines (V1,). There were no significant differences in variance between MA lines and the Ancestor. Bold values denote variance estimates that are significantly different from zero (P < 0.05). Mutational variance (VM), environmental variance (V5), mutational heritability (hzu) and mutational coefficient of variation (C VM) are reported with standard errors in parentheses ................................................ 48 CHAPTER 4 Table 4.1. Dye-swap array design. All pairings include one MA line (L-5, L-39, L- 40) and one Parent line (P-26, P-61, P-98). Each line was labeled twice with each dye for four total biological replicates ........................................................ 68 Table 4.2. Least-square mean estimates of centered background-subtracted mean intensity for 4 nominally significant genes in the contrast L05 minus Parent. Direction of change in expression relative to the Parent is indicated in the last column: (+) increased expression in L05, (-) decreased expression in L05 ............. 74 Table 4.3. Least-square mean estimates of centered background-subtracted mean intensity for 7 nominally significant genes in the contrast L39 minus Parent. Direction of change in expression relative to the Parent is indicated in the last column: (+) increased expression in L39, (-) decreased expression in L39 ............. 74 Table 4.4. Least-square mean estimates of centered background—subtracted mean intensity for 28 nominally significant genes in the contrast L05 minus Parent. Direction of change in expression relative to the Parent is indicated in the last column: (+) increased expression in L40, (-) decreased expression in L40 ............. 75 vii LIST OF FIGURES CHAPTER 2 Figure 2.1. Hermaphrodite middle-class neighborhood crossing design. Each individual is used as a male once and a female once, contributing two offspring to the next generation ............................................................................... Figure 2.2. Field assay least-square (LS) means (+/- two standard errors) of fitness components from individual ANOVAs. (A) seed weight, (B) germination success, (C) days from planting to germination, (D) survival to flowering, (E) reproductive lifespan (number of days in flower), (F) flowering stems produced per day, (G) flowers per stem, (H) proportion of flowers setting fi'uit, (1) average number of seeds per fruit, (J) tetal lifetime seed production. P values are fi'om planned constrasts of the ancestor to the mean of the two MA populations ....................................... Figure 2.3. Field assay additive genetic variation (VA; +/- one standard error) for fitness characters. Traits as in Figure 2.2. Note that some Y-axes do not start at zero. P values are from the test of differences in VA among the three groups (see Methods). Asterisks indicate individual variances that are significantly greater than zero (P < 0.05) .................................................................................... Figure 2.4. Greenhouse assay least-square (LS) means (+/- two standard errors) of fitness characters. (A) pollen viability, (B) ovule number, (C) flower number, (D) fruits per flower, (E) seeds per fruit, (F) seed number. Note that the Y-axes do not start at zero. P values are from contrasts of the ancestor to the mean of the MA populations ............... Figure 2.5. Greenhouse assay additive genetic variances (VA; +/- one standard error) for fitness characters. Traits as in Figure 2.4. Note that some Y-axes do not start at zero. P values are from the test for differences in VA among the three groups (see Methods). Asterisks indicate individual variances that are significantly greater than zero (P < 0.05) .................................................................................... CHAPTER 3 Figure 3.1. Among-line variation in mean total fitness by site. Gray distribution = MA lines. Black distribution = Ancestor. Values were averaged by line. N M = 50 families, NAM = 6 families. (A) VA, (B) MI ..................................... viii ll 21 24 27 29 50 Figure 3.2. Genotype by environment interaction for all traits. Reaction norms of relative trait values were calculated by dividing by the mean within each environment. Each point represents a line mean. (A) germination rate, MA, (B) germination rate, Ancestor, (C) survival to flowering, MA, (D) survival to flowering, Ancestor, (E) fruit number, MA, (F) fruit number, Ancestor, (G) total fitness, MA, (H) total fitness, Ancestor. Chi-square and P values for significance of GEI are shown. Correlation coefficients and P values for the significance test for a non-zero correlation are given for all MA plots ......................................................... 52 CHAPTER 4 Figure 4.1. Distribution of mean difference L05 minus Parent for 14,424 oligonucleotides. The distribution is truncated at both ends excluding 66 genes with a difference larger than -1 and 8 genes with a difference larger than +1.05. The mean difference is 0.005 ........................................................................ 77 Figure 4.2. Distribution of mean difference L39 minus Parent for 14,410 oligonucleotides. The distribution is truncated at both ends excluding 28 oligonucleotides with a difference larger than -1 and 29 with a difference larger than +1.05. The mean difference is -0.032 ......................................................... 78 Figure 4.3. Distribution of mean difference L40 minus Parent for 14,426 oligonucleotides. The distribution is truncated at both ends excluding 25 oligonucleotides with a difference larger than -1 and 4 with a difference larger than +1 .05. The mean difference is 0.079 .......................................................... 78 ix CHAPTER 1 INTRODUCTION As the ultimate source of new genetic variation in natural populations, the importance of the spontaneous mutational process to evolutionary biology is undisputed. In spite of this, our knowledge of the process is limited to a few species studied under laboratory conditions. In fact, the rate of spontaneous mutation and the fraction of those mutations that are deleterious may have important implications for human health (Crow 2000). The studies to date mostly imply that mutations are either neutral or deleterious with respect to fitness. The mutations which are deleterious are those that are of interest in understanding the impact of spontaneous mutation influencing evolution (Lynch et a1. 1999). For example, the rate of genomic spontaneous deleterious mutation (U) is of central importance in the mutation-selection balance hypothesis for the maintenance of genetic variation (Houle et al. 1996). Hypotheses explaining the evolution of mating systems also may hinge on U (i.e., a high enough rate of deleterious mutation mitigates the two-fold cost of sex; Keightley & Eyre-Walker 2000). Most studies of spontaneous mutation have involved animal species, primarily Drosophila melanogaster. While this is an excellent model species, there are limitations to the system. It is nearly impossible to study the fitness of individual flies under natural or partially-controlled field conditions. The use of a plant system makes this a far more tractable problem. In addition, no studies of spontaneous mutations under field conditions have been reported though the importance ‘of understanding mutations in natural populations and natural settings is acknowledged (Bataillon 2003). Testing for the effects of spontaneous mutations in a field setting was a main goal of my dissertation. Furthermore, spontaneous mutation studies have focused strongly on the effects of mutations on fitness components such as lifetime fecundity or survival. Until recently, no studies had examined spontaneous mutation at a lower-level phenotype, such as gene expression. The new technology of microarrays, which allows one to assay the expression of most or all known genes of an organism, presents the opportunity to measure the effects of spontaneous mutations much earlier in the phenotypic path to fitness. Gene expression is the first phenotype of a gene while total fitness is the ultimate phenotype. Examining the effects of mutations at the level of gene expression was the second main goal of my dissertation. Dissertation overview: Chapter 2 Of my thesis asks: how do spontaneous mutations affect fitness measured in the field versus the greenhouse for wild radish? To date, all published studies of spontaneous mutation have been performed in the laboratory. Thus, I decided to test for the effects of spontaneous mutation under field conditions. Several laboratory studies have found that the deleterious effects of mutations on fitness are only visible under stressful conditions. This has led to the belief that mutations may be more strongly deleterious under natural or field conditions, which are likely to be harsher than laboratory environments. By assaying lifetime fitness of wild radish families (with and without accumulated mutations) under field conditions and in the greenhouse, I was able to directly address this question. I found that fitness is indeed lower in the field than in the greenhouse, thus more stressful in the field. In addition, fitness was lower in the mutation accumulation populations than in the ancestor in both the field and greenhouse (only significant in the greenhouse due to larger sample size). The decline in fitness was proportionally larger in the field than in the greenhouse. This study supported the contention that mutations are harsher under field conditions and suggests that laboratory studies of spontaneous mutation are underestimating the effects of mutation in nature. In Chapter 3, I pursued a similar question with a twist: is there genotype- environment interaction for new mutations in Arabidopsis thaliana measured in two field sites? In collaboration with Charles Fenster and Matthew Rutter, I assayed the effects of accumulated mutations on fitness in the field in Michigan and in Virginia. Laboratory studies of genotype-environment interaction (GEI) for new mutations have reported mixed results. Most studies that do not find GEI do detect an effect of environment but no crossing of reaction norms. In this study, we planted the seeds of A. thaliana MA lines and the Ancestor into two field sites in the fall of 2004, allowing germination to occur under field conditions. In the spring of 2005, we measured fitness components of the surviving plants (survival to flower, biomass, fruit number, and total fitness). We found that there is a strong environmental effect (Michigan is a much harsher environment for these plants) and also extensive GEI. This implies that the effects of mutation are not uniform thus, the assumption of uniform mutational effects may lead to under- or over-estimates of mutational impact. Finally, in Chapter 4, I asked: can we detect the effects of accumulated spontaneous mutations on gene expression in A. thaliana? In this study, I chose mutation accumulation lines of A. thaliana (provided by Ruth Shaw) which exhibited very low fitness for fruit number relative to the ancestor. The phenotypic difference is underlain by mutations which may affect gene expression, as the first, lowest-level phenotype of a gene. I assayed the expression of these lines for all known (or hypothesized) A. thaliana genes alongside the expression of the Parent (with no accumulated mutations). I found significantly different expression for a number of genes in each MA line relative to the Parent. These genes are candidate genes for the pathway from DNA to fitness and will serve as a starting point for future studies. However, mutations impacting gene expression do not necessarily impact fitness. 1 also found that the distribution of mutational effects in each line is slightly biased toward up-regulation, while the majority of genes showing differential expression are down-regulated relative to the Parent. Overall, I have found that measurement of spontaneous mutation under field conditions is important and differs from measurements in the laboratory. The field environment used is also very important, perhaps ideally representing the natural habitat of the organism, as mutations have different effects under different environmental conditions. In addition, the study of multiple genotypes and the impact of genetic variation on the spontaneous mutation process are also important factors which should be considered in future studies. Mutational effects on gene expression are detectable but difficult to link to phenotypic change. On average, spontaneous mutations in the measured MA lines up-regulate gene expression, though the overall distribution is fairly symmetric. CHAPTER 2 FITNESS EFFECTS OF MUTATION ACCUMULATION IN A NATURAL OUTBRED POPULATION OF WILD RADISH (RAPHAN US RAPHANIST R UM): COMPARISON OF FIELD AND GREENHOUSE ENVIRONMENTS with Jeffrey K. Conner Abstract Spontaneous deleterious mutation has been measured in a handful of organisms, always under laboratory conditions and usually employing inbred species or genotypes. We report the results of a mutation accumulation experiment with an outbred annual plant, Raphanus raphanistrum, with lifetime fitness measured in both the field and the greenhouse. This is the first study to report the effects of spontaneous mutation measured under field conditions. Two large replicate populations (N. z 600) were maintained with random mating in the greenhouse under relaxed selection for nine generations before the field assay was performed and ten generations before the greenhouse assay. Each generation, every individual was mated twice, once as a pollen donor and once as a pollen recipient, and a single seed from each plant was chosen randomly to create the next generation. The ancestral pOpulation was maintained as seeds at 4°C. Declines in lifetime fitness were observed in both the field (1.7% per generation; P = 0.27) and the greenhouse (0.6% per generation; P = 0.07). Significant increases in additive genetic variance for fitness were found for stems per day, flowers per stem, fruits per flower and seeds per fi'uit in the field as well as for fruits per flower in the greenhouse. Lack of significance of the fitness decline may be due to the short period of mutation accumulation, the use of outbred populations, or both. The percent declines in fitness are at the high end of the range observed in other mutation accumulation experiments and give some support to the idea that mutational effects may be magnified under harsher field conditions, although the harsher conditions are also novel, as they are in many similar mutation accumulation experiments. Thus, measurement of mutational parameters under laboratory conditions may underestimate the effects of mutations in natural populations. Introduction As the source of all new genetic variation, spontaneous mutation is one of the most fundamental processes in evolution. Theoretical predictions concerning the maintenance of genetic variation (Houle et a1. 1996), the evolution of sex (Keightley & Eyre-Walker 2000), the evolution of aging (Rose 1991) and the persistence of small populations (Lande 1994; Lynch et a1. 1995) depend on the rate and fitness effects of spontaneous mutation in nature. Spontaneous mutation is, however, very difficult to study empirically because mutations are rare and most are thought to have small effects on the phenotype. Most spontaneous mutations are assumed to be deleterious, for two reasons. First, there are many more ways to dismantle an existing adaptation than there are ways to improve it. Second, data from molecular studies comparing the observed per-site rate of nonsynonymous amino acid substitutions (Kn; mutations which change the amino acid) to the per-site rate of synonymous amino acid substitutions (K,; mutations which do not change the amino acid) indicate that the majority of nonsynonymous substitutions are deleterious (Keightley & Lynch 2003). The ratio of K,/ K, averages 0.3 or less in all taxa for which estimates are available (Ohta 1995; Eyre-Walker et al. 2002), suggesting that at least 70% of all nonsynonymous mutations are eliminated by selection. Mutation accumulation (MA) is the most common method used to study the rate and effects of spontaneous deleterious mutation on fitness. In this technique selection is reduced and often drift is maximized so that deleterious mutations are more likely to be fixed. This regime of reduced selection is repeated over multiple generations in independent lines to allow mutations to accumulate. After multiple generations of MA, fitness is estimated simultaneously in the MA lines and a control (ideally the ancestral state of no new mutations accumulated). The expectation is that fitness will be decreased in the MA lines relative to the ancestor due to the accumulation of spontaneous deleterious mutations. Using MA experiments, the genomic spontaneous deleterious mutation rate for fitness has been estimated in a handful of model organisms including Escherichia coli (Kibota & Lynch 1996), yeast (Wloch et al. 2001; Zeyl & DeVisser 2001), Caenorhabditis elegans (e.g., Keightley & Caballero 1997; Vassilieva & Lynch 1999), Drosophila melanogaster (e.g., Shabalina et al. 1997; Fry et al. 1999), and Arabidopsis thaliana (Schultz et al. 1999; Shaw et al. 2000). Most of these studies have found decreased mean fitness after mutation accumulation but a few have not. Shaw et al. (2000) and Keightley & Caballero (1997) did not report decreased mean fitness, although they did detect an increase in fitness variance, indicating that mutations had in fact accumulated. In yeast, Zeyl and DeVisser (2001) detected a significant decline in growth rate in DNA repair-deficient yeast but not in DNA repair-competent yeast. A potential explanation for the finding of no decrease in mean fitness in some MA studies may be the environments in which fitness is estimated (Kondrashov 1998). Most studies of MA have utilized laboratory populations and all have assayed fitness in the lab or greenhouse, usually under benign conditions. Several studies of D. melanogaster that compared the effects of MA under stressful and benign environments have found greater declines under stress for some fitness components (e. g., Shabalina et al. 1997) and/or increases in among-line variance of MA lines (e.g., Fry & Heinsohn 2002). In these studies of MA in multiple environments the harsh or stressful environments are also oflen novel, that is, different from the environment under which mutations were accumulated. During accumulation, those mutations having particularly large deleterious effects in the environment of accumulation may be removed by selection. This will reduce the observed magnitude of fitness reduction due to new mutation when measured in the “benign” mutation accumulation environment. This downward bias of estimates of mutational rates fiom MA experiments is expected (Lynch et al. 1999) due to the inability to completely remove selection, though the extent of the bias remains unknown. However, when MA lines or populations are assayed under other, novel, conditions, mutations that have a small effect in the accumulation environment may express larger deleterious effects in the novel environment, thus making it appear that the novel environment is “harsh” relative to the accumulation environment. Xu (2004) found support for this hypothesis in a MA study of the fungus Cryptococcus neoformans in which performance was better under the conditions experienced during MA than under novel conditions (altered temperature and growth medium). This suggests that mutations with large deleterious effects are removed during MA and some mutations that were neutral (or nearly so) in the MA environment were deleterious in a novel environment. Thus, both the environment experienced during MA and the environment of the fitness assay are important considerations in interpreting MA experiment results. We have assayed the effects of MA on fitness in two novel and harsh environments: in the field and under stress in the. greenhouse in wild radish (Raphanus raphanistrum). Two replicate populations of 300 individuals collected from the same natural population were propagated under relaxed selection for nine (field assay) or ten (greenhouse assay) generations to allow mutations to accumulate. To our knowledge, this is the first study to examine the effects of spontaneous mutations on fitness under field conditions. Methods Study species Wild radish, R. raphanistrum (Brassicaceae) is a self-incompatible annual weed that grows in highly disturbed habitats such as agricultural fields. R. raphanistrum is a model system in ecology and evolution, including many studies on plant-insect interactions (e.g., Agrawal 1998; Strauss et al. 2001), natural selection and genetic correlations (e.g., Stanton et al. 1986; Mazer 1987; Conner 2002) and adaptation to global climate change (e. g., Tevini et al. 1983; Kostkarick & Manning 1993; Case et al. 1998) Generation and propagation of mutation accumulation populations Seeds for the experimental populations were collected from a natural population of wild radish in an alfalfa field near Binghamton, NY in 1988 as described in Conner & Via (1993) and stored as seeds at 5°C. Two populations (designated MA] and MA2) of 300 individuals each were created in 1991 and maintained in the greenhouse for nine generations under relaxed selection, using a middle-class neighborhood (MCN) crossing design modified for use in hermaphrodites. In this design, each individual is mated twice, once as a male and once as a female (Figure 2.1), that is, each individual contributed two offspring to the next generation, one through seed and the other through pollen. This design maintained a large effective population size (N, z 600) because there was no variation in family size (Crow & Kirnura 1970). A single seed produced by each individual was chosen randomly to be planted for the next generation. An MCN design minimizes the opportunity for selection and thus allows mutations to accumulate. This also minimizes adaptation to the greenhouse environment. The large effective population size used in this study also minimizes the effects of genetic drift. There may have been some selection on germination, as about 3% of mothers had no offspring germinate in the next generation, but this was likely mainly due to pests and diseases in the greenhouse rather than genetic differences (Conner 2002). Still, mutations that affected germination success may have experienced some selection. In addition, while most wild radish plants in this population germinate quickly and flower within 4-6 weeks, much more time was taken to eliminate selection for early germination and flowering, so that the nine generations took about nine years to complete. Ancestral populations were maintained as seeds at 4°C. For further details see (Conner 2002). Field assay of MA Experimental Design.— All three populations (Ancestor, MAI, MA2) were germinated, grown and crossed in the greenhouse for one generation prior to planting in the field (common garden generation ten). The field generation corresponds to nine 10 .aosfioaow the: 05 8 wctamto 025 mic—53:3 .38 6—95.“ m was 8:0 29: a we vow: fl 3.6385 seem dwfiou wfimmob voofiofiwmoa mam—c-0828 cumuofimmgom .—.N 95»:— 9 9 9 3. 53.225» Lou See-:2 «a 58:9 .38 25 8298 32:31 to a a0 m to m to w 11 generations of mutation accumulation, because the common garden generation included the ancestor and therefore the MA populations did not accumulate additional mutations relative to the ancestor in this generation. Populations were regularly interspersed on the greenhouse benches to eliminate average maternal environmental differences among populations. Families were created using a nested half-sib mating design with 75 sires and three dams per sire creating 225 full-sib families. Due to failed germination and crosses the final numbers were 75 ancestral sires (207 dams), 70 MAI sires (201 dams) and 68 MA2 sires (193 dams). Two MAI sires were only crossed with two dams but all others were crossed with three. One offspring fiom each full-sib family was grown in the field trial. Seeds from the common garden generation were germinated in the greenhouse to ensure high germination success; germination was recorded daily. All seeds were weighed prior to planting. Initially one haphazardly chosen seed per dam was planted. If the first seed did not germinate within one week, a second seed was planted. If the second seed did not germinate, two more were planted and this was continued until a seed germinated or no more seeds were available. A maximum of seven seeds were planted for a single dam, and 25 dams failed to have any offspring germinate, leaving germinated offspring from 201 ancestral, 193 MAI, and 182 MA2 dams (576 total). The first seed planted germinated for 90% of the dams and more than four seeds were planted for only 2.5% of dams. Seedlings were planted in May 2001 at Kellogg Biological Station (KBS) in southwestern Michigan. Planting took place before seedlings had their first true leaves, within five days of germination. Individuals from each population were randomly 12 assigned to one of seven blocks in the field with 90 individuals per block (30 from each population). The populations were regularly interspersed in the field within blocks, in ten rows by nine columns, with one-meter spacing. Mortality and date of first flowering were recorded. A large number of plants (227 out of 576) were lost early in the experiment to rabbit herbivory due to a hole in the fence surrounding the field. Once the hole was repaired, mortality was low until the end of the season (90% survived to flowering). All stems, flowers and fruits were collected and counted for each plant. The number of seeds in each fruit was also counted on all fruits allowing the calculation of seeds per fruit and total number of seeds (lifetime female fitness) for each plant. Study site.— The field used in this study is located at the Plant Ecology Field Lab of KBS. This site was an agricultural field before its addition to the field station, thus it represents the type of habitat in which wild radish might be found naturally. In order to further simulate the natural agricultural habitat we tilled the field before planting. Analysis.—- The measured traits were divided into parental and offspring groups. Each group was analyzed first with a MANOVA and then with individual ANOVAs for each trait using SAS (SAS Institute 2001). Parental traits were average seed weight, germination success (proportion of seeds germinated), and average days from planting to germination. The parental traits, including germination success, represent averages of all of the multiple seeds planted per dam; therefore, they represent traits of the parents not traits of any single offspring. Traits of individual offspring were partitioned into multiplicative fitness components (survival to flowering, reproductive lifespan (number of days in flower), flowering stems produced per day, flowers produced per stem, the proportion of flowers that set fruit, and the average number of seeds per fruit); their 13 product is total lifetime female fitness (number of seeds). The plants that died before flowering (those eaten by rabbits plus 34 that died for unknown reasons) were not included in the last four fitness components but were included in total fitness. The multiplicative fitness components are largely independent (r S 0.19), while the raw variables are highly correlated. Population was a fixed effect in the model. Block and sire (nested within population) were modeled as random effects. A planned contrast comparing the mean of the two MA populations to the ancestral mean was performed for each trait. Residual plots showed no signs of serious heteroscedasticity. This portion of the study was designed to test for a change in mean, but the inclusion of multiple offspring per sire allows us to test for significant additive genetic variance as well. Additive genetic variance within each population was estimated as four times the sire variance component in a model estimating separate variances for each population. The presence of significant sire variance was tested by performing a one- tailed Chi-square test comparing the two-tirnes log likelihood of the model including sire to that of the model without sire (Littell et al 1996). The hypothesis of greater variance in the MA populations was tested by comparing a model estimating one sire variance across all three populations (equal variance model) to one estimating separate (unequal) variances for each population. Significance was tested by performing a one-tailed Chi- square test comparing the two-times log likelihood of the full model (unequal variance) to that of the reduced model (equal variance). This test has two degrees of freedom because the models differ by two parameters (the equal model estimates one variance and the unequal model estimates three variances). Significance of this test indicates different 14 variances among the three groups but does not specifically indicate significantly greater variance in the MA groups relative to the Ancestor. Greenhouse assay of MA Experimental design.— The fitness assay was repeated in the greenhouse using a larger half-sibling mating design to more powerfully test for increases in additive genetic variance (V A) as well as declines in mean fitness due to mutation accumulation. Stored seeds were planted to create the ancestor half-sibling families. For the MA populations, seeds from the common garden generation ten (prior to the field assay) were planted to create the half-sibling families for the greenhouse assay. This planting represents ten generations of mutation accumulation for the MA populations because new ancestral seeds were chosen; thus, the common garden generation was an additional generation of mutation accumulation for the MA populations relative to the ancestor. In each population, fifty plants were chosen randomly to be sires and three unique dams were assigned randomly to each sire to generate 150 full-sibling families nested within 50 paternal half-sibling families. Due to space constraints, the 50 half-sibling families were split into two blocks of 25 that were grown and crossed at different times. Due to failure to set seed, six full-sibling families were lost resulting in 148 ancestral full-sibling families, 149 MA] full-sibling families and 147 MA2 full-sibling families. Seeds from these families were grown in the greenhouse under water and nutrient stress. For the ancestor, four seedlings per dam were grown (592 seedlings; 577 survived to produce fruit). Two seedlings per dam were grown for each MA population (298 MA] , 297 survived to fruiting; 294 MA2, 290 survived to fruiting). This design uses the same number of ancestral plants as MA plants, increasing the power to detect differences 15 _ , v -. ,. -_—". . . .—"'_"‘. , in mean and variance between the ancestor and the MA populations. Two offspring per dam provided reasonable power to detect changes in additive genetic variance for two reasons. First, the number of sires is the primary determinant of the power to detect additive variance. Second, environmental variance was minimized by periodically rotating plant location in the greenhouse (see below). Seeds were planted in 3-inch pots (to produce water stress) with Metro-mix 360 and fertilized with a total of 5 g Osmocote Plus 15-9-12 (NPK) controlled release pellets (nutrient stress). Fertilizer was applied gradually, 1.25 g was applied just after planting, 1.25 g applied just before flowering and 2.5 g applied during peak flowering. A regular dose is 5 g of Osmocote pellets applied at planting rather than gradually over the life of the plant. To compensate for failures in germination, six seeds were planted for each ancestral dam and four seeds for each MA] and MA2 dam. Extra seedlings were thinned or transplanted to pots from the same dam that did not have seedlings. When necessary, additional seeds were planted until the desired numbers of offspring were achieved. During germination, pots from the three populations were regularly interspersed in flats of 33 plants (11 from each population) to eliminate average environmental differences between populations. Once plants had germinated they were transferred to new flats containing individuals from one population only, with one offspring from each of the 25 sires in a time block (25 plants from different families per flat). There were 24 ancestor flats, 12 MAI flats and 12 MA2 flats. Flats were spaced at least 29 cm apart to minimize accidental cross-pollination between groups. Flat order on the greenhouse bench was randomly assigned within each population and each flat bordered two other flats, one 16 from each of the other two populations. Flat position was re-randomized twice a week prior to flowering to minimize environmental differences among flats. Within-flat pot position was randomized initially. Two greenhouse rooms were used and assignment of flats to rooms was random. Germination and first day of flowering were recorded daily. Once flowering began, mass pollination was performed within each population (primarily within a flat) two to three times per week until flowering ceased. Flowers were pollinated by sweeping a paintbrush haphazardly across the tops of all open flowers in a flat for five to seven minutes per flat. Pollen viability was assayed on one newly opened flower of two haphazardly chosen offspring per sire for the ancestral population (n = 100) and on one offspring per sire for MAI (n = 50) and MA2 (n = 50). Viability was assessed by the Heslop—Harrison fluorochromatic reaction (F CR) test (modified fi'om Kearns & Inouye 1993 and Thomson et al. 1994) as follows. A single newly opened flower was collected from the focal plant and the cut pedicel placed into 10% sucrose for a maximum of three hours. A sample of pollen (about 100 grains) was removed from one anther with a pin and placed in a single drop of Fluoroscein diacetate (FDA) solution on a glass slide. The sample was incubated for 5-10 minutes at room temperature in the dark before a cover slip was added. All strongly fluorescent (viable) grains were counted under epifluorescent illumination. A count of all grains was then performed under visible-light illumination. Ovule number was counted for one newly opened flower from two offspring for each ancestral dam (n = 296) and one offspring for each MAI dam (n = 149) and MA2 dam (n = 144, 3 not collected). Fruits were collected as they ripened and all seeds were 17 counted. Once plants senesced all shoots were collected and all flowers and fruits were counted. Analysis.— Results for pollen viability, ovules per flower, flowers produced, fruits per flower, seeds per fruit and total seeds produced were analyzed using the MIXED procedure in SAS (SAS Institute 2004). Population and block were fixed effects and sire (nested within population and block) and dam (nested within sire, population, and block) were random effects. A priori contrasts were constructed comparing the ancestor mean to the mean of the two MA lines to test for decreased fitness of the MA lines. The multiplicative fitness components, pollen viability and ovules per flower were log(Y+l) transformed to reduce differences in scale between variables and analyzed with MANOVA. The untransformed data, including total seeds produced, were also analyzed with univariate ANOVAs. Residual plots showed no signs of serious heteroscedasticity. Additive genetic variance was estimated from untransformed data as previously described for the field study. Results Field Assay of Mutation Accumulation The means did not differ significantly between populations for any trait (Figure 2.2; MANOVA parean traits F2300 = 0.38, P = 0.7; MANOVA offspring traits F232.” = 1.83, P = 0.16), and while lifetime fitness (number of seeds produced) declined in the MA populations relative to the ancestor, this decline was not statistically significant (Figure 2.2J). The estimated fitness difference between the Ancestor and the MA lines was 7.4 seeds (s.e. = 5.7), which represents a 17.9 percent fitness decline due to the nine generations of mutation accumulation. The fitness component that declined the most was 18 survival to flowering (Figure 2.2C). Germination success was high and similar across populations (Figure 2.2B), suggesting that little or no inadvertent selection on this trait occurred. There was evidence for increases in VA in the MA populations. Significant sire variance was found for survival to flowering, stems per day, and fruits per flower in population MAI, for flowers per stem and seeds per fruit in population MA2, but not for any trait in the Ancestor (Figure 2.3). All traits with significant sire variance also showed evidence of significant differences in sire variance among populations (Figure 2.3). Greenhouse Assay of Mutation Accumulation The MANOVA was significant (F 2,141 = 3.49, P = 0.03), indicating differences in fitness between the populations. The planned contrast was also significant (171,115 = 3.78, P = 0.05), demonstrating that the significance of the MANOVA is due to a decrease in mean fitness of the MA populations relative to the ancestor. The estimate of difference of the MA lines from the Ancestor was 5.1 (s.e. = 2.9), that is, fitness of the MA lines is 6.6 percent lower than that of the Ancestor. Univariate ANOVAs showed that the significance of the MANOVA contrast was likely due to two individual traits: fruits per flower showed a significant decrease in the MA populations relative to the ancestor (Figure 2.4D), and the decrease in pollen viability approached significance (Figure 2.4A). The univariate ANOVA for number of seeds produced showed a nearly significant decrease (Figure 2.4F) in the MA populations relative to the ancestor. There was evidence for increased VA due to mutation accumulation for fruits per flower in population MA2 (Figure 2.5B). There were no statistically significant differences in additive variance among populations for any other trait. However, 19 Figure 2.2. Field assay least-square (LS) means (+/- two standard errors) of fitness components from individual ANOVAs. (A) seed weight, (B) germination success, (C) days from planting to germination, (D) survival to flowering, (E) reproductive lifespan (number of days in flower), (F) flowering stems produced per day, (G) flowers per stem, (H) proportion of flowers setting fruit, (1) average number of seeds per fruit, (.0 total lifetime seed production. P values are from planned contrasts of the ancestor to the mean of the two MA populations. 20 Figure 2.2 B 1 .lol. Tlolr. 21 1.5 1 05 == .lol. PN .l A _ 0 5 0 5 0 9 9 8 1 30V 3803 532.5% cams. A 1 P=0.14 {N=576 a a I f L , r A 6 5 6.5 Ego; coon cums. MA1 MA2 Ancestor MA1 MA2 Ancestor cozmcEtom 2 vim—u zoos. .lol. 0 0_ _ _ 2. 5 0 5 __ = - P N Ill. 0 0 0 0 0 8 6 4. 2 so 9:25: 2 _m>_>Sm coo—2 C - _|¢|. Ill. n m 0 5 I = = P N _|o|_ 6 4. 2 3 8. 3 3 3 2 MA1 MA2 Ancestor MA1 MA2 Ancestor = 0.55 = 306 p N Ancestor 01— 0084 0.06- 0044 d 2 0. 0 3.... .8 253 secs. P = 0.74 N= 538 _ _ _ 5 0 5 0 7 5 2 SEE canon... o>zo=u2a2 cows. MA2 MA1 MA1 MA2 Ancestor 21 Figure 2.2 (cont’d). II II 2 6. w - 0 5 __ __ II. P N 4| 5 5 5 0 7 nu. 2 o 0 .030: .8 3:5 cows. G a Tlol. III. a. .1... , O 5 __ = P N _ . _ m 5. 5 5 0 82w .8 2950: cows. MA1 MA2 Ancestor MA1 MA2 Ancestor H - Ill. .llQll. 7 2 fl - 0. 5 __ =. o _ P N 5 0 5 0 7 5 2 £58 Co LonEsc zoos. ll. _ l L m. a f O 5 __ __ III. I P N ‘ fir ‘ —‘ l1] lair ill? 4 3 2 1 0 :3: con 30$ :32 MA1 MA2 Ancestor MA1 MA2 Ancestor 22 Figure 2.3. Field assay additive genetic variances (VA; +/- one standard error) for fitness characters. Traits as in Figure 2.2. Note that some Y-axes do not start at zero. P values are from the test for differences in VA among the three groups (see Methods). Asterisks indicate individual variances that are significantly greater than zero (P < 0.05). 23 B 1 O I 2 T 0. m 0 5 = = T... P. N 1 5 0 5 1 0 0. O. 0 s o 0 fl <> .mmoooam cozosEoO r u g .1 F I A 4v 7 O 1 m. e - 0 7 < 5 = = O P N o. o. o. o. o. o. 5 4 3 2 1 0 $25 <> .290; new MA1 MA2 Ancestor MA1 MA2 Ancestor D TI. a. Ii 7 o. m - 0 5 __ __ P N _ o _ 5 5 0 5 O. 2. 2 0 no .> 6:526: 2 _m>_>5w . C o 6 . 7 1 5 __ ._ o P N 5. 1 5. 0 5. 1 0 0 <> 52858.5 8 inc MA1 MA2 Ancestor MA1 MA2 Ancestor e *TII 1 m mm. = = PN e m m. m. m. m. o 0 0 0 0 0 <> Sun Log «:55 E o 8 3 15 p F L _. __ 1 1 PN — a m... a 5 53.06:. 3303053. -300 MA1 MA2 Ancestor MA1 MA2 Ancestor 24 Flowers per stem. VA Seeds per fruit, VA Figure 2.3 (cont’d). 25 3 g l * P = 0.015 r G P = <0.001 H N = 310 g N = 307 * 15 - 0 g 2 - ‘E " 8 5 ‘ g 1 _ t a . LI. -5 l r 1 0 4v 1 % Ancestor MA1 MA2 Ancestor MA1 MA2 95 ‘ * | 4000 . P = <0.001 P = 1 J 75 N = 305 = _ 1 < 3000 . N 535 > I- 55 « a; 1 g 2000 J 0 35 “ g L I" 1000 J 15 1 T5 . If I r 1 0 IT I ¢ Ancestor MA1 MA2 Ancestor MA1 MA2 25 Figure 2.4. Greenhouse assay least-square (LS) means (+/- two standard errors) of fitness characters. (A) pollen viability, (B) ovule number, (C) flower number, (D) fruits per flower, (E) seeds per fruit, (F) seed number. Note that the Y-axes do not start at zero. P values are from contrasts of the ancestor to the mean of the MA populations. 26 B 1 D 2 III. II A M II. M 7 M m mm m 7 O. 1 T 0 % I = M = __ m P _ . _ P N d Tel. 6 c M .l A a 3. 5 2 5 a _ _ 0 2 0 1 4 6 5 4 3 0 0 1 526: .3 3:30 coo—23 830: .8 3:5 cum—23 e r. u g 1 .1 F a A M C TI I 1‘ u M u 0 . M TI M 7 Mm m r n. m r . 2 r 0 1 n:v .. m = = A m m m .n... m m m m m AR; 5533 5:05 cam—23 £250: Go .383: coo—23 MA1 MA2 Ancestor MA1 MA2 Ancestor F II; M M Till. M 1 M mm 01 T = = m PN _ _ _ m n A m m m w woman .0 L353: :3sz a E III m M TIIIOIL M M 4 I 2m .. 01 m = = 8 PN TITI. m A 4. 3. 2. 1. 2 2 2 2 2 :3: Log muoom c323 27 Figure 2.5. Greenhouse assay additive genetic variances (VA; +/- one standard error) for fitness characters. Traits as in Figure 4.2.4. Note that some Y-axes do not start at zero. P values are from the test for differences in VA among the three groups (see Methods). Asterisks indicate individual variances that are significantly greater than zero (P < 0.05). 28 Figure 2.5 2000 ~ - P = 0.19 A 3 P = 0.34 N = 201 N = 557 g s g; 1000 ~ g»; 1.5 . e: g r g “5 > A“, 0 ~ { o g 0 . 0 3 2 o. l -1000 , , -1.5 1 Ancestor MA1 MA2 Ancestor MA1 MA2 C 2000 ~ P = 1 ,, 0-015 “ P = 0.005 N = 1167 - I N = “67 "' §_ f 9 g 0.01 l E! . I: g 1000 0 cg, 3 ' 8 3 l. g g 0 l a 0 . e 2 U. -1000 . 1 -0.005 I Ancestor MA1 MA2 Ancestor MA1 MA2 0.15 . E 120° ‘ I P=0.43 P = 0'1 N = 1161 N = 1154 ' < > .. 3’ g 300 ~ 1.. 0 g . * 8' E 0 "’ 3 § .. E 400 ~ in 23. 0 -0.15 . . 0 . Ancestor MA1 MA2 Ancestor MA1 MA2 29 there were non-significant trends for at least one MA population to have a larger estimate of VA than the ancestor for all traits except seeds per fruit. Significant V A was found for only three traits: flowers and fruits per flower in MA2 (Figure 2.5A, D) and number of seeds in the ancestor (Figure 2.5F). All other populations and traits had estimates of V A that were not significantly different from zero. Discussion No mean differences in fitness were detected in the field, although a trend for decreased fitness in the MA populations relative to the ancestor was present for four of ten traits (seed weight, survival to flowering, flowers per stem and lifetime fitness). The greenhouse assay did reveal significant decreases due to MA in the overall MANOVA, for fruits per floWer, and a marginally significant decline in lifetime female fitness, number of seeds. However, the percent decline in fitness was almost three times greater in the field than in the greenhouse (1.99% versus 0.66% per generation); the higher statistical significance in the greenhouse results from lower variance in the fitness estimates due to larger sample sizes and lower environmental variance. These estimates of fitness decline are for heterozygous mutations and are similar to the highest heterozygous estimates of 2% per generation in Drosophila (Lynch et al. 1999). Assuming 11 ~ 0.15 — 0.3 for mildly deleterious mutations (Lynch & Walsh 1998), our estimates correspond to a homozygous decline of ~6-11% per generation in the field and ~2-4% per generation in the greenhouse. Declines such as these could present a substantial challenge to a small population in nature. Alternatively, they could be biased upward by selection or recombination (see below). 30 One possible explanation for not observing stronger statistical support for a decline in fitness is an inherently low rate of mutation combined with a relatively short period of mutation accumulation (nine or ten generations). Of the MA studies that assayed fitness at or near ten generations, three found a decline in mean under some conditions (Shabalina et al. 1997; Schultz et al. 1999; Schoen 2005) while two found no difference from the control (V assilieva & Lynch 1999; Shaw et al. 2000). These latter two studies continued and were assayed at later times — Shaw et al. (2000) still reported no significant decrease in fitness in A. thaliana after 17 generations, but Vassilieva & Lynch (1999) found a significant decrease for some traits in C. elegans after 50 generations. These studies demonstrate that it is possible to statistically detect the effects of mutation accumulation after ten generations, though it is not guaranteed. Potential problems with MCN designs Shabalina et al. (1997) used a breeding design similar to ours with two large populations (N, =1 400 per population, smaller than ours) of Drosophila melanogaster and detected a significant decrease in fitness after ten generations. Several concerns that were raised by Keightley et al. (1998) about the interpretation of the MCN experiment of Shabalina et al. (1997) could potentially apply to our experiment. First, adaptation in the MA lines to the greenhouse environment is possible. Within-family selection, due to inviable seeds, would result in adaptation to the greenhouse during mutation accumulation. We minimized this source of selection by randomly choosing a single seed per individual for the next generation, thus allowing genetic drifi to dominate within-family selection. Adaptation to the MA environment could have operated in Shabalina et al.’s (1997) experiment, where 20% mortality was experienced each 31 generation in the MA populations. In contrast, our populations experienced much lower mortality, only losing three percent of dams per generation. There was a large decline in fitness per generation in the greenhouse, and no significant difference between Ancestor and MA lines in germination success in the greenhouse, which argue against a major effect of selection during MA, because adaptation to the greenhouse should make the decline in fitness in the MA lines relative to the ancestor smaller. Thus, the greenhouse comparisons are conservative in the event of selection during MA. Second, if selection occurs in the seeds of the Ancestor population during storage, then lower fitness of MA populations relative to the Ancestor could be due to increased Ancestor fitness rather than decreased MA fitness. However, in our study germination success was high (93-96%) and not significantly different between Ancestor and MA lines. This is in contrast to the low recovery from cryopreservation of the controls (8- 18%) in Shabalina et al. (1997). Thus, adaptation of the control is unlikely to be a problem in this study due to low opportunity for selection in the Ancestor. Finally, Keightley et al. (1998) suggest that inbreeding depression could cause a decrease in fitness similar in magnitude to that observed by Shabalina et al. (1997). Shabalina et al. (1997) maintained populations of Nc z 400, leading to an increase in the inbreeding coefficient of U800 per generation or 4% by the end of the experiment (Crow & Kimura 1970). Our populations were maintained with Nc z 600 resulting in an increase in the inbreeding coefficient of 1/1200 per generation or 0.8% by the end of the experiment. A previous study of inbreeding depression in the closely related R. sativus found no reduction in total fitness after inbreeding at F = 0.0315 (3.15%) in a natural 32 population (Nason & Ellstrand 1995). These data suggest that our level of inbreeding would not produce effects on fitness similar to those we have observed. Comparison OfMCN Designs To Single-Ofispring Descent The inability to reproduce by selfmg of many sexual organisms presents a challenge for mutation accumulation experiments. The time (20 generations or more) required to create genetic homogeneity through inbreeding before the start of an MA experiment is impractical in most cases, and would require strong selection against deleterious recessives in an obligate outcrosser like Raphanus, so that the lines would no longer be representative of natural populations. One method used in Drosophila (but not available in Raphanus) is balancer chromosomes (where crossing-over is reduced), which protects one chromosome from selection (Mukai 1964; Mukai et al. 1972). Another approach is the MCN design, which was used by Shabalina et al. (1997) and in this experiment. The main difference between the single-offspring descent selfing method and MCN is the probability of fixation of a new mutation. In selfmg lines, 75% of the offspring will carry a new mutation, with one-quarter homozygous for it. So there is a 25% probability of fixation and a 25% probability of loss of a new mutation each generation due to chance. In our MCN design, each individual has two offspring so the genetic contribution is the same as in a selfing experiment (two genomic complements). Half of the offspring will carry the new mutation but never in a homozygous form. However, there is a 25% probability that both offspring will carry a copy of the mutation and thus two copies will be passed on to the next generation. Similarly, there is a 25% probability that neither offspring will carry the mutation (it will be lost from the population). While fixation of a new deleterious mutant is unlikely in a MCN design, the 33 probabilities of loss and transmittal of new mutations are the same in inbred lines and MCN designs. An advantage of outbred MCN designs is that they allow the accumulation of more deleterious recessive mutations, such as homozygous lethals. These mutations will be eliminated from an inbred design but are much more likely to accumulate in heterozygotes in a MCN design. Thus, the main difference between inbred and MCN mutation accumulation experiments is in the expression of new recessive mutants. If most new mutations are partially recessive, their effects will be more difficult to observe in MCN populations, where nearly all mutations will be heterozygous. Partially recessive mutations are likely — the dominance of deleterious alleles has been estimated as h ~ 0.1 overall and h ~ 0.15 to 0.3 for mildly deleterious alleles excluding lethal alleles (h = 1 is completely dominant and h = 0 completely recessive; Lynch & Walsh 1998). Note that partially recessive mutations will have an effect on fitness in a heterozygous form, so while new mutations will have stronger phenotypic effects in an inbred design they will still affect fitness in a MCN design. There are two additional disadvantages of the MCN design compared to inbred line studies. First, declines in fitness could be due to recombination breaking up adaptive gene complexes rather than the accumulation of mutations; this problem is difficult to deal with. Second, declines in fitness could be due to rare deleterious mutations that were present in the natural population and that increase in frequency due to reduced natural selection in the lab. This is not a problem in the inbred line studies that are initiated with isogenic base populations. However, using large, outcrossed populations as in our experiment will minimize the effects of genetic drift and inbreeding in increasing 34 the fiequency of these pre-existing mutations when selection is relaxed. Thus, we assume that the frequency of pre-existing deleterious mutations in the Ancestor approximated their frequency in the MA populations at the end of our experiment. Measurements 0fM4 In Multiple Environments Dependence of mutational effects on the assay environment has been studied for several systems. Studies in Drosophila illustrate the variety of results of such studies. Fry & Heinsohn (2002) found no interaction of mutations and environment on viability in Drosophila measured under high and low densities, normal and low temperatures, and presence/absence of ethanol. Two other studies have found an effect of environment on mean decline in fitness. Shabalina et al. (1997) measured competitive ability (primarily survival), motility, and longevity under benign and harsh competitive conditions (low and high density), and found little to no decline in mean under benign conditions but did find a decline in mean viability under competitive conditions. Kondrashov & Houle (1994) studied genotype-environment interactions for accumulated mutations in Drosophila with three factors: parental density, dilution of the growth medium, and temperature (crossed with medium dilution), and found a larger decline in mean fitness (productivity) under harsh conditions. Our results are similar to the latter study since we found a trend toward decreased fitness that was proportionally larger under harsher field conditions compared to more benign greenhouse conditions (overall seed production was greater in the greenhouse: compare Figures 2.2] and 2.3F). Thus, our study does provide some support for the magnification of mutational effects under stress. However, some of this magnification could be due to the fact that the field was not the environment under which mutations accumulated (but it is the recent natural environment of this species). 35 Conclusions The low rate of mutation and small average mutational effect on fitness require large sample sizes and controlledconditions for the estimation of mutational parameters, making studies of fitness effects of spontaneous mutations challenging. While it is more difficult, we have shown that the effects of mutation can be detected with outbred populations (also demonstrated by Shabalina et al. 1997 and Schoen 2005) and, for the first time, under field conditions. We found a trend for decreased fitness and evidence of increased additive genetic variance after nine or ten generations of MA. In addition, we found support for the idea that harsher environments exacerbate the deleterious effects of mutations, resulting in a decline in fitness in the field that is nearly three times the decline found in the less stressful greenhouse environment. 36 CHAPTER 3 FIELD MEASUREMENTS OF GENOTYPE BY ENVIRONMENT INTERACTION FOR FITNESS CAUSED BY SPONTANEOUS MUTATIONS IN ARABIDOPSIS T HALIANA with Jeffrey K. Conner, Charles Fenster, and Matthew Rutter Introduction Spontaneous mutation is one of the most fundamental processes in evolution, important in maintaining genetic variation for quantitative traits (Houle et al. 1996). The genetic load caused by slightly deleterious mutations may favor the evolution of sexual reproduction (Keightley & Eyre-Walker 2000) or threaten the persistence of small populations (Lande 1994; Lynch et al. 1995). Mutational pressure may also be involved in the evolution of senesence (Partridge & Barton 1993). Many recent laboratory studies have addressed the rate of mutation (e. g. Fry 2004), the distribution of mutational effects on fitness (e. g. Shaw et al. 2000) as well as the extent of environmental dependence of the fitness effects of new mutations (e.g. Fry & Heinsohn 2002; Chang & Shaw 2003; Kavanaugh & Shaw 2005). However, nothing is known about these parameters in field environments. A Two approaches are often used in estimating mutation rates and effects. The first relies on either estimates of inbreeding depression in natural populations (Charlesworth et al. 1990; Johnston & Schoen 1995) or on patterns of base substitution across species (Ohta 1995; Eyre-Walker et al. 2002; Wright et al. 2002). The second, and more common, method is mutation accumulation (MA), which can estimate both the rate of genomic spontaneous deleterious mutation (U) and the distribution of effects on fitness. 37 MA studies are conducted by relaxing selection for many generations thus allowing mutations to accumulate, using either isogenic lines or large random mating populations. Selection may be reduced by minimizing population size to a single individual and allowing genetic drifi to dominate (in isogenic lines; e. g. Shaw et al. 2000) or by equalizing fitness among individuals in large random mating populations (e. g. Shabalina et al. 1997). The fitness of the advanced generation MA lines is then assayed alongside the ancestor, which has no new mutations. Under the expectation that most mutations are deleterious, mean fitness is expected to decrease in MA lines or populations relative to the ancestor. The addition of new mutations is also expected to increase genetic variance among the MA lines or within the MA populations relative to that of the ancestor. From these two quantities, change in mean and variance, the whole genome spontaneous deleterious mutation rate (U) can be estimated in isogenic designs (Bateman-Mukai method; Lynch et al. 1999). Published studies to date have only estimated the rate and effects of new spontaneous mutations under laboratory conditions so that the effects of MA on fitness under field conditions remain unknown. Stress may increase the negative fitness impacts of mutations (Kondrashov 1998); thus, we might expect that mutations which increase fitness in the lab or greenhouse may not do so under the harsher conditions of the field. A number of studies have shown that the magnitude and direction of mutational effects on fitness may vary with environmental conditions (Kondrashov & Houle 1994; Fry et al. 1996; Shabalina et al. 1997; Vassilieva et a1. 2000; Szafraniec et al. 2001; Xu 2004). When the relative fitness of MA lines changes across environments, there is genotype by environment interaction (GEI) for the new mutations. If GEI is common for new 38 mutations then it is important to measure mutational effects under multiple environments. Of particular interest would be mutations that are deleterious in one environment and beneficial or neutral in another (Lynch et al. 1999). In addition, understanding the pervasiveness of GEI is important in applying the results of laboratory studies to evolutionary processes in natural populations. While it is not practical to conduct MA fitness assays of many model organisms (e. g. Drosophila or C. elegans) under natural conditions, it is feasible to conduct studies in a field setting with plant species such as Arabidopsis thaliana. We measured the effects of MA on the lines initiated by Shaw et al. (2000) under field conditions. Only two studies that we are aware of have attempted to measure MA under field conditions (Conner & Roles, in prep; Rutter & F enster, in prep). We expand upon this work here, with two field sites, in Michigan and Virginia, studying germination through fruit production of A. thaliana MA lines. The use of two field sites allows us to study GEI for new mutations. GEI for new mutations has been examined in two laboratory studies of these A. thaliana MA lines (Chang & Shaw 2003; Kavanaugh & Shaw 2005). Each study analyzed the effects of a single manipulated environmental variable (nutrients and light) and neither found evidence of GEI; however, field habitats will rarely vary in only a single aspect. The use of two field sites, which will differ in many environmental variables as do natural habitats, should better reflect the differences that might be found between two natural populations. Materials 8: Methods Overview. A. thaliana fills many of the ideal requirements for an organism in which to study spontaneous mutation: short life cycle, high fecundity, and reproduction 39 by selfing. In addition, it is possible to maintain viable seeds for long periods, allowing us to directly compare fitness with and without newly accumulated mutations. Native to Europe, A. thaliana is widely distributed across the continent (Ratcliffe 1965) and has now invaded many other habitats, including North America. In this study, we planted seeds derived from the Columbia type in the field during fall 2004 and measured germination, biomass and reproduction through the spring 2005 season. Here we estimate the effects of mutations on overall fitness in the field. One of the field sites used (MI, see below) has natural populations of A. thaliana while the other site (VA, see below) does not normally contain A. thaliana. The Columbia type of A. thaliana is derived from the original collection of F. Laibach’s Landsberg seed which originated in northwestern Poland (Landsberg/Warthe; Robbelen 1965) and was sent to G. Redei in Columbia, Missouri where a single plant was chosen to found the Columbia ecotype. It should be noted that a subset of Laibach’s‘ Landsberg seeds, whiCh founded the Landsberg erecta type, were irradiated but the seeds used to found the Columbia type were not irradiated (NASC; http://seeds.nottingham.ac.uk/Nasc/detail/2005/bglines.lasso). Neither of our field sites is the collection site of the original A. thaliana Columbia type and thus does not reflect precisely the environment to which this genotype is expected to be adapted. However, A. thaliana is widely distributed and these sites do represent environments commonly experienced by this species. Mutation accumulation. The mutation accumulation lines were developed and maintained in the lab of R. Shaw at the University of Minnesota as described in Shaw et a1 (2000). The 120 MA lines are derived from the seeds of a single individual of the 40 Columbia type. The founder was highly homozygous, so any average differences between MA lines is due only to new mutational variance. The lines were maintained in the greenhouse by single-seed descent (N, = 1), which reduces selection and maximizes genetic drift. The founder genotype was maintained as seed stored at 4°C. Due to the expected large environmental variance in the field, we randomly chose a subset of 50 generation 17 MA lines for the field study. Prior to the field trial, five replicate sublines were created of each of the 50 MA lines and each of 6 Ancestor lines to generate seed (common garden generation). Ancestor lines were created from the generation zero seed and six such plants were chosen to represent Ancestor lines in the common garden generation. Offspring of the common garden generation were planted in the field for this experiment. The entire design was replicated at two sites, Kellogg Biological Station in Hickory Corners, Michigan (MI) and Blandy Experimental Farm in Boyce, Virginia (VA). At each site, we planted 140 replicates of each MA line (28 seeds from each of five sublines X 50 lines = 7000 MA individuals). The Ancestor genotype was represented by 84 (VA) or 90 (MI) replicates of each of six Ancestor lines with (18 seeds fi'om each of five sublines X six lines = 540 Ancestor individuals at MI; 14 seeds from each of six sublines X six lines = 504 Ancestor individuals at VA). Individuals were planted in the field as seeds and germination was recorded in the fall. In the spring, survival, flowering and fruit production were recorded; entire plants were harvested upon cessation of flowering. Methods and layout for each site are detailed below. Kellogg Biological Station, Michigan. The field site was a fenced enclosure on the site of an old agricultural field. Single seeds were planted in peat pots (size 4.45 cm by 5.08 cm deep) using topsoil originally from the site. This soil was used to line an 41 artificial pond for ten years before being used in this experiment, which greatly reduced the viable seed bank. Blank pots containing no seeds (N = 650) were planted to control for natural recruitment of A. thaliana in the field at MI. Of blank pots, 2.3% germinated a seedling and 0.9% produced flowers, while 26.4% percent of experimental pots germinated seedlings and 8.9% percent produced flowers. Of experimental pots that germinated seedlings, 4.7% produced more than one seedling. Germination of two seedlings could be due to accidental planting of two seeds while germination of more than two seedlings is most likely due to recruitment from the natural population. Thus, we are confident that the vast majority of our data is from experimental plants. Pots were planted in 70 blocks in the field, each containing 117 pots. Each block consisted of two seeds from each MA line, seven or eight Ancestor seeds, and nine or ten blank pots. Sublines were randomly assigned to blocks. Pot identity (MA seed, Ancestor seed, or blank) was assigned randomly. Blocks were spaced 0.75 m apart on the north- south axis and 1.0 m apart on the east-west axis to allow observation of each pot fi'om above. Within blocks, pots were spaced 10 cm apart. Pots were filled with top soil to about 1 cm from the top of the pot, seeds were placed into filled pots and all pots were thoroughly bottom-watered and then misted from above just before planting in the field. Blocks were planted over three consecutive days starting October 25, 2004: Blocks 1-20 on day one, Blocks 21-40 on day two and Blocks 41-70 on day three. Pots were not watered after planting in the field. The first rain fell two days after planting was completed. Germination was recorded weekly until November 22nd. Twice-weekly checks of flowering and mortality began in the spring on April 5th. Above-ground 42 biomass including fruits were collected for plants that flowered (N = 621) when they ceased flowering or died. Blandy Experimental Farm, Virginia. Seeds were planted in the same size peat pots as those at MI (4.45 cm by 5.08 cm deep) using Sunshine # 3 as the soil mix. One seed was planted per pot. A. thaliana does not occur naturally in the experimental plot thus no blank control pots were necessary at VA. Pots were planted in 14 blocks, each containing 536 pots. Each block consisted of 10 seeds per MA line (two plants per subline by five sublines) and 36 Ancestor seeds (six seeds per subline by six sublines). Blocks were spaced to allow for observation from above. Pots within blocks were spaced 10 cm apart. Pots were filled with soil mix to about 1 cm from the top of the pot. Blocks were planted over two days (October 30-31, 2004). Germination was recorded weekly until December 16th. Weekly flowering and mortality checks began in the spring on March lst. Above-ground biomass including fruits were collected for plants that flowered (N = 1602) when they ceased flowering or died. Data preparation. Four response variables were calculated for analysis. Germination and survival to flowering were coded as presence/absence. Germination included all individuals while survival to flowering included only individuals that germinated. Fruit number only included individuals that survived to flowering. Total fitness was calculated as the product of germination, survival to flowering and fiuit number. Groups of five adjacent blocks at M1 were combined for analysis to form blocks of a similar size to that of VA, resulting in 14 blocks at each site. The distribution of raw line means for fruit number and total fitness was normal for the entire dataset but when separated by site, the distribution was normal for VA and 43 not normal for M1. In addition, the variance of line means was much smaller due to the means being much smaller at MI than at VA for these three traits, thus violating the assumption of equal variances. Log transformations were performed to address these issues, but results were similar so only analyses of untransformed variables are presented here. Germination and survival to flowering were normally distributed and not log- transformed. For presentation purposes, variables were relativized by site to reduce issues of scale. Relative values were calculated by dividing raw values by the site mean value. Except for the estimates of site means, the results of analyses of relative values was identical to that for raw values. Evidence for accumulated mutations comes from both a change in mean fitness between the Ancestor (generation 0) and the MA lines (generation 17) as well as by the presence of significant among-line (genetic) variance of MA lines due to new mutations (V1). The Ancestor is expected to have no genetic variance because it was derived from a single highly homozygous individual. Screening of molecular variation also failed to find genetic variation (Shaw et al. 2000). To test whether there is significant V1, in the MA lines, among-line variance of MA lines was analyzed using the MIXED procedure in SAS (SAS Institute 2004). All traits were modeled individually as the sum of random effects of the ith line nested in the gth linetype (l,(tg)) and the jth maternal subline nested within each line and linetype (ml-(l,- tg)) in the kth block (bk), as well as the fixed effects of linetype (tg; MA or Ancestor) and of the site at which the plant grew (Sf), the random effect of the interaction between site and line (Sf*lt(tg)), and error (e). This represents the equal variance model including the test for GEI (sf*l.(tg)). An unequal variances model, estimating separate variances for 44 each site, was also run for each variable. The separate variances estimated for each site replaces the site by line interaction in the equal variance model. Parallel models fitting separate variances for each linetype (MA vs. ancestor) were also fit. The components of the model were estimated by restricted maximum likelihood (REML) and significance of each factor was tested with a likelihood-ratio test. The difference in the two-times log-likelihood of the fiill model minus the model with each effect removed was compared to a one-tailed Chi-squared distribution (one-tailed because variance components cannot theoretically be negative; Littell et a1 1996). The fit of the unequal variance model was also tested with a likelihood-ratio test. The unequal variance model was compared to an identical model which was constrained to equal line variances for the two sites and included the site"‘line interaction (as above). The unequal variance model was a better fit for all variables (P < 0.02; one-tailed Chi-square) except survival to flowering, thus this model was used in subsequent analyses. Survival to flowering was analyzed with the equal variances model. Estimating equal variances for the line effect was a better fit for all variables, overall and at each site. For each trait overall and at each site, the per-generation increase in genetic variance due to mutation (VM) was calculated as VM = L / 2t , where t is the number of generations of divergence (Lynch & Walsh 1998). Mutational heritability (hf, ), which is the rate of increase in heritability due to new mutations, was calculated as hi, 2 M /VE . The mutational coefficient of variation (C VM ), which standardizes VM by the trait mean, was estimated as C VM = 100 x fi/X (Houle et a1 1996). Results 45 No significant decreases in mean fitness of the MA lines relative to the Ancestor were found for any trait (Table 3.1). In fact, the MA lines had significantly greater survival to flowering and a trend toward increased fruit number compared to the ancestor. For nearly all traits, the REML estimate of among-line variance for the Ancestor was zero as expected, and was not significantly greater than zero for any trait (Table 3.2). There was significant VL for total fitness overall (Figure 3.1, Table 3.2), which is evidence that mutations affecting fitness did indeed accumulate. This increased variance was due mostly to the germination and survival to flowering fitness components. However, because estimates of V1, have large standard errors, we did not detect significantly more variance in the MA lines relative to the Ancestor for any trait. Values for mutational variance, mutational heritability and the mutational coefficent of variation (Table 3.2) fall within the range observed in other species (Lynch 1988; Houle et a1 1996). Mean trait values were significantly different between MI and VA for all traits (Table 3.1). Significant GEI was found in the MA lines for all response variables. Plots of all traits in the two sites show substantial crossing of reaction norms (Figure 3.2). In contrast, there was no evidence of GEI among the Ancestor lines for any trait, as expected. Correlations of MA line means between the two sites are not different from zero (Figure 3.2), indicating that the trait value in one environment does not predict the trait value in the alternative environment. Thus, a mutation which decreases fitness in VA may be neutral in MI. There was no significant site by linetype interaction, indicating that though fitness is lower overall in M1 the proportional change in fitness due to new mutation is similar in MI and VA. 46 SE mod 8.: 2:. 8.8 3:. seem 33 82 88 6.8 «m: 8.28 38 88:: new 82 8 .o 88.8 53 88.8 32 weaves: 9 335m 82 So 68.8 ammo 88.8 m u no use 8:258 <> 8% «to o 38.8 on; nose Boy am So 58 as $8 88 855 men «.3: 3o 2 8.8 $3 888 83 $838 9 325m 83 3o 88.8 $3 a 8.8 Ewe as Segue E ”a: moo 6.8 mom 3.: 2; mafia 30H 88 :8 8.8 32 AS 8 8.2: 88:: new 33 $3. :88 483 88.8 $3 3538 9 335m 88; So 88.8 :2 a 8.8 83 as cougéoo 326 £2 a <25”: .5582 “a; an 29.239: coin—o Ew com .5955: E5 mm mmofim 38,—. .338 05 .«o Show 3.083 65 Bow 8d moggi .2256 v 5 33b =a com 550 :80 Boa #:8ch bimofiammm 8m 2808 8mm 2:. .85. <2 28 83.354 05 5 603 £808 :8... Ad 033—. 47 8.8 8. .o 8.428 328 8.8 n. 8.5 new. o mafia EB 83 8 3 8.38 3:: 8.8 _.~ 8.88 I: 8.88 ode” .382 ea..— 82 83 88.8 83 A48 8 5.8 42 x 3. f: a «.8 1: u m; o 8:58; 9 32:8 8; 8.3 88.8 83 f: 8 n8 42 x 8.8 f: u 8.81: u 8.. o as 885538 <> S. 83 8.8 8.8 f: x 8.8 L: x 8.8 f: 8 08 we x 3 o 32.8 83 - - 2.8 $8 - o o 8:5: :5 82 :2. 88.8 9: .o f: x .842 x 3 f: x «.8412 x E o 85.258 2 38:8 82 ~35 88.8 83 88 x 28 4.: x 3 f: x 8.8 n.8 x as f: x 8.8 me x 2 as 8858.8 :2 8.8 85... 8.8 35. 8.8 83 8.8 8.2 o age .33 N86 856 8.88 83 8.: Z 8.88 38 8.88 :8 .382 man .2 288 88.8 :28 88 x m. c 0.8 x 3 f: x _.8 .12 x S o 853%: s .228 ad Sod 88.8 s: .c f: x 4.8 42 x 2 f: 8 N8 42 x 3. o 22 885.50 :88 :90 A82 5 3% MS 3S .S 95: 88H 25 .8855th E .880 98:88 53> 88.888 0.8 84:8 8988.» .8 808508 383858 Be 348 8:88: 3:288 .38 83:? 855885 .38 Sears, 8.28:: .83 v 8 eQN see 8:288 32:85:88 08 85 88850 02833 0858 833 20m .8882 05 98 8:: <2 c8389 8:898 5 8858888 Hgomawmm o: 203 085. .3: 3:: <2 can A25: 8882 05 E 83:58 855.» 82-308484 .dzmm .~.n 035. 48 Figure 3.1. Among-line variation in mean total fitness by site. Gray distribution = MA lines. Black distribution = Ancestor. Values were averaged by line. N M = 50 families. NAM = 6 families. (A) VA, (B) MI. 49 Figure 3.1 10 20 30 40 50 60 70 80 Total fitness, VA 0.1 0.3 0.5 0.7 0.9 1.2 1.4 1.6 Total fitness, MI 50 Figure 3.2. Genotype by environment interaction for all traits. Reaction norms of relative trait values were calculated by dividing by the mean within each environment. Each point represents a line mean. (A) germination rate, MA, (B) germination rate, Ancestor, (C) survival to flowering, MA, (D) survival to flowering, Ancestor, (E) fruit number, MA, (F) fruit number, Ancestor, (G) total fitness, MA, (H) total fitness, Ancestor. Chi-square and P values for significance of GEI are shown. Correlation coefficients and P values for the significance test for a non-zero correlation are given for all MA plots. 51 A Relative germination rate, MA 1.5 "‘l— 2 F r = 0.044 x = 9.5 P=076 P=oom 125» — 1 ._ _ 015T - as l C Relative sum‘wl to flowering, MA 0.8 T Ml r = 0.014 P: 0.92 VA x2=26 P=005 Figure 3.2 1.25 0.75 0.5 Ml VA 52 3 Relative germination rate, Anc 1.5 — 1.25 —- 0.75 .— x2=o4 P=026 0.5 MI VA 1.25 0.75 0.5 D Relative suwhal to flowering. Anc 1.6 —- 0.8 «— _o4« - 0.8 - 0.4 Ml VA Figure 3.2 (cont’d). Relative fruit number, MA Relative fruit number, Anc F 3 —— [ 3 3 l __ 3 2 x=1 2.5 -— __ 2 5 2.5 -~ P: 016 -, 2.5 2 <— __ 2 2 —~ __ 2 1.5 -— -_ 15 1.5 ~— __ 15 ‘————A 1 ~— ._ 1 1 —— —- 1 0.5 —~ __ 0 5 0.5 —— __ 0 5 0 l o 0 l 0 N” VA Ml VA Relative total fitness, MA Relative total fitness, Anc 3T "3 3w *3 x2=03 2.5 -— —— 2.5 2.5 —— p: 029 —— 2.5 2* *2 2» «2 15* $15 15* «15 1 -- -- 1 1 -~ ~- 1 0.5 ~- -- 0 5 0.5 —- -— 0 5 0 l 0 0 l 0 MI VA MI VA 53 Discussion Effects of mutation on mean and variance In agreement with earlier laboratory studies of these lines, we found no evidence of decreased mean fitness of the MA lines relative to the Ancestor (Table 3.1; Figure 3.1) but we did detect significant genetic variance due to new mutations (V1) for total fitness overall (Table 3.1), confirming that spontaneous mutations accumulated in these lines. The presence of significant GEI for all of the fitness components in the MA lines is further evidence that mutations have accumulated. Prior studies of these MA lines have also failed to find a mean decrease in fitness in various laboratory environments (Shaw et a1 2000; Chang & Shaw 2003; Kavanaugh & Shaw 2005). Additionally, an independent experiment using the same A. thaliana ancestor and single-seed descent to accumulation mutations found no change in trait mean after 11 generations of MA, even with one set of MA lines subject to UV-B treatment (MacKenzie et a1 2005). These findings of no decline in fitness of MA lines relative to the Ancestor stand in contrast to many published MA studies that detect decreased fitness, including an independent study of 1000 MA lines of A. thaliana Landsberg erecta (Schultz et al 1999). Alter ten generations of single-seed descent mutation accumulation, Schultz et al (1999) detected a small decrease in fitness relative to the Ancestor for two fitness components (seed set and total fitness) and significant among-line variance for one (seed set). However, they failed to find any change in mean or variance for two other traits (germination success and fruit set). It is important to note, however, that Schultz et al (1999) used the Landsberg erecta type of A. thaliana and it is known that Reidy mutagenized Columbia seeds to create Landsberg erecta. Thus, Schultz et a1 (1999) have 54 a different starting genotype. Keightley & Lynch (2003) suggest that the length of MA and relatively small number of lines employed by Shaw et al. (2000) may explain the difference between these two studies. Indirect molecular evidence supports the idea that mutation should decrease fitness in A. thaliana -- Wright et al. (2002) estimated that 88% of mutations are expected to be deleterious in this species, based on tests for selective constraint on protein-coding genes. However, failure to detect a decline in mean fitness due to MA, at least under some assay conditions, is not an uncommon result (Keightley & Caballero 1997; Shabalina et al. 1997; Vassilieva & Lynch 1999; Shaw et al. 2000; Zeyl & DeVisser 2001; Azevedo et al. 2002). As suggested above, the length of the period of mutation accumulation may have been too short to accumulate a substantial number of mutations affecting fitness. A longer period of mutation accumulation may eventually result in a mean decline in fitness for these A. thaliana MA lines. Challenges to MA studies Mutation accumulation studies are, in theory, an excellent method of observing the distribution of mutational effects, estimating the average effect of mutations on fitness and estimating the rate of genomic spontaneous deleterious mutations for fitness. However, these studies are difficult for a number of reasons. They are time-consuming and frequently fail to detect significant changes in means and variances. Ideally, MA studies would meet the following criteria: (1) No genetic variance at generation zero. This would ensure that all observed genetic variance is due to new mutations. (2) Perfect storage and maintenance of the ancestral genotype (i.e., no opportunity for selection on the ancestor) so that observed decreases in fitness could not be due to increased fitness of the ancestor. (3) Multiple independent lines initiated from multiple homozygous 55 genotypes; this would give us a better idea of an “average” mutational effect than the use of a single genotype given that mutational effects are not completely independent of genetic background. (4) Rapid fixation of mutations and no loss due to drift. (5) Perfect reduction of selection to zero so that all mutations are maintained (impossible due to lethal mutations); the latter two qualities would give us the full, unbiased spectrum of mutations that occurred rather than just a sample. (6) A long period of accumulation, at least fifty generations for diploids; a long period of accumulation allows a sufficient number of mutations to accumulate such that we can unambiguously detect a change in mean fitness. (7) Accumulation performed under multiple environmental conditions (as in Xu 2004); use of multiple accumulation conditions allows us to estimate the bias in our results due to highly deleterious mutations being removed regardless of our attempts to reduce selection. (8) Fitness assayed under multiple environmental conditions, including environments representing the natural habitat of the organism as well as the environment under which mutations accumulated; use of multiple assay environments helps us to understand the effect of mutations, averaged across many environments (similar to averaging across genetic backgrounds). Many of these characteristics are not achievable; no experiment can perfectly reduce selection to zero or rapidly fix mutations while minimizing genetic drift. MA experiments performed with organisms with short generation times and selfing or asexual reproduction come the closest to meeting these criteria. However, selection cannot be reduced completely to zero and as a result the estimates of mutation rate and effect are biased (due to the removal of mutations that are highly deleterious). In addition, few MA experiments have utilized multiple genotypes (but see Baer et al 2006) or accumulated mutations under multiple conditions (but see Xu 56 2004). The use of multiple environments and genotypes is one approach to reduce or estimate the bias in mutation rates and effects. Given these conditions, it is not surprising that a number of MA studies fail to detect a decline in mean. The A. thaliana MA lines employed here meet many of the criteria outlined above: they are selfing, started from a highly homozygous base and maintained by single- seed descent. Though it is possible that more generations of MA would lead to a significant decline in the mean, other alternatives may also explain these results. Shaw et al (2002) modeled the distribution of mutational effects for the MA lines employed by Shaw et al (2000) and concluded that the presence of new beneficial mutations might explain the observed phenotypic distribution, with perhaps up to half of mutations being considered beneficial. This has been a controversial interpretation. Keightley & Lynch (2003) suggested several alternative explanations (1) The traits measured are under stabilizing selection, (2) the length of the mutation accumulation and replication of lines was insufficient, and (3) the analysis does not compare alternative models. Shaw et al. (2003) support the interpretation of seeds per fruit and fruits per plant as appropriate measures of fitness in their study. They concede that with a longer period of mutation accumulation, a decline in fitness may be apparent. Finally, regarding the analysis, Shaw et al (2003) defend their use of a distribution of mutational effects continuous through zero but allow that their model continues to undergo revision. The issue of statistical power is also relevant to our study. Perhaps with greater replication a significant decline in mean would be apparent. However, it is interesting to compare the lack of decline in mean found by Shaw et al (2000) to the significant decline in mean observed in Raphanus raphanistrum for total fitness in the greenhouse (Roles & 57 Conner, Chapter 2). Roles & Conner (Chapter 2) assayed fitness (total seeds) in the greenhouse after ten generations of MA, using a middle-class neighborhood (MCN) crossing design. Roles & Conner (Chapter 2) maintained two independent populations, each with N, ~ 300. This crossing design reduces selection by equalizing reproductive output while drift is minimized by the maintenance of a large population. Theoretically, many mutations are becoming fixed in A. thaliana while none are fixed in R. raphanistrum (where they are maintained in heterozygotes). Thus, with relatively low statistical power (few generations, outbred design) Roles & Conner (Chapter 2) detected decreased fitness while Shaw et a1 (2000) were unable to detect a change in mean. This comparison suggests that alternative explanations for the results of MA in A. thaliana should be sought. For example, fungal endophytes are ubiquitous and may protect plants from pathogens, leading to increased fitness (Herre et al. 2005). If some of the MA lines acquired fungal endophytes during the MA process, they might display higher fitness than those without the endophytes and the signal of mutational decay would be masked. Mutational variability Though we did not detect a decline in mean fitness, we did observe significant among- line variance in the MA lines for several traits (Table 3.2). Mutational heritability (hid) describes the per generation increase in heritability due to new mutations for a population with a homozygous base (Houle et a1 1996). Our values (~1 x 10“) are an order of magnitude lower than the average (1 x 10'3; Houle et a1 1996) and on the low end of the observed range for various fitness components in other studies and species (e.g., Fernandez & Lopez-Fanjul 1997; Lynch et al 1999; Schultz et a1 1999; Vassilieva et a1 2000; Downie 2003; Xu 2004; Kavanaugh & Shaw 2005; Baer et a1 2006). In 58 comparison to laboratory studies using the same MA lines, our estimates of mutational heritability are much lower than those found by Shaw et al (2000) and by Chang & Shaw (2003) (~l x 103) but similar to those reported by Kavanaugh & Shaw (2005). One possible explanation for the low mutational heritability estimated here is high environmental variance. Prior studies have been performed under controlled conditions in the greenhouse and are expected to have lower environmental variance than the uncontrolled field environment. The direct comparison of V5 = 1131.3 for fruit number, as reported by Shaw et a1 (2000), is substantially lower than our field estimate of V5 = 4516.94 for total fitness overall and V5 = 8959.53 for total fitness in VA. On average, V5 is about 103 VM (Houle et al 1996) while in our study V5 is on the order of 104 VM; this difference entirely explains our lower than average values for mutational heritability. An alternative measure of mutational variability that. does not depend on V5 but instead scales the variance by the mean is the mutational coefficient of variation (C VM). In a review of mutational variability, Houle et a1 (1996) found that CVM varies substantially among traits and organisms with a range of ~ O.13-29.22. Our estimates of C VM (0.55 - 4.2) are within the range observed for life history characters in Houle et a1 (1996) and similar to the estimates for similar traits in Arabidopsis (Shaw et al 2000). These parameters suggest that while mutational variance is expressed similarly in the field and laboratory, the greater environmental variance in the field may decrease the efficiency of selection on new mutations relative to the laboratory environment. GEI for new mutations Contrary to prior studies of these A. thaliana MA lines (Chang & Shaw 2003; Kavanaugh & Shaw 2005), we detected significant GEI for all four fitness components (Figure 3.2). 59 Crossing of reaction norms is evident for these traits and all show a cross-environment correlation that is nearly zero. With no GEI for new mutations we would expect a correlation not different from one, whereas a cross-environment correlation significantly less than one is evidence for GEI. These results indicate that the mutations are not consistent across environments in their effects; lines with the highest fitness in MI often did not have high fitness in VA. Prior experiments testing for GEI in these A. thaliana MA lines have failed to detect any interaction when altering nutrient conditions (Chang & Shaw 2003) and light conditions (Kavanaugh & Shaw 2005). In contrast, we found substantial GEI in our two field experiments. One possible explanation for this lack of agreement is the assay environments. Our two field environments are likely to differ from each other in many unquantified ways while the two prior laboratory studies each differed in a single variable. Sample size, and thus power, may also have been a factor. Chang & Shaw (2003) and Kavanaugh & Shaw (2005) each assayed just 20 MA lines, with 20 replicates each, while we assayed 50 MA lines, with 140 replicates each. Of course, the greater environmental variance of the field may balance the larger number of replicates. Manipulative experiments of single variables in the field may help to elucidate the relationship between field and laboratory assays. Mixed results are common among MA studies of GEI for new mutations. In D. melanogaster, four studies have found evidence of GEI (Kondrashov & Houle 1994; Fry et al 1996; Fernandez & Lopez-Fanjul 1997; Shabalina et al 1997) while one study found no evidence of GEI (Fry & Heinsohn 2002). It is notable that Fry & Heinsohn used MA lines that accumulated mutations for only about 30 generations versus over 60 60 generations for the other four studies. Perhaps the effects of GEI after 30 generations were too subtle to be detected but would become apparent after more accumulation. Studies in C. elegans have failed to find evidence of GEI for new mutations (V assilieva et al 2000; Baer et al 2006). Both studies accumulated mutations for approximately 200 generations, far longer than the studies in Drosophila or Arabidopsis. However, the mutation rate may be lower in C. elegans, thus a longer period of accumulation may be necessary to observe evidence of GEI for new mutations (Baer et a1 2006). Finally, a single study in Saccharomyces cerevisiae found evidence of GEI for new mutations (Szafraniec et a1 2001). These studies encompass a broad range of taxa and life history strategies. Taken together, the available evidence suggests that GEI for new mutations is common but may not be ubiquitous. Where GEI is present, it is likely that U for fitness is underestimated in the laboratory with respect to its value in field environments, as stressful environments often display lower fitness than benign environments. In summary, we have found no significant changes in mean fitness, in agreement with previous studies of these MA lines. However, we did find evidence of substantial GEI for new mutations in two field environments, contrary to prior work in these lines. These results suggest that the distribution of mutational effects may be similar between laboratory and field environments but that single-variable manipulation in the laboratory may not reflect GEI in a field environment. Further study of genotype-environment interaction in the field is needed to understand this discrepancy. 61 CHAPTER 4 LINKING GENE EXPRESSION AND SPONTANEOUS MUTATIONS: MICROARRAY STUDIES OF ARABIDOPSIS T HALIANA MUTATION ‘ ACCUMULATION LINES Introduction The generation of variation through spontaneous mutation is a crucial part of the evolutionary process. The rate and distribution of fitness effects of spontaneous mutations are important determinants of predictions of the maintenance of genetic variation (Houle, Morikawa, and Lynch 1996), the evolution of sex (Keightley and Eyre- Walker 2000), the evolution of aging (Rose 1991) and the persistence of small populations (Lande 1994; Lynch, Conery, and Burger 1995). Most studies of spontaneous mutation have used a phenotypic approach known as mutation accumulation. In mutation accumulation (MA), selection is reduced for many generations, allowing deleterious mutations to accumulate. The cumulative effects of those mutations are visible in their effects on fitness when compared to a control that has not experienced MA. Mutation accumulation is usually performed with selfing or highly inbred species and proceeds by initiating all replicate lines from a single highly homozygous parent individual (e.g. Schultz et al. 1999; Vassilieva & Lynch 1999; Shaw et al. 2000). Selection is reduced and the effects of genetic drift maximized. This is achieved by maintaining a very small population size (often one randomly chosen individual each generation), with the result that selection must be quite strong to remove a new mutation (s > 0.5; Lynch et al. 1999). After MA, the fitness of the MA lines is assayed with a 62 control line. The whole-genome rate of deleterious mutations impacting fitness (U) can then be estimated from the change in mean fitness and the increase in additive genetic variance of the MA lines relative to the control lines (Lynch et al. 1999). While MA does directly examine the effects of new mutations on fitness, it cannot reveal information about the nature of individual mutations (i.e. which genes are affected; what kinds of mutations are common). Mutation accumulation studies focus on the highest-level phenotype, fitness, and typically ignore mutational effects on lower-level phenotypic traits. This method also underestimates U due to the likelihood of failing to detect mutations of small effect in the laboratory (Lynch et al. 1999). Alternatively, direct sequencing of MA lines has been employed recently to estimate the mutation rate per nucleotide site per generation (u; Haag-Liautard et a1. 2007; Denver et al. 2004). One can then use u to estimate U by multiplying by estimates of the size of the organism’s diploid genome (number of bases, 20) and the fraction of mutations removed by selection (C; Kimura 1983). This method directly examines new mutations but is reliant on good estimates of selective constraint, C. This is usually estimated from between-species comparisons of substitution rates (Halligan & Keightley 2006). While providing information on the rate, U, of genomic deleterious spontaneous mutation, it provides no information on the effects of those mutations on fitness (aside from the assumption of decreasing fitness). In addition, estimates of C are devoid of genetic or environmental context on which mutational parameters may depend thus their general applicability is unclear (Shaw et al. 2003). A relatively new approach employs MA in conjunction with estimates of gene expression to infer the properties of spontaneous mutation and its cOntribution to the 63 evolution of gene expression. The degree to which a gene is expressed is measured by the amount of mRNA produced by that gene. Gene expression is the first phenotype produced by a DNA sequence and the effects of changes in gene expression may cascade through to affect the fitness of the whole organism. Currently, we know very little about how changes in gene expression result in differences in visible phenotypes; thus, the study of gene expression is a very active area of research (Gibson 2002). However, the study of gene expression holds promise to address the question of how so few sequences differences can produce such dramatic phenotypic differences between organisms (e.g. the 1.5% sequence difference between chimps and humans). Assays of gene expression generally do not detect differences in the sequence of the loci whose expression is assayed but rather differences in regulation of those sequences. The idea that regulatory differences may be more important in evolution than structural change in proteins is an old one, first proposed by King & Wilson (1975) and more recently championed by Carroll (2005). Thus, microarrays now allow us to address the importance of regulatory changes to phenotypic evolution. There have been several recent studies that address the selective importance of variation in gene expression. Derome et al. (2006) found evidence of adaptive changes in gene expression in sympatric lake Whitefish and Matzkin et al. (2006) identified changes in gene expression associated with host shifts in Drosophila mojavensis. However, the overall selective importance of variation in gene expression is contested. Khaitovich (2004) has suggested that most variation in gene expression may be neutral or nearly- neutral, thus contributing little of adaptive significance. 64 While several studies have found evidence for adaptive changes in gene expression, those studies do not tell us whether most variation in gene expression evolves neutrally or selectively. To date, two studies have addressed this question, using MA lines. Denver et al. (2005) studied divergence in gene expression between MA lines in Caenorhabditis elegans (353 generations of single-progeny descent) and contrasted this to expression profiles for natural isolates. They compared genetic variance (V3) in the natural isolates to mutational variance (Vm) in the MA lines. Under neutral models, the ratio of V8/ V,,, is expected to be equal to 4Ne, for primarily selfing diploid organisms (Lynch & Hill 1986). If purifying selection is important, this ratio will be less than the neutral expectation. Denver et al. (2005) found that the ratio was much less than the I neutral expectation for all genes, suggesting strong stabilizing selection on gene expression. Rifkin & Houle (2005) also estimated V", for gene expression, in MA lines of Drosophila melanogaster after 200 generations of relaxed selection. Similar to the comparison of Denver et al. (2005), they used the estimates of V", to compare the expected difference in expression between Drosophila species under. mutation-drift equilibrium to the observed difference. For almost all genes for which they could estimate V”, the differences between species were less than expected, again indicative of the impact of stabilizing selection. The importance of stabilizing selection also suggests that most mutations affecting gene expression are likely to be deleterious (Gilad 2006). Neither of the above studies addressed the distribution of mutational effects (i.e., the mean and variance of effect size of new mutations) on gene expression. Thus far, the distribution of mutational effects has been studied from the phenotypic perspective of 65 fitness components (e.g. Shaw et a1. 2000; Vassilieva et al. 2000; Baer et al. 2006) but not using molecular data. In this chapter I report assays of gene expression in MA lines of Arabidopsis thaliana, which were generated in the Shaw lab at the University of Minnesota and maintained by selfmg with single-seed descent for 17 generations (Shaw et al. 2000). I chose three MA lines that exhibited the greatest reduced fitness relative to the ancestor. Assuming that differences in gene expression translate into differences in fitness, the greater the difference in fitness, the more likely that changes in gene expression will be visible with microarrays. I ask the following questions about mutation using these data: 1) What is the distribution of mutational effects on gene expression? 2) On average, do new mutations increase or decrease levels of gene expression or is there no bias? 3) Do independent MA lines show any parallel changes in gene expression? Parallel changes would be indicative of different mutations affecting the same pathway or expression of the same gene rather than an identical mutation. 4) What is the degree of pleiotropy in new mutations affecting gene expression? The presence of clusters of genes which display similar changes in expression gives an indication of the pleiotropy of new mutations. Methods Selection of Arabidopsis MA lines. Lines were chosen based on the results of Shaw et al. (2000). Given that we expect most mutations to be deleterious, we chose lines with low fitness relative to the Parent. However, the mutations causing low fitness in the assay environment of Shaw et al (2000) may not be detrimental under all environments. As we found in Chapter 3, genotype-by-environment interaction does exist in these lines, thus 66 we do not know whether the mutations in these three lines are detrimental under these growth conditions. Two of the chosen MA lines were also included in the field assay (Chapter 3), L40 and L5. Neither of them had lower than average MA fitness in the Virginia field site but both of them had lower than average MA fitness in the Michigan field site. Thus, we cannot be sure that these mutations would cause the same phenotype under these growth conditions as was seen in the greenhouse by Shaw et a1 (2000). Ideally, we would have included three lines of high relative fitness (as defined in the Shaw et al [2000] assay) but resource limitations prevented the expansion of this project to six MA lines. Three lines that exhibited reduced fitness relative to the ancestor were chosen for analysis (L-39, L-40, L-S). Within each MA line, seeds are produced by selfing and are identical except for new mutations in that individual. Three ancestral lines were also selected to represent the baseline expression profile (P-61, P-26, and P-98). The Parental lines are one generation removed from generation zero and were created to generate seed. These three Parental lines were treated as a single line (P) in the analysis of gene expression described below. Four biological replicates of the mRNA of each line were generated and hybridized to arrays in pairs, one Parental line with one MA line per slide (Table 4.1). In each pair of samples, the MA line was labeled with one fluorescent dye and the Parental line with another. A dye-swap was performed for each pair of samples to account for different labeling efficiencies of the two dyes. The dye-swap design was repeated once resulting in four pairings of each MA line and a Parental line. The experiment was designed to compare each MA line profile against the control profile of the Parent (Steel et al. 1997). 67 Table 4.1. Dye-swap array design. All pairings include one MA line (L-5, L-39, L-40) and one Parent line (P-26, P-61, P-98). Each line was labeled twice with each dye for four total biological replicates. Array Cy-3 labeled sample Cy-S labeled sample 1 P-26 L~5 2 L-5 P-26 3 L-5 P-61 4 P-98 L-5 5 P-26 L-40 6 P-61, L-40 7 L-40 P-98 8 L-40 P-61 9 P-98 L-39 10 _ P-61 L-39 11 L-39 P-26 12 L-39 P-98 Growth of seeds. Seeds from each line were sterilized with a solution of 50% bleach, 0.1% Tween-20 and washed with sterile water before planting on sterile MS agar plates (4.33 g/L Murashige-Skoog 1X salts, 10 g/L sucrose, 7 g/L bacto agar, pH 5.7). Several hundred seeds were suspended in 2-mL of sterile 0.1% agarose and spread on each MS plate. Plates were placed at 4°C for two days then moved to a growth chamber maintained at 21°C with a 12-hr light, 12-hr dark cycle. Shoot biomass was harvested ten days after moving plates to the growth chamber. Each plate was flash-frozen with liquid 68 nitrogen and the frozen shoot biomass was aliquoted into 1.5-mL labeled microcentrifuge tubes and stored immediately at -80°C. RNA isolation. RNA was isolated from frozen tissue ground in liquid nitrogen using a Qiagen RNeasy Plant Mini Kit (Qiagen, USA) yielding between 50 and 100 pg total RNA. The quantity of RNA was assessed using the A260/A280 ratio as measured on a spectrophotometer. The quality of RNA was assessed with a Bioanalyzer 2100 (Agilent Technologies, Palo Alto, CA) using the Total RNA. Pico assay. RNA was extracted from each line four times for 24 total isolations (four isolations X six lines). All four tissue samples for a line were derived from a single MS plate, thus minimizing environmental differences between replicates. There were four biological replicates for each line. Isolated RNA was stored in water at -80°C. RNA labeling, purification and dye coupling. Samples were reverse transcribed into cDNA with aminoallyl-labeled dNTPs. The samples were purified using a Qiagen PCR Purification kit (Qiagen, USA) and vacuum-dried (10 uL 95% ethanol was added to ensure rapid drying). The purified aminoallyl-labeled cDNA was coupled to a NHS-ester Cy dye (either Cy3 or Cy5). A second purification was performed with the Qiagen PCR Purification kit to remove uncoupled dye and the sample was again vacuum-dried. A small sample (2.5 uL) was reserved to check for proper coupling of the dye to the cDNA on an agarose gel. After drying, visual inspection of the pellet confirmed coupling (a colored pellet indicates proper incorporation of the dye) before proceeding with slide hybridization. Slide preparation. Arabidopsis oligonucleotide microarrays were obtained from the University of Arizona (http://ag.arizona.edu/microarray). Glass-slide arrays were 69 printed with the 29,000 element Qiagen-Operon Arabidopsis Genome Array Ready Oligo Set (AROS) Version 3.0. The AROS contains 29,110 oligonucleotide probes selected fiom the TIGR A. thaliana genome annotation database (release 5.0; www.tigr.org). Each oligo is 70 oligonucleotides long and chosen to show less than 70% identity to all other transcripts (A. thaliana Genome 3.0 Datasheet; www.0peron.com). Arrays were re- hydrated before cross-linking with exposure to 180 mJ UV in a Stratalinker. Just before hybridization, the cross-linked arrays were washed sequentially with 1% SDS, sterile water and 95% ethanol then spun dry. Hybridization & Array Scanning. Samples were paired as in Table 4.1, one Cy3 with one Cy5, resuspended in EDTA and combined. Yeast tRNA was added (1 uL) to reduce background hybridization. The cDNA was denatured by heating to 95°C for 10 minutes. The denatured samples were mixed with 60 uL hybridization buffer and pipetted onto the array underneath a clean lifter-slip. The array was sealed in a hybridization chamber and incubated at 48°C for 14 to 24 hours. After hybridization was complete, the array was washed and dried before scanning with an Affymetrix 428 scanner. Analysis & Results The analysis was performed in collaboration with L. McIntyre and J. Pienaar at the University of Florida, Gainesville. Array images were quantified using ImaGene software version 6.0 (BioDiscovery; www.biodiscovery.com). Images were aligned and spots quantified using a moat. Each spot in an array image consists of multiple pixels and the boundary between pixels representing spots and those representing background must be determined. By default the boundary is defined sharply but in fact there is a 70 transitional zone between background and spot (where signal intensity is intermediate between background and spot). A moat defines this transitional zone, thus clearly delineating spot signal from background signal. Once the moat was defined, mean intensity of pixels in each spot (mean spot intensity) and mean intensity of local background pixels (mean background signal) were calculated for each spot. Intensity was quantified separately for each dye on an array resulting in 24 intensities for each spot (12 slides x 2 dyes). The median local background intensity was subtracted from each mean spot intensity to correct for variation in background intensity across the slide. Centering of the data was performed to give each array an average spot intensity of zero, increasing comparability across arrays. To center the data for each array and dye, we subtracted the median background-subtracted spot intensity for an array/dye combination from the background-subtracted intensity for each spot. Each slide contains 31,200 total spots. Of these, 1,946 represent blank or control oligonucleotide sequences. An additional eleven oligonucleotide sequences were not unique and are considered positive control sequences excluded from this analysis (N = 155 spots). Control spots were used only for data preparation and not in the ANOVA analysis, leaving 29,099 spots representing unique oligonucleotide sequences. Negative control spots allow us to estimate the random, non-specific hybridization of cDNA in the absence of a match between the spotted oligonucleotide and the sample cDNA. Negative control spots consisted of spots containing a buffer (3xSSC; N = 462) or random oligonucleotides (Alien spots; N = 92). The null distribution of intensity values in the absence of hybridization for each slide and dye combination was estimated from these controls. Hybridization was considered successful if a spot had an intensity greater than 71 that of 95% of the negative controls for that slide and dye combination. Spots were included in the analysis if this intensity threshold was met in any two replicates of a treatment. This excluded 5,846 spots, 20% of the total unique oligonucleotides. In addition, some genes were represented by multiple probes on the array (N = 1,681 genes represented by 4,529 spots) and these were removed for consideration in a later analysis. Some of the probes for these genes represent splice variants that are present in only a subset of gene transcripts while others represent sequences that are present in all transcripts. Thus, the multiple probes are not replicates and need to be accounted for separately in the model. Only genes that were detected and represented by a single spot were analyzed here. There are 24,570 genes represented by a single spot and 59% of these genes were considered detected (N = 14,450) and analyzed for differential expression. Intensity of each spot (gene) was analyzed in a univariate ANOVA using SAS v.9.1 (SAS Institute 2004) in the model,. Y 0,, =p+d, +11. +30, Where Yijk is the transcript abundance for dye i, line j, and replicate k. Each transcript abundance is modeled with the overall mean for that transcript (p), the fixed effect of dye (d), the fixed effect of line (I) and error (e). Contrasts were constructed to compare the mean intensity for each MA line against the mean intensity of the parental lines (i.e. L39 versus Parent). Nominal P-values are reported here (P < 0.005). A False Discovery Rate (FDR) correction was performed to account for multiple comparisons resulting in all genes having P = 0.9. FDR controls the proportion of false positives expected in the list of tests rejecting the null hypothesis. Thus, with an FDR set to 0.05 and 100 tests that 72 reject the null hypothesis, we expect five of those to be false positives. This correction is common in microarray studies and is less conservative than a Bonferroni correction (V erhoeven et al. 2005). Here, nominal significance is used to identify candidate genes for further analysis. Genes which have differential expression for each contrast are reported in Tables 4.2-4.4. Many genes were detected as having differential expression for each comparison of a MA line to the Parent. Four loci in L05, seven loci in L39 and 28 loci in L40 were found to have significantly different expression from the Parent (P < 0.005). The distributions of difference from the Parent for each MA line are shown in Figures 4.1-4.3 for all genes analyzed. Further analysis will cluster nominally significant genes by function (e. g., reproduction, physiology, photosynthesis) to study over-representation of functional clusters. For example, we might expect genes involved in reproduction to be over-represented in the set of significantly differentially expressed genes. Discussion Significant differences in gene expression between MA lines and the Parent have been detected for each of three low fitness phenotypes. While these genes are nominally significant and differential expression has not been confirmed, they make a good starting point for future exploration of the data. The genes in Tables 4.2-4.4 can be considered a list of candidate genes whose expression is changed as a result of mutation. The mutations affecting gene expression are most likely present in regulatory elements rather than in the gene whose expression is altered. In addition, these genes are candidates for the path from genotype to fitness. Mutations which alter gene expression may not alter fitness; thus, further study is needed to establish a causal link between gene expression 73 Table 4.2. Least-square mean estimates of centered background-subtracted mean intensity for 4 nominally significant genes in the contrast L05 minus Parent. Direction of change in expression relative to the Parent is indicated in the last column: (+) increased expression in L05, (-) decreased expression in L05. Locus L05 Parent At3g02555 0.45 (0.07) 0.24 (0.04) 0.0017 At4g00810 -2.67 (0.83 0.95 (0.29) 0.0020 At4g20690 1.70 (0.09) 1.36 (0.05) 0.0027 At3g02230 2.41 (0.51) 1.14 (0.30) 0.0029 Table 4.3. Least-square mean estimates of centered background subtracted mean intensity (s.e.) for 7 nominally significant genes in the contrast L39 minus Parent . Direction of change in expression relative to the Parent is indicated in the last column: (+) increased expression in L39, (-) decreased expression in L39. Locus L39 Parent P At3g61250 -0.62 (0.13) 0.02 (0.08) 0.0004 At1g62300 -0.64 (0.30) 0.21 (0.19) 0.0035 At2g23810 -1.10 (0.31) 0.12 (0.19) 0.0036 At2g40420 -1.16 (0.35) 0.48 (0.18) 0.0039 At4g23210 -0.56 (0.22) -0.08 (0.12) 0.0043 At5g44980 -0.68 (0.21) -0.25 (0.12) 0.0047 Atl g1 1300 -0.32 (0.10) -0.11 (0.06) 0.0049 74 Table 4.4. Least-square mean estimates of centered background-subtracted mean intensity (s.e.) for 28 nominally significant genes in the contrast L40 minus Parent . Direction of change in expression relative to the Parent is indicated in the last column: (+) increased expression in L40, (-) decreased expression in L40. Locus L40 Parent P At1g69770 1.19 (0.14) 0.38 (0.08) 0.0001 + At1g17710 1.20 (0.08) 0.63 (0.05) 0.0011 + At5g60550 0.85 (14) 0.53 (0.08) 0.0014 + At3g08940 2.51 (0.23) 2.01 (0.13) 0.0014 + At5g19580 0.54 (0.11) 0.26 (0.06) 0.0015 + At1g42970 2.38 (0.16) 2.09 (0.10) 0.0019 + At1g77650 -1.51 (0.20) -0.49 (0.13) 0.0021 - . At2g46505 1.43 (0.13) 1.11 (0.08) 0.0022 + At3g23580 0.47 (0.15) -0.05 (0.09) 0.0027 + At1g74370 0.74 (0.11) 0.42 (0.06) 0.0027 + At2g38220 -1.47 (0.21) -0.36 (0.14) 0.0028 - At3g59780 1.38 (0.08) 1.10 (0.04) 0.0035 + At5g21105 1.42 (0.11) 1.07 (0.06) 0.0042 + At1g26440 0.84 (0.11) 0.47 (0.06) 0.0042 + At5g51890 0.45 (0.14) -0.00 (0.08) 0.0042 + At31000022 -1.57 (0.29) -0.49 (0.18) 0.0043 - At3g23590 0.72 (0.12) 0.26 (0.07) 0.0043 + At4g35165 0.71 (0.15) 0.31 (0.09) 0.0044 + At1g53345 1.22 (0.18) 0.72 (0.10) 0.0045 + At1g76050 1.02 (0.17) 0.55 (0.10) 0.0046 + At5g09570 0.23 (0.07) -0.07 (0.04) 0.0046 + At4g39300 0.66 (0.08) 0.39 (0.05) 0.0046 + At4g20440 0.80 (0.13) 0.60 (0.08) 0.0047 + At5g13400 0.54 (0.17) 0.09 (0.10) 0.0047 + At1g21270 0.97 (0.14) 0.61 (0.08) 0.0047 + At4g21870 1.71 (0.14) 1.32 (0.08) 0.0049 + At3g62460 0.19 (0.10) -0.16 (0.06) 0.0049 + At3g21330 0.60 (0.11) 0.21 (0.06) 0.0049 + 75 and fitness. Initial exploration would utilize The Arabidopsis Information Resource (TAIR; www.tair.org) to determine what is already known of the function of the candidate gene. Differential expression should also be confirmed before further studies proceed. Quantitative real-time PCR (QRT-PCR) is a follow-up to microarrays that determines the quantity of nucleic acid present in a sample by measuring the amount of nucleic acid present afier each cycle of PCR using fluorescent markers. Once differential expression is confirmed, expression of the candidate genes could be assayed in the offspring of mutation line by wild type crosses. Another possibility for assessing the contribution of a locus to fitness would be RNA interference (RNAi) of the candidate gene expression. This technique works by inserting a sequence complementary to the gene’s normal mRNA which binds to and prevents translation of the mRNA, thus suppressing the expression of the gene. If RNAi results in reduced fitness, similar to the phenotype seen in the MA line, that would be evidence of the gene’s importance to fitness. While spontaneous mutations are likely to affect the alter the degree of expression, RNAi will silence expression of the targeted gene, resulting in more pronounced effects. Based on the estimate of U z 0.1 in A. thaliana and 17 generations of MA, we expect few mutations in each line that affect fitness relative to the number of nominally significant genes that we detected. Thus, either some of these changes in expression are neutral with respect to fitness or there is a large amount of pleiotropy for new mutations affecting fitness. At the molecular level, pleiotropy results from the networked nature of regulation of gene expression. A mutation may affect a regulatory element that changes expression of only a single gene, immediately downstream (cis-acting mutation). 76 Alternatively, a mutation that affects a transcription factor may affect the expression of many genes, distant from the mutated gene (trans-acting mutation). Pleiotropy may be visible in clustering of changes in gene expression in gene families or physiological pathways and this will be explored in future analyses. The distribution of mutational effects in these MA lines is roughly symmetric around zero, with a slight bias in L05 and L40 toward up—regulation relative to the Parent (Figures 4.1, 4.3) and a slight downward bias in L39 relative to the Parent (Figure 4.2). This visible bias is reflected in the mean difference, which is slightly positive for L05 and L40 and slightly negative for L39. Interestingly, of the genes which are nominally significant in these lines, all but four are up-regulated with respect to the Parent in L05 and L40 but all are down-regulated with respect to the Parent in L39. L05 - Parent 2000 > 1600 a 1200 — Frequency 800 a 400 « o- ‘3 9.3836693993939399?) §§§§§§§QQ§§9&§§§§§§§ Figure 4.1. Distribution of mean difference L05 minus Parent for 14,424 oligonucleotides. The distribution is truncated at both ends excluding 66 genes with a difference larger than -1 and 8 genes with a difference larger than +1 .05. The mean difference is 0.005. 77 L39 - Parent 2000 '- 1500 e 1000 ' Frequency 500 -- 0 ~ . 936‘: 95669369366693 9§Q§§Q§QQ§§QQ§§§§§§§ Figure 4.2. Distribution of mean difference L39 minus Parent for 14,410 oligonucleotides. The distribution is truncated at both ends excluding 28 oligonucleotides with a difference larger than -1 and 29 with a difference larger than +1.05. The mean difference is -0.032. L40 - Parent 2000 ~— 1500 ~- Frequency 8 o o 500 -- 0- 9999 99 969999999999 69«999u9 Noox s 6«9 9W9999999M9900G’QIPQ‘Qg’Q'Q'Q'Qg Figure 4.3. Distribution of mean difference L40 minus Parent for 14,426 oligonucleotides. The distribution is truncated at both ends excluding 25 oligonucleotides with a difference larger than -1 and 4 with a difference larger than +1 .05. The mean difference is 0.079. 78 There is no overlap in the nominally significant genes across the three comparisons to the Parent, indicating that different mutations in the MA lines are not affecting expression of the same genes. As mutations are accumulated independently, it is unlikely mutations will be shared between lines. Several of the nominally significant loci are transcription factors or involved in DNA repair and thus are good candidates for further study. In Line 39, loci At3g61250 and At1g62300 are both transcription factors. The former locus is important in the determination of meristem identity during development and the latter is important in senescence and defense responses. Both are upregulated in floral tissues during development (AVT; AtGenExpress Visualization Tool; http://jsp.weigelworld.org/expviz/expviz.j sp; Schmid et al. 2005). In Line 40, Atl g697 70 is very interesting; it is a methylase that preferentially methylates transposon- related sequences, thus reducing their mobility. This gene is up-regulated in the apex and flowers of the developing plant relative to other plant tissues (AVT; Schmid et al. 2005) and is up-reglated in L40 relative to the Parent. Due to the mutagenic nature of transposons, this gene is a good candidate for further study. Locus At3g23580 is also involved in controlling damage to the DNA strand. This locus is a ribonucleotide reductase critical for progression through the cell cycle and for DNA repair. Like the previous locus, At3g23580 is up-regulated in L40 relative to the Parent. A third locus, At4g21870, is a heat shock protein, functioning in repair and maintenance in the cell, similar to the latter two loci. Finally, there is a transcription factor with DNA-binding activity, locus At3g21330, that is also differentially expressed in L40 relative to the Parent. I have chosen to highlight genes which may be important in reproductive function or control of mutation rate in the cell as loci that have a greater chance of being 79 linked to the observed phenotypic differences (decreased fruit number). Thus, these genes are good candidates for knockout studies to compare the resulting phenotype with the observed phenotype of the MA line. However, this does not exclude other genes from contributing to the observed phenotype. Many of the additional nominally significant loci are involved in metabolism and transport in the cell and thus disruption of function could lead to decreased overall fitness of the organism. The function of several of the genes in Tables 2-4 are completely unknown and some genes which do contribute to the phenotypic change may not have displayed differential gene expression. This study examined the contribution of new mutations that alter fitness to changes in gene expression. We found that changes in expression between MA lines and the Parent are detectable. On average the effects of mutations are to slightly up-regulate gene expression, though many of the genes which displayed significant changes in expression were down-regulated relative to the Parent. Future work with this data set will include more extensive analysis and will guide future studies of mutation and gene expression. Genes which were represented by multiple probes were not included here and will be analyzed with a model allowing for splice variants, as done by McIntyre et al. (2006). Other approaches may include clustering and representation analysis. Representation analysis will allow us to determine whether functional categories of genes (e. g., reproduction, growth, root development) tend to have similar changes in expression. In representation analysis, genes are clustered by function (e.g., involved in reproduction, growth, root development), co-regulation (i.e. the expression of multiple genes controlled by the same regulatory elements), or physical distance (regulatory elements may control multiple downstream coding regions). Categories which contain 80 more genes that are differentially expressed than expected are identified as over- represented. The presence of over-represented co-regulated groups implies trans-acting mutations with cascading downstream effects, as was found in C. elegans by Denver et al. (2005). Conversely, over-representation of differentially expressed genes at close physical distances (chromosomal locations) implies cis-acting mutations. Once over- represented pathways are identified, knockout mutants from those pathways can be phenotyped and compared to the existing MA lines. Future studies may also incorporate additional MA lines, including MA lines exhibiting high relative fitness. Given the pervasiVe nature of genotype by environment interaction in these MA lines (Chapter 3), it would be useful to characterize gene expression under alternative environments. In addition, the point at which specific changes in expression occurred could be determined by examining expression of plants across generations. This could be done over the entire genome using microarrays or be explored in detail for specific loci which exhibit altered expression, such as those candidates identifiedhere. 81 CHAPTER 5 CONCLUSIONS The research presented here addressed two main areas of study of spontaneous mutation: fitness in the field and gene expression in the lab. These are the first studies to address the affects of spontaneous mutations on fitness in the field, thus providing valuable insight into the applicability of laboratory studies to field populations. There are also few studies of the impact of accumulated spontaneous mutations on gene expression and neither of the two published studies addressed the distribution of effects (Denver et al. 2005; Rifkin & Houle 2005). In my studies of wild radish, I found that fitness was decreased in mutation- accumulation lines relative to the parent both in the greenhouse and in the field (Chapter 2). However, the proportional decrease in fitness was larger in the field than in the greenhouse. This suggests that application of laboratory results to field conditions underestimates the effects of mutations in nature. . My research on fitness in the field in A. thaliana revealed extensive genotype by environment interaction (GEI) for new mutations in two field assays (Chapter 3). The two environments (Michigan and Virginia) differed significantly in their average fitness, with Michigan characterized as the more stressful of the two environments. This is in contrast to prior studies of GEI in these mutation-accumulation lines which failed to find any GEI for two levels of nutrients (Chang & Shaw 2003) or two levels of light (Kavanaugh & Shaw 2005). This discrepancy provides an interesting avenue for further study (see Future directions, below). 82 The study of mutational effects on gene expression uncovered a number of genes that have significantly different expression in the mutation-accumulation lines relative to the parent (Chapter 4). I found that the distribution of mutational effects appears to be slightly shifted toward up-regulation, though most genes with significantly different expression were down-regulated. There was little evidence of parallel changes in gene expression. Only a single gene was significantly differently expressed in two lines and in opposite directions in the two lines. This study provided many candidate genes for future research. In addition, there are other questions about mutation and gene expression that can be explored with this data (see Future directions, below). Future directions The discrepancy between my finding of substantial GEI in the field for A. thaliana and the lack of GEI found in the greenhouse studies (Chang & Shaw 2003; Kavanaugh & Shaw 2005) demands explanation. Studies of GEI for new mutations suggest that GEI is common but not ubiquitous and there are no apparent patterns in those studies detect that GEI and those that do not. This particular case is intriguing, because-the same mutation accumulation lines are assayed in the field and greenhouse. The field environments are likely to differ from each other in many unquantified ways while the greenhouse studies each varied just a single environmental factor with two levels. A first approach to understand the differing results is to change single variables in the field. In particular, both light and nutrient levels are amenable to manipulation in the field. This study would take a step toward understanding the source of GEI in the field for these mutation-accumulation lines. 83 The gene expression study also opens up avenues of research. The degree of pleiotropic effects that new mutations have on gene expression remains largely unknown. Denver et al. (2005) examined this question in mutation-accumulation lines of C.‘ elegans. They found over-representation of differentially expressed genes in functional clusters with half of differentially expressed genes placed into a single cluster enriched for sperm genes. Similar functional clusters can be defined for A. thaliana using existing resources (i.e. Gene Ontology (GO) codes; Ashburner et al. 2000). This representation analysis gives an indication of the prevalence of trans-acting mutations. T rans-acting ' mutations are those that act distant from the mutation site, such as mutations affecting the structure of a transcription factor. Such mutations do not show up as differential expression of the transcription factor but rather are seen indirectly through their effects on the genes they regulate. ’The alternative, cis-acting, mutations affect expression of loci immediately downstream from the mutation site, which may be in a regulatory region. Denver et a1. (2005) detected the action of this type of mutation by analyzing the pattern of clustering by physical distance. The over-representation of differentially expressed genes spaced close together (less than 10kb in Denver et al. 2005) indicates locally active mutations. A second avenue of pursuit for the gene expression dataset is the analysis of genes with multiple transcripts. A subset of genes on the Arizona array are represented by more than one oligonucleotide. These multiple oligonucleotides correspond to alternative splice variants of the locus. Initially these genes were removed from my analysis but a model can be constructed to analyze them by including a probe effect (McIntyre et al. 2006). Differential expression of splice variants in the mutation-accumulation lines 84 relative to the parent would indicate a mutation affecting only a subset of the transcripts of that gene. A third prospect for further study is the link between spontaneous mutations and fitness. As there are many more mutations expected to occur than are expected to affect fitness, simply the finding of differential expression does not imply differential fitness. The first step in the process is to confirm differential expression of candidate genes. This can be done using quantitative PCR and those genes that are confirmed can be further analyzed. One possible mechanism to determine whether a gene is linked to fitness would be to silence that gene. This can be done using RNA interference, a technique in which an RNA strand complementary to the mRNA of interest is synthesized and injected into the organism. This complementary strand will anneal to the mRNA, creating double-stranded RNA, which cannot then be translated thus preventing expression of the gene product. Observation of the resulting fitness phenotype will illuminate whether this candidate gene is likely to have a direct role in fitness. Of course, the networked nature of the genome means that many genes are pleiotropic so it is possible that this method would disrupt many other functions of the candidate'gene leading to a less-informative phenotype. 85 LITERATURE CITED Agrawal, A. A. 1998. Induced responses to herbivory and increased plant performance. Science (Wash. D. C.) 279(5354): 1201-1202. Ashbumer, M. et al. 2000. Gene Ontology: tool for the unification of biology. Nature Genetics 25:25-29. Azevedo, R. B. R., P. D. Keightley, C. Lauren-Maatta, L. L. Vassilieva, M. Lynch, and A. M. Leroi. 2002. Spontaneous mutational variation for body size in Caenorhabditis elegans. Genetics 162:755-765. Baer, C. F., N. Phillips, D. Ostrow, A. Avalos, D. Blanton, A. Boggs, T. Keller, L. Levy, and E. Mezerhane. 2006. Cumulative effects of spontaneous mutations for fitness in Caenorhabditis: role of genotype, environment and stress. Genetics 174:1387- 1395. Bataillon, T. 2003. Shaking the ‘deleterious mutations’ dogma? Trends Ecol. Evol. 18:315-317. Case, A. L., P. S. Curtis, and A. A. Snow. 1998. Heritable variation in stomatal responses to elevated C02 in wild radish, Raphanus raphanistrum (Brassicaceae). Am. J. Bot. 85:253-258. Chang, SM. and R.G. Shaw. 2003. The contribution of spontaneous mutation to variation in environmental response in Arabidopsis thaliana: Responses to nutrients. Evolution 57(5):984-994. Charlesworth, B., D. Charlesworth and M. T. Morgan. 1990. Genetic loads and estimates of mutation rates in highly inbred plant populations. Nature 347 :380-382. Conner, J. K. 2002. Genetic mechanisms of floral trait correlations in a natural population. Nature (Lond.) 420:407-410. Conner, J. and S. Via. 1993. Patterns of phenotypic and genetic correlations among morphological and life-history traits in wild radish, Raphanus raphanistrum. Evolution 47:704-71 1. Crow, J. F. 2000. The origins, patterns and implications of human spontaneous mutation. Nature Reviews Genetics 1:40-47. Crow, J. F. and M. Kimura. 1970. An introduction to population genetics theory. Harper and Row, New York. Denver, D. R., K. Morris, M. Lynch and W. K. Thomas. 2004. High mutation rate and predominance of insertions in the Caenorhabditis elegans nuclear genome. Nature 430:679-682. 86 Denver, D. R., K. Morris, J. T. Streelman, S. K. Kim, M. Lynch, and W. K. Thomas. 2005. The transcriptional consequences of mutation and natural selection in Caenorhabditis elegans. Nature Genetics 37:544-548. Derome, N., P. Duchesne, and L. Bernatchez. 2006. Parallelism in gene transcription among sympatric lake Whitefish (Coregonus clupeaformis Mitchell) ecotypes. Molecular Ecology 15:1239-1249. Eyre-Walker, A., P. D. Keightley, N. G. C. Smith and D. Gaffney. 2002. Quantifying slightly deleterious mutation model of molecular evolution. Mol. Biol. Evol. 19(12):2142-2149. Eyre-Walker, A. and P. Keightley. 1999. High genomic deleterious mutation rates in hominids. Nature (Lond.) 397:344-347. Fernandez, J. and C. Lopez-Fanjul. 1997. Spontaneous mutational genotype-environment interaction for fitness-related traits in Drosophila melanogaster. Evolution 51(3):856-864. Fry, J. D. 2004. On the rate and linearity of viability declines in Drosophila mutation- accumulation experiments: Genomic mutation rates and synergistic epistasis revisited. Genetics 166:797-806. Fry, J. D. and S. L. Heinsohn. 2002. Environment dependence of mutational parameters for viability in Drosophila melanogaster. Genetics 161:1 155-1 167. Fry, J .D., S.L. Heinsohn, and T.F.C. Mackay. 1996. The contribution of new mutations to genotype-environment interaction for fitness in Drosophila melanogaster. Evolution 50(6):23 16-2327. Fry, J. D., P.’ D. Keightley, S. L. Heinsohn, and S. Nuzhdin. 1999. New estimates of the rates and effects of mildly deleterious mutation in Drosophila melanogaster. Proc. Natl. Acad. Sci. U. S. A. 96:574-579. Gibson, G. 2002. Microarrays in ecology and evolution: a preview. Molecular Ecology 1 1 : 1 7-24. Gibson, G. and R. Wolfinger. 2004. Gene expression profiling using mixed models. Pp. 251-278 in A. M. Saxton, ed. Genetic analysis of complex traits using SAS. SAS Institute, Cary, NC. Gilad, Y., A. Oshlack, and S. A. Riflein. 2006..Natural selection on gene expression. Trends in Genetics 22:456-461. Haag-Liautard, C., M. Dorris, X. Maside, S. Macaskill, D. L. Halligan, B. Charlesworth, and P. D. Keightley. 2007. Direct estimation of per nucleotide and genomic deleterious mutation rates in Drosophila. Nature 445:82-85. 87 Halligan, D. L. and P. D. Keightley. 2006. Ubiquitous selective constraints in the Drosophila genome revealed by a genome-vvide interspecies comparison. Genome Research 16:875-884. Herre, E. A., S. A. Van Bael, Z. Maynard, N. Robbins, J. Bischoff, A. E. Arnold, E. Rojas, L. C. Mejia, R. A. Cordero, C. Woodward, and D. A. Kyllo. 2005. Tropical plants as chimera: some implications of foliar endophytic fungi for the study of host-plant defence, physiology and genetics. In: Burslem, David F .R.P., Pinard, Michelle A. and Hartley, Sue E. (Ed.), Biotic interactions in the tropics: their role in the maintenance of species diversity: 226-237. Cambridge, UK: Cambridge University Press. Houle, D., B. Morikawa, and M. Lynch. 1996. Comparing mutational variabilities. Genetics 143: 1467-1483. Johnston, M. O. and D. J. Schoen. 1995. Mutation rates and dominance levels of genes affecting total fitness in two Angiosperm species. Science 267:226-229. Kavanaugh, CM. and R.G. Shaw. 2005. The contribution of spontaneous mutation to variation in environmental responses of Arabidopsis thaliana: Responses to light. Evolution 59(2):266-275. Kearns, CA. and D.W. Inouye. 1993. Techniques for Pollination Biologists. Keightley, P. D. 1994. The distribution of mutation effects on viability in Drosophila melanogaster. Genetics 1 38: 1 3 1 5-1 322. Keightley, PD. and A. Caballero. 1997. Genomic mutation rates for lifetime reproductive output and lifespan in Caenorhabditis elegans. Proc. Natl. Acad. Sci. U. S. A. 94:3823-3827. Keightley, P.D., A. Caballero, and A. Garcia-Dorado. 1998. Population genetics: Surviving under mutation pressure. Current Biology 8:R235-R237. Keightley, P. D. and A. Eyre-Walker. 2000. Deleterious mutations and the evolution of sex. Science (Wash. D. C.) 290:331-333. Keightley, P. D. and M. Lynch. 2003. Toward a realistic model of mutations affecting fitness. Evolution 57(3):683-685. Khaitovich, P., G. Weiss, M. Lachmann, I. Hellman, W. Enard, B. Muetzel, U. Wirkner, W. Ansorge, S. Paabo. 2004. A neutral model of transcriptome evolution. PLoS Biology 2:682-689. Kibota, T. and M. Lynch. 1996. Estimate of the genomic mutation rate deleterious to overall fitness in E. coli. Nature (Lond.) 381 :694-696. Kimura, M. 1983. The neutral theory of molecular evolution. Cambridge University Press, Cambridge. 88 Kondrashov, A. S. 1998. Measuring spontaneous deleterious mutation process. Genetica 102/103zl83-197. Kondrashov, A. S. and D. Houle. 1994. Genotype-environment interactions and the estimation of the genomic mutation rate in Drosophila melanogaster. Proc. R. Soc. Lond. B Biol. Sci. 258:221-227. Kostkarick, R., and W. J. Manning. 1993. Radish (Raphanus sativus L) - a model for studying plant responses to air pollutants and other environmental stresses. Environ. Pollut. 82:107-138. Lande, R. 1994. Risk of population extinction from fixation of new deleterious mutations. Evolution 48: 1460-1469. Littell, R. C., G. A. Milliken, W. W. Stroup, and R. D. Wolfinger. 1996. SAS system for mixed models. SAS Institute Inc., Cary, NC. Lynch, M. 1988. The rate of polygenic mutation. Genetica] Research 51:137-148. Lynch, M., J. Blanchard, D. Houle, T. Kibota, S. T. Schultz, L. L. Vassilieva, and J. H. Willis. 1999. Perspective: Spontaneous deleterious mutation. Evolution 53(3):645-663. Lynch, M., J. Conery, and R. Burger. 1995. Mutation accumulation and the extinction of small populations. Am. Nat. 146(4):489-51 8. Lynch, M. and W. G. Hill. 1986. Phenotypic evolution by neutral mutation. Evolution 40:91 5-935. Lynch, M., L. Latta, J. Hicks, and M. Giorgianni. 1998. Mutation, selection, and the maintenance of life-history variation in natural populations. Evolution 52:727- 733. Lynch, M. and B. Walsh. 1998. Genetics and analysis of quantitative traits. Sinauer Associates, Sunderland, MA. MacKenzie, J. L., F. E. Saade, Q. H. Le, T. E. Bureau, and D. J. Schoen. 2005. Genomic mutation in line of Arabidopsis thaliana exposed to ultraviolet-B radiation. Genetics 171:715-723. Matzkin, L. M., T. D. Watts, B. G. Bitler, C. A. Machado, and T. A. Markow. 2006. Functional genomics of cactus host shifts in Drosophila mojavensis. Molecular Ecology 15:4635-4643. Mazer, SJ. 1987. The quantitative genetics of life-history and fitness components in Raphanus raphanistrum L (Brassicaceae) - Ecological and evolutionary consequences of seed-weight variation. Am. Nat. 130(6):891-914. 89 McIntyre, L. M., L. M. Bono, A. Genissel, R. Westerman, D. Junk, M. Telonis-Scott, L. Harshman, M. L. Wayne, A. Kopp, and S. V. Nuzhdin. 2006. Sex-specific expression of alternative transcripts in Drosophila. Genome Biology 7:R79. Nason, J .D. and NC. Ellstrand. 1995. Lifetime estimates of biparental inbreeding depression in the self-incompatible annual plant Raphanus sativus. Evolution 49:307-3 1 6. Ohta, T. 1995. Synonymous and nonsynonymous substitutions in mammalian genes and the nearly neutral theory. J. Mol. Evol. 40(1):56-63. Partridge, L. and NH. Barton. 1993. Optimality, mutation and the evolution of aging. Nature 362(6418):305-311. Ratcliffe, D. 1965. The geographical and ecological distribution of Arabidopsis and comments on physiological variation. Arabidopsis Information Service. Rifltin, S. A., D. Houle, J. Kim, and K. P. White. 2005. A mutation accumulation assay reveals a broad capacity for rapid evolution of gene expression. 438:220-223. Robbelen, G. 1965. The Laibach standard collection of natural races. Arabidopsis Information Service. Rose, M. R. 1991. Evolutionary biology of aging. Oxford University Press, New York. SAS Institute. 2004. SAS for Windows, ver. 9.1. SAS Institute, Cary, NC. SAS Institute. 2001. JMP, ver. 4. SAS Institute, Cary, NC. Schultz, S. T., M. Lynch, and J. H. Willis. 1999. Spontaneous deleterious mutation in Arabidopsis thaliana. Proc. Natl. Acad. Sci. U. S. A. 96:1 1393-1 1398. Schoen, D. J. 2005. Deleterious mutation in related species of the plant genus Amsinckia with contrasting mating systems. Evolution 59:2370-2377. Schmid, M., T. S. Davison, S. R. Henz, U. J. Pape, M. Demar, M. Vingron, B. Scholkopf, D. Weigel, and J. U. Lohmann. 2005. A gene expression map of Arabidopsis thaliana development. Nature Genetics 37:501-506. Shabalina, S. A., L. Y. Yampolsky, and A. S. Kondrashov. 1997. Rapid decline of fitness in panmictic populations of Drosophila melanogaster maintained under relaxed natural selection. Proc. Natl. Acad. Sci. U. S. A. 94:13034-13039. Shaw, F. H., C. J. Geyer, and R. G. Shaw. 2002. A comprehensive model of mutations affecting fitness and inferences for Arabidopsis thaliana. Evolution 56:453-463. Shaw, R. G., D. L. Byers, and E. Darmo. 2000. Spontaneous mutational effects on reproductive traits of Arabidopsis thaliana. Genetics 155:369-3 78. 90 Shaw, R. G., F. H. Shaw, and C. Geyer. 2003. What fiaction of mutations reduces fitness? A reply to Keightley and Lynch. Evolution 57:686-689. Stanton, M. L., A. A. Snow and S. N. Handel. 1986. Floral evolution — Attractiveness to pollinators increases male fitness. Science (Wash. D. C.) 232(4758):1625-1627. Stanton, M. L. and D. A. Thiede. 2005. Statistical convenience vs biological insight: consequences of data transformation for the analysis of fitness variation in heterogeneous environments. New Phytologist 166:319-338. Steel, R. G. D., J. H. Torrie, and D. A. Dickey. 1997. Principles and procedures of statistics: A biometrical approach. 3rd ed. McGraw-Hill Companies, New York. Strauss, S. Y., J. K. Conner, and K. P. Lehtila. 2001. Effects of foliar herbivory by insects on the fitness of Raphanus raphanistrum: Damage can increase male fitness. Am. Nat. 158(5):496-504. Szafraniec, K., R. H. Borts, and R. Korona. 2001. Environmental stress and mutational load in diploid strains of the yeast Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. USA 98:1107-1112. Tevini, M., W. Iwanzik, and A. H. Teramura. 1983. Effects of UV-B radiation on plants during mild water stress. II. Effects on growth, protein and flavonoid content. Z. Pflanzenphysiol. Bd. 1 10:459-467. Thomson, J. D., L. P. Rigney, K. M. Karoly and B. A. Thomson. 1994. Pollen viability, vigor, and competitive ability in Erythronium grandiflorum (Liliaceae). Am. J. Bot. 81 :1257-1266. Vassilieva, L. L., A. M. Hook, and M. Lynch. 2000. The fitness effects of spontaneous mutations in Caenorhabditis elegans. Evolution 54:1234-1246. Vassilieva, L. L. and M. Lynch. 1999. The rate of spontaneous mutation for life-history traits in Caenorhabditis elegans. Genetics 151 :1 19-129. Verhoeven, K. J. F., K. L. Simonsen and L. M. McIntyre. 2005. Implementing false discovery rate control: increasing your power. Oikos 1082643-647. Via, S. 1994. Population structure and local adaptation in a clonal herbivore. In: L. Real (Bd.) Ecological Genetics. Princeton University Press. Wloch, D., K. Szafraniec, R. Borts, and R. Korona. 2001. Direct estimate of the mutation rate and the distribution of fitness effects in the yeast Saccharomyces cerevisiae. Genetics 159:441-452. Wright, S. I., B. Lauga, and D. Charlesworth. 2002. Rates and patterns of molecular evolution in inbred and outbred Arabidopsis. Mol. Biol. Evol. 19:1407-1420.Xu, 91 J. 2004. Genotype-environment interactions of spontaneous mutations for vegetative fitness in the human pathogenic fungus Cryptococcus neoformans. Genetics 168:1177-1188. Zeyl, C. and J. A. G. M. DeVisser. 2001. Estimates of the rate and distribution of fitness effects of spontaneous mutation in Saccharomyces cerevisiae. Genetics 157:53- 61. 92 IIIIIIIIIIIIIIIIIIIIIIIIIIIIIII mwmwmww