COMPUTATIONAL APPROACHES TO COMPLEX BIOLOGICAL PHENOMENA
                                      By
                           Joshua Luke Franklin
                            A DISSERTATION
                                Submitted to
                        Michigan State University
                 in partial fulfillment of the requirements
                              for the degree of
       Microbiology and Molecular Genetics – Doctor of Philosophy
                                     2021


                                           ABSTRACT
   COMPUTATIONAL APPROACHES TO COMPLEX BIOLOGICAL PHENOMENA
                                                By
                                      Joshua Luke Franklin
Biological systems can be difficult to understand due to a vast array of interacting phenom-
ena. The result is that some seemingly "easy" questions go unanswered. For example, we
have long known that bacteria utilize many distinct flagellar configurations, but in most
cases it remains unclear why they do so. We know that cell differentiation is critical to many
biological processes, yet we still do not fully understand how such spatiotemporal patterning
occurs. Despite mutation being one of the driving forces of evolution, we still have a hazy
understanding of how organisms respond and adapt to high mutation rates. However, ad-
vances in technology, modeling, and experimental techniques have enabled us to investigate
the small and nuanced effects that can answer these questions.
    In this dissertation, I have used modern computational tools and statistical techniques
to investigate evolutionary and behavioral processes. Agent-based models of evolution and
flagellar inheritance have allowed me to investigate the evolution of mutational robustness
and trade-offs associated with flagellar motility, respectively. By writing Bayesian mixed-
effect models, I have been able to precisely quantify the metabolic cost of producing flagella
and describe spatiotemporal patterns of cell differentiation within fruiting bodies of Myxo-
coccus xanthus. Careful quantitative modeling of biological phenomena can help cut through
the complexity of these systems. As computational power continues to grow and software
continues to become more sophisticated, these computational approaches will both become
more powerful and easier to use. Computational approaches will not replace experiments
in most cases; instead, computational models can direct experiments, which can themselves
direct new modeling efforts, in an iterative loop of progressing knowledge.


Copyright by
JOSHUA LUKE FRANKLIN
2021


   For Heather, with whom
laughs come freely as raindrops
      in a thunderstorm.
              iv


                                  ACKNOWLEDGEMENTS
    It may be cliché to say that it takes a village to raise a Ph.D., but it is grounded in
truth. Were it not for the support, friendship, and mentorship of dozens of people, I can
safely say I would not have completed graduate school. Unfortunately, I do not have room
to acknowledge everyone who has supported me, since I would have to list almost everyone
I have run into during my stay at Michigan State University (MSU). Nevertheless, there are
several people who have been particularly important to me, and I would like to thank them
here.
    I would like to thank my committee members, Dr. Christoph Adami, Dr. Daniel Ducat,
and Dr. Jonathan Hardy, for their guidance. They have provided me with valuable scientific
feedback and helpfully voiced their support of my work and my career trajectory.
    I would like to thank my mentor, Dr. Yann Dufour, for having countless meetings with
me. The hours we spent pair programming (read: debugging [read: struggling with]) models
in brms and STAN were well worth the effort. Dr. Dufour has incredible intensity and clarity
of thought, and I am a much better scientist for having joined his lab.
    I am nowhere near skilled enough with words to convey the care my graduate student
friends have shown me. Dr. Kyle Card was the first student I met at MSU and has been a
close friend for my entire graduate career. I always look forward to our next conversation
about evolution, consciousness, and the newest Vsauce video. I also appreciate Dr. Card’s
genuinely uncanny ability to make me keel over with laughter. Dr. Nkrumah Grant is
possibly the most positive and supportive person I have ever met. Despite the distance now
separating us, I know we will always be friends. Dr. Thomas LaBar is a great mentor,
friend, and inspiration, as his path from mathematics to biology has in some ways mirrored
my own trajectory. Dr. Paula Roszczenko and Dr. Maciej Jasiński were fantastic friends
during their time at MSU. I fondly remember all our activities together, from eating pastries
at Chapelure to evenings spent working on Pudding’s suspension. Dr. Nhu Nguyen was
                                             v


the best lab mate anyone could ask for, and her undying support and friendship has been
invaluable to surviving my graduate career. Her perseverance and work ethic in spite of the
daily ups and downs of graduate student life were, and continue to be, inspiring to me.
    Of course, my family has made me who I am today. I feel my mother’s unwavering
support and love every time I speak with her. Her path has not been easy, but nevertheless
she has climbed the nursing ladder from CNA to RN to DNP, all while raising me and
supporting my endeavors. My stepfather has been a deeply important figure in my life, and
I can only aspire to have as many interesting stories when I reach his age. I appreciate the
unyielding confidence of my father, who never, ever, fails to tell me how proud he is of me.
I only hope he knows how proud I am of him.
    While William Halle is not related to me by blood, he may as well be my brother. From
coding, to gaming, to sending absurd messages, we have formed a connection that cannot
be destroyed by time, space, or lack thereof.
    Finally, my fiancée Heather has supplied an endless stream of patience, encouragement,
and laughs as I needed them. She is the only person I could have been with literally every
hour of every day for months of quarantine, and still find myself growing closer to every day.
I look forward to spending the rest of our lives laughing together on our walks beneath the
stars.
                                              vi


                                 TABLE OF CONTENTS
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     viii
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       ix
INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .         1
CHAPTER 1 - MAPPING THE PEAKS: FITNESS LANDSCAPES OF THE FITTEST
AND THE FLATTEST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .           6
       Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    7
       Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      7
       Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       9
       Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  14
       Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   17
       Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    21
       Supporting Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     23
CHAPTER 2 - CELL DENSITY, ALIGNMENT, AND ORIENTATION CORRE-
LATE WITH C-SIGNAL-DEPENDENT GENE EXPRESSION DURING MYXO-
COCCUS XANTHUS DEVELOPMENT . . . . . . . . . . . . . . . . . . . . . . . . .                  25
       Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   26
       Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   27
       Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  29
       Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   39
       Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      43
       Supporting Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     48
CHAPTER 3 - COST-BENEFIT ANALYSIS OF PERITRICHOUS FLAGELLATION
IN LOW VISCOSITY ENVIRONMENTS . . . . . . . . . . . . . . . . . . . . . . . .                 71
       Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   72
       Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   72
       Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  74
       Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   83
       Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      85
       Supporting Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     89
CHAPTER 4 - CONCLUSIONS AND FUTURE DIRECTIONS                        . . . . . . . . . . . . 92
       Introduction . . . . . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . 93
       Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . .   . . . . . . . . . . . . 93
       Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . .   . . . . . . . . . . . . 96
       Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . .   . . . . . . . . . . . . 97
       Closing recommendations . . . . . . . . . . . . . . . . .     . . . . . . . . . . . . 101
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
                                              vii


                                LIST OF TABLES
Table 2.1 Strains used in this study. . . . . . . . . . . . . . . . . . . . . . . . . 52
Table 2.2 Plasmids used in this study. . . . . . . . . . . . . . . . . . . . . . . .  52
Table 2.3 Primers used in this study. . . . . . . . . . . . . . . . . . . . . . . . . 53
                                      viii


                                       LIST OF FIGURES
Figure 1.1 Populations adapt to different fitness peaks in high mutation rate
         environments. a) Fitness values for the most abundant genotype from each
         population after the Initial Adaptation experiment (denoted by “Initial") and
         after adaptation to low (“Fit”) and high (“Flat”) mutation rate environments.
         For boxplots, lines are the median values, boxes show the interquartile range,
         and whiskers are 1.5 times the interquartile range or the minimum/maximum
         value; points are outliers. Overlapping notches suggest that the difference in
         medians is not significant. b) Relative average fitness values over the course of
         the Mutation Rate Adaptation experiment. Lines are the mean value across all
         populations and the shaded regions are standard deviations. Blue represents
         fit genotypes and red represents flat genotypes. . . . . . . . . . . . . . . . .     15
Figure 1.2 Competition outcome between fit and flat genotypes is deter-
         mined by the mutation rate. Outcome from competition experiments
         between one fit/flat genotype pair. Black line is the mean fraction of the
         population consisting of the fit genotype as a function of time (measured in
         generations). Green shading represents one standard deviation. Each subplot
         shows the results for a given genomic mutation rate for five replicates. . . . .     16
Figure 1.3 Description of local fitness landscapes for fit and flat genotypes.
         a-e) Distribution of fitness effects with mutations grouped by their broad
         effect. Boxplots as previously described. Lineage refers to whether the data
         is for fit genotypes (blue) or flat genotypes (red). f) Average relative fitness
         of all one-step mutants for each fit and flat genotype. . . . . . . . . . . . . .    17
Figure 1.4 Summarized distribution of fitness effects for mutants with mul-
         tiple mutations. Proportion of mutations with a mutational effect classifica-
         tion for mutants up to four mutations away from the wild-type sequence. See
         text for definitions of different classes of mutations. Distance is the number of
         mutations away from the wild-type sequence. Data for mutants with distance
         of one is the same as in Figure 3. Blue (red) boxplots are the proportions
         from the fit (flat) genotypes. Boxplots as previously described. . . . . . . . .     18
Figure 1.5 Distribution of α and β values for fit and flat genotypes. Distribu-
         tions of α and β for each treatment. α is a measure of single-mutant fitness,
         while β is a measure of epistasis (see Methods for further details). Boxplots
         as previously described. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Figure 1.6 Histogram of the residual standard error (RSE) of the fits of
         Equation 1 to all fit and flat peaks. . . . . . . . . . . . . . . . . . . . .        23
                                                 ix


Figure 1.7 Boxplot of the residuals of the fit to Equation 1 for the fit peaks
         (blue) and flat peaks (red) separately. As in the boxplots of the main
         text, lines are the median values, boxes show the interquartile range, and
         whiskers are 1.5 times the interquartile range of the minimum or maximum
         value; points are outliers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Figure 2.1 Visualization of transitioning cells near the bottom of a nascent
         fruiting body. M. xanthus wild-type strain DK1622 was starved to initiate
         development under submerged culture conditions. FM 4-64 (2 µg/mL) was
         added at the start of starvation. Confocal images were acquired at the indi-
         cated times poststarvation (PS) to show FM 4-64 staining of cell membranes
         (red fluorescence shown as grayscale). Images are representative of three bio-
         logical replicates. (A) Image of an optical section near the bottom of the same
         nascent fruiting body (NFB) during development. Red arrow, rod-shaped cell.
         Green arrows, transitioning cells (TCs). Blue arrow, spore. Bar, 20 µm. (B)
         Cartoon depiction of tangential orientation. Upper left quadrant depicts one
         tangentially oriented rod. Arrow indicates radius of NFB (gray). Other quad-
         rants depict many smaller tangentially oriented rods. (C) Enlarged images of
         representative cells. Images from panel A were enlarged to show cells with a
         particular shape (a rod at 18 h, TCs at 24 and 30 h, and a spore at 42 h).
         Colored arrows point to the same cell classes as in panel A. Bar, 2 µm. (D)
         Tracking the fate of TCs. Images from Movie S2. Orange arrows, TC that
         becomes a spore. Magenta arrows, TC that changes shape. White arrows,
         TC whose fate is uncertain. Bar, 2 µm. . . . . . . . . . . . . . . . . . . . . .      54
                                                 x


Figure 2.2 Classification of cellular morphologies and the temporal pat-
         terns of rods, transitioning cells, and spores. Labeled cells that produce
         a fluorescent protein under control of a vanillate-inducible promoter (strain
         YH7, YH8, YH14, or Y15) were mixed with unlabeled wild-type cells (strain
         DK1622) at a ratio of 1 to 5 and starved under submerged culture condi-
         tions. Vanillate (0.5 mM) was added at the start of starvation. Z-stacks of
         optical sections of the same nascent fruiting body (NFB) for each of three
         biological replicates were acquired at the indicated times poststarvation (PS)
         to show fluorescence from tdTomato or mNeonGreen. Segmented cells from
         all z-stacks were classified as rods, transitioning cells (TCs), or spores (5256,
         1459, and 3213 cells, respectively). (A) Confocal images of the same NFB. Im-
         ages show an optical section near the bottom of a representative NFB formed
         by the mixture of strains YH8 and DK1622. Red fluorescence from YH8 cells
         is shown as grayscale. Red arrow, rod. Green arrows,TCs. Blue arrow, spore.
         Bar, 20 µm. (B) Enlarged images of representative cells. Images from panel
         A were enlarged to show cells with a particular shape (a rod at 27 h, TCs
         at 30 and 36 h, a spore at 48 h). Colored arrows point to the same cell
         classes as in panel A. Bar, 2 µm. (C) Representative cellular morphologies.
         Three-dimensional renderings reconstructed from the image stacks are shown
         for three cells of each class. Apparent features <0.5 µm are artifacts created
         by shading during the rendering process. Bar, 1 µm. (D) Proportion of each
         cell class at the indicated times PS. Dot, median. Shaded region, 90% credible
         interval. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Figure 2.3 Radial patterns of cell density, alignment, and orientation early
         correlate with spore proportion later as nascent fruiting bodies ma-
         ture.In the experiments described in the legend of Figure 2.2 and in the text,
         segmented cells from all z-stacks were classified as rods, transitioning cells, or
         spores. (A) Radial distribution of cell classes and density. The proportion
         of each cell class at different times poststarvation is shown radially from the
         center (0 µm) to the edge (60 µm) of nascent fruiting bodies (NFBs). Total
         cell density is normalized to the maximum between the center and the edge
         of NFBs. Line, median. Shaded region, 90% credible interval. (B) Radial
         patterns of rod neighbor alignment and tangential orientation. Alignment is
         the cosine of the angle of a rod to its neighbors averaged using a Gaussian
         kernel (sigma = 2.5 µm) (1, perfect alignment; 0, orthogonal). Orientation is
         the cosine of the angle of a rod with the circumference of the NFB (1, tangent
         to the circumference; 0, orthogonal). (C) Cross-correlations between the ra-
         dial patterns of total cell density, rod neighbor alignment, and rod tangential
         orientation early, and that of the spore proportion later. Values are Pearson
         correlation coefficients and color shade indicates strength of positive (purple)
         or negative (yellow) correlation (* p < 0.05, ** p < 0.01). . . . . . . . . . .       56
                                                xi


Figure 2.4 C-signal-dependent gene expression increases in transitioning
         cells and spores as nascent fruiting bodies mature.As described in
         the Figure 2.2 legend, strains YH14 bearing Pdev -tdTomato (designated dev )
         or YH15 bearing PfmgE -tdTomato (designated fmgE ) were mixed with unla-
         beled wild-type strain DK1622 at a ratio of 1 to 5 and co-developed under
         submerged culture conditions. Both the dev and fmgE strains also contain
         Pvan-mNeonGreen. Vanillate (0.5 mM) was added at the start of starva-
         tion. Z-stacks of confocal images of the same nascent fruiting body (NFB)
         were acquired at the indicated times post starvation (PS), measuring both
         red and green fluorescence. Images are representative of three biological repli-
         cates. (A) Confocal images of the same NFB. Images show merged (M) red
         and green fluorescence. Red arrows, rods. Green arrows, transitioning cells
         (TCs). Blue arrows, spores. Bar, 20 µm. (B) Enlarged images of represen-
         tative cells. Images from panel A were enlarged to show green (G), red (R),
         or M fluorescence of cells with a particular shape (rods at 27 and 30 h, TCs
         at 30 and 36 h, spores at 36 and 48 h). Colored arrows point to the same
         cell classes as in panel A. Bar, 2 µm. (C) Quantification of gene expression.
         Segmented cells from z-stacks were classified as rods, TCs, or spores based on
         green fluorescence. For each cell class, the log10 (relative fluorescence inten-
         sity) (i.e., red/green) at each time PS is plotted for the dev and fmgE strains
         as indicated. Dot, median. Vertical line, 90% credible interval. (D) Gene
         expression in TCs and spores. The relative fluorescence intensity of TCs and
         spores was normalized to that of rods at 27 h (dashed line at y = 0). Dot,
         median. Vertical line, 90% credible interval. . . . . . . . . . . . . . . . . . . 57
Figure 2.5 Radial patterns of cell density, alignment, and orientation early
         correlate with C-signal-dependent gene expression later as nascent
         fruiting bodies mature. In the experiment described in the legend of Fig-
         ure 2.4 and in the text, both red and green fluorescence was quantified for
         all segmented cells from the z-stacks. The results from three biological repli-
         cates were combined. (A) Radial distribution of C-signal-dependent gene
         expression. For transitioning cells and spores, the log10 (relative fluorescence
         intensity) (i.e., red/green) for the dev and fmgE strains at each time post-
         starvation is shown radially from the center (0 µm) to the edge (60 µm) of
         nascent fruiting bodies. Line, median. Shaded region, 90% credible interval.
         (B) Cross-correlations between the radial patterns of total cell density, rod
         alignment, and rod tangential orientation early, and that of the dev strain
         relative fluorescence intensity (RFI) later. Values are Pearson correlation co-
         efficients and color shade indicates strength of positive (purple) or negative
         (yellow) correlation. (C) Cross-correlations as in panel B except with the
         fmgE strain RFI (* p < 0.05, ** p < 0.01). . . . . . . . . . . . . . . . . . . .  59
                                               xii


Figure 2.6 Model showing the inferred radial pattern of C-signaling based
         on observed patterns of cellular arrangement, morphology, and gene
         expression as nascent fruiting bodies mature.The radial distributions of
         cell classes, cell density, and the neighbor alignment and tangential orientation
         of rods are depicted at three time points based on Figure 2.3A and 2.3B.
         The inferred radial pattern of C-signaling at 30 h poststarvation is based
         on correlations with later C-signal-dependent dev and fmgE expression in
         transitioning cells and spores (Fig. 2.5). . . . . . . . . . . . . . . . . . . . . 60
Figure 2.7 Cartoon and scanning electron micrographs of M. xanthus de-
         velopment.The cartoon at the top depicts starving rod-shaped cells, which
         coordinate their movements to build a mound, with some cells lysing (dashed
         outline) during the process. The mound matures into a fruiting body as some
         rods differentiate into round spores, while other rods remain outside the fruit-
         ing body as peripheral rods. The scanning electron micrographs at the bottom
         show the mound and fruiting body stages. The cartoon is adapted from [89]
         and the micrographs are from [94] . . . . . . . . . . . . . . . . . . . . . . . .  61
Figure 2.8 FM 4-64 does not interfere with M. xanthus development. Wild-
         type strain DK1622 was subjected to starvation under submerged culture con-
         ditions. FM 4-64 (2 µg/mL) was added at the start of starvation or withheld.
         (A) Bright-field images of the bottom of nascent fruiting bodies (NFBs). The
         same NFB was examined at the indicated times poststarvation (PS). Dark-
         ening of mounds typically reflects the formation of spores. Images are rep-
         resentative of three biological replicates. Bar, 20 µm. (B) Quantification of
         sonication-resistant spores. Samples were collected at 42 h PS, sonicated, and
         the number of spores/mL was determined. In the main graph, each symbol
         represents one biological replicate. The insert shows the posterior probability
         distribution of the difference in the number of sonication-resistant spores/mL
         in response to 2 vs. 0 µg/mL of FM 4-64. Vertical line, mean. Thick and thin
         horizontal lines, 90% and 98% credible intervals, respectively. . . . . . . . . .  62
Figure 2.9 Temporal pattern of cellular shape near the bottom of nascent
         fruiting bodies. Based on our observations of cells undergoing submerged
         culture development and visualized by membrane staining with FM 4-64, the
         proportion of cells exhibiting the indicated shapes at different times poststar-
         vation was approximated. Rods undergo lysis or become transitioning cells
         or peripheral rods. Transitioning cells may undergo lysis or persist, but some
         become spores. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
                                                 xiii


Figure 2.10 Cell classification using Gaussian mixture modeling. Each point
         on the scatter plots represents a cell from the aggregated data across time and
         fruiting bodies, collected in experiments described in the legend of Figure 2
         and in the text. Clusters were determined to be rods (red), transitioning cells
         (green), or spores (blue) according to their morphology (equivalent diameter,
         surface area, length, and mean width). The Pearson correlation coefficients
         are indicated for each pair of properties for all cells (black font) and for each
         cell class as indicated (*** indicates p-value < 0.001). Histograms on the
         left are the marginal distributions for each cell class for the corresponding
         morphological property. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Figure 2.11 Morphological property distributions of cells classified as rods,
         transitioning cells, or spores. In the experiments described in the legend
         of Figure 2 and in the text, segmented cells from all z-stacks were classified
         as rods, transitioning cells, or spores. The graph on the left in each panel
         shows the distribution of the indicated property for each cell class. Gray dot,
         individual cell. Black dot, median. The graph on the right in each panel shows
         the posterior probability intervals of the property difference for the indicated
         pair of cell classes. Dot, median. Thick and thin horizontal lines, 90% and
         98% credible intervals, respectively. (A) Equivalent diameter. (B) Cell length.
         (C) Mean cell width. (D) Sphericity. . . . . . . . . . . . . . . . . . . . . . .  65
Figure 2.12 Spatiotemporal distribution of cell classes and density in indi-
         vidual nascent fruiting bodies. In the experiments described in the legend
         of Figure 2 and in the text, segmented cells from z-stacks were classified as
         rods, transitioning cells, or spores. The proportion of each cell class at the
         indicated time poststarvation is shown radially from the center (0 µm) to the
         edge (60 µm) of one representative nascent fruiting body (NFB) formed by
         mixing the strain indicated on the right with unlabeled strain DK1622. Total
         cell density is normalized to the maximum between the center and the edge
         of the NFB. Line, median. Shaded region, 90% credible interval. . . . . . . .     66
Figure 2.13 A csgA mutant fails to form nascent fruiting bodies and fails to
         express C-signal-dependent genes. M. xanthus csgA mutant strains YH46
         bearing Pdev -tdTomato (designated dev ) or YH45 bearing PemphfmgE -tdTomato
         (designated fmgE ) were starved under submerged culture conditions. Both
         the dev and fmgE strains also contain Pvan -mNeonGreen. Vanillate (0.5 mM)
         was added at the start of starvation. Wells were examined at 6-h intervals
         from 18 to 48 h poststarvation (PS) using confocal microscopy to visualize
         both red and green fluorescence. Nascent fruiting bodies failed to form, cells
         remained rod shaped, and the rods exhibited green fluorescence but showed
         very little red fluorescence. Confocal images near the bottom of the biofilm
         were acquired at the indicated times PS. Images show merged (M) red and
         green fluorescence. Bar, 2 µm. . . . . . . . . . . . . . . . . . . . . . . . . .  67
                                               xiv


Figure 2.14 C-signal-dependent gene expression in individual cells of nascent
         fruiting bodies. In the experiment described in the legend of Figure 4 and
         in the text, segmented cells from z-stacks were classified as rods, transitioning
         cells, or spores based on green fluorescence. The results from three biological
         replicates were combined. (A) Distributions of red and green fluorescence
         intensities. For each cell (dot), the log10 (red fluorescence intensity) and the
         log10 (green fluorescence intensity) at each time poststarvation (PS) is plotted
         for the dev and fmgE strains as indicated. Cells in the lower left quadrant
         of each plot (i.e., exhibiting low red and green fluorescence) were eliminated
         from analyses in panels B-D. (B) Distributions of red fluorescence intensity.
         For each cell class, the log10 (red fluorescence intensity) at each time PS is
         plotted for the dev and fmgE strains as indicated. Dot, individual cell. Bar,
         median. (C) Distributions of green fluorescence intensity. Same as panel B
         except the log10 (green fluorescence intensity) is plotted. (D) Distributions
         of relative fluorescence intensity. Same as panel B except the log10 (relative
         fluorescence intensity) (i.e., red/green) is plotted. . . . . . . . . . . . . . . . 68
Figure 2.15 Vanillate-inducible gene expression in individual cells of nascent
         fruiting bodies. As described in the Figure 2 legend, strain YH7 bearing
         Pvan -mNeonGreen or YH8 bearing Pvan -tdTomato were mixed with unlabeled
         wild-type strain DK1622 cells at a ratio of 1 to 5 and co-developed under
         submerged culture conditions. Vanillate (0.5 mM) was added at the start
         of starvation. Z-stacks of confocal images of the same nascent fruiting body
         were acquired at the indicated times poststarvation (PS) without changing
         the microscope settings. Segmented cells from z-stacks were classified as rods,
         transitioning cells, or spores. The results from three biological replicates were
         combined. For each cell class, the distribution of log10 (fluorescence intensity)
         (i.e., green or red) at each time PS is plotted. Dot, individual cell. Bar,
         median. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Figure 3.1 Ectopic control of flagellar expression and flagellar hook detec-
         tion in S. enterica. A) Composite image of fluorescently labeled EM8242
         after induction of flagellar expression ([AnTc] = 2,000 pg/mL). DNA was
         stained with DAPI (blue), flagellar hook with a maleimide conjugated fluores-
         cent dye (red), and flagellar filament with anti-flagellin antibodies conjugated
         to a fluorescent dye (green). B) Number of flagellar hook foci as a function
         of inducer concentration in single cells (750 cells per condition). Each color
         dot represents a cell (red: EM8242, blue: TH9677). The estimated mean for
         each population is indicated in black (dot: mean, line: 95% CI). C) Mean
         lengths of flagella as a function of the number of foci in each cell (symbols
         represent different AnTc concentrations). The relationship was fitted using a
         linear model with a gamma response (black line: mean, light and dark blue:
         50% and 95% CI). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    76
                                               xv


Figure 3.2 The relative metabolic cost of flagellar synthesis. A) Calculat-
         ing growth rate by monitoring change in culture optical density at 600 nm
         (OD600 ) over time. Representative measurements from non-induced EM8242
         (red circle) and TH9677 (blue circle) are overlaid. Growth rates (r), back-
         ground OD, and OD of the inoculum were estimated by least-square fitting of
         an exponential growth model (solid lines) while cells were at low density. B)
         Calculated growth rates as a function of inducer concentration for EM8242
         (red) and TH9677 (blue). Each dot represents a single batch culture (between
         30 and 200 per condition). The estimated growth rate for each condition is
         indicated in black (dot: mean, line: 95% CrI). C) Regression of growth rate
         as a function of number of foci. Each dot represents a single inducer concen-
         tration (EM8242 in red) or wild type FlhDC activity (TH9677 in blue). The
         crossed lines represent the 95% credible interval from Figure 3.1B and 3.2B.
         The black line represents the mean of the regression model (dark and light
         blue areas: 50% and 95% CrI). Inset: Posterior probability distribution of the
         relative cost (slope/intercept) of one flagellum of the growth rate (dot: mean,
         line: 95% CrI). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  78
Figure 3.3 Swimming performance of TH9677 with varying number of flag-
         ella. A) Illustrative examples of single cell tracking followed by detection
         of fluorescently labeled flagellar hooks after immobilizing cells with a UV-
         activated hydrogel. B-F) Linear regressions of behavioral parameters calcu-
         lated from cell trajectories against number of foci from labeled flagellar hooks.
         Varying number of foci does not explain the variability in swimming perfor-
         mance. Each dot represents a single cell. The black line represents the mean
         of the regression model (dark and light blue areas: 50% and 95% CrI). . . .          79
Figure 3.4 In silico evolution of the optimal number of flagella in different
         conditions. A) Schematic representation of the agent-based model. Flag-
         ellated agents swim up with a speed inversely proportional to the chemical
         gradient (blue) and are pulled down by gravity. Position along the gradient
         and determines the resources available for growth and flagellar synthesis. The
         rate of flagellar synthesis is allowed to evolve by natural selection as the pop-
         ulation grows until steady state is reached. B) New flagella (red dots) are
         inserted randomly on the agent body. Cell elongation follows one of three
         alternative models for cell wall synthesis (blue) to determine the allocation
         of flagella to daughter cells. C) Results from repeated in silico evolutionary
         experiments with different combinations of flagellar cost and growth reward
         from chemotaxis. Each dot represents the mean number of flagella per cell
         in an independent population after its evolutionary trajectory reached steady
         state. The solid lines represent the best fit from our mathematical model
         (with P(¬flg) as a free parameter). Inset: Mean and 95% CrI for the param-
         eter P(¬flg) fitted to simulation results. . . . . . . . . . . . . . . . . . . . . . 81
                                                xvi


Figure 3.5 Mathematical predictions of the optimal number of flagella.A)
         Optimal number of flagella (n) as a function of the reward to cost ratio for
         chemotaxis (δ/c) for different probability of not inheriting a flagellum after
         cell division (p) from equation 3.2. B) Expected growth reward (d) as a
         function of number of flagella (n) for different probability of not inheriting a
         flagellum (p) and a fixed cost of 1.17% of the growth rate per flagellum (c)
         from equation 3.3. C) Expected number of flagella (n) as a function of the
         probability of not inheriting a flagellum (p) for different reward to cost ratios
         (δ/c) from equation 3.2. The parameter values estimated in this study for S.
         enterica are indicated with dotted lines. . . . . . . . . . . . . . . . . . . . .     83
Figure 3.6 Effect of AnTc on the growth rate of TH9677. Each dot represents
         a single batch culture, and each color represents an independent experiment.
         The estimated growth rate for each condition is indicated in black (dot: mean,
         line: 95% CrI). The addition of AnTc had a very small positive effect on growth
         (5.6e-4 hr-1 (ng/ml)-1 [2.3e-4,8.9e-4] 95% CrI). . . . . . . . . . . . . . . . . .    89
Figure 3.7 Relative distribution of foci along the cell midline. Most foci are
         located between the cell poles, with a nearly even distribution from 0.2 to 0.8,
         and a slight dip at the cell midpoint. The pole and midpoint may be crowded
         with other cellular machinery (eg. FtsZ complex). . . . . . . . . . . . . . . .       90
Figure 3.8 Optimal number of flagella for different combinations of flagellum
         costs and rewards. Results from repeated in silico evolutionary experiments
         with different combinations of flagellar cost (color scale) and growth reward
         from chemotaxis. Each dot represents the mean number of flagella per cell
         in an independent population after its evolutionary trajectory reached steady
         state. Simulations were repeated using three growth models for new cell wall
         synthesis and flagella allocation between daughter cells (end: growth near cell
         pole, mid: growth from middle of cell length, random: diffusive growth along
         cell length). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Figure 4.1 A bacterial “racetrack”. The width and depth of the channel is about
         150µm. The total length of the track is approximately two meters long. . . . 101
                                               xvii


INTRODUCTION
Author: Joshua L. Franklin
                           1


    As our understanding of biology grows, so do the questions we are trying to answer.
Behavioral studies of bacteria are becoming more nuanced and are considering the myriad
of interactions between microbes in the environment. We are trying to predict the outcome
of evolutionary processes with greater precision than ever before. We are also determined
to answer the big questions in biology, the holy grail of which is the origin of life. As our
ambition grows, so do the obstacles we face. Nuanced understanding of small, but important,
effects is difficult in the face of biological variability and noise. The amount of data flowing
from our experiments is becoming difficult to store, let alone process and comprehend. Some
of our questions are so difficult that they cannot be easily tested in biological systems, as
the would require an impossible degree of control or exploration of an enormous parameter
space.
    Nonetheless, I am optimistic about the future. Raw computational power has exploded
over the past couple of decades, and the software required to utilize such power has been in
rapid development. Statistical tools that were curiosities for want of computational power
in the 1900’s are now daily tools of many scientists. For example, Bayesian statistics was
mostly infeasible prior to the computational revolution, but that is no longer the case. Tools
like STAN and brms allow researchers to write and run complex multi-level Bayesian models
with relative ease [22, 162]. Biologists can now easily write complex agent-based models to
investigate phenomena with an unprecedented degree of control and precision [184]. Even the
tools of our trade are improving, with microscopy and sequencing becoming more advanced
by the day.
    My research touches on a number of distinct topics, such as the evolution of mutational
robustness, collective behavior of cells, and evolutionary trade-offs associated with motility.
What unifies these topics are the computational approaches I have used to ask and answer
questions that would otherwise be difficult to approach. In particular, agent-based models
and mixed-effect Bayesian models are central to my work, because they allow me to write
fairly simple code describing agent interactions or the structure of an experiment. Yet they
                                                   2


also provide rich outputs that can be nearly as deep as biological datasets. I hope that the
following chapters of my dissertation provide interesting examples to inspire other biologists,
and encourages them to try a new computational analysis or write a new statistical model.
    The first chapter expands on the “Survival of the Flattest” (SOF) effect, which relates
to how organisms adapt to high mutation rates [189]. Briefly, the authors of the origi-
nal paper used the artificial life model Avida to evolve pairs of digital organisms in high
and low mutation rate environments, then performed competitions between the resulting
organisms. They found that slowly reproducing but mutationally robust organisms (“flat”)
could outcompete more quickly reproducing but mutationally fragile organisms (“fit”) in a
high mutation rate environment [189]. The SOF effect was later supported in biological
experiments [33, 101, 150], but a key question remained: how were organisms adapting their
genomes to high mutation rates? To investigate, we first replicated the original study, then
characterized the local fitness landscapes by evaluating tens of millions of mutants for each
pair of fit and flat organisms. We then binned the fitness effect of each mutation as lethal,
deleterious, neutral, or beneficial relative to the reference genotype. We also investigated
changes in epistasis by fitting a model to average fitness as a function of mutational distance.
Both lines of evidence indicated that mutational robustness limits the number of deleterious
mutations that can be passed to offspring. A surprising consequence is that robustness often
results in both increased rates of neutral and lethal mutations, since lethal mutations cannot
be inherited.
    The second chapter relates to fruiting body development in Myxococcus xanthus. Starving
M. xanthus cells cooperate to form a fruiting body that spreads environmentally resistant
spores in search of fresh resources [196]. Fruiting body formation requires both coordina-
tion of cells to form the fruiting body structure and differentiation of a subset of cells into
spores [93]. While fruiting body formation has long served as a model of cooperation and
differentiation, the exact temporal and spatial dynamics of differentiation were not well un-
derstood. We wanted to better characterize these transitions by performing image analysis
                                                3


of micrographs of fruiting bodies over time. However, the high cell density within a fruit-
ing body makes segmentation and characterization of individual cells difficult. To overcome
this hurdle, we seeded a small number of fluorescently labeled cells into developing fruit-
ing bodies of wildtype (i.e., non-fluorescent) M. xanthus, allowing us to easily segment and
characterize fluorescent micrographs despite the high cell density. We then used a Gaussian
mixture model to cluster and classify individual cells as rods, transitioning cells, or spores.
To interpolate the proportion of these categories across space, we used Bayesian generalized
additive models. We also investigated cell orientation, relative to both the fruiting body and
neighboring cells. Our results indicate the spatial patterning of cell-cell interactions may
enhance contact-dependent C-signaling, a critical signal for sporulation and fruiting body
development [93].
    The third chapter relates to peritrichously flagellated bacteria. We know that flagella
are metabolically expensive due to their protein content and propensity to be lost during
evolution experiments in shaken flasks [138]. As such, we expect that natural selection will
optimize the cost-benefit trade-off to produce an optimal number of peritrichous flagella for
a given environment. To better understand the trade-offs involved, we quantified the cost of
flagella using growth curve assays. Because growth curves are noisy and the cost of flagella
is relatively small, we used a Bayesian mixed-effect model to increase both our accuracy and
precision. We determined that each flagellum decreases the growth rate by about 1%, in line
with previous estimates [138]. However, when we performed single-cell tracking under the
microscope, we found that flagella number and swimming performance are not well correlated
in low-viscosity (i.e. water-like) liquids. Given the high metabolic costs and lack of swimming
benefits, why should cells make more than one flagellum? To investigate, we wrote an
agent-based model of flagellar inheritance, which showed a logarithmic relationship between
benefit/cost ratio and optimal flagella number. This model is consistent with bacteria that
make a small number of peritrichous flagella (e.g. Salmonella enterica). However, cells
producing many peritrichous flagella (e.g. Bacillus subtilis) appear to be subject to a different
                                                 4


evolutionary trade-off.
    In the concluding chapter, I emphasize the importance of the computational and statis-
tical methods used in each study. I also elaborate on more exploratory future directions for
each project than would be present in a typical scientific article. I then recommend biologists
embrace agent-based and mixed-effect models as key tools for expanding their research capa-
bilities. I end with some suggestions for model-writers to make their models easier to re-use
for educational purposes. Taken together, I hope this dissertation explores computational
tools useful for biologists in an approachable manner and encourages them to write their
own models for both research and educational purposes.
                                              5


CHAPTER 1 - MAPPING THE PEAKS: FITNESS LANDSCAPES OF THE
FITTEST AND THE FLATTEST
Authors: Joshua L. Franklin*, Thomas LaBar*, and Christoph Adami
**Co-first authors
This chapter was originally published as an article in Artificial Life 25: 250-262
Reproduced with permission from the publisher:
https://direct.mit.edu/journals/pages/rights-permissions
Figures have been renumbered from the original publication.
                                             6


Abstract
    Populations exposed to a high mutation rate harbor abundant deleterious genetic varia-
tion, leading to depressed mean fitness. This reduction in mean fitness presents an oppor-
tunity for selection to restore adaptation through the evolution of mutational robustness.
In extreme cases, selection for mutational robustness can lead to “flat" genotypes (with
low fitness but high robustness) out-competing “fit" genotypes with high fitness but low
robustness—a phenomenon known as “survival of the flattest". While this effect was pre-
viously explored using the digital evolution system Avida, a complete analysis of the local
fitness landscapes of “fit” and “flat” genotypes has been lacking, leading to uncertainty about
the genetic basis of the survival of the flattest effect. Here, we repeated the survival of the
flattest study and analyzed the mutational neighborhoods of fit and flat genotypes. We found
that flat genotypes, compared to the fit genotypes, had a reduced likelihood of deleterious
mutations as well as an increased likelihood of neutral and, surprisingly, of lethal mutations.
This trend holds for mutants one to four substitutions away from the wild-type sequence.
We also found that flat genotypes have, on average, no epistasis between mutations, while
fit genotypes have, on average, positive epistasis. Our results demonstrate that the genetic
causes of mutational robustness on complex fitness landscapes are multifaceted. While the
traditional idea of the survival of the flattest effect emphasized the evolution of increased
neutrality, others have argued for increased mutational sensitivity in response to strong mu-
tational loads. Our results show that both increased neutrality and increased lethality can
lead to the evolution of mutational robustness. Furthermore, strong negative epistasis is not
required for mutational sensitivity to lead to mutational robustness. Overall, these results
suggest that mutational robustness is achieved by minimizing heritable deleterious variation.
Background
    All well-adapted populations receive an influx of deleterious variation every generation
due to de-novo mutations. This influx reduces the population’s average fitness and causes
                                                7


a mutational load [2, 39]. For populations with high mutation rates, deleterious variation
can alter the dynamics of natural selection and lead to selection acting not just on individ-
ual genotypes, but on genotypes and their likely mutant genotypes [13, 19, 124, 186]. This
combination of a wild-type sequence and its mutants is known as a “quasispecies" and is
thought to be a unit of selection in organisms such as the first RNA replicators [47, 48] and
modern-day RNA viruses [100].
    Since the quasispecies concept was first proposed [47,48], many studies have analyzed the
conditions and consequences of evolutionary dynamics in this strong mutational regime (e.g.,
[7,87,186]). One consequence of evolution under high mutation rates is that populations are
expected to evolve mutational robustness by fixing mutations that decrease the likelihood
(and/or decrease the average effect) of deleterious mutations [42, 188]. In the extreme case,
this trend can lead to genotypes with lower fitness out-competing genotypes with higher
fitness if the lower-fitness genotypes are more mutationally-robust [152]. This effect has
been termed the “survival of the flattest” effect, where “flat” genotypes, that is, those with
lower fitness but greater mutational robustness, out-compete “fit” genotypes, meaning those
with higher fitness but decreased mutational robustness [189]. While this effect was first
observed in the digital evolution system Avida, it was later demonstrated with experimental
evolution of RNA viruses [33, 101, 150].
    Even though the evolutionary dynamics of the survival of the flattest effect are well-
understood, the exact genetic causes of robustness are unknown. Indeed, the differences in
the distribution of fitness effects [52] (that is, the local fitness landscapes) between fit geno-
types and flat genotypes have not been explored. While it is generally assumed that muta-
tional robustness evolves due to the evolution of an increased number of neutral mutational
neighbors [53,173,185], other authors have argued that increased mutational sensitivity may
evolve [4, 87, 99, 126, 190]. Additionally, recent work has indicated that populations evolving
under strong genetic drift are expected to evolve “drift-robust” genetic architectures, giving
rise to genotypes with a decreased likelihood of small-effect deleterious mutations and an
                                                  8


increased likelihood of neutral and/or strongly-deleterious mutations [98]. A mapping of the
local fitness landscapes of fit genotypes and flat genotypes is needed to compare the survival
of the flattest effect to these other proposed evolutionary mechanisms.
    To explore the local fitness landscapes of fit genotypes and flat genotypes, we first re-
peated the original survival of the flattest experiment. We then took genotypes adapted to
either low or high mutation rates and obtained the distribution of fitness effects for these
genotypes by examining mutants up to four substitutions away from the wild-type sequence.
We found that flat genotypes had a greater likelihood of beneficial and neutral mutations
(as expected for mutationally-robust genotypes) while fit genotypes had a greater likelihood
of deleterious mutations, as expected from mutationally-sensitive genotypes. Contrary to
expectations, we also found that flat genotypes had a greater likelihood of lethal mutations,
suggesting that the survival of the flattest effect relies on a reduction of heritable delete-
rious mutation, which can be achieved by both an increase in neutral variations, as well
as lethal mutations. These results illustrate the complexity of mutationally-robust genome
architectures on non-trivial fitness landscapes.
Materials and Methods
Avida
    Here, we describe the relevant details of Avida (version 2.14; our version, with added code
for data analysis, is available at https://github.com/joshf93/MappingThePeaks) for the cur-
rent experiments (see [128] for a full overview of the Avida software). In Avida, a finite
population of computer programs (“avidians") compete for the resources (memory, space,
and processing time) required for reproduction. Each program consists of a circular genome
of computer instructions that encode the ability to self-replicate. During this replication
process, instructions may be copied inaccurately, resulting in mutations that are passed onto
an avidian’s offspring. As avidians with faster replication speeds enter the population, they
reproduce faster than slower-replicating avidians, leading to the spread of faster replicators
and the extinction of slower replicators. In other words, there is selection for faster repli-
                                                9


cators. Therefore, because there is heritable variation and selection in Avida, populations
of avidians undergo evolution by natural selection [133]. Avida has been used to test many
concepts difficult to test in biological systems [1, 37, 59, 60, 97, 103, 163].
    The Avida world consists of a toroidal grid of N cells; each cell can be occupied by
at most one avidian and thus N is the maximum population size. During reproduction,
offspring avidians are placed into one of the nine adjacent cells of its parent, including the
cell that contains the parent. If some of these cells are unoccupied, a random empty cell is
chosen. If all neighboring cells are occupied, a random cell is chosen, its occupant is removed,
and the new avidian then occupies the cell. This random selection of replacement adds an
element of genetic drift to Avida.
    The limited number of cells in the Avida environment is one of the limiting resources
over which avidians compete. The other limiting resource is the opportunity to execute
the instructions in an avidian’s genome. The resource required to execute one instruction
is called a “Single Instruction Processing" unit (SIP). During each unit of time in Avida
(called an update) 30N SIPs are available to the population for execution. These SIPs are
probabilistically distributed to avidians according to a figure of merit that is related to their
ability to perform certain computations. These computations (binary logic operation) are
the “traits" that avidians display. In a monoclonal population, all avidians express the same
traits and thus have the same merit and thus receive on average 30 SIPs (i.e., they execute 30
genome instructions during that update). However, avidian lineages can evolve the ability
to increase merit, and thus increase the number of instructions that can be executed per
update.
    Fitness in Avida is implicit and can only be estimated by executing the computer instruc-
tions within an avidian’s genome. In this regard, Avida differs from classical evolutionary
simulations where genotypes are assigned a fitness and their frequency increases or decreases
proportionally to this assigned value. Here, faster replicators out-compete slower replicators
and fitness emerges from this dynamic. There are two means by which an avidian lineage
                                                10


can evolved increased replication speed. First, genotypes that require fewer instruction ex-
ecutions to undergo reproduction will reproduce faster than avidians that require a greater
number of instruction executions. In environments where genome size can vary, this is often
achieved by decreasing genome size. In fixed genome size environments, this trajectory of fit-
ness gain still occurs, often through the fixation of additional instructions that copy genome
information from parent to offspring. The second methods of fitness increase in Avida is
through the evolution of complex adaptations: the ability to perform one- and two-input
Boolean logic calculations (the traits discussed earlier). In the setup used here, nine of these
functions are potentially rewarded with SIPs; the amount of SIPs awarded is proportional
to the complexity of the evolved trait [104]. The performance of these calculations increases
an avidian’s merit. In other words, the more calculations an avidian can perform, the more
SIPs it receives per update and it can then execute a larger portion of its genome per unit
time. This increases the avidian’s replication speed and leads to selection for the ability to
perform these calculations. When Avida estimates a genotype’s fitness, it is estimated as
the genotype’s merit divided by the number of instruction executions required for replication
(otherwise known as its gestation time).
    Avidian populations evolve these complex adaptations by fixing a sequence of mutations.
This genetic variation is introduced into a population when an avidian copies one instruction
inaccurately into its daughter’s genome. The rate at which these mutations occur is set by
the experimenter and is usually set as a rate “per instruction copied" (a copy-mutation
rate). When mutations occur during this copying process, multiple mutations can occur
per replication event. Additionally, in experiments where genome size can evolve, insertion
and deletion mutations occur upon division (not during the replication process) at a pre-
specified rate. These indel mutations insert or delete one instruction. Finally, large-scale
genome changes can occur, but these changes are caused by the specific replication algorithm
encoded by an avidian’s genome; these mutations are inherent to the Avida genetic code
(implicit mutations) and are not under the control of the experimenter.
                                               11


Experimental Design
    Here, we repeated the experimental protocol of the original survival of the flattest
study [189]. We differed from the original study in one aspect: the placement of offspring.
Here, we used Avida’s current default setting, where a new offspring randomly replaces
one of the nine neighbors of its parent (including possibly the parent). The original paper
used mass-action reproduction, where new offspring could randomly replace any offspring in
the population. We first generated a collection of genotypes during the Initial Adaptation
step. We evolved 201 populations of 3,600 individuals for 5 × 104 updates. All populations
initially started with the default Avida ancestor with an one-hundred instruction genome.
Point mutations occurred at a rate of 0.0075 mutations per instruction copied and inser-
tion/deletion mutations occurred at a rate of 0.05 mutations per division each. At the end
of these populations’ evolution, we extracted the most abundant genotype.
    For the Mutation Rate Adaptation phase of the experiment, we duplicated each of the
201 genomes and used one as progenitor of an experiment at a fixed low genomic mutation
rate of 0.5 mutations per genome per generation, and one at a fixed high genomic rate of 2.0
mutations genome per generation. To achieve this, we adjusted the per-site mutation rate of
each genome so that the targeted genomic rate was achieved, and disallowed sequence length
changes. In this manner, the Initial Adaptation created genetic variation used as input to
the adaptation step at a fixed genomic rate. From this stage onward each genotype was
limited to the tasks it could complete at the end of the Initial Adaptation phase, i.e., they
were not allowed to evolve novel traits, though they could lose traits they already possessed.
Each of the populations adapting to high or low mutation rate consisted of 3600 individuals
and evolved for 104 generations. After evolution we again selected the most abundant geno-
type from each population and denoted genotypes from the low mutation environment “fit"
genotypes, and genotypes from the high mutation environment “flat" genotypes.
    Next, we performed competition experiments between each pair of fit and flat genotypes
(each pair has the same ancestor). We only performed these competitions with genotype pairs
                                              12


where the fit genotype was 1.5 times more fit than the flat genotype. For the competitions,
we seeded populations of 3600 individuals with 1800 individuals of each genotype and ran the
simulation for 200 generations. We repeated these competitions across a range of genomic
mutation rates (0.5 to 3.0 mutations/genome/generation in increments of 0.5) and performed
five replicates per mutation rate. We tracked the abundance of the fit genotypes over time
to determine the outcome of the competition. We classified the flat genotype as the winner
if it reached at least 95% of the population in three of five replicates at a given mutation
rate.
Data Analysis
      We performed mutational analyses to determine the fitness landscapes of the fit and flat
genotypes used in the competitions using Avida’s “analyze" mode. In analyze mode, avidians
run through their life-cycle in isolation (as opposed to in a population) and characteristics
such as their fitness can be estimated. We first estimated the base fitness of all avidian
genotypes used in the experiment. Then, we took the genotypes used in the competition and
generated all possible single mutants. For each genotype, there are 25L single mutants, where
L is the genome size and 25 is the number of possible alternative alleles at each position
in the genome, as there are 26 instructions in the Avida instruction set. We then repeated
this procedure for all double mutants. For triple- and quadruple-mutants, we randomly
sampled 108 mutants as the number of all possible mutants made generating every mutant
computationally prohibitive1 .
      After generating these mutants, we calculated their fitness and compared them to their
ancestor genotype. Mutants with greater fitness were designated as beneficial and mutants
with the same fitness were designated as neutral. Mutants that were viable but with de-
creased fitness were designated as deleterious. Mutations resulting in genomes that could
not reproduce were designated as lethal. To estimate differences in epistasis between fit and
flat genotypes, we fit the mutational analysis data to the following equation:
                                                                                                L
                                                                                                  
    1
      The number of possible n-mutants of a length L genome encoded in an alphabet of size D is n   (D − 1)n ,
which translates to over 1.5 × 1012 possible mutants for L = 100, n = 4, and D = 26.
                                                     13


                                            w(n)
                                       log(       ) = −αnβ                                  (1.1)
                                             w0
    Here, n is the genetic distance from the initial genotype, w(n) is the mean fitness of all
mutants at a given genetic distance, and w0 is the fitness of the initial genotype. This is
a similar equation to that fit in the original survival of the flattest paper [189], with the
mean fitness estimated at a given mutation rate substituted by the mean fitness of all n-
mutants (as is often used [103]). α and β can be thought of as parameters that measure a
genotype’s mutational robustness and epistasis, respectively. A greater α indicates that a
genotype is less mutationally-robust, while a greater β indicates epistasis is more negative
(i.e., the fitness effect of a double-mutant is less than expected from the fitnesses of the two
single-mutants).
    All statistical analyses were performed using R 3.4.4 [137] and figures were generated
with either the ggplot2 R package [183] or the Matplotlib Python package [69].
Results
    We first evolved 201 populations and isolated the most abundant genotype from each
population. We then evolved these genotypes in both a high mutation rate environment
and a low mutation rate environment and isolated the most abundant genotype from each
environment. The genotypes from the low mutation rate environment did not significantly
change in fitness compared to their ancestor (Fig. 1.1a; Wilcoxon Rank Sum test, p =
0.411). The genotypes from the high mutation rate environment significantly decreased in
fitness compared to their ancestors (Fig. 1.1a; p = 2.216 × 10−7 ). By tracking the fitness
of the high-mutation-rate populations through time, we observe that these genotypes first
severely decrease in fitness upon transfer, but then recover fitness as they search out new,
and presumably more mutationally robust, fitness peaks (Fig. 1.1b).
    Of our 201 pairs of genotypes adapted to low mutation rates and high mutation rates,
166 pairs evolved such that the low mutation rate genotype (the fit genotype) had a fitness
                                                 14


  a                                         b
               6                                            1.00
          10
Fitness                                   W avg W initial
                                                                                                                Fit
                                                            0.10                                                Flat
               4
          10
                                                            0.01
                   Initial   Fit   Flat                            0        2500      5000       7500   10000
                                                                                   Generations
Figure 1.1: Populations adapt to different fitness peaks in high mutation rate
environments. a) Fitness values for the most abundant genotype from each population
after the Initial Adaptation experiment (denoted by “Initial") and after adaptation to low
(“Fit”) and high (“Flat”) mutation rate environments. For boxplots, lines are the median
values, boxes show the interquartile range, and whiskers are 1.5 times the interquartile
range or the minimum/maximum value; points are outliers. Overlapping notches suggest
that the difference in medians is not significant. b) Relative average fitness values over the
course of the Mutation Rate Adaptation experiment. Lines are the mean value across all
populations and the shaded regions are standard deviations. Blue represents fit genotypes
and red represents flat genotypes.
greater than 1.5 times the high mutation rate genotype (the flat genotype), which is the
threshold used in the original study [189]. We took these pairs and performed competition
experiments between the two genotypes across a range of mutation rates. In 76 (45.8%)
of these competition experiments we noted that the eventual winner of the competition
depended on the mutation rate, and that a switch between winners occurred at a critical
mutation rate, just as was observed in the initial study [189]. Fit genotypes win the compe-
tition at low mutation rates, but flat genotypes win the competition at high mutation rates
even with the large fitness difference (Fig. 1.2). It is worth noting that our definition of
winning excluded many genotype pairs that showed the effect to a lesser degree.
      Next, we analyzed the mutational neighborhood (the local fitness landscape), of all 166
pairs of fit and flat genotypes. We first generated all genotypes containing one point mutation
and measured their fitness. There are significant differences in particular classes of mutations.
Flat genotypes have a greater likelihood of beneficial (Fig. 1.3a; Wilcoxon Rank Sum test,
                                                                       15


a              1.00                                      b              1.00
               0.75                     μ = 0.5                         0.75                     μ = 1.0
Fraction Fit                                             Fraction Fit
               0.50                                                     0.50
               0.25                                                     0.25
               0.00                                                     0.00
                      0   50      100       150   200                          0   50      100       150   200
                               Generation                                               Generation
c              1.00                                      d              1.00
               0.75                     μ = 1.5                         0.75                     μ = 2.0
Fraction Fit                                             Fraction Fit
               0.50                                                     0.50
               0.25                                                     0.25
               0.00                                                     0.00
                      0   50      100       150   200                          0   50      100       150   200
                               Generation                                               Generation
Figure 1.2: Competition outcome between fit and flat genotypes is determined by
the mutation rate. Outcome from competition experiments between one fit/flat genotype
pair. Black line is the mean fraction of the population consisting of the fit genotype as a
function of time (measured in generations). Green shading represents one standard deviation.
Each subplot shows the results for a given genomic mutation rate for five replicates.
p = 1.35 × 10−20 ), neutral (Fig. 1.3b; p = 2.66 × 10−35 ), and lethal (Fig. 1.3d; p = 0.0168)
mutations. Fit organisms have a greater likelihood of deleterious mutations (Fig. 1.3c;
p = 1.31 × 10−44 ). Additionally, fit genotypes have lower mean relative fitness than flat
genotypes (Fig 1.3d; p = 2.24 × 10−27 ). We also analyzed the mutational neighborhood for
sequences that are two, three, or four mutations away from the wild-type sequence and found
similar results to the one-mutant mutational neighborhood (Fig. 1.4).
          Finally, we used the mutational neighborhood data for sequences up to four mutations
away from the wild-type sequence to estimate differences in epistasis between fit and flat
genotypes. We fit the relative average fitness values as a function of mutational distance
to a curve containing two parameters (see Methods for further details): α (as a measure of
mutational robustness) and β (as a measure of epistasis). As expected from the data on the
                                                        16


   a                           Beneﬁcial                                                          b                   0.5   Neutral                                                           c                          Deleterious
                                                                                                                                                                                                                   0.6
Fraction of Mutations                                                                         Fraction of Mutations                                                                        Fraction of Mutations
                                                                                                                      0.4
                        0.10
                                                                                                                      0.3                                                                                          0.4
                        0.05
                                                                                                                      0.2
                                                                                                                                                                                                                   0.2
                                                                                                                      0.1
                        0.00
                                           Fit                                  Flat                                                  Fit                                     Flat                                                 Fit   Flat
                                                     d                         Lethal                                                           e
                                                                         0.6
                                                                                                                                             Relative Average Fitness
                                                                                                                                                                        0.6
                                                 Fraction of Mutations
                                                                         0.5
                                                                         0.4                                                                                            0.4
                                                                         0.3
                                                                         0.2                                                                                            0.2
                                                                         0.1
                                                                                        Fit                                  Flat                                                    Fit                                    Flat
Figure 1.3: Description of local fitness landscapes for fit and flat genotypes. a-
e) Distribution of fitness effects with mutations grouped by their broad effect. Boxplots
as previously described. Lineage refers to whether the data is for fit genotypes (blue) or
flat genotypes (red). f) Average relative fitness of all one-step mutants for each fit and flat
genotype.
relative fitness of one-step mutants (Fig. 1.3e), fit genotypes have greater alpha values (Fig.
1.5a; p = 1.06 × 10−27 ). Flat genotypes have greater β values (Fig. 1.5b; p = 5.44 × 10−34 ),
indicating that flat genotypes have greater negative epistasis than fit genotypes. However, it
should be noted that flat genotypes have, on average, almost no epistasis, while fit genotypes
have positive epistasis. These data demonstrate how the selection pressure for mutational
robustness drives populations to alternative areas of the fitness landscape.
Discussion
                           We used digital experimental evolution to map the local fitness landscape of high fitness,
but mutationally-fragile,“fit" genotypes adapted under low mutation rates, and low fitness,
but mutationally-robust, “flat" genotypes adapted under high mutation rates. After repeat-
ing the original survival of the flattest effect [189], we calculated the distribution of fitness
effects for de-novo point mutations up to four substitutions from the wild-type genotypes.
We found that, as expected, flat genotypes were more mutationally robust than fit genotypes.
Flat genotypes had a lower likelihood of deleterious mutations and a greater likelihood of
                                                                                                                                            17


                                            Summarized Distribution of Fitness Eﬀects
                                   Lethal             Deleterious                Neutral           Beneﬁcial
                        1.00
                        0.75
Fraction of Mutations
                        0.50
                                                                                                                   Fit
                                                                                                                   Flat
                        0.25
                        0.00
                               1   2   3    4     1     2    3      4        1   2    3    4   1    2    3     4
                                                                 Distance
Figure 1.4: Summarized distribution of fitness effects for mutants with multiple
mutations. Proportion of mutations with a mutational effect classification for mutants
up to four mutations away from the wild-type sequence. See text for definitions of different
classes of mutations. Distance is the number of mutations away from the wild-type sequence.
Data for mutants with distance of one is the same as in Figure 3. Blue (red) boxplots are
the proportions from the fit (flat) genotypes. Boxplots as previously described.
neutral and beneficial mutations than fit genotypes. Surprisingly, flat genotypes also had a
greater likelihood of lethal mutations and more negative epistasis than fit genotypes, sug-
gesting that the factors responsible for mutational robustness involve only limiting heritable
deleterious variation.
                 Around the time when the original survival of the flattest paper first appeared, muta-
tional robustness was largely conceptualized as a decrease in the likelihood of a deleterious
mutation [173, 189]. And while these changes in genetic architecture can lead to mutational
robustness, work since that original publication has suggested that increases in the effect of
deleterious mutations, not decreases, can also lead to robustness [4, 126]. Increased sever-
ity of deleterious mutations can lead to population-level robustness as purifying selection
                                                                        18


a                                          b
    2.0                                         1.04
    1.5
                                                1.00
α                                          β
    1.0
                                                0.96
    0.5
                Fit           Flat                           Fit            Flat
Figure 1.5: Distribution of α and β values for fit and flat genotypes. Distributions
of α and β for each treatment. α is a measure of single-mutant fitness, while β is a measure
of epistasis (see Methods for further details). Boxplots as previously described.
is more effective against strongly-deleterious mutations. It has also been proposed that in-
creased negative (or synergistic) epistasis can evolve as a mechanism to increase the strength
of purifying selection and hence robustness [99, 190].
    Our results suggest that all of these genetic mechanisms (increased neutrality, increased
severity of deleterious mutations, and increased negative epistasis) can contribute to muta-
tional robustness. These data support a previously-proposed relationship between increased
neutrality (low α) and increased negative epistasis (high β), based on data from Avida,
RNA-folding, and neutral network fitness landscapes [187]. In other words, a population can
only maximize immediate mutational robustness at the cost of increased deleterious severity
of mutants farther away in the fitness landscape. We should also note that mutations on
a “flat" genetic background are not, on average, negatively epistatic. Flat genotypes evolve
to have, on average, no epistasis between mutations, as previously shown to occur in high
mutation rate Avida populations [46]. However, they are more negatively-epistatic than fit
genotypes, which have positive epistasis between mutations. The lack of negative epistatic
                                               19


interactions in flat genotypes is likely related to their abundance of lethal mutations. Ad-
ditional mutations on a genetic background with a lethal mutations cannot, by definition,
lead to a greater decline in fitness (negative epistasis).
    Theoretical arguments suggest that increasing the severity of mutations not only protects
against high mutation rates, but also can protect against fitness loss due to genetic drift [85],
although this depends on the specific distribution of fitness effects [21]. Recently, LaBar and
Adami proposed the concept of “drift robustness,” or the idea that small populations tend
to populate drift-robust fitness peaks with a low likelihood of slightly-deleterious mutations
(and a high likelihood of both neutral and strongly-deleterious mutations), while large popu-
lations would adapt to drift-fragile fitness peaks with a high likelihood of slightly-deleterious
mutations [98]. Both flat genotypes and drift-robust genotypes have some noticeable sim-
ilarities in their distribution of fitness effects (increased neutrality, in particular), so it is
worth asking whether the evolution of drift robustness and survival of the flattest is really
the same phenomenon.
    The main difference between drift robustness and survival of the flattest is the role of
natural selection in the two phenomena. The evolution of drift robustness does not require
competition between drift-robust and drift-fragile genotypes [98]. Instead, small populations
tend to evolve towards drift robustness because they can only maintain fitness on drift-
robust genetic backgrounds. Survival of the flattest, however, occurs when a flat genotype
out-competes a fit genotype, and is driven by selection for mutational robustness. It has also
been established that the critical high mutation rate above which this selective advantage
occurs is independent of population size [35], while drift robustness, almost by definition,
is strongly dependent on population size. Furthermore, our previous work demonstrated
that drift-robust genotypes are not more mutationally-robust than drift-fragile genotypes, if
one measures mutational robustness as the mean relative fitness of all single mutants [98].
Single mutants of flat genotypes are more fit, on average, than those of fit genotypes, again
illustrating a difference between drift robustness and survival of the flattest. Thus, while
                                                 20


similarities exist between drift-robust genetic architecture and “flat" genetic architecture, it
appears that they represent distinct evolutionary responses to different population-genetic
environments.
    It would be worthwhile to determine if the results observed held here can be replicated
with RNA viruses, the biological model for the survival of the flattest effect [33, 101, 150].
It would be illuminating to test if the likelihood of lethal mutations does increase in these
evolved viruses, especially in light of studies arguing that RNA viruses have high muta-
tional fragility, not high mutational robustness [49]. Such a study would lead to a better
understanding of the mutational mechanisms that lead to robust organisms in nature.
Conclusions
    In high mutation rate environments, populations will evolve alternative genetic architec-
tures that increase their robustness to deleterious mutations. Classically, this is viewed as a
survival of the flattest effect [189], where populations adapt to fitness peaks with increased
neutrality [173]. Here, we repeated the original experiments to finely map the fitness land-
scapes of fit and flat genotypes. Flat, mutationally-robust genotypes adapt to fitness peaks
with an increased likelihood of both neutral genotypes and lethal genotypes, a decreased
likelihood of deleterious mutations, and greater negative epistasis compared to fit genotypes.
These results demonstrate the genetic complexity of mutationally robustness genomes that
minimize the impact of heritable deleterious genetic variation.
Availability of data and material
    All Avida configuration files and data analysis scripts to recreate these experiments,
analyze the data, and recreate the figures are available at:
https://github.com/joshf93/MappingThePeaks
Competing interests
    The authors declare that they have no competing interests.
                                               21


Funding
    This material is based in part upon work supported by the National Science Foundation
under Cooperative Agreement No. DBI-0939454. Any opinions, findings, and conclusions or
recommendations expressed in this material are those of the author(s) and do not necessarily
reflect the views of the National Science Foundation.
Author’s contributions
    JF performed the experiments and analyzed the data. JF, TL, and CA designed the
study and wrote the manuscript.
Acknowledgements
    T.L. acknowledges a Michigan State University Distinguished Fellowship, a BEACON
fellowship, and the Russell B. DuVall award for support. This work was supported in part
by Michigan State University through computational resources provided by the Institute for
Cyber-Enabled Research.
                                             22


Supporting Information
Figure 1.6: Histogram of the residual standard error (RSE) of the fits of Equation
1 to all fit and flat peaks.
                                         23


Figure 1.7: Boxplot of the residuals of the fit to Equation 1 for the fit peaks (blue)
and flat peaks (red) separately. As in the boxplots of the main text, lines are the median
values, boxes show the interquartile range, and whiskers are 1.5 times the interquartile range
of the minimum or maximum value; points are outliers.
                                              24


CHAPTER 2 - CELL DENSITY, ALIGNMENT, AND ORIENTATION COR-
RELATE WITH C-SIGNAL-DEPENDENT GENE EXPRESSION DURING
MYXOCOCCUS XANTHUS DEVELOPMENT
Authors: Y Hoang*, Joshua L. Franklin*, Yann S. Dufour, and Lee Kroos
**Co-first authors
Notice: My contribution to this article focused on the data analysis, modeling, and figure
construction, as well as some edits to the manuscript.
                                             25


Abstract
     Starving Myxococcus xanthus bacteria use short-range C-signaling to coordinate their
movements and construct multicellular mounds, which mature into fruiting bodies as rods
differentiate into spherical spores. Differentiation requires efficient C-signaling to drive the
expression of developmental genes, but how the arrangement of cells within nascent fruiting
bodies (NFBs) affects C-signaling is not fully understood. Here, we used confocal microscopy
and cell segmentation to visualize and quantify the arrangement, morphology, and gene ex-
pression of cells near the bottom of NFBs at much higher resolution than previously achieved.
We discovered that “transitioning cells” (TCs), intermediate in morphology between rods and
spores, comprised 10 to 15% of the total population. Spores appeared midway between the
center and the edge of NFBs early in their development and near the center as maturation
progressed. The developmental pattern as well as C-signal-dependent gene expression in
TCs and spores were correlated with cell density, the alignment of neighboring rods, and the
tangential orientation of rods early in the development of NFBs. These dynamic radial pat-
terns support a model in which the arrangement of cells within the NFBs affects C-signaling
efficiency to regulate precisely the expression of developmental genes and cellular differentia-
tion in space and time. Developmental patterns in other bacterial biofilms may likewise rely
on short-range signaling to communicate multiple aspects of cellular arrangement, analogous
to juxtacrine and paracrine signaling during animal development
Significance Statement
     Significance Statement Pattern formation is foundational to development of multicellu-
lar organisms. To investigate patterning in bacteria during formation of Myxococcus xanthus
fruiting bodies, cellular arrangement, morphology, and gene expression driven by C-signaling
were quantified using high-resolution microscopy. How short-range C-signaling controls gene
expression governing rod-to-spore differentiation within multicellular mounds has been un-
clear. The results support a new model in which cell density, rod neighbor alignment, and
                                               26


rod tangential orientation affect C-signaling efficiency to pattern gene expression and spore
maturation. Other bacterial biofilms may likewise be patterned by short-range signaling that
captures local cellular arrangement and dynamical interactions, which could be exploited to
manipulate biofilm development. Our results provide fundamental insights into spatiotem-
poral patterning by short-range signaling, which is relevant to animal development.
Introduction
    How cells build multicellular structures and adopt alternative fates are fundamental
questions in developmental biology. The mechanisms of pattern formation can be lineage-
dependent or signal-dependent, and formation of complex patterns of differentiated cell types
typically involves both mechanisms [41,68]. Many bacteria form multicellular structures such
as filaments, aggregates, and biofilms, in some cases with differentiated cells, although the
patterns tend to be relatively simple (i.e., fewer cell types and less precise spatial organization
than in many multicellular eukaryotes) [30, 108]. Single-species bacterial multicellular struc-
tures have proven to be useful models to discover fundamental principles of pattern formation
(e.g., Anabaena heterocyst differentiation [65], Streptomyces development [20], biofilm for-
mation by Bacillus subtilis [179], Vibrio cholerae [169], and Pseudomonas aeruginosa [182]).
Compared to other bacteria, Myxococcus xanthus biofilms form relatively complex patterns
when starved [196]. Cells coordinate their movements to build regularly-spaced mounds of
nearly uniform size and hemispherical shape, each containing approximately 100,000 cells.
Some of the rod-shaped cells within mounds differentiate into round spores (Fig. 2.7). The
mature spore-filled mounds are called “fruiting bodies”, which promote survival [175]. How-
ever, the majority of cells lyse during the developmental process and 10-20% of the cells
remain outside of fruiting bodies as “peripheral rods” [102, 125, 191] (Fig. 2.7).
    The dynamic nature of mound building by M. xanthus was first visualized by low-
resolution time-lapse microscopy [76, 142]. Increased resolution and tracking of individual
rods have revealed changes in motility behavior that lead to mound formation. Early flocking
of rods into groups is followed by alignment in streams facilitated by directional reversals
                                                27


and slime-trail-following [6, 72, 170]. Intersecting streams appear to create “traffic jams” of
rods with reduced motility [76, 156], which lead to raised mounds as monolayers of rods
add vertically in tiers [40]. Some mounds mature into fruiting bodies, while others disperse,
merge, or split in two [194]. Reduced motility in traffic jams is insufficient to explain all
the observed dynamics of mound building [199]. Rods in addition perform a biased random
walk toward mound centroids, suggestive of chemotaxis, and neighboring rods align their
trajectories in a direction radial to the nearest mound [36].
     Nascent fruiting bodies (NFBs) are stable mounds in the process of maturing into spore-
filled fruiting bodies. Two NFB domains have been described, an outer hemispherical domain
of densely packed rods surrounding an inner hemisphere of less ordered rods at three-fold
lower cell density [145]. Rods in the outer domain were reported to rarely reverse their
direction of movement and to orbit the inner domain in either direction. Differential gene
expression in the two domains and temporal changes in refractile patches possibly due to
spore precursors led to a model in which spore differentiation begins in the outer domain and
fills inward over time, with the orbiting movement of rods in the outer domain transporting
nonmotile spore precursors to the inner domain [146].
     C-signaling is key among several factors known to coordinate mound building with spore
differentiation, ensuring that spores form only within mounds [88, 167]. C-signaling involves
a proteolytic fragment of CgsA [82, 106] and/or lipids produced by CsgA cardiolipin phos-
pholipase activity [14]. A csgA null mutant fails to build stable mounds and fails to form
spores [155]. Overexpression of csgA results in small mounds forming prematurely and spores
forming prematurely outside of mounds [93]. Although the mechanism of C-signal transmis-
sion is unknown, cell movement enhances C-signaling, possibly due to end-to-end and/or side-
to-side contact between cells [81,84,90,147,200]. Expression of csgA increases in response to
starvation [38,61,105]. It has been proposed that alignment of rods in streams during mound
building and in the outer domain of NFBs increase short-range C-signaling [72,73,145]. Both
an increasing level of intracellular CsgA and increasing intercellular C-signaling may allow
                                                28


thresholds to be reached that regulate differential gene expression required to coordinate
mound building with spore differentiation [83, 105].
    C-signaling regulates the expression of many genes during development [91]. In recip-
ient cells, C-signaling has been proposed to activate the transcription factor FruA post-
translationally, but the mechanism is unknown [50, 148]. FruA binds cooperatively with
a second transcription factor, MrpC, to promoter regions of several C-signal-dependent
genes [25, 116, 117, 143, 154, 159]. The cooperative binding sites exhibit different affinities
and locations that may explain how increasing C-signaling could differentially regulate gene
expression during development [89]. However, the relationships between C-signal-dependent
gene expression, and the arrangement and morphology of individual cells, have not been
visualized and quantified within NFBs.
    We discovered that M. xanthus undergoes development in submerged culture on thin
plastic slides amenable to high-resolution confocal microscopy. Using this approach in com-
bination with fluorescent protein reporters, we found that individual cells could be visualized
through several layers near the slide surface. We have quantified the spatiotemporal patterns
of cellular arrangement, morphology, and C-signal-dependent gene expression in NFBs. We
did not observe an inner domain of threefold lower cell density or evidence of spore precursor
transport by the movement of rods. Rather, we observed rods aligned at high density near
the radial center of early NFBs and our results support a new model in which cell density,
and the neighbor alignment and tangential orientation of rods, affects C-signaling efficiency
to pattern gene expression and spore maturation radially.
Results
    Nascent fruiting bodies contain cells intermediate in shape between rods and spores. To
examine the temporal pattern of cellular shape near the bottom of NFBs, wild-type strain
DK1622 was starved to initiate development under submerged culture conditions in wells
of thin-bottom plastic slides. The lipophilic dye FM 4-64 was added at the beginning of
starvation in order to stain membranes. In a control experiment, mounds formed at the
                                               29


normal time, 18 h poststarvation (PS), and darkened by 42 h (as visualized by bright-field
microscopy from the bottom), in the absence or presence of FM 4-64 (Fig. 2.8). The number
of sonication-resistant spores at 42 h was greater on average in the presence of FM 4-64 than
in its absence (p = 0.042). Importantly, FM 4-64 did not interfere with development.
    The FM 4-64 membrane stain revealed the shape of individual cells in images of optical
sections obtained using confocal laser scanning microscopy. Figure 2.1A shows a series of
images obtained near the bottom of the same NFB at the indicated times PS. At 18 h, cells
were rod-shaped. Strikingly, by 24 h, some rods had thickened and shortened, especially near
the center of the NFB. These cells were intermediate in shape between rods and round spores.
Herein, we refer to these cells as “transitioning cells” (TCs), since they are transitioning in
shape. Some TCs become spores, but we have not been able to determine the percentage
that do so (see below). At 20 to 40 µm from the center of the NFB, many rods appeared
to be aligned and oriented tangentially (i.e., nearly perpendicular to a radius of the NFB
drawn through the center of the rod) (Fig. 2.1B), perhaps corresponding to the outer domain
described previously [146] (see below). At 30 h, TCs were still abundant near the center
of the NFB and there appeared to be less rods and less alignment farther from the center.
By 42 h, many round spores had formed. TCs were also observed within the NFB and rods
were present outside the NFB (i.e., becoming peripheral rods if they remain outside). Figure
2.1C shows enlarged images of rods at 18 h, lanceolate-shaped TCs at 24 h, elliptical TCs
at 30 h, and round spores at 42 h.
    We tried to determine the fate of TCs by capturing images of an NFB stained with FM4-
64 (as described above) every 15 min from 30 to 42 h PS. The images were assembled into a
time-lapse movie. In most cases, the fate of individual TCs was unclear. Many TCs left the
plane of focus, possibly due to lysis or movement (driven by shape change and/or movement
of other TCs or rods). Importantly though, some TCs become round spores. Others changed
shape but did not become round. To illustrate the different outcomes, a small section of
Movie 1 was enlarged. Figure 2.1D shows frames from movie 2. Arrows point to three TCs
                                               30


at 30 h. The TC indicated by the orange arrow became a spore. The TC indicated by the
magenta arrow changed shape, transiently elongating and possibly fusing with another TC
or vesicle, and remaining irregularly shaped. The TC indicated by the white arrow changed
shape abruptly, as seen comparing 32 and 34 h PS in Figure 2.1D, but after 38 h we could
not be certain what happened to this TC. Although we could not determine the fate of all
TCs, we conclude that NFBs contain TCs from 24 h onward, and that some TCs become
spores by 42 h. Figure 2.9 summarizes our observations of the temporal pattern of cellular
shape near the bottom of NFBs.
    Several morphological characteristics distinguish transitioning cells from rods and spores.
The membrane stain FM 4-64 allowed visualization of cells in images of optical sections, but
quantification of three-dimensional cellular morphology was not possible because cells in
NFBs were not sufficiently well-separated. Thus, we mixed wild-type strain DK1622 with
one of four strains (YH7, YH8, YH14, YH15) derived from DK1622 that produce a fluorescent
protein (tdTomato or mNeonGreen) under control of a vanillate-inducible promoter (Pvan ).
The two strains were mixed at a ratio of five unlabeled DK1622 to one labeled derivative
(based on cell density measurements) and the mixture was starved as described above except
vanillate (not FM 4-64) was added at the beginning of starvation. In each case, mound
formation by the mixture was delayed by 6 to 9 h compared to that of unmixed strain
DK1622 and each unmixed derivative, in the presence or absence of vanillate. We do not
understand why mound formation by the mixtures was delayed, but it was reproducible. For
each of three biological replicates, a z-stack of optical sections (with z-resolution of 0.5 µm
and a step size of 0.2 µm) was collected at 27, 30, 36, and 48 h PS from near the bottom
of one NFB (i.e., 0.25 to 0.5 µm above the bottom of the well) to 5 µm up ( 10% of the
height of NFBs). For three additional biological replicates of the mixture of strains DK1622
and YH8, z-stacks were collected to 10 µm up. Light scattering prevented high-resolution
visualization of cellular morphology farther up. Excessive light exposure has been reported
to inhibit sporulation [105]. We did not observe inhibition at the laser intensities we used,
                                               31


but we limited z-stack collection to 5 µm up for the other 12 NFBs examined (3 NFBs ×
4 mixtures) in order to avoid excessive light exposure, especially since laser excitation at
two wavelengths was used to visualize both cellular morphology and gene expression in some
NFBs.
    Figure 2.2A shows representative images of optical sections near the bottom of one NFB
formed by the mixture of strains DK1622 and YH8. At 27 and 30 h, labeled YH8 cells
were mostly rod-shaped, although there appeared to be a few TCs and/or spores (keeping
in mind that an optical section may show only a portion of a cell). By 36 h, there appeared
to be less rods and more TCs and/or spores. At 48 h, cells appeared to be mostly spores,
although a few rods and apparent TCs were observed. Figure 2.2B shows enlarged images
of rods and apparent TCs and spores. To quantify cellular morphology, we developed a
computational pipeline involving three-dimensional segmentation of cells in image stacks
and classification of individual cells. Four morphological properties (equivalent diameter,
surface area, length, width) were used to classify cells as rods, TCs, or spores by fitting a
Gaussian mixture model. Although the morphological transition between rods and spores is
continuous, a mixture of three multivariate Gaussian distributions successfully captured the
distribution of the four morphological properties of the whole population and quantitatively
defined TCs. Scatter plots show clustering of cell classes based on each combination of
two properties (Figure 2.10). Three-dimensional renderings of randomly sampled cells of
each class were reconstructed from the image stacks and visually inspected to verify the
classification method. Figure 2.2C shows renderings of representative cellular morphologies
and Figure 2.11 shows the distributions of morphological properties of cells classified as rods,
TCs, and spores. Rods were either straight or bent, and had the greatest median equivalent
diameter (i.e., the diameter of a sphere with the same volume as the object) and length, and
the least median width and sphericity (i.e., a measure of how closely the shape of an object
resembles that of a sphere). Spores were nearly spherical, with the greatest median width and
the least median length and equivalent diameter, whereas TCs had median morphological
                                              32


properties intermediate between spores and rods.
    Proportions of rods and spores change as nascent fruiting bodies mature. To determine
the temporal patterns of rods, TCs, and spores in NFBs, we calculated the proportion of
each cell class for each z-stack collected as described above. Figure 2.2D summarizes the
temporal distribution for the 15 NFBs. At 27 and 30 h PS, 75% of the cells were rods and
small percentages of TCs and spores were observed. By 36 h, the proportion of rods had
decreased more than twofold, that of TCs had changed very little, and that of spores had
increased considerably to >50%. At 48 h, the proportions of rods and TCs were low, and
that of spores was 75%. The temporal changes in proportions were reproducible among
NFBs and the most dramatic change occurred between 30 and 36 h, with the proportions
of rods and spores decreasing and increasing, respectively, while the proportion of TCs was
10-15% at all times examined, indicating a steady flux of cells transitioning from rods to
spores (see Supplementary Discussion for more about TCs).
    Radial patterns of cell density, alignment, and orientation early correlate with spore
proportion later as nascent fruiting bodies mature. In terms of the spatial pattern of each
cell class for the 15 NFBs, we focused on the radial distributions, since all but three of the
z-stacks went to only 5 µm up from near the bottom, representing only 10% of the height
of the NFBs, and no vertical patterns were discernible. Figure 2.3A summarizes the radial
distribution of normalized cell density and the proportions of rods, TCs, and spores for the
15 NFBs. At 27 h PS, cell density was greatest at the radial center of NFBs and decreased
steadily to the edge of NFBs at 60 µm. Later, NFBs became more compact radially, with
most cells within 40 µm from the center. By 48 h, the maximum cell density had shifted
to 15 µm from the center. Previously, an inner domain of three-fold lower cell density
was described for NFBs formed on starvation agar at 24 h PS [146]. Under our submerged
culture conditions, we observed only a slight decrease in cell density near the center of NFBs
at 48 h (Fig. 2.3A). The radial distribution of normalized cell density was very similar for
each of the 15 NFBs (Fig. 2.12 shows a representative NFB formed by strain DK1622 mixed
                                              33


with each of the four labeled DK1622-derived strains). The proportions of rods, TCs, and
spores varied over time along the radii of NFBs, with very little variation between individual
NFBs (Fig. 2.12). At 27 and 30 h PS, the proportion of rods is slightly lower at the radial
center of NFBs than at the edge (p = 0.049 and p = 0.053, respectively) (Fig. 2.3A). The
difference increases dramatically by 36 h (p < 0.001), and persists at 48 h, when rods are
nearly absent within 20 µm from the center. The proportion of TCs was maximal 17 µm
from the center at 30 h (p = 0.003 comparing the maximum with the edge), and increased
closer to the center at later times. The proportion of spores was maximal at 24 µm from
the center at 36 h (p = 0.003 and p < 0.001 in comparison with the center and the edge,
respectively), and increased both closer to the center and closer to the edge by 48 h (p <
0.001 comparing the center with the edge).
    It is interesting that the proportions of TCs and spores were maximal at 20 µm from the
center and increased closer to the center as NFBs mature. Previously, it was proposed that
spore differentiation begins in the outer domain of NFBs and fills inward over time, with
the orbiting movement of rods in the outer domain transporting nonmotile spore precursors
to an inner domain of less ordered rods at three-fold lower cell density [145]. However, as
noted above, NFBs formed under submerged culture conditions do not have an inner domain
of three-fold lower cell density (Fig. 2.3A). Moreover, our time-lapse movie of membrane-
stained cells near the bottom of an NFB shows no strong evidence of orbiting movements
of rods transporting TCs or spores toward the center. Therefore, it is unlikely that the
transport model [145] accounts for the radial distributions of the proportions of TCs and
spores (Fig. 2.3A). C-signaling requires cells to move into alignment [81, 84, 90, 147] and
C-signal-dependent gene expression is necessary for sporulation [83, 89, 91, 93, 105, 167]. We
noted above that many rods appeared to be aligned with each other and oriented tangentially
to the circumference of the NFBs in a region possibly corresponding to the outer domain
described previously [146], in NFBs at 24 h PS (Fig. 2.1A and 2.1B). Since mound formation
by the mixtures of unlabeled strain DK1622 with labeled derivatives was delayed by 6 to 9
                                               34


h (Fig. 2.2A), we expected the neighbor alignment and tangential orientation of rods to be
greatest at 30 to 33 h, and at 20 to 40 µm from the radial center of NFBs. Surprisingly, we
discovered that rods are most aligned with each other near the center at 27 and 30 h (Fig.
2.3B). In agreement with our expectation, the tangential orientation of rods was greatest at
20 to 40 µm from the center at 27 and 30 h.
    To determine if the spatial arrangement of rods early in the development of NFBs is
predictive of spore maturation, we performed a cross-correlation analysis of the posterior
probabilities summarizing the radial patterns across NFBs and time points (Fig. 2.3A and
2.3B). Since the proportions of rods and spores change dramatically between 30 and 36 h (Fig.
2.2D), we calculated the cross-correlation coefficients of cell density, neighbor alignment, and
rod tangential orientation at 27, 30, and 36 h with spore proportions at 36 and 48 h (Fig.
2.3C to 2.3E). Strikingly, the best predictor of spore proportion at 36 h is the rod tangential
orientation at 30 h, followed by rod neighbor alignment. Cell density is a weaker predictor
for early spore formation. However, rod tangential orientation becomes a weak predictor
for spore formation at 48 h, whereas rod neighbor alignment and cell density become strong
predictors across all time points. These results support a new model in which the initial
increase in the proportion of spores 20 to 40 µm from the center of the NFBs at 36 h
can be explained by more efficient C-signaling between tangentially oriented rods relative to
the center (<20 µm) where cells are at higher density and aligned to their neighbors. The
spatial arrangement of cells at the center may not be as conducive to C-signaling, resulting
in a lag during spore development. Therefore, our model predicts that genes differentially
sensitive to C-signaling will exhibit distinct radial patterns of expression. To test the model,
we devised methods to quantify gene expression of individual cells, as described below.
    C-signal-dependent gene expression increases in transitioning cells and spores as nascent
fruiting bodies mature. To measure the spatiotemporal pattern of C-signal-dependent gene
expression, two of the strains (YH14, YH15) mixed with strain DK1622 in experiments
described above contained a C-signal-dependent promoter fused to tdTomato, in addition
                                                35


to Pvan fused to PfmgE . For NFBs formed by those mixtures, two z-stacks measuring red
and green fluorescence intensity were collected in the experiments described above. The
green fluorescence from Pvan allowed visualization of all labeled cells, whether or not red
fluorescence was observed from the C-signal-dependent promoter. The green fluorescence
also allowed cellular morphology measurements as described above and served as an internal
control for normalization of red fluorescence as described below. Red fluorescence intensity
from fusions of tdTomato to two different C-signal-dependent promoters, Pdev (strain YH14)
and PfmgE (strain YH15) was measured. Pdev was expected to be more sensitive than PfmgE
to a low level of C-signaling, based on in nitro and in vitro analyses of cooperative binding
sites for FruA (proposed to be activated by C-signaling) and MrpC. The tdTomato fusions
were integrated ectopically in the chromosome so as not to interfere with the native dev
and fmgE loci, which contribute to the timing and level of sporulation, respectively [140,
159]. Figure 2.4A shows representative images of optical sections near the bottom of one
NFB each formed by strain YH14 (designated dev ) or YH15 (designated fmgE ) mixed with
strain DK1622. The images show merged red and green fluorescence. As expected, rods
(red arrows), TCs (green arrows), and spores (blue arrows) in both NFBs exhibited green
fluorescence from Pvan -mNeonGreen. Strikingly though, rods of the dev strain (bearing
Pdev -tdTomato) showed much greater red fluorescence than rods of the fmgE strain (bearing
PfmgE -tdTomato), which were barely visible. Figure 2.4B shows enlarged images of green and
red fluorescence separately, as well as merged. Some TCs of the fmgE strain also showed
very little red fluorescence at 30 h (e.g., Fig. 2.4B), but in general TCs and spores of both
strains exhibited readily visible red fluorescence. The results suggest that Pdev is active at
27 h, earlier than PfmgE , and that PfmgE activity differs more than Pdev activity comparing
rods with TCs and spores. These observations are consistent with the expectation that
Pdev would be more sensitive to low-level C-signaling than PfmgE [25, 159, 176, 177], since C-
signaling appears to increase during development [83,93,105]. In the absence of C-signaling,
Pdev and PfmgE were inactive, since a csgA null mutant bearing Pvan -mNeonGreen and either
                                                36


Pdev -tdTomato (YH46) or PfmgE -tdTomato (YH45) failed to form NFBs and showed very
little red fluorescence at 18 to 48 h, although both strains exhibited green fluorescence (Fig.
2.13).
     To quantify C-signal-dependent gene expression of individual cells of the dev and fmgE
strains mixed with strain DK1622, cells were classified as rods, TCs, or spores based on
green fluorescence from Pvan . The distributions of red and green fluorescence intensities
were plotted (Fig. 2.14). At 36 and 48 h PS, some cells exhibited very low levels of both red
and green fluorescence (lower left quadrants), suggesting these cells were compromised, so we
eliminated them from further analyses. For the remaining cells, the distributions of red and
green fluorescence intensities were plotted separately for rods, TCs, and spores (Fig. 2.14B
and 2.14C). The laser and microscope settings were predetermined based on preliminary ex-
periments (Supplementary Results). For each cell, the red fluorescence intensity was divided
by the green fluorescence intensity to yield the relative flurorescence intensity (RFI) (Fig.
2.14D). Obviously, the distributions of RFI are broad for each cell class and both strains.
Nevertheless, the median RFI of the dev strain exceeded that of the fmgE strain by 4-fold
for all cell classes at 27 h and for rods and TCs at 30 h, consistent with the expectation
that Pdev would be more sensitive than PfmgE to a low level of C-signaling [25, 159, 176, 177].
Figure 2.4C shows the median (dot) and 90% credible interval (vertical line) of each RFI dis-
tribution (Fig. 2.14D). For both the dev and fmgE strains, the median RFI of rods changed
little as NFBs matured, while the RFI of TCs and spores increased for dev (p = 0.046 and
0.025) and fmgE (p = 0.0063 and 0.016). The RFI of TCs and spores for each strain was
then normalized to the RFI of rods at 27 h to quantify the increase in C-signal-dependent
gene expression during development of NFBs (Fig. 2.4D). By 48 h, C-signal-dependent gene
expression was 10 to 30-fold greater in both TCs (p = 0.046 and 0.0065) and spores (p =
0.013 and 0.0088) relative to early rods. This analysis supports that C-signal-dependent
gene expression is cell class-specific. We conclude that C-signaling increases gene expression
in TCs and spores as NFBs mature.
                                               37


    Distinct radial patterns of C-signal-dependent gene expression correlate differently with
cell density, rod neighbor alignment, and rod tangential orientation early in development
of nascent fruiting bodies. Our new model to explain the radial distribution of the spore
proportion in maturing NFBs (Fig. 2.3A) predicts that genes differentially sensitive to C-
signaling will exhibit distinct radial patterns of expression. As expected, Pdev appeared to be
more sensitive than PfmgE to a low level of C-signaling (Fig. 2.4) [25,159,176,177]. Therefore,
our model predicts that Pdev activity may be elevated throughout the NFBs regardless of
cell arrangement and C-signaling efficiency, whereas PfmgE activity may be more sensitive
to efficient C-signaling between tangentially oriented rods at 20 to 40 µm from the center
during maturation of the NFBs.
    To determine if cell arrangement affects C-signaling within the NFBs, we examined the
radial pattern of gene expression in TCs and spores at each time point. At 27 h, the RFIs
were uniform radially for both the dev and fmgE strains. By 30 h, the mean RFIs for TCs
and spores of the dev strain had increased slightly near the center, while the mean RFI for
spores of the fmgE strain was maximal at 20 to 40 µm from the center (p = 0.032). At
36 h, the mean RFIs for TCs and spores of the dev strain were uniform from the center
to 30 µm, then decreased 10-fold toward the edge (p = 0.028 and 0.004), and the pattern
persisted until 48 h (p = 0.076 and 0.016). This pattern is similar to the radial pattern of
the spore proportion at 48 h (Fig. 2.3A). Remarkably, the mean RFIs for TCs and spores of
the fmgE strain at 36 h were maximal at 20 to 40 µm from the center (p = 0.052 and 0.012)
(Fig. 2.5A), similar to the radial pattern of the spore proportion at 36 h (Fig. 2.3A). By
48 h, the mean RFIs for TCs and spores of the fmgE strain had increased near the center
(Fig. 2.5A), resulting in a pattern similar to that of the dev strain at 36 and 48 h (Fig.
2.5A) and the spore proportion at 48 h (Fig. 2.3A). We conclude that genes differentially
sensitive to C-signaling exhibit distinct radial patterns of expression, as predicted by our
model. Pdev activity increases at <20 µm from the radial center in TCs and spores as NFBs
mature, while PfmgE activity increases maximally at 20 to 40 µm from the center at 36 h
                                                38


before increasing closer to the center.
    To determine if the spatial arrangement of rods early in the development of NFBs is
predictive of C-signal-dependent gene expression in TCs and spores, we performed a cross-
correlation analysis of the radial patterns of rod arrangement at 27, 30, and 36 h (Fig. 2.3B)
with the radial patterns of gene expression at 30, 36, and 48 h (Fig. 2.5A). The early patterns
of cell density and rod neighbor alignment correlated strongly with the later patterns of RFI
for TCs and spores of the dev strain (Fig. 2.5B, left and center), similar to the correlations
with spore proportion (Fig. 2.3C and 2.3D). In contrast, rod tangential orientation early
correlated poorly with RFI for TCs and spores of the dev strain later (Fig. 2.5B, right).
Rod alignment at high cell density without tangential orientation appears to be sufficient
to explain the radial pattern of Pdev activity. On the other hand, rod tangential orientation
at 30 h correlated moderately well with RFI for TCs and spores of the fmgE strain at
36 h (Fig. 2.5C, right), similar to the correlations between rod tangential orientation and
spore proportion (Fig. 2.3E). Interestingly, cell density and rod neighbor alignment early
correlated strongly with RFI for TCs of the fmgE strain later (Fig. 2.5C, left and center).
Rod alignment at high cell density without tangential orientation appears to largely explain
the radial pattern of PfmgE activity in TCs as NFBs mature (although tangential orientation
may contribute to the pattern of PfmgE activity in TCs at 36 h), but tangential orientation
appears to be necessary to explain the pattern of PfmgE activity in spores at 36 h. We conclude
that the arrangement of cells early correlates differently with Pdev and PfmgE activity in TCs
and spores later, with greater sensitivity of Pdev than PfmgE to a low level of C-signaling
likely accounting for the distinct radial patterns of expression as NFBs mature (Fig. 2.5A).
A requirement for abundant C-signaling appears to make PfmgE activity a particularly good
marker of spore maturation.
Discussion
    Our results provide new information about the spatiotemporal patterns of cellular differ-
entiation and C-signal-dependent gene expression near the bottom of NFBs. In contrast to
                                                39


a previous report that examined M. xanthus development on agar [146], we found that rods
are aligned at high density near the radial center of early NFBs formed under submerged
culture conditions (Fig. 2.3B). We also observed no strong evidence of TC or spore transport
toward the center, as had been proposed previously [145]. Our results support a new model
to explain the radial distribution of spores in maturing NFBs – efficient C-signaling between
tangentially oriented rods leads to an initial increase of spores 20 to 40 µm from the center
at 36 h PS, and less efficient C-signaling between aligned rods at high density <20 µm from
the center results in some spores by 36 h and more by 48 h (Fig. 2.3 and 2.5). The model
is summarized in Figure 2.6. By communicating several features of cellular arrangement,
short-range C-signaling produces precise patterning of gene expression and cellular differen-
tiation in time and space. Our work has implications for understanding the formation of
other bacterial biofilms and multicellular animals.
    We did not observe an inner domain of three-fold lower cell density, as inferred previously
based on fluorescence intensity measurements of confocal sections of NFBs formed by cells
labeled with lipophilic dye [146]. A subsequent study also revealed an inner domain of lower
fluorescence in confocal sections of NFBs formed by cells expressing gfp [107]. Individual
cells were not enumerated in the previous studies. Other differences from our study include
observation of NFBs from above (rather than below), which may affect light scattering and
thus fluorescence intensity, and development on agar (rather than plastic), which may alter
motility and thus cellular arrangement. However, fruiting bodies formed on agar and viewed
in cross-section after high pressure freezing were uniformly spore-filled [67]. Cells did appear
to be arranged differently within 20 µm of the radial center of NFBs viewed from the bottom,
than in the surrounding annulus (Fig. 2.1A, 2.1B, 2.2A, and 2.4A). We found that rods are
aligned at high density near the center of early NFBs (Fig. 2.3B and 2.6). In this region, Pdev
activity increased in TCs and spores as NFBs matured (Fig. 2.5A). A high-affinity site for
cooperative binding of FruA (proposed to be activated by C-signaling) and MrpC stimulates
transcription from Pdev [25, 177]. Hence, we infer that rods aligned at high density within
                                               40


20 µm of the center engage in an intermediate level of C-signaling sufficient for Pdev activity
(Fig. 2.6). In comparison, we found that rods in the annulus 20 to 40 µm from the center
of early NFBs are not only aligned with neighbors but also tangentially oriented (Fig. 2.3B
and 2.6). Cell density decreases sharply farther from the center across this region of early
NFBs (Fig. 2.3A). Importantly, in this region activity of PfmgE (as well as of Pdev ) increased
by 36 h in TCs and spores (Fig. 2.5A). Two sites for cooperative binding of FruA and MrpC
are located upstream of PfmgE , a higher-affinity site that acts negatively by competing with
a lower-affinity site that acts positively [25, 159, 176, 177]. Therefore, we infer that rods 20
to 40 µm from the center of early NFBs engage in a high level of C-signaling sufficient for
PfmgE activity (Fig. 2.6). Rods >40 µm from the center of early NFBs appear to engage in
a low level of C-signaling (Fig. 2.6) based on low Pdev and PfmgE activity in TCs and spores
as NFBs mature (Fig. 2.5A), presumably due to low cell density and nearly random rod
alignment and orientation in that region of early NFBs (Fig. 2.3A and 2.3B).
     Early work demonstrated the importance of cell motility for efficient C-signaling [81, 90].
Mechanical alignment of nonmotile cells facilitated C-signaling, but aligned nonmotile cells
formed only 16% as many spores as motile cells [84]. A study of the role of C-signaling in
rippling behavior [147] led to the notion that end-to-end contacts between rods in the outer
domain of NFBs favor C-signaling [73, 145]. Modeling of rippling behavior instead predicted
that side-to-side contacts between rods favor C-signaling [200]. Our results correlate several
features of cellular arrangement in early NFBs with distinct radial patterns of C-signal-
dependent gene expression as NFBs mature (Fig. 2.5). We propose that restricted movement
and/or metabolism of rods aligned at high density within 20 µm of the radial center of early
NFBs leads to an intermediate level of C-signaling (Fig. 2.6). Lower cell density together
with rod alignment and tangential orientation in the annulus 20 to 40 µm from the center
of early NFBs is proposed to allow rod movement, proximity (possibly contact), and/or
metabolism that favors a high level of C-signaling (Fig. 2.6). Spore differentiation requires
efficient C-signaling [83, 93, 105]. Expression of numerous developmental genes depends on
                                                 41


C-signaling and some of these genes affect sporulation [89, 91, 143]. Expression of the dev
operon appears to allow a small protein, DevI [141], to delay sporulation for 6 h [140].
FmgE, which has a metalloprotease domain, enhances sporulation 8-fold [159]. We propose
that differential expression of dev, fmgE, and likely many other C-signal-dependent genes,
results from different levels of C-signaling due to the cellular arrangement in early NFBs,
and produces the observed spatiotemporal pattern of spores as NFBs mature (Fig. 2.3).
     Implications of patterning by C-signaling for understanding the formation of other bac-
terial biofilms and multicellular animals. By communicating several features of cellular
arrangement, short-range C-signaling produces precise spatiotemporal patterning of gene
expression and differentiation within NFBs. Earlier during development, C-signaling alters
cellular movements so that stable mounds are built [40, 72, 201]. Exactly how C-signaling
interfaces with motility control is unclear, but it involves the frz chemosensory system that
regulates reversal of the direction of cell movement [158, 166]. The versatility of C-signaling
for pattern formation by M. xanthus has implications for development of other bacterial
biofilms, which are patterned by myriad short-range interactions [12, 32, 120, 136, 161]. Such
interactions may communicate multiple aspects of cellular arrangement, judging from our
results, which would be important to consider in designing therapeutic strategies that target
the biogeography of polymicrobial infections [161].
     Eukaryotes use a variety of mechanisms, broadly categorized as lineage-dependent or
signal-dependent, to build multicellular structures with differentiated cells [41, 68]. Among
the signal-dependent mechanisms, some depend on cell-to-cell contact (juxtacrine) and oth-
ers work over a short distance (paracrine), so analogies to C-signaling abound, particularly
in animal development. For example, similarities between juxtacrine Notch signaling and
C-signaling have long been evident [75], but the postulated C-signal receptor has defied iden-
tification. If C-signaling instead acts over a short distance, analogies to paracrine signaling
via growth factor [110,172], Hedgehog [17], and Wnt [31] family proteins would be more apt.
Notably, all the paracrine signaling pathways, as well as contact-dependent Notch signaling
                                               42


pathways [15], play multiple roles in pattern formation during development of complex multi-
cellular animals. Much remains to be learned about the molecular mechanisms of patterning
by C-signaling and its evolutionary implications.
Materials and Methods
    Bacterial strains, plasmids, and primers. The construction of plasmids and bacterial
strains is described in the SI Appendix. The strains, plasmids, and primers used in this
study are listed in Table T2.1.
    Growth and development. M. xanthus was grown at 32°C in CTTYE liquid medium (1%
Casitone, 10 mM Tris-HCl [pH 8.0], 1 mM KH2 PO4 -K2 HPO4 , 8 mM MgSO4 , [final pH 7.6],
0.2% yeast extract) with shaking at 350 rpm. Development was performed under submerged
culture conditions [95] in 8-well µ-slides (Ibidi) with starvation buffer MC7 (10 mM mor-
pholinepropanesulfonic acid [MOPS, pH 7.0], 1 mM CaCl2) as described previously [139].
Cells from log-phase CTTYE cultures were collected by centrifugation and resuspended in
MC7 to 1,000 Klett units. The cell suspension (26 µL) was added to 174 µL of MC7 in each
well. To collect samples for quantification of sonication-resistant spores, cells were scraped
from the bottom of a well and the entire contents of a well was aspirated into a 1.5 mL
microcentrifuge tube. MC7 (300 µL) was added, the sample was sonicated, and spores were
counted microscopically as described previously [139].
    Microscopy. Images of NFBs were acquired with a Nikon A1 Laser Scanning Confocal
Microscope, which was configured on a Nikon Ti inverted platform with an XY automated
stage and a 100X objective. Fluorescence from FM 4-64 and tdTomato were examined
using a 560-nm laser for excitation and a 595/50 band pass emission filter. Fluorescence
from mNeonGreen was examined using a 488 nm laser for excitation and a 525/50 band
pass emission filter. Time-lapse confocal images were taken every 15 min. Images near the
bottom of NFBs were the first optical section above the bottom of the well, in which cells
could be clearly visualized, so 0.25 to 0.5 µm above the bottom of the well.
    Cell segmentation. Segmentation was done using a custom MATLAB (Mathworks Inc.)
                                               43


script. The pixel intensity of each image in each image stack was normalized by subtracting
the background intensity (mode of the intensity distribution) and dividing by the intensity
at the 99.99th percentile. To smooth high-frequency noise, a three-dimensional Gaussian
filter was applied (sigma = 2 pixels) to each image stack. To estimate the local curvature
of fluorescent objects in three dimensions, we calculated the second-order partial derivatives
of the voxel intensities with respect to each dimension (Hessian matrix). The direction
and magnitude of the local curvature was determined by decomposing the Hessian matrix
into eigenvectors and eigenvalues. The core of bright tubular and round structures (such
as fluorescent rods and spores) are characterized by large negative curvatures in at least 2
directions. Therefore, new images were created by summing and inverting the two largest
eigen values of each voxel to amplify the fluorescence signal likely to come from labeled cells.
The resulting image was eroded using a 3x3x3 structure to further separate close objects
prior to segmentation, binarized using a threshold at the 99th percentile to identify cells,
and opened with a 2x2x2 structure to filter stray voxels. Small bright objects of less than
100 voxels (0.308 µm3) were filtered. To separate cells that were not properly segmented,
objects that were larger than the median volume of the segmented objects were processed
further. First, the skeleton of the object was extracted to calculate the number of endpoints
(more than two indicates merged objects). Then branches with the highest angle from
the trunk were separated into independent objects until the remaining objects had only
two endpoints. After the image segmentation was completed, the morphological properties
of each object were calculated (volume, centroid coordinates, equivalent diameter, surface
area). The medial axis length and mean cell diameter along the cell axis were calculated
from the cell skeleton transformation. The mean fluorescence intensity of each object was
calculated from the original images after subtracting the background intensity for each stack.
To filter cells that were truncated at the image boundaries, cells with centroids outside the
z-heights range of 1 to 4.5 µm (for 5 µm-high z-stacks) or 1 to 9.5 µm (for 10 µm-high
z-stacks) were discarded.
                                               44


     Cellular morphology classification. Cells were classified based on their morphologies by
fitting a Gaussian mixture model (GMM) with four components using the mclust pack-
age [153]. The multivariate mixture was fitted once on the ensemble of segmented objects
across NFBs and time points using as variates: equivalent diameter, square root of cell surface
area, length of the cell medial axis, and mean cell width. All variates were log-transformed
before fitting to produce distributions that were approximately Gaussian for each variable.
The cells were assigned to one of four groups (rods, TCs, spores, debris) based on maxi-
mum likelihood. Cell classification was assessed by visual inspection of three-dimensional
renderings of cells randomly sampled from each cluster across NFBs. The renderings were
reconstructed from the image stacks using the Volume Viewer plugin of ImageJ [144]. Anal-
ysis of the Bayesian information criterion supports a mixture of 4 components for clustering
the ensemble of fluorescent objects.
     Comparisons of morphological parameters. The distributions of equivalent diameter,
skeleton size (referred as length in Fig. 2.10 and 2.11), and mean width were modeled using
an intercept-only log-normal model and the posterior probability distributions of the means
were estimated with Bayesian sampling using brms [22], with a separate intercept for each
cell class and the interaction between fruiting bodies and time points treated as a random
effect:
    M orph_param ∼ 0 + cell_class + (cell_class|f ruiting_body_id : time_point)           (2.1)
     Sphericity was modeled using an intercept-only beta model with a separate intercept
for each morphological classification. The significance of the differences between means
was calculated empirically from the posterior probability distributions of each parameter.
Bayesian sampling was done with 4 chains with 4,000 iterations each with default parameters.
The calculated credible interval is the Bayesian analog of the confidence interval. It is
                                               45


defined as the highest density interval within which a parameter value falls with a particular
probability. There is 90% probability that the population parameter is within the 90%
credible interval given the model and the sample data. For a broad overview of this kind
of technique, see [92]. Plots were generated in the R statistical environment using the
ggplot [183] and tidybayes [78] packages.
    Estimating the proportions of cell classes and fluorescence intensity across time points.
The number of cells in each class within the fruiting body was fitted with a multinomial
logistic regression. Time was treated as an ordinal predictor and variability between fruiting
bodies was treated as a random effect. The model was fitted using Bayesian sampling using
the brms package [22] to determine the median and 90% credible intervals of Nb_cellclass,
the number of cells for each cell class at each time point:
                   N b_cellclass|trials(total_cell_nb) ∼
                                                                                          (2.2)
                   mo(time_point) + (mo(time_point)|f ruiting_body_id)
    To model fluorescence intensity across time points and cell classes, the same approach was
used using a log-normal model and adding promoter and cell classes as interacting factors
in the regression
   mvbind(ch1_f luo, ch2_f luo) ∼
   promoter + cell_class + mo(time_point) : promoter : cell_class+
   (promoter + cell_class + mo(time_point) : promoter : cell_class|f ruiting_body_id)
                                                                                          (2.3)
    Bayesian sampling was done with 8 chains with 5,000 iterations each with default pa-
rameters. Plots were generated in the R statistical environment using the ggplot [183] and
tidybayes [78] packages.
    Estimating the proportions of cell classes and fluorescence intensity along the radii of
NFBs. The proportion of each cell class as a function of distance within the fruiting body
                                               46


was fitted with a multinomial logistic regression with a generalized additive model (GAM)
of tensor products to smooth the spatial dependency using the mgcv package [192]. Vari-
ability between fruiting bodies was treated as a random effect and each time point was fitted
independently. The model was fitted using Bayesian sampling using the brms package [22]
to determine the mean and 90% credible intervals of P_cellclass, the probability of a given
cell class at a given location:
 cell_class|trials(1) ∼ 1+P _cellclass t2(distance)+t2(distance|f ruiting_body_id) (2.4)
    To model the cosine similarity of the orientations between cells or between cells and the
tangent to the NFB circumference, and to model the mean relative fluorescence intensity of
each cell as a function of space, we used the same approach but replaced the multinomial
model with a beta or log-normal model. Bayesian sampling was done with default parameters
except for ‘adapt_delta’=0.99 and ‘inits’=0.
    Cross-correlation analyses of radial patterns within fruiting bodies. The cross correla-
tions between cell alignment, orientation, or density with the proportion of spores or fluo-
rescence intensities, were calculated by sampling the posterior probability distributions from
the GAM models 1,000 times every 1 µm from 0 to 60 µm and calculating the Pearson corre-
lation coefficient between sample pairs for the different combinations of parameters and time
points. The mean correlations and the empirical p-values were calculated from the posterior
probability distributions of the correlation coefficients.
Acknowledgements
    We dedicate this work to the memory of Dale Kaiser, whose insights and enthusiasm
inspired countless scientists, including L.K. as one of his PhD students. Daniel Parrell
initiated our confocal microscopy efforts and provided important suggestions and assistance
during the early part of the work. We are grateful to Melinda Frame and the Center for
                                              47


Advanced Microscopy at Michigan State University for assistance with microscopy. We thank
Oleg Igoshin and Zhaoyang Zhang for suggestions regarding data analysis, Montse Elias-
Arnanz for providing pMR3691, and Oleg Igoshin for helpful comments on the manuscript.
This research was supported by National Science Foundation grants MCB-1411272 and IOS-
1951025, and by salary support for L.K. from Michigan State University AgBioResearch.
Supporting Information
Results
    Quantification of C-signal-dependent gene expression of individual cells in nascent fruiting
bodies. Laser and microscope settings were predetermined based on preliminary experiments
with the dev and fmgE strains mixed with strain DK1622. We avoided excessive light
exposure so as not to interfere with development and not to saturate fluorescence from any
cells at any times poststarvation (PS). We collected z-stacks using the same settings at
different times PS so that fluorescence intensities can be compared directly for a given color
and strain mixture. The magnitude of differences over time can also be compared for the dev
and fmgE strains mixed with strain DK1622. For example, the difference in median (black
bar) red fluorescence intensity of spores at 48 h compared with rods at 27 h is 10-fold greater
for the fmgE strain than for the dev strain (Fig. 2.14). We note that green fluorescence
intensity of rods, TCs, and spores decreased similarly at later times PS (Fig. 2.14). The
decrease appears to be a characteristic of Pvan , since red fluorescence intensity from Pvan -
tdTomato also decreased (Fig. 2.15; note that cells with low fluorescence intensity were not
excluded in this experiment, which may explain the lower median green fluorescence intensity
as compared with Fig. S8B). Pvan is presumably recognized by the major sigma factor [70]
and the level of this sigma factor has been shown to decrease during development [64].
Although transcription from Pvan appears to decrease during development, green fluorescence
intensity decreased consistently across cell classes and strains, so we reasoned that green
fluorescence could serve as an internal control for normalization of red fluorescence from
individual cells, allowing comparisons between cell classes and between the dev and fmgE
                                               48


strains. Therefore, for each cell, the red fluorescence intensity was divided by the green
fluorescence intensity to yield the relative fluorescence intensity (RFI) (Fig. 2.14).
Discussion
    Transitioning cells are intermediate in shape between rods and spores, but similar to
spores in terms of C-signal-dependent gene expression. Although TCs in situ have not
been described previously, shortened cells have been observed after resuspending cells from
NFBs [11] . Multiple lines of evidence indicate that cell shortening involves conversion of fatty
acids from membrane lipids into cytoplasmic lipid bodies [11, 67] (3, 4). Interestingly, csgA
mutant cells fail to shorten and exhibit only 20% of the wild-type lipid body area/cell [11] ,
so CsgA cardiolipin phospholipase activity has been proposed to be important for lipid body
formation [14] . Lipid bodies gradually disappear as spores form, suggesting that lipids
provide metabolic fuel for cellular differentiation [67] . Since lipid bodies can be visualized
by fluorescence microscopy after staining with Nile red [67] , the relationships between TCs,
cell shortening, and lipid body formation and utilization can be investigated within NFBs
using the methods described herein. In addition to loss of 87% of the cell membrane
as lipid bodies form [11] , cellular shape change requires peptidoglycan remodeling. The
peptidoglycan sacculus of M. xanthus is mostly if not completely degraded when rod-shaped
vegetative cells are induced with glycerol to form spores [18] . This unicellular process
differs from the multicellular process of starvation-induced fruiting body formation studied
herein, yet in both cases a polysaccharide coat appears to rigidify spores [86, 119] (7, 8)
and the cytoskeletal protein MreB appears to recruit peptidoglycan remodeling enzymes
during both formation and germination of spores [119, 201] (8, 9). Nonuniform recruitment
or activity of peptidoglycan degrading enzymes could account for the irregular shapes and
different fates of TCs (Fig. 2.1, 2.2, 2.9). It is also possible that the peptidoglycan becomes
flexible, so the shapes and fates of TCs may be influenced by other cells and the extracellular
matrix in NFBs. Our efforts to track the fate of TCs demonstrated that some become spores,
but the fate of most was unclear. Importantly, C-signal-dependent gene expression in TCs
                                                 49


is similar to that in spores (Fig. 2.4 and 2.5), suggesting that many TCs have entered the
spore differentiation pathway.
Methods
    Construction of plasmids and bacterial strains. To construct pYH8, which was used to
induce production of tdTomato by addition of vanillate, primers Van-tdTom F and Van-
tdTom R were used to generate a PCR product of the tdTomato sequence, using plasmid
tdTomato-N1 (Addgene, item ID 54642). The product was digested with NdeI and EcoRI and
mixed with pMR3691 (also digested with the same restriction enzymes) in a Gibson assembly
reaction to enzymatically join the overlapping DNA fragments [58] . The reaction mixture
was transformed into Escherichia coli strain DH5α with outgrowth in Luria-Bertani (LB)
medium [149] prior to plating on LB agar (1.5%) supplemented with 15 µg/mL tetracycline
for selection at 37°C. The DNA sequence of the joined fragments was verified, and the plasmid
was transformed into M. xanthus strain DK1622 using electroporation [77] , with outgrowth
in Casitone-Tris (CTT) (1% Casitone, 10 mM Tris-HCl (pH 8.0), 1 mM KH2 PO4 -K2 HPO4 ,
8 mM MgSO4 , (final pH 7.6)) medium prior to plating on CTT agar (1.5%) supplemented
with 15 µg/mL tetracycline to select transformants. Integration of pYH8 was verified by
colony PCR using primers pMR3691 MCS G-F and pMR3691 MCS G-R. The resulting
strain was named YH8. A similar method was used to generate pYH7, which bears Pvan -
mNeonGreen (primers Van-mNeonGreen F and Van- mNeonGreen R were used to generate
a PCR product with pNSI-ptre-mNeonGreen (Dr. Daniel Ducat, personal communication)
as the template). pYH7 was transformed into strains DK1622 and LS2442 to create strains
YH7 and YH48, respectively. To construct pYH14, a DNA fragment containing the dev
promoter region was amplified from M. xanthus strain DK1622 genomic DNA with Pdev F
and Pdev R primers, which contain EcoRI and BamHI sites, respectively. The tdTomato
sequence was amplified from plasmid tdTomato- N1 (see above) using primers TomatoF and
TomatoR, which contain BamHI and HindIII sites, respectively. The two fragments were
joined to BamHI-EcoRI-digested pSWU19 in a ligation reaction. The reaction mixture was
                                              50


transformed into E. coli strain DH5α with outgrowth in LB medium prior to plating on LB
agar supplemented with 50 µg/mL kanamycin for selection at 37°C. The DNA sequence of the
joined fragments was verified, and the plasmid was transformed into M. xanthus strains YH7
and YH48 using electroporation [77] , with outgrowth in CTT medium prior to plating on
CTT agar supplemented with 40 µg/mL kanamycin to select transformants. Integration of
pYH14 was verified by colony PCR and the resulting strains were named YH14 and YH46.
The dev promoter region fragment of pYH14 was replaced by an fmgE promoter region
fragment to construct pYH15. The fmgE promoter region was amplified from genomic DNA
with primers PfmgE F and PfmgE R. pYH15 was transformed into YH7 and YH48, resulting
in strains YH15 and YH45.
                                             51


Strain     Description                                     Source
E. coli
DH5        λ- Φ80dlacZ ΔM15 Δ(lacZYA-argF )U169 recA1       [63]
           endA1 hsdR17 (rΚ- mΚ- supE44 thi-1 gyrA relA1
M. xanthus
DK1622     Laboratory wild-type strain                      [74]
LS2442     DK1622 with and in-frame deletion of csgA        [14]
YH7        DK1622 with MXAN_0018-MXAN_0019::pYH7           This  study
YH8        DK1622 with MXAN_0018-MXAN_0019::pYH8           This  study
YH14       YH7 with attB::pYH14                            This  study
YH15       YH7 with attB::pYH15                            This  study
YH45       YH48 with attB::pYH15                           This  study
YH46       YH48 with attB::pYH14                           This  study
YH48       LS2442 with MXAN_0018-MXAN_0019::pYH7           This  study
               Table 2.1: Strains used in this study.
Plasmid    Description                                     Source
pMR3691    MXAN_0018-MXAN_0019-PR3-4::vanR-                 [70]
           Pvan ::lacZ, Tetr
pSWU19     Cloning vector with Mx8 attP locus for integra-  [193]
           tion at attB in the M. xanthus genome, Kmr
pYH7       MXAN_0018-MXAN_0019-PR3-4::vanR-                This study
           Pvan::mNeonGreen, Tetr
pYH8       MXAN_0018-MXAN_0019-PR3-4::vanR-                This study
           Pvan::tdTomato, Tetr
pYH14      pSWU19 with Pdev -tdTomato, Kmr                 This study
pYH15      pSWU19 with PfmgE -tdTomato, Kmr                This study
              Table 2.2: Plasmids used in this study.
                                  52


Primer       Description                              Source
Van-tdTom-F  GATGCGAGGAAACGCATATGGTGAGCAA             This study
             GGGCGAG
Van-tdTom-R  GTACGCGTAACGTTCGAATTCTTACTTGT            This study
             ACAGCTCGTCCATG
Van-         GATGCGAGGAAACGCATATGGTCAGCAA             This study
mNeonGreen-F AGGTGAAGAAGAC
Van-         GTACGCGTAACGTTCGAATTCCTTGTACA            This study
mNeonGreen-R GTTCGTCCATACCCATC
Pdev-F       GCTAGAATTCCGCCGTTGGCTCGGATGC             This study
Pdev-R       GCTAGGATCCGACCTGAGCTGATACCAAC            This study
PfmgE-F      GCTAGAATTCCATTAACGGGCGATCTT              This study
             CCTC
PfmgE-R      GCTAGGATCCCTAAAGGCGATTGCGTTC             This study
             CTGC
Tomato-F     GCTAGGATCCGAAAGGAGGCGCATATGG             This study
             TGAGCAAGGGCGAG
Tomato-R     GCTAAAGCTTTTACTTGTACAGCTCGTC             This study
             CATG
pMR3691 MCS- CACGATGCGAGGAAACGCA                      This study
G-F
pMR3691 MCS- CACCGGTACGCGTAACGTTC                     This study
G-R
               Table 2.3: Primers used in this study.
                                 53


Figure 2.1: Visualization of transitioning cells near the bottom of a nascent fruiting
body. M. xanthus wild-type strain DK1622 was starved to initiate development under
submerged culture conditions. FM 4-64 (2 µg/mL) was added at the start of starvation.
Confocal images were acquired at the indicated times poststarvation (PS) to show FM 4-64
staining of cell membranes (red fluorescence shown as grayscale). Images are representative
of three biological replicates. (A) Image of an optical section near the bottom of the same
nascent fruiting body (NFB) during development. Red arrow, rod-shaped cell. Green arrows,
transitioning cells (TCs). Blue arrow, spore. Bar, 20 µm. (B) Cartoon depiction of tangential
orientation. Upper left quadrant depicts one tangentially oriented rod. Arrow indicates
radius of NFB (gray). Other quadrants depict many smaller tangentially oriented rods. (C)
Enlarged images of representative cells. Images from panel A were enlarged to show cells
with a particular shape (a rod at 18 h, TCs at 24 and 30 h, and a spore at 42 h). Colored
arrows point to the same cell classes as in panel A. Bar, 2 µm. (D) Tracking the fate of TCs.
Images from Movie S2. Orange arrows, TC that becomes a spore. Magenta arrows, TC that
changes shape. White arrows, TC whose fate is uncertain. Bar, 2 µm.
                                              54


Figure 2.2: Classification of cellular morphologies and the temporal patterns of
rods, transitioning cells, and spores. Labeled cells that produce a fluorescent protein
under control of a vanillate-inducible promoter (strain YH7, YH8, YH14, or Y15) were
mixed with unlabeled wild-type cells (strain DK1622) at a ratio of 1 to 5 and starved under
submerged culture conditions. Vanillate (0.5 mM) was added at the start of starvation. Z-
stacks of optical sections of the same nascent fruiting body (NFB) for each of three biological
replicates were acquired at the indicated times poststarvation (PS) to show fluorescence
from tdTomato or mNeonGreen. Segmented cells from all z-stacks were classified as rods,
transitioning cells (TCs), or spores (5256, 1459, and 3213 cells, respectively). (A) Confocal
images of the same NFB. Images show an optical section near the bottom of a representative
NFB formed by the mixture of strains YH8 and DK1622. Red fluorescence from YH8 cells
is shown as grayscale. Red arrow, rod. Green arrows,TCs. Blue arrow, spore. Bar, 20 µm.
(B) Enlarged images of representative cells. Images from panel A were enlarged to show
cells with a particular shape (a rod at 27 h, TCs at 30 and 36 h, a spore at 48 h). Colored
arrows point to the same cell classes as in panel A. Bar, 2 µm. (C) Representative cellular
morphologies. Three-dimensional renderings reconstructed from the image stacks are shown
for three cells of each class. Apparent features <0.5 µm are artifacts created by shading
during the rendering process. Bar, 1 µm. (D) Proportion of each cell class at the indicated
times PS. Dot, median. Shaded region, 90% credible interval.
                                               55


Figure 2.3: Radial patterns of cell density, alignment, and orientation early cor-
relate with spore proportion later as nascent fruiting bodies mature.In the ex-
periments described in the legend of Figure 2.2 and in the text, segmented cells from all
z-stacks were classified as rods, transitioning cells, or spores. (A) Radial distribution of
cell classes and density. The proportion of each cell class at different times poststarvation
is shown radially from the center (0 µm) to the edge (60 µm) of nascent fruiting bodies
(NFBs). Total cell density is normalized to the maximum between the center and the edge
of NFBs. Line, median. Shaded region, 90% credible interval. (B) Radial patterns of rod
neighbor alignment and tangential orientation. Alignment is the cosine of the angle of a rod
to its neighbors averaged using a Gaussian kernel (sigma = 2.5 µm) (1, perfect alignment;
0, orthogonal). Orientation is the cosine of the angle of a rod with the circumference of the
NFB (1, tangent to the circumference; 0, orthogonal). (C) Cross-correlations between the
radial patterns of total cell density, rod neighbor alignment, and rod tangential orientation
early, and that of the spore proportion later. Values are Pearson correlation coefficients and
color shade indicates strength of positive (purple) or negative (yellow) correlation (* p <
0.05, ** p < 0.01).
                                               56


Figure 2.4: C-signal-dependent gene expression increases in transitioning cells
and spores as nascent fruiting bodies mature.As described in the Figure 2.2 legend,
strains YH14 bearing Pdev -tdTomato (designated dev ) or YH15 bearing PfmgE -tdTomato
(designated fmgE ) were mixed with unlabeled wild-type strain DK1622 at a ratio of 1 to
5 and co-developed under submerged culture conditions. Both the dev and fmgE strains
also contain Pvan-mNeonGreen. Vanillate (0.5 mM) was added at the start of starvation.
Z-stacks of confocal images of the same nascent fruiting body (NFB) were acquired at the
indicated times post starvation (PS), measuring both red and green fluorescence. Images are
representative of three biological replicates. (A) Confocal images of the same NFB. Images
show merged (M) red and green fluorescence. Red arrows, rods. Green arrows, transitioning
cells (TCs). Blue arrows, spores. Bar, 20 µm. (B) Enlarged images of representative cells.
Images from panel A were enlarged to show green (G), red (R), or M fluorescence of cells
with a particular shape (rods at 27 and 30 h, TCs at 30 and 36 h, spores at 36 and 48 h).
Colored arrows point to the same cell classes as in panel A. Bar, 2 µm. (C) Quantification
of gene expression. Segmented cells from z-stacks were classified as rods, TCs, or spores
based on green fluorescence. For each cell class, the log10 (relative fluorescence intensity)
(i.e., red/green) at each time PS is plotted for the dev and fmgE strains as indicated. Dot,
median. Vertical line, 90% credible interval. (D) Gene expression in TCs and spores. The
relative fluorescence intensity of TCs and spores was normalized to that of rods at 27 h
(dashed line at y = 0). Dot, median. Vertical line, 90% credible interval.
                                               57


Figure 2.4 (cont’d)
        58


Figure 2.5: Radial patterns of cell density, alignment, and orientation early cor-
relate with C-signal-dependent gene expression later as nascent fruiting bodies
mature. In the experiment described in the legend of Figure 2.4 and in the text, both
red and green fluorescence was quantified for all segmented cells from the z-stacks. The
results from three biological replicates were combined. (A) Radial distribution of C-signal-
dependent gene expression. For transitioning cells and spores, the log10 (relative fluorescence
intensity) (i.e., red/green) for the dev and fmgE strains at each time poststarvation is shown
radially from the center (0 µm) to the edge (60 µm) of nascent fruiting bodies. Line, median.
Shaded region, 90% credible interval. (B) Cross-correlations between the radial patterns of
total cell density, rod alignment, and rod tangential orientation early, and that of the dev
strain relative fluorescence intensity (RFI) later. Values are Pearson correlation coefficients
and color shade indicates strength of positive (purple) or negative (yellow) correlation. (C)
Cross-correlations as in panel B except with the fmgE strain RFI (* p < 0.05, ** p < 0.01).
                                               59


Figure 2.6: Model showing the inferred radial pattern of C-signaling based on
observed patterns of cellular arrangement, morphology, and gene expression as
nascent fruiting bodies mature.The radial distributions of cell classes, cell density, and
the neighbor alignment and tangential orientation of rods are depicted at three time points
based on Figure 2.3A and 2.3B. The inferred radial pattern of C-signaling at 30 h poststar-
vation is based on correlations with later C-signal-dependent dev and fmgE expression in
transitioning cells and spores (Fig. 2.5).
                                            60


Figure 2.7: Cartoon and scanning electron micrographs of M. xanthus develop-
ment.The cartoon at the top depicts starving rod-shaped cells, which coordinate their move-
ments to build a mound, with some cells lysing (dashed outline) during the process. The
mound matures into a fruiting body as some rods differentiate into round spores, while other
rods remain outside the fruiting body as peripheral rods. The scanning electron micrographs
at the bottom show the mound and fruiting body stages. The cartoon is adapted from [89]
and the micrographs are from [94] .
                                             61


Figure 2.8: FM 4-64 does not interfere with M. xanthus development. Wild-type
strain DK1622 was subjected to starvation under submerged culture conditions. FM 4-64
(2 µg/mL) was added at the start of starvation or withheld. (A) Bright-field images of the
bottom of nascent fruiting bodies (NFBs). The same NFB was examined at the indicated
times poststarvation (PS). Darkening of mounds typically reflects the formation of spores.
Images are representative of three biological replicates. Bar, 20 µm. (B) Quantification of
sonication-resistant spores. Samples were collected at 42 h PS, sonicated, and the number
of spores/mL was determined. In the main graph, each symbol represents one biological
replicate. The insert shows the posterior probability distribution of the difference in the
number of sonication-resistant spores/mL in response to 2 vs. 0 µg/mL of FM 4-64. Vertical
line, mean. Thick and thin horizontal lines, 90% and 98% credible intervals, respectively.
                                             62


Figure 2.9: Temporal pattern of cellular shape near the bottom of nascent fruiting
bodies. Based on our observations of cells undergoing submerged culture development
and visualized by membrane staining with FM 4-64, the proportion of cells exhibiting the
indicated shapes at different times poststarvation was approximated. Rods undergo lysis
or become transitioning cells or peripheral rods. Transitioning cells may undergo lysis or
persist, but some become spores.
                                            63


Figure 2.10: Cell classification using Gaussian mixture modeling. Each point on
the scatter plots represents a cell from the aggregated data across time and fruiting bodies,
collected in experiments described in the legend of Figure 2 and in the text. Clusters were
determined to be rods (red), transitioning cells (green), or spores (blue) according to their
morphology (equivalent diameter, surface area, length, and mean width). The Pearson
correlation coefficients are indicated for each pair of properties for all cells (black font) and
for each cell class as indicated (*** indicates p-value < 0.001). Histograms on the left are
the marginal distributions for each cell class for the corresponding morphological property.
                                               64


Figure 2.11: Morphological property distributions of cells classified as rods, transi-
tioning cells, or spores. In the experiments described in the legend of Figure 2 and in the
text, segmented cells from all z-stacks were classified as rods, transitioning cells, or spores.
The graph on the left in each panel shows the distribution of the indicated property for each
cell class. Gray dot, individual cell. Black dot, median. The graph on the right in each panel
shows the posterior probability intervals of the property difference for the indicated pair of
cell classes. Dot, median. Thick and thin horizontal lines, 90% and 98% credible intervals,
respectively. (A) Equivalent diameter. (B) Cell length. (C) Mean cell width. (D) Sphericity.
                                               65


Figure 2.12: Spatiotemporal distribution of cell classes and density in individual
nascent fruiting bodies. In the experiments described in the legend of Figure 2 and in
the text, segmented cells from z-stacks were classified as rods, transitioning cells, or spores.
The proportion of each cell class at the indicated time poststarvation is shown radially from
the center (0 µm) to the edge (60 µm) of one representative nascent fruiting body (NFB)
formed by mixing the strain indicated on the right with unlabeled strain DK1622. Total cell
density is normalized to the maximum between the center and the edge of the NFB. Line,
median. Shaded region, 90% credible interval.
                                              66


Figure 2.13: A csgA mutant fails to form nascent fruiting bodies and fails to
express C-signal-dependent genes. M. xanthus csgA mutant strains YH46 bearing Pdev -
tdTomato (designated dev ) or YH45 bearing PemphfmgE -tdTomato (designated fmgE ) were
starved under submerged culture conditions. Both the dev and fmgE strains also contain
Pvan -mNeonGreen. Vanillate (0.5 mM) was added at the start of starvation. Wells were
examined at 6-h intervals from 18 to 48 h poststarvation (PS) using confocal microscopy
to visualize both red and green fluorescence. Nascent fruiting bodies failed to form, cells
remained rod shaped, and the rods exhibited green fluorescence but showed very little red
fluorescence. Confocal images near the bottom of the biofilm were acquired at the indicated
times PS. Images show merged (M) red and green fluorescence. Bar, 2 µm.
                                            67


Figure 2.14: C-signal-dependent gene expression in individual cells of nascent
fruiting bodies. In the experiment described in the legend of Figure 4 and in the text,
segmented cells from z-stacks were classified as rods, transitioning cells, or spores based
on green fluorescence. The results from three biological replicates were combined. (A)
Distributions of red and green fluorescence intensities. For each cell (dot), the log10 (red
fluorescence intensity) and the log10 (green fluorescence intensity) at each time poststarvation
(PS) is plotted for the dev and fmgE strains as indicated. Cells in the lower left quadrant of
each plot (i.e., exhibiting low red and green fluorescence) were eliminated from analyses in
panels B-D. (B) Distributions of red fluorescence intensity. For each cell class, the log10 (red
fluorescence intensity) at each time PS is plotted for the dev and fmgE strains as indicated.
Dot, individual cell. Bar, median. (C) Distributions of green fluorescence intensity. Same as
panel B except the log10 (green fluorescence intensity) is plotted. (D) Distributions of relative
fluorescence intensity. Same as panel B except the log10 (relative fluorescence intensity) (i.e.,
red/green) is plotted.
                                                68


Figure 2.14 (cont’d)
         69


Figure 2.15: Vanillate-inducible gene expression in individual cells of nascent fruit-
ing bodies. As described in the Figure 2 legend, strain YH7 bearing Pvan -mNeonGreen or
YH8 bearing Pvan -tdTomato were mixed with unlabeled wild-type strain DK1622 cells at a
ratio of 1 to 5 and co-developed under submerged culture conditions. Vanillate (0.5 mM)
was added at the start of starvation. Z-stacks of confocal images of the same nascent fruiting
body were acquired at the indicated times poststarvation (PS) without changing the micro-
scope settings. Segmented cells from z-stacks were classified as rods, transitioning cells, or
spores. The results from three biological replicates were combined. For each cell class, the
distribution of log10 (fluorescence intensity) (i.e., green or red) at each time PS is plotted.
Dot, individual cell. Bar, median.
                                               70


CHAPTER 3 - COST-BENEFIT ANALYSIS OF PERITRICHOUS FLAGEL-
LATION IN LOW VISCOSITY ENVIRONMENTS
Authors: Joshua L. Franklin, Sarya Derado, Marc Erhardt, and Yann S. Dufour
                                           71


Abstract
    Flagellar motility is common in bacteria, but we know little about the cost-benefit trade-
offs of synthesizing more than one flagellum. Rough estimates from biochemical data and
rapid loss-of-function in evolutionary experiments [138] suggest a relative cost of one to
two percent per flagellum in Escherichia coli. Given such a high cost, what is the cost-
benefit relationship that determines the optimal number flagella? First, we determined the
metabolic cost of flagella from optical density measurements of Salmonella enterica batch
cultures using Bayesian inference with a mixed-effect model. We found that the rate of
growth decreases linearly by 1.2% [0.9 – 1.5% 99% CrI] per flagellum. Then, we quantified
the swimming performance of cells with different number of flagella by tracking swimming
cells with fluorescent flagellar hooks. We found that at low viscosity, the number of flagella
does not measurably impact swimming performance. Therefore, we derived a theoretical
model to predict if the expression of multiple flagella could be driven by the absence of
mechanisms to segregate flagella between daughter cells during division. We predicted the
optimal number of flagella given the cost-benefit ratio of producing flagella and provided
additional support using agent-based simulations of cells swimming in exponential gradients
of nutrients. Our predictions suggest that the synthesis of multiple flagella can be driven
by the random inheritance of flagella for up to four or five flagella, as seen in E. coli or S.
enterica, but additional selective pressures are required to explain the expression of higher
flagella numbers, as is found in Bacillus subtilis.
Introduction
    Owing to their small size, bacteria inhabit a world in which access to resources is severely
limited by diffusion [135, 198]. As a result, active motility confers a significant benefit and is
widespread in bacteria [43, 71, 79]. While undirected motility alone is advantageous, chemo-
taxis further increases the benefits by allowing cells to bias their movement toward favorable
conditions even in the presence of moderate turbulence or physical obstacles [168, 178, 181].
                                                72


Bacteria use several mechanisms for motility, but flagellar motility is the most common and
well-studied [71]. Flagella location, numbers, and morphology vary considerably between
bacterial lineages. Some lineages may have a single flagellum at the cell pole (e.g., Vibrio
cholerae) or distribute dozens of flagella over the cell body (e.g., Bacillus subtilis). While it
is expected that different environments apply different selective pressures, the constraints,
benefits, and trade-offs underlying the evolution of diverse flagellar morphologies across bac-
terial lineages is still unclear [54, 123, 168, 197].
    Generally, bacteria expressing a single flagellum tend to be found in aqueous environments
while bacteria expressing many flagella live in more physically heterogeneous environments
such as the gut of animal hosts or soil [151]. Some lineages, such as Vibrio alginolyticus
or Azospirillum brasilense, possess multiple flagellar systems or can regulate the production
of lateral flagella to allow swarming motility or swimming in high viscosity environments
[5, 56, 114]. For B. subtilis, increasing the number of flagella affects swimming behavior
and improves spreading on semi-solid surfaces [80, 121, 132, 132]. For Shewanella oneidensis,
different flagellation patterns were favored when experimental conditions were changed [195].
However, predicting the benefits of flagellation patterns across environments is complicated
by the fact that flagella can be used for more that motility, such as attaching to surfaces,
forming biofilms, or pathogenesis [8, 28,34, 129]. Nevertheless, making many flagella appears
to be beneficial in various contexts.
    Making and powering flagella has a significant metabolic cost for bacteria [9, 51, 109,
115]. Non-motile mutants often outcompete their motile siblings in batch cultures with
a significant growth rate advantage [96, 134, 157]. Synthesizing hundreds of thousands of
flagellin monomers to extend the flagellar filament (which can account for more than 1% of
total protein synthesis) has imposed a strong selective pressure to optimize the composition
of flagellins toward energy efficient amino acids [9, 157]. In addition, the flagellar motor
expends trans-membrane electrochemical potentials (either proton- or sodium-motive-force),
which otherwise drive metabolic processes [160]. The substantial cost of flagellar motility
                                                  73


explains why bacterial lineages that have been examined have evolved elaborate and tight
regulation of flagellar synthesis [3, 29, 62, 111, 118, 202]. Therefore, metabolic investment in
flagellar motility creates a strong fitness trade-off that should determine an optimal number
and arrangement of flagella for bacterial lineages in different environments.
    A functional and quantitative characterization of fitness trade-offs is necessary to predict
evolutionary trajectories and explain the diversification of complex phenotypic traits across
lineages. Despite being well characterized, some aspects of the flagella system in Enter-
obacterales, such as Escherichia coli and Salmonella enterica, are not understood from an
evolutionary point of view. These bacteria assemble three to five peritrichous flagella but
are neither fast swimmers at low viscosity nor proficient at swarming on wet surfaces when
compared to other bacterial lineages [45, 130, 131]. In addition, their swimming behavior
does not appear to be sensitive to variation in the number of flagella [113,171]. Therefore, it
is not immediately evident why E. coli or S. enterica make three to five flagella on average.
    Here, we characterize the cost-benefit trade-off for peritrichous flagellation with a low
number of flagella using a genetically engineered strain of S. enterica. We gradually induced
flagellar expression and fluorescently labeled flagellar hooks to count the number of flagella in
single cells. We then determined the direct cost of flagellar expression by precisely measuring
the growth rates of populations making different numbers of flagella. We also used single
cell tracking and labels to determine that variation in flagella number has little effect on
swimming performance. However, we demonstrated that making multiple flagella can provide
a significant fitness advantage when the partition of flagella is not controlled during cell
division using an agent-based model of flagellar inheritance.
Results
The mean length of flagellar filaments is robust to changes in FlhDC activity.
    To characterize how change in the number of flagella affects growth rate and swimming
performance in S. enterica serovar typhimurium LT2, we generated a mutant strain in which
we can ectopically control flagellar expression and count flagella in single cells (EM8242).
                                                 74


We placed the master regulator of flagellar gene expression (FlhDC) under the control of
an inducible promoter (pTet) and introduced a cysteine residue in FlgE (S171C) to enable
specific fluorescent labelling of flagellar hooks. Another strain producing FlgE-S171C but
with a wild type flhDC locus was generated for comparison (TH9677).
    Flagellar hooks produce small and consistent fluorescent foci that can be accurately
counted using automated image analysis as opposed to flagellar filaments that have complex
overlapping shapes (Figure 3.1A). We visually verified in 500 cells that fluorescently labeled
flagellar hooks were always detected at the base of flagellar filaments using our automated
image analysis protocol.
    Next, we characterized the production of flagellar hooks as a response to ectopic induction
of pTet-flhDC with anhydrotetracycline (AnTc) (Figure 3.1B). In EM8242, no fluorescent
foci were detected in the absence of inducer and production saturated to 2 foci per cell
on average with AnTc concentrations above 1.5 pg/mL. In TH9677, the wild type flhDC
promoter produced 4 foci per cell on average. The distribution of the number of foci in
each population was most consistent with a negative binomial distribution with a constant
overdispersion parameter.
    Finally, we examined if the number of hooks expressed had an effect on the lengths of
the flagellar filaments produced in each cell (Figure 3.1C). The length of each flagellum
was measured by manually tracing flagella with an image processing software for 100 cells
randomly selected from each of the 5 induction conditions. The distribution of filament
lengths was fitted with a gamma distribution with a constant shape parameter. The effect of
the number of hooks on the filament length was estimated to be 0.05 µm/hook ([-0.07,0.16]
95% credible interval (CrI)). There is no strong evidence that changing FlhDC expression has
a large effect of the lengths of the produced flagellar filaments in our experimental conditions.
Therefore, we can assume that the overall cost for producing flagella is proportional to the
number of detected flagellar hooks.
                                                75


Figure 3.1: Ectopic control of flagellar expression and flagellar hook detection in S.
enterica. A) Composite image of fluorescently labeled EM8242 after induction of flagellar
expression ([AnTc] = 2,000 pg/mL). DNA was stained with DAPI (blue), flagellar hook
with a maleimide conjugated fluorescent dye (red), and flagellar filament with anti-flagellin
antibodies conjugated to a fluorescent dye (green). B) Number of flagellar hook foci as a
function of inducer concentration in single cells (750 cells per condition). Each color dot
represents a cell (red: EM8242, blue: TH9677). The estimated mean for each population
is indicated in black (dot: mean, line: 95% CI). C) Mean lengths of flagella as a function
of the number of foci in each cell (symbols represent different AnTc concentrations). The
relationship was fitted using a linear model with a gamma response (black line: mean, light
and dark blue: 50% and 95% CI).
Cell growth decreases proportionally to the number of flagella produced.
    To quantify the effect of flagellar synthesis on the average population growth rate, we
measured the change in optical density of batch cultures producing different numbers of
flagella. Expression of FlhDC was controlled in EM8242 by adding varying concentrations
of AnTc. Cultures were grown in Vogel-Bonner E minimal salts with 2% w/v yeast extract
added to prevent the bimodal regulation of FlhDC by RflP [180].
    The difference in growth rates between a strain producing no flagella (EM8242 without
AnTc) or 4 flagella (TH9677) was very small, as expected (Figure 3.2A). Cultures were
inoculated with cells already growing exponentially in the same conditions to avoid any lag
in growth and the growth rates were calculated while culture were still at low cell density
(OD600 < 0.4). To rule out any direct effect of AnTc, we also grew TH9677 in the presence
of 20 ng/mL AnTc and found no negative effect on growth rate (Figure 3.6).
                                              76


    Because experimental variability was relatively large compared to the expected effect
of flagellar synthesis, precise estimates of growth rates were obtained through extensive
replication (between 30 and 200 per conditions). As a result, we were able to resolve the
monotonic decrease in growth rate as flagellar synthesis increases (pval < 10-4 ) (Figure 3.2B).
As expected, TH9677 grew slower than fully induced EM8242.
    Finally, the cost of flagellar synthesis was estimated by directly comparing the populations
growth rates against mean foci numbers by matching populations according to the AnTc
concentrations used (Figures 3.1B and 3.2B). The relationship between the two phenotypes
was consistent with a linear model within the range of conditions tested (Figure 3.2C). We
estimated the slope and intercept of the linear relationship by integrating the models we
used to describe each phenotype independently. The mean growth rate of cells that do not
produce flagella was estimated to be 1.36 hr-1 ([1.35,1.38] 95% CrI) while the reduction in
growth rate for each flagellum produced was estimated to be 0.0159 hr-1 /foci ([0.0131,0.0192]
95% CrI). Therefore, the relative cost for each flagellum produced is 1.17% ([0.96%,1.41%]
95% CrI) in our growth conditions.
    Our direct measurement of the relative metabolic cost of producing flagella in S. enterica
is consistent with prior estimates obtained from competition experiments and biochemical
calculations [9, 134].
Swimming performance at low viscosity is not sensitive to the number of flagella.
    While there is a clear cost for producing several flagella, it is not yet clear how their
number would affect the swimming behavior of S. enterica in liquid environments. We
tested if varying number of flagella had an effect on the swimming performance of single cells
using an experimental assay previously described [45]. We tracked the trajectories of single
TH9677 cells swimming freely at low viscosity for several minutes and then captured high-
magnification phase contrast and epifluorescence micrographs (Figure 3.3A). Morphological
parameters and the number of fluorescently labeled flagellar hooks were determined for each
cell. Cells that were not motile or had a size (length or width) not within 3 standard
                                                  77


Figure 3.2: The relative metabolic cost of flagellar synthesis. A) Calculating growth
rate by monitoring change in culture optical density at 600 nm (OD600 ) over time. Rep-
resentative measurements from non-induced EM8242 (red circle) and TH9677 (blue circle)
are overlaid. Growth rates (r), background OD, and OD of the inoculum were estimated
by least-square fitting of an exponential growth model (solid lines) while cells were at low
density. B) Calculated growth rates as a function of inducer concentration for EM8242 (red)
and TH9677 (blue). Each dot represents a single batch culture (between 30 and 200 per
condition). The estimated growth rate for each condition is indicated in black (dot: mean,
line: 95% CrI). C) Regression of growth rate as a function of number of foci. Each dot repre-
sents a single inducer concentration (EM8242 in red) or wild type FlhDC activity (TH9677
in blue). The crossed lines represent the 95% credible interval from Figure 3.1B and 3.2B.
The black line represents the mean of the regression model (dark and light blue areas: 50%
and 95% CrI). Inset: Posterior probability distribution of the relative cost (slope/intercept)
of one flagellum of the growth rate (dot: mean, line: 95% CrI).
deviations of the population mean were excluded from the following analyses.
     We tested if the number of flagellar hooks had a measurable effect on several behavioral
parameters calculated from the cell trajectories (Figure 3.3B-E). Linear regression analy-
ses indicate that swimming performance at low viscosity is not sensitive to change in the
number of flagella (swimming speed/foci = 0.004 mm/s [-0.016,0.024] 95% CrI; diffusion co-
efficient/foci = 0.001 mm2/s [-0.066,0.068] 95% CrI; tumble bias/foci = -0.025 [-0.082,0.030]
95% CrI; reversion frequency/foci = -0.032 s-1 [-0.103,0.045] 95% CrI; directional persis-
tence/foci = -0.009 [-0.064,0.043] 95% CrI). In addition, regression analyses against cell
length and width did not reveal any significant effect, as expected, since cells of unusual size
were excluded.
                                              78


Figure 3.3: Swimming performance of TH9677 with varying number of flagella.
A) Illustrative examples of single cell tracking followed by detection of fluorescently labeled
flagellar hooks after immobilizing cells with a UV-activated hydrogel. B-F) Linear regres-
sions of behavioral parameters calculated from cell trajectories against number of foci from
labeled flagellar hooks. Varying number of foci does not explain the variability in swimming
performance. Each dot represents a single cell. The black line represents the mean of the
regression model (dark and light blue areas: 50% and 95% CrI).
Making multiple flagella is beneficial even when swimming performance is not
affected.
    The high cost of flagellar synthesis and the lack of effect from varying flagella number on
swimming performance raise the question of why S. enterica makes more than one flagellum.
One model may be that flagellar motility in S. enterica has not evolved to be optimal at low
viscosity. Alternatively, because the location of new flagella on the cell body is not controlled
(although flagella are less likely to be near the poles) (Figure 3.6) [151], the inheritance of
flagella as cells divide may create a selective trade-off. The random allocation of flagella to
                                               79


daughter cells can generate non-flagellated cells that may be at a severe disadvantage since
it can take several minutes to synthesis a new flagellum [51].
    To examine the trade-off between the cost of making flagella and reducing the produc-
tion of non-flagellated cells, we created an agent-based model to simulate population growth
and evolutionary dynamics. The growth rate of each agent in a population is determined
by its position along a linear chemical gradient minus the proportion of resources allocated
for flagellar synthesis. The agent position along the gradient is determined by the equi-
librium between a chemotactic force (inversely proportional to the chemical concentration)
and gravity (Figure 3.4A). Non-flagellated agents eventually sediment to the bottom of the
gradient where they have access to fewer resources. Agents make flagella by allocating a
fraction of the acquired resources to flagellar synthesis until a predetermined cost is over-
come. New flagella are placed randomly along the agent body length. Three models for cell
growth, which affect the inheritance of flagella at cell division, were examined (Figure 3.4B).
S. enterica elongates by inserting new peptidoglycan diffusively along the sidewall (random
model), but other flagellated bacterial lineages insert new peptidoglycan near a cell pole (e.g.
Agrobacterium) or the mid cell (e.g. Caulobacter ) [24, 27].
    To determine the optimal number of flagella for different conditions, we evolved popula-
tions of agents by allowing the rate of flagellar synthesis to mutate as agent replicate until
the rate reached steady state. We varied the cost of making a flagellum and the reward
for being flagellated by varying the chemotactic force pulling agents up the gradient (the
resulting growth reward was calculated from the population average). As a result, we found
that the optimal number of flagella is a sublinear function of the growth reward that is
scaled down when flagellar cost is increased (Figure 3.8). Remarkably, the optimal number
of flagella collapse to a single function when the variable is changed to the reward over cost
ratio (Figure 3.4C). However, the mode of growth (thus, flagellum allocation) still affects
the scale of the relationship.
                                               80


Figure 3.4: In silico evolution of the optimal number of flagella in different condi-
tions. A) Schematic representation of the agent-based model. Flagellated agents swim up
with a speed inversely proportional to the chemical gradient (blue) and are pulled down by
gravity. Position along the gradient and determines the resources available for growth and
flagellar synthesis. The rate of flagellar synthesis is allowed to evolve by natural selection as
the population grows until steady state is reached. B) New flagella (red dots) are inserted
randomly on the agent body. Cell elongation follows one of three alternative models for cell
wall synthesis (blue) to determine the allocation of flagella to daughter cells. C) Results from
repeated in silico evolutionary experiments with different combinations of flagellar cost and
growth reward from chemotaxis. Each dot represents the mean number of flagella per cell in
an independent population after its evolutionary trajectory reached steady state. The solid
lines represent the best fit from our mathematical model (with P(¬flg) as a free parameter).
Inset: Mean and 95% CrI for the parameter P(¬flg) fitted to simulation results.
Making multiple flagella is beneficial even when swimming performance is not
affected.
    To gain further insights into the functional trade-off determining the optimal number of
flagella, we derived a mathematical model describing the average population growth rate (r)
when flagella are randomly allocated between daughter cells:
                             r = pn (r0 − cn) + (1 − pn )(r0 − cn + δ)                      (3.1)
    where, r0 is the basal population growth rate with no flagellar synthesis or reward from
chemotaxis, c is the growth cost per flagellum produced, n is the average number of flagella
produced, δis the growth reward for having at least one flagellum, and p is the probability of
not inheriting each flagellum after cell division. By taking the derivative of equation 3.1 as a
                                                81


function of n, we can calculate the number of flagella that should be produced to maximize
                                        dr
the average population growth rate (        = 0):
                                        dn
                                                    −1 c
                                               log log pδ
                                           n=                                               (3.2)
                                                 log p
    Equation 3.2 revealed that the optimal number of flagella is determined by δ/c (the
reward to cost ratio) and p (the probability of not inheriting a flagellum) (Figure 3.5A). The
functional form of equation 3.2 fitted the results well even though the population dynamics in
the agent-based evolutionary experiments are expected to be more complex. The nonlinear
growth dynamics generated by the different growth modes and chemotactic behavior average
into an effective probability of flagellar inheritance (p). Therefore, we estimated p from the
results of the agent-based simulations to characterize the influence of different cell growth
modes on the inheritance of flagella in peritrichous bacteria (Figure 3.4C inset). As expected,
growing from the cell pole is more likely to generate non-flagellated cells (Pend (¬flg) = 0.47),
while a more balanced allocation of flagella resulted from diffusive elongation (Prandom (¬flg)
= 0.34) or mid cell elongation (Pmiddle (¬flg) = 0.36). Our model also predicted that an
extremely large reward to cost ratio would be necessary to justify making more than half a
dozen flagella if the only goal was to minimize the probability in producing non-flagellated
cells during growth (Figure 3.5A).
    Equation 3.2 can be rearranged to calculate the expected growth reward (δ) from pro-
ducing a given number of flagella (n) at a given cost (c):
                                         δ = −cp−n / log p                                  (3.3)
    We calculated the expected reward using the parameters we estimated for S. enterica and
assuming a simple evolutionary model. To maintain the production of 4 flagella on average,
the reward from chemotaxis must increase the average population growth rate by at least
80% (Figure 3.5B) but this relationship is sensitive to the mode of flagellar inheritance. As
                                                82


expected, fewer flagella are necessary when both daughter cells are more likely to inherit at
least one. Finally, the optimal number of flagella, n, is more sensitive to the probability, p,
when the reward to cost ratio, d/c, is high (Figure 3.5C). When the reward to cost ratio
is very small, the ability to maximize the inheritance of flagella (p = 0) becomes critical
to make flagellar motility worthwhile. Overall, peritrichously flagellated bacterial lineages
should disfavor growing cell wall from the poles, and oligotrophic environments may strongly
select for mechanisms to ensure flagellar inheritance.
Figure 3.5: Mathematical predictions of the optimal number of flagella.A) Optimal
number of flagella (n) as a function of the reward to cost ratio for chemotaxis (δ/c) for
different probability of not inheriting a flagellum after cell division (p) from equation 3.2. B)
Expected growth reward (d) as a function of number of flagella (n) for different probability
of not inheriting a flagellum (p) and a fixed cost of 1.17% of the growth rate per flagellum
(c) from equation 3.3. C) Expected number of flagella (n) as a function of the probability
of not inheriting a flagellum (p) for different reward to cost ratios (δ/c) from equation 3.2.
The parameter values estimated in this study for S. enterica are indicated with dotted lines.
Discussion
    We find a growth rate cost of approximately 1% per flagellum, which is approximately in
line with previous evolutionary loss-of-function experiments, and biochemical estimates [138].
A growth rate decline of 1% per flagellum is substantial in evolutionary terms [138], with
our reference S. enterica strain growing approximately 4.5% slower than our inducible strain
when expressing no flagella (Figure 3.2B).
    The substantial cost of producing flagella indicates that cells derive some benefit from
                                                 83


making multiple flagella. But we know that cells with a single polar flagellum can swim
well [122], and our FAST assay data indicates that peritrichously flagellated cells with a
single flagellum can also swim effectively. Additionally, given that our inducible strain could
stably express fewer flagella than the reference strain, it seems clear that peritrichously
flagellated cells could evolve a lower flagella number. Taken together, these facts imply that
the expression of multiple peritrichous flagella is maintained by active selection.
    However, we did not find any strong effect of flagella number on a variety of swimming
parameters, though we could not rule out weak effects. The FAST assay is currently limited
to examining swimming behavior in a low viscosity regime, and it is possible that flagella
number plays a more important role at higher viscosities. Indeed, this is supported by
evolution experiments repeatedly passaging cells in swimplates [123]. Nevertheless, it seems
that in our experimental setup, flagella number does not play a strong role in swimming
behavior, excepting the obvious difference between zero and one or more flagella.
    The fact that we did not find a strong relationship between flagella number and swim-
ming behavior supports other explanations of peritrichous flagellation, such as offspring
inheritance, attachment, immunomodulation, etc. Though, we note that these alternative
explanations would still have been worth investigating even if we had found a strong link to
swimming performance, as natural selection acts on the fitness of the cell as a whole. We
decided to focus on offspring inheritance, as it is one of the simplest viable explanations, and
is amenable to computer simulation.
    Assuming cells do not receive a direct benefit from possessing more than one flagellum,
our simulations show that the optimal number of flagella grows logarithmically as a function
of the reward/cost ratio. We also found the logarithmic fit robust to variations in how
peptidoglycan was inserted into the cell wall. The logarithmic nature of our model implies
that the optimal number of flagella first grows quickly at low reward/cost ratios, then slows
down as the reward/cost ratio becomes large (Figure 3.4C). This leads us to the conclusion
that cells with many flagella (e.g. B. subtilis) violate our model assumptions. Otherwise,
                                                84


our model implies that B. subtilis would be experiencing a flagellar reward/cost ratio on
the order of 220 (one million), which seems implausibly high. A more likely explanation is
that highly flagellated cells derive direct benefits from possessing more than one flagellum,
perhaps for swarming or attachment purposes [121].
    Our work demonstrates the benefits of the flagellar localization processes found in some
bacteria. Flagellar localization ensures that both offspring cells receive one flagellum, and
so requires only one new flagellum be produced per division. Our logarithmic model of opti-
mal flagella number has no upper bound, indicating that even cells with four to five flagella
per cell will produce some unflagellated offspring. Essentially, peritrichously flagellated cells
producing four to five flagella pay 2-3% more than cells with localized flagella, while still not
being certain that both offspring will receive at least one flagellum. This trade-off seems of
dubious value, unless cells either directly benefit from peritrichous flagellation for another
reason or have not evolved flagellar localization machinery due to evolutionary constraints.
Perhaps regulatory machinery is also metabolically expensive, flagellation has recently been
(re)introduced into a lineage and insufficient time has passed to acquire localization ma-
chinery, or evolutionary obstacles (mutation supply, antagonistic pleiotropy, etc.) hinder the
evolution of regulatory machinery.
    Future work involves better characterizing the role of flagella number in swimming be-
havior, and evolutionary modeling of flagella number in different environmental contexts.
Additionally, studies into other non-swimming uses of peritrichous flagella will provide valu-
able insight about potentially diverse mechanisms of selection. In the end, we suspect that
peritrichous flagellation is maintained via several concurrent mechanisms, each of which
contributes to the overall selective pressure to maintain expression of multiple flagella.
Materials and Methods
    We performed all biological experiments using Salmonella enterica serovar typhimurium
LT2. We used strain TH9677 (Dhin-5717::FRT flgE6506(S171C)) as our “wild-type” refer-
ence strain, as it allows for maleimide-fluorophore staining of the flagellar hook and dis-
                                               85


ables flagellar phase variation. We then created EM8242 (Dhin-5717::FRT flgE6506 (S171C)
PflhDC5451::Tn10dTc[del-25] DTetA4(Daa6-396)), which allowed us to control flagellar ex-
pression using anhydrotetracycline (AnTc) induction of a pTet-flhDC promoter.
    We incubated all cultures in Vogel-Bonner Minimal Medium E salts buffer containing
2% w/v yeast extract (VB2YE). We grew cells overnight in VB2YE broth at 32◦ C with
shaking, then performed a 1:1000 dilution of overnight culture into fresh inducer-containing
media, then incubated under the same conditions for another five hours to condition the
cultures. We measured the optical density (OD) of the conditioned cultures at a wavelength
of 590nm using a spectrophotometer, then diluted to a target OD of 0.002 in fresh inducer-
containing media. We then transferred 200uL of inoculated culture to wells on a 96-well plate
(CELLSTAR). Once filled, we secured the lid with parafilm then poked four ventilation holes
in the corners of the lid using a hot 18-gauge needle. We then transferred the plate to a Tecan
(Austria) Sunrise 96-well plate reader, which incubated the plate at 32◦ C with shaking. The
plate reader collected OD readings every 5-15 minutes for 8-24 hours during the experiment.
    OD readings are transformed to account for non-linearity in the spectrophotometer read-
ings before trimming. To limit our fit to regions of exponential growth, we trimmed se-
                                                 dOD       d2 OD
quences by keeping only points in which both           and        were positive. We then used
                                                   dt       dt2
the trimmed data to fit an exponential growth model via maximum likelihood to estimate
the growth rate.
    When counting foci, we incubate the cells using the same protocol as the growth curves,
with an additional five-hour incubation at 32◦ C with shaking and induction. After the
final incubation we centrifuge 500µL of culture at 2500g for five minutes, then resuspend
in Vogel-Bonner salts buffer (VB2YE without yeast extract) as a washing step. We repeat
the wash, then add AlexaFluor594-maleimide conjugate (ThermoFisher) at a concentration
of 50-100µM and incubate with shaking for 10 minutes at room temperature. To remove
excess stain, we wash the cells three times as before. To immobilize the cells in a single plane
of focus we place 2µL of stained cells onto 1% agarose pads before imaging. We collect 61
                                                86


images per slide, manually discarding images that have obvious defects such as dust, bubbles,
excessively high or low cell density, etc. We use a customized version of SuperSegger [164] to
detect and segment cells, along with a custom written foci-detection function to count foci
within each cell.
    To calculate the cost of flagellar expression, we wrote a model in the STAN probabilistic
programming language to fit a mixed-effect model to both the foci count and growth rate
data with a linear model to measure the cost of flagellar expression. We used a negative
binomial model for the foci count data, and a lognormal model for the growth rate data. We
then fit a linear model to determine the slope and intercept of growth rate as a function of
foci number.
    The FAST assay was performed as previously described [45].
    We used an agent-based model (ABM) written in NetLogo [184] to model flagellar in-
heritance. The model consists of a world inhabited by simulated bacterial cells. Gravity
pulls cells towards the bottom of the simulation at a constant rate. Cells with at least
one flagellum swim upwards as if they were chemotaxing along a linear attractant gradient,
resulting in a drift velocity inversely proportional to nutrient concentration [44]. The com-
bination of gravity and chemotaxis result in cells forming a band within the world, and the
user can change the location of the band by tuning the proportionality constants of gravity
and chemotaxis. All cells receive a basal amount of energy for existing, and an additional
energy reward as a linear function of their height in the world. Together, this means that
cells which produce and maintain at least one flagellum are rewarded with additional energy
compared to unflagellated cells. There is no additional energy reward for possessing more
than one flagellum.
    Energy collected by the cells is partitioned between growth and flagellar synthesis via
a cell-owned parameter, π. Each cell possesses its own value for π, which it inherited from
its parent with a small change in value (i.e. a mutation). The value of π influences the
survival and reproduction of the cell, so natural selection will result in the average population
                                                87


π shifting towards the optimal value for a given combination of simulation parameters.
    When the amount of energy allocated to flagellar synthesis passes a user-selectable cost
threshold, the cell synthesizes a new flagellum which is placed at a random position along
the medial axis of the cell. Meanwhile, energy allocated to growth causes the cell to increase
in length according to a user-selected growth model (see below), and when the length passes
a user-set threshold the cell divides at the midpoint into two offspring cells. To model
the passage of flagella to offspring cells, the user can select from three different growth
models that control how peptidoglycan is inserted into the cell wall: mid-growth, modeling
a situation in which peptidoglycan is inserted only at the cell midpoint; end-growth, which
simulates peptidoglycan insertion only at one pole, and random growth, in which the inserted
peptidoglycan is inserted randomly along the cell body on each simulation update. The
coordinates of each flagellum are adjusted as the cell grows according to the growth method.
During division daughter cells are formed from each half of the parent cell, preserving the
relative positions the inherited flagella.
                                              88


Supporting Information
Figure 3.6: Effect of AnTc on the growth rate of TH9677. Each dot represents a
single batch culture, and each color represents an independent experiment. The estimated
growth rate for each condition is indicated in black (dot: mean, line: 95% CrI). The addition
of AnTc had a very small positive effect on growth (5.6e-4 hr-1 (ng/ml)-1 [2.3e-4,8.9e-4] 95%
CrI).
                                               89


Figure 3.7: Relative distribution of foci along the cell midline. Most foci are located
between the cell poles, with a nearly even distribution from 0.2 to 0.8, and a slight dip at
the cell midpoint. The pole and midpoint may be crowded with other cellular machinery
(eg. FtsZ complex).
                                            90


Figure 3.8: Optimal number of flagella for different combinations of flagellum
costs and rewards. Results from repeated in silico evolutionary experiments with differ-
ent combinations of flagellar cost (color scale) and growth reward from chemotaxis. Each
dot represents the mean number of flagella per cell in an independent population after its
evolutionary trajectory reached steady state. Simulations were repeated using three growth
models for new cell wall synthesis and flagella allocation between daughter cells (end: growth
near cell pole, mid: growth from middle of cell length, random: diffusive growth along cell
length).
                                               91


CHAPTER 4 - CONCLUSIONS AND FUTURE DIRECTIONS
Author: Joshua L. Franklin
                           92


Introduction
    Each of the previous chapters describe their contributions to biological knowledge. Here, I
will emphasize how the computational and statistical models I used in the previous chapters
were essential for the success of each project. In all cases, the chapter would have been
impossible, or would have been radically different, had we not utilized computational and
statistical modeling. Additionally, I will take this chapter as an opportunity to discuss a
couple of future directions more freely than in most scientific articles, which I hope the
reader finds engaging and interesting without becoming bogged down in details.
Chapter 1
Benefits of using Avida as an evolutionary model
    In the first chapter we built upon the Survival of the Flattest (SOF) paper, which demon-
strated slowly reproducing but mutationally robust organisms out-competing quickly repro-
ducing but mutationally fragile organisms [189]. Such processes are relevant to RNA viruses,
which tend to have high mutation rates. Indeed, the effect itself has been confirmed in RNA
viruses, providing support for Avida as a useful evolutionary model of important biological
systems [33, 101, 150].
    In order to replicate the original experiment, we needed to generate mutationally robust
and mutationally fragile organisms, which required precise control over mutation rates [189].
Additionally, we needed to perform competition assays across a range of mutation rates to
ensure the SOF effect was occurring. While researchers can manipulate biological mutation
rates, tuning a culture to a specific mutation rate would be difficult [165]. Additionally,
it takes considerable time and effort to adapt biological organisms to new mutation rates,
and there is no guarantee that organisms will not evolve reduced mutation rates [165]. In
contrast, Avida allowed us to fix the mutation rate to a known value while also disabling
insertion and deletion mutations, which we did not of interest in this study. Easy control
over the mutation rate in Avida allowed us to examine 201 independent pairs of organisms,
                                               93


a degree of replication infeasible in many complex biological experiments.
    However, the most important advantage of Avida was the ability to directly quantify
the distribution of fitness effects (DFE) of our organisms. Basically, the DFE required us
to categorize each mutation as lethal, deleterious, neutral, and beneficial relative to the
reference genotype. Avida can easily generate and characterize the fitness of all single- and
double- mutants for a genome. However, even Avida cannot test all triple- or higher mutants
due to the vast number of possible combinations of mutations. Fortunately, Avida can still
precisely estimate the DFE at higher mutational distances by randomly sampling tens of
millions of combinations for a given mutational distance. In a biological system, where the
smallest viral genomes are about 1 kilobase [26], generating and characterizing every single
nucleotide change would be possible, if laborious. With effort, a researcher could generate
a library of all 4.5 million possible double mutants, but the competitions assays required to
categorize each mutant would be infeasible.
    Instead, a researcher might try to characterize fitness by measuring the change in the
relative frequency of genotypes before and after a round of growth, similar to a TnSeq ex-
periment [174]. However, this kind of assay has more noise than Avida, making it difficult to
differentiate between e.g. neutral and slightly deleterious mutations, which may be relevant if
mutational robustness and drift robustness are related phenomena [55,98]. Avida can classify
the effect of each mutation as lethal, deleterious, neutral, or beneficial by directly measuring
the fitness of a mutant. While it is possible for organisms in Avida to have probabilistic
phenotypes, it is not common with the settings we used in our experiment. In biological
systems fitness measurements must contend with biological variability and other limitations
of competition assays (e.g. “neutral” markers). A growth rate approach to measuring fitness,
as discussed in chapter 3, could potentially be easier and more accurate. Even so, performing
millions of growth curves with substantial replication is far more difficult than using a few
hundred hours of CPU time.
                                               94


Future directions
     We encountered some unusual discrepancies as we tried to replicate the original SOF
paper. We had to use a spatially structured environment to generate the SOF effect, rather
than the well-mixed environment used in the original SOF study. When investigating, we
found that the version of Avida used in the original study possessed a slightly different
instruction set [127, 189]. The original instruction set was less flexible in that some of
the instructions had order dependencies. For example, “get” and “put” which load a value
and send a value, respectively, perform different tasks and so require a particular ordering.
Whereas the modern instruction set replaces “get” and “put” with a unified “IO” instruction
that both sends the current value then loads a new value automatically [127]. An analogous
change has occurred with some of the reproduction instructions as well [127]. The changes
from the old to new instruction sets were likely made to increase evolvability by making the
instruction set more “robust.”
     Instruction sets of varying “robustness” provide a unique opportunity to explore biochem-
ical transitions that may have occurred in early lifeforms. Many scientists hypothesize that
life originally used RNA to both store information and catalyze reactions, before transition-
ing to DNA-based information storage and protein catalysts [10]. It seems possible that
such a transition would reduce the mutation rate, due to the higher chemical stability of
DNA [10]. However, the transition could also cause radical changes in mutational robust-
ness. If RNA with substantial secondary structure or catalytic activity transitioned into the
double-stranded DNA we see today, the DFE would probably change substantially.
     We might be able to emulate some aspects of the RNA to DNA transition by evolving
organisms in Avida with a “fragile” instruction set, then transitioning the organisms to a semi-
compatible “robust” instruction set. We would expect to see radical changes in the DFE and
rapid changes in fitness during adaptation. If it is difficult to perform this transition without
killing the organisms, it may be possible to more gently transition from one instruction set to
another using an intermediate instruction set containing both robust and fragile instructions.
                                                95


    A study of the RNA to DNA transition using Avida would have the limitations frequently
associated with computational models. Instruction sets in Avida are imperfect analogies to
RNA and DNA. However, researchers have used Avida for several interesting studies that
have been validated experimentally, such as the original SOF paper [189]. Additionally, Avida
has already been used to investigate aspects of another difficult to study transition: the origin
of life [23]. Despite the limitations, I believe that experiments with Avida instruction sets
can provide insights into the otherwise opaque RNA to DNA transition.
Chapter 2
Benefits of using generalized additive models
    In the second chapter we investigated fruiting body formation in Myxococcus xanthus.
During starvation, M. xanthus cells cooperate to form a fruiting body, which requires the
differentiation of rod cells into spores [93]. The signaling processes controlling sporulation
are driven by contact-dependent C-signaling [93], so we wanted to characterize the spatial
distribution of gene expression, cell alignment, and cell classification. Ideally, we would
describe the spatial structure of these parameters throughout the fruiting body using a
smooth, continuous function. Additionally, due to the substantial variability between fruiting
bodies, our model would be capable of incorporating information from multiple replicates
to accurately reflect this variability. Generalized additive models (GAM) satisfied all these
requirements, and it is easy to write Bayesian GAM models using the brms R software
package [22, 192].
    GAMs are a principled way of fitting a flexible curve to a dataset [192]. A GAM consists
of a series of basis functions that are combined to generate a single curve that is flexible
enough to fit a dataset without becoming “too wiggly” and overfitting noise within the
data [192]. These features make GAMs useful for interpolation, where we do not have, or
are not concerned with, a specific equation describing the data across some variable such
as space or time. For example, we wanted to know the fluorescent intensity across space
within a fruiting body, and we suspected that factors such as cell classification would affect
                                               96


intensity. However, we did not have a specific mathematical equation describing the change
in fluorescent intensity as a function of distance from the center of the fruiting body, such as
exponential decay. Instead, by fitting a GAM we can generate a smooth curve that describes
intensity across space while accounting for relevant variables, but without forcing the curve
to have a specific shape.
    The flexibility of brms allowed us to use very similar GAMs for continuous variables,
such as alignment and fluorescent intensity, and discrete variables, such as cell classification.
This is because brms allows the user to specify the response function of a linear model [22].
Essentially, this means that instead a Gaussian distribution around the mean, as in a linear
model, we can choose to use an arbitrary probability distribution, such as gamma, Poisson,
categorical, etc [22]. So, we can keep similar model structures in terms of explanatory vari-
ables, and simply change the response function from Gaussian to categorical when switching
from a continuous variable, e.g. alignment, to one described by a categorical distribution,
e.g. cell classification. While brms is typically used to fit linear and generalized linear mod-
els, it gives users the capability to use GAMs as a “drop-in” replacement, enabling us to
easily extend linear models across space, time, or any other variable of interest.
Chapter 3
Benefits of using Mixed-effect Models
    In chapter three we were interested in precisely quantifying the cost of flagella in order
to characterize the cost-benefit trade-offs associated with peritrichous flagellation. We chose
to measure the cost using 96-well plate optical density growth curves, which can be noisy
and difficult to fit. Biochemical and evolutionary loss-of-function estimates put the cost per
flagellum on the order of 1% [138], so we required substantial precision to even detect the
growth defect.
    To increase the precision of our estimates, we performed many replicate assays. However,
we did not simply average our growth rates, because this is not a good model of how the data
was generated [112]. This is because many biological assays produce structured data, e.g.
                                                97


wells are nested within plates which themselves are nested in days. A simplistic view is that
wells within plate are isolated from one another and so they are completely independent.
However, this is not an accurate view of the situation; wells within a plate often share some
sources of noise. For example, we diluted the chemical inducer on a per-plate basis, so
all wells on a plate shared the same dilution error. Each plate was sourced from different
colonies on an agar plate, so biological variation is shared by all the wells within a plate. We
performed incubations on a per-plate basis, so errors from timing or temperature are also
shared by all wells on the plate. The result is that wells within a plate need to be treated
as related entities.
    Additionally, higher-level sources of error such as pipette calibration, ambient temper-
ature, relative humidity, etc. vary from day to day, so multiple plates performed on the
same day are not fully independent either. We could take this farther and say that pipetting
precision, preparedness, and laboratory skills vary widely from person-to-person, so plates
performed by different people need to be grouped together. Similar arguments could be made
that laboratory-to-laboratory variation in climate, equipment manufacturers, water sources,
etc. mean that we should also take laboratory into account. While we could continue this
chain of noise, beyond a certain point it is impractical to e.g. replicate across people and
across labs.
    Mixed-effect models allowed us to accurately model hierarchical data produced by our
growth curves. We grouped our growth rate measurements by plate, which also accounts
for the date because we performed one plate per day. The model then estimated a plate
effect/deviation for every plate we performed as being drawn from a normal (or t-) distribu-
tion with some variance. At a high level, all plates contribute to the posterior distribution
(read: estimate) of growth rate, but plates that deviate strongly from the mean are “shrunk”
towards the mean in a mathematically principled way [112]. This reduces our tendency to
overfit compared to doing no averaging and reduces our tendency to underfit compared to
computing a global average [112]. There are also side benefits, such as the “free” posterior dis-
                                               98


tribution of the plate effects in our study. While we did not utilize that distribution, other
researchers may be interested in the distribution of effects coming from dates, personnel,
countries, reagents, etc.
Future directions
     Our growth curve results indicated substantial cost of producing flagella but did not
find a strong relationship between flagella number and swimming behavior. Instead, we
postulated that flagellar inheritance might explain the production of multiple peritrichous
flagella. Our agent-based model of flagella assumed that a single flagellum was sufficient
for maximum swimming performance, and that cells paid a constant cost per flagellum
produced. We found that populations of agents evolved different flagellar expression levels
depending on the reward for being flagellated, the cost of producing flagella, and the method
of peptidoglycan insertion, which affects the distribution of flagella along the cell.
     The optimal number of flagella in our model is a logarithmic function of the benefit/cost
ratio, implying that an optimal number of flagella greater than six or seven requires high
benefit/cost ratios. That is why we suggested that organisms like Bacillus subtilis (over 20
flagella) violate our model assumptions, as it seems implausible that the reward/cost ratio
is of the order ten million. However, our model of flagellar inheritance is consistent with
common enteric bacteria such as Escherichia coli and Salmonella enterica, which have about
four flagella per cell. This begs the question of whether these enteric bacteria experience
selective pressures consistent with our model and have consequently optimized their flagellar
expression for inheritance purposes. If we could design an environment consistent with our
model, we could perform an evolution experiment to test our predictions.
     Key to such an experiment is designing an environment with the intended selective pres-
sures. The environment needs to rewards swimming and chemotaxing cells, without bene-
fiting nonmotile behaviors such as buoyancy and biofilm production. This clearly rules out
shaken media, which causes cells to rapidly lose flagellation, as they pay a large cost and reap
no benefit [138]. Instead, shaken media could serve as a control environment to determine
                                              99


the rate of flagella loss when there is zero benefit of flagella [138]. Serial passage in agar me-
dia would select for motility, but that is a high viscosity and porous medium that is outside
the scope of our flagellar inheritance model and has been explored in other studies [123].
These studies indicate upregulated flagellar expression in E. coli, which implies that agar
violates our model assumptions, perhaps because of the higher viscosity [123]. Unshaken
liquid media could encourage swimming along a vertical oxygen gradient, but can also fa-
vor lineage differentiation due to the spatially structured nature of the environment [138].
Instead, we need to design a more specialized environment that continuously produces the
required selective pressure for motility.
    To produce such an environment, I have helped design a bacterial “racetrack” microfluidic
device. The racetracks consist of a single chamber about 150µm wide, 150µm deep, and
about two meters long (Figure 4.1). The racetrack spirals outwards from a central loading
chamber until the diameter of the device reaches about 100mm, at which point it merges
with an unloading chamber. The result is a racetrack that provides sustained selection
for motility, as cells need to continuously swim to access fresh nutrients. The device can
optionally utilize chemotaxis, depending on the presence of metabolizable chemoattractants
in the media. At a drift velocity of 5µm/s, it would take four to five days for cells to complete
the “race”. Once the cells reached the unloading chamber, we could use a syringe to transfer
cells to a new racetrack for a new round of selection.
    If our racetracks correlate with our agent-based model, we should expect the population
to evolve towards the optimal flagella number due to flagellar inheritance. If flagella number
reaches a new steady expression level, we can then derive the benefit/cost ratio of producing
flagella for that organism. Perhaps most interestingly, if we see no meaningful change in
flagella number of wildtype cells over prolonged periods of selection, that could indicate the
organism has a life history that is well-aligned with our model.
                                               100


Figure 4.1: A bacterial “racetrack”. The width and depth of the channel is about 150µm.
The total length of the track is approximately two meters long.
Closing recommendations
Embrace agent-based modeling
   As my work has emphasized, agent-based modeling (ABM) should be in the toolset of
most biologists. ABMs consist of entities (agents) following a set of rules. An ABM of
bacterial chemotaxis might allow entities to swim through their environment or tumble, and
transition between those states based on the concentration of nutrients. Or in an ABM
of M. xanthus fruiting body development, entities might possess an accumulator variable
that integrates their exposure to a particular signal and change their behavior once the
accumulator crosses a threshold. In both cases, the rules are fairly simple and can be inferred
from biological knowledge. We can then set our computers to run the model many times
                                             101


and record the output. We can then observe how the model output changes when tweaking
parameters of interest (e.g. swimming speed).
    Certainly, agent-based models can be much more complex than these examples. However,
they can be useful to flip the script on working “smarter not harder”, by having the computer
work harder so that you look smarter. The explosion of computational resources over the past
few decades have made running hundreds of permutations of complex agent-based models
easy. Additionally, the ABM approach approach is arguably easier for the average biologist
than attempting to derive the (partial) differential equations governing these phenomena.
Indeed, without our ABM of flagellar inheritance in chapter three, we may not have found
our equation-based model of the optimal number of flagella.
Prefer a Bayesian Approach
    Perhaps the most appealing argument for Bayesian statistics is that many scientists
already try to interpret frequentist statistics in a Bayesian manner. A classic example of
this behavior relates to confidence intervals. A confidence interval is a frequentist interval
calculated such that, if an experiment were repeated many times, the calculated interval
would overlap the true value of a parameter some percentage of the time (usually 95%) [66].
However, scientists often try to interpret a confidence interval as containing the true value of
a parameter with some probability (again, usually 95%). This is nonsense from a frequentist
point of view, in which parameters are treated as fixed; the confidence interval either contains
the true value or does not [66]. However, Bayesian credible intervals, conditional on the
likelihood, data, and prior distribution, do contain the true value of a parameter with a
specified degree of probability [57]. By shifting to a Bayesian approach, scientists can more
confidently follow their intuition when interpreting statistics.
    More fundamentally, we are often actually interested in the value of parameters, not
rejecting (often arbitrary) null hypotheses. For example, imagine a farmer asking a scientist
to test the effect of fertilizer treatment on potato yields. The farmer probably would not
ask if the scientist had rejected the hypothesis that the fertilizer had no effect. Instead,
                                              102


the farmer would probably ask how much the fertilizer increased yields. This indicates that
the farmer is interested in the value of the yield parameter, not rejecting a null hypothesis.
Tellingly, it also indicates that the farmer has a prior distribution on the value of the yield
parameter due to their prior experience with fertilizer. Bayesian statistics puts the focus on
the posterior distribution of the parameters, making interpretation easier for both scientists
and the general public.
Make computational models accessible for educational purposes
    Computational models allow biologists to address questions that are otherwise difficult
to answer. Additionally, they make excellent pedagogical tools because they allow students
to play with a subject without the expense, labor, and safety issues of laboratories [16].
Part of my instructional experience during graduate school involved helping to write web-
based computational models of diffusion, bacterial motility, and chemotaxis for 8th grade
students as part of MSU’s Gifted and Talented Education summer program. The students
enjoyed the interactivity of the simulations, with some students going to far as to edit the
JavaScript code backing the simulation. Additionally, the easy data-logging capabilities of
computational models allowed us to perform simple quantitative analyses in class. Overall,
it was an excellent opportunity to integrate microbiology, physics, math, and programming
into a compelling educational experience.
    While the models we used in our classroom were built specifically for education, I encour-
age researchers to facilitate the use of their research models as educational tools by making
them as accessible as possible. To do so, articles using the model should be open access
and model code should be open source. However, to make a model highly accessible for
educational use requires more effort. Specifically, I suggest the following:
    • Documentation is key
         – Fully document every function, parameter, setting, flag, and output
         – Ensure that all provided documentation is accurate and always up to date
                                               103


    – Target your documentation towards an audience of modest technical and scientific
       knowledge
• Make installation painless
    – Preferably, use a web-based platform or provide pre-built binaries
    – Alternatively, provide build scripts
    – At a minimum, provide comprehensive installation instructions
    – Minimize external dependencies and avoid complex build tools
• Provide interesting examples
    – Write tutorials guiding users through important aspects of the model
    – Design example experiments for users to build upon
    – Consider providing videos or blog posts to interest users
• Be visual and interactive
    – Graphics are particularly engaging
    – Make it easy to save videos and images of the model
    – If appropriate, include touch- and mouse-based controls
• Make real experiments easy
    – Dump data into easily viewed table formats such as CSV, even if only a subset of
       your output can be saved in this way
    – Avoid using binary, JSON, XML, etc. files as the only output format
• Be genuine
    – Avoid the temptation to write a simplified model for educational use
                                        104


         – Instead, handle complexity through good documentation and software design
           practices
    These recommendations pose a substantial cost, but I suggest that the benefits are larger.
Educational tools that organically engage students are intrinsically beneficial and can be
used to fulfill grant outreach and engagement requirements. Less obviously, following these
recommendations may result in models that are more scientifically useful because they are
also more accessible to other scientists. Improved accessibility to scientific audiences may
also improve reproducibility and encourage model extension/collaboration. Taken together,
there is a clear case for making accessibility a higher priority for computational models in
the sciences.
                                              105


BIBLIOGRAPHY
      106


                                     BIBLIOGRAPHY
 [1] Christoph Adami, Charles Ofria, and Travis C Collier. Evolution of biological com-
     plexity. Proceedings of the National Academy of Sciences, 97(9):4463–4468, 2000.
 [2] Aneil F Agrawal and Michael C Whitlock. Mutation load: The fitness of individuals
     in populations where deleterious alleles are abundant. Annual Review of Ecology,
     Evolution, and Systematics, 43:115–135, 2012.
 [3] Phillip Aldridge and Kelly T Hughes. Regulation of flagellar assembly. Current Opinion
     in Microbiology, 5(2):160–165, 2002.
 [4] M Archetti. Survival of the steepest: Hypersensitivity to mutations as an adaptation
     to soft selection. Journal of Evolutionary Biology, 22(4):740–750, 2009.
 [5] T Atsumi, Y Maekawa, T Yamada, I Kawagishi, Y Imae, and M Homma. Effect of
     viscosity on swimming by the lateral and polar flagella of vibrio alginolyticus. Journal
     of Bacteriology, 178(16):5024–5026, 1996.
 [6] Rajesh Balagam and Oleg A. Igoshin. Mechanism for collective cell alignment in
     Myxococcus xanthus bacteria. PLoS Computational Biology, 11, 8 2015.
 [7] Robert E Beardmore, Ivana Gudelj, David A Lipson, and Laurence D Hurst. Metabolic
     trade-offs and the maintenance of the fittest and the flattest. Nature, 472(7343):342–
     346, 2011.
 [8] Robert Belas. Biofilms, flagella, and mechanosensing of surfaces by bacteria. Trends
     in Microbiology, 22(9):517–527, 2014.
 [9] Howard C. Berg. The rotary motor of bacterial flagella. Annual Review of Biochemistry,
     72(1):19–54, 2003. PMID: 12500982.
[10] Harold S. Bernhardt. The rna world hypothesis: the worst theory of the early evolution
     of life (except for all the others)a. Biology Direct, 7(1):23, Jul 2012.
[11] Swapna Bhat, Tye O. Boynton, Dan Pham, and Lawrence J. Shimkets. Fatty acids from
     membrane lipids become incorporated into lipid bodies during Myxococcus xanthus
     differentiation. PLOS ONE, 9(6):1–10, 06 2014.
[12] Saurabh Bhattacharya, Amit K. Baidya, Ritesh Ranjan Pal, Gideon Mamou, Yair E.
     Gatt, Hanah Margalit, Ilan Rosenshine, and Sigal Ben-Yehuda. A ubiquitous platform
     for bacterial nanotube biogenesis. Cell articles, 27:334–342.e10, 4 2019.
[13] Christof Biebricher and Manfred Eigen. What is a quasispecies? Quasispecies: Concept
     and Implications for Virology, pages 1–31, 2006.
[14] Tye O’Hara Boynton and Lawrence Joseph Shimkets. Myxococcus csga, drosophila
     sniffer, and human hsd10 are cardiolipin phospholipases. Genes and Development,
     29:1903–1914, 9 2015.
                                              107


[15] Sarah J. Bray. Notch signalling in context. Nature Reviews Molecular Cell Biology,
     17:722–735, 11 2016.
[16] Elena Bray Speth, Tammy M. Long, Robert T. Pennock, and Diane Ebert-May. Us-
     ing avida-ed for teaching and learning about evolution in undergraduate introductory
     biology courses. Evolution: Education and Outreach, 2(3):415–428, Sep 2009.
[17] James Briscoe and Pascal P. Thérond. The mechanisms of hedgehog signalling and its
     roles in development and disease. Nature Reviews Molecular Cell Biology, 14:418–431,
     2013.
[18] Nhat Khai Bui, Joe Gray, Heinz Schwarz, Peter Schumann, Didier Blanot, and Walde-
     mar Vollmer. The peptidoglycan sacculus of Myxococcus xanthus has unusual structural
     features and is degraded during glycerol-induced myxospore development. Journal of
     Bacteriology, 191(2):494–505, 2009.
[19] James J Bull, Lauren Ancel Meyers, and Michael Lachmann. Quasispecies made sim-
     ple. PLoS Computational Biology, 1(6):e61, 2005.
[20] Matthew J. Bush, Natalia Tschowri, Susan Schlimpert, Klas Flärdh, and Mark J.
     Buttner. C-di-gmp signalling and the regulation of developmental transitions in strep-
     tomycetes. Nature Reviews Microbiology, 13:749–760, 12 2015.
[21] David Butcher. Muller’s ratchet, epistasis and mutation effects. Genetics, 141(1):431–
     437, 1995.
[22] Paul-Christian Bürkner. Advanced bayesian multilevel modeling with the r package
     brms. The R Journal, 10(1):395–411, 2018.
[23] Nitash C G, Thomas LaBar, Arend Hintze, and Christoph Adami. Origin of life in a
     digital microcosm. Philosophical Transactions of the Royal Society A: Mathematical,
     Physical and Engineering Sciences, 375(2109):20160350, 2017.
[24] Todd A. Cameron, John R. Zupan, and Patricia C. Zambryski. The essential features
     and modes of bacterial polar growth. Trends in Microbiology, 23(6):347–353, Jun 2015.
[25] Ashleigh Campbell, Poorna Viswanathan, Terry Barrett, Bongjun Son, Shreya Saha,
     and Lee Kroos. Combinatorial regulation of the dev operon by mrpc2 and frua during
     Myxococcus xanthus development. Journal of Bacteriology, 197:240–251, 2015.
[26] José A. Campillo-Balderas, Antonio Lazcano, and Arturo Becerra. Viral genome size
     distribution does not correlate with the antiquity of the host lineages. Frontiers in
     Ecology and Evolution, 3:143, 2015.
[27] Felipe Cava, Erkin Kuru, Yves V Brun, and Miguel A de Pedro. Modes of cell
     wall growth differentiation in rod-shaped bacteria. Current Opinion in Microbiology,
     16(6):731–737, 2013. Growth and development: eukaryotes/prokaryotes.
                                           108


[28] Bonnie Chaban, H. Velocity Hughes, and Morgan Beeby. The flagellum in bacterial
     pathogens: For motility and a whole lot more. Seminars in Cell & Developmental
     Biology, 46:91–103, 2015. Biomineralisation & Motorisation of pathogens.
[29] Fabienne F. V. Chevance and Kelly T. Hughes. Coordinating assembly of a bacterial
     macromolecular machine. Nature Reviews Microbiology, 6(6):455–465, Jun 2008.
[30] Dennis Claessen, Daniel E. Rozen, Oscar P. Kuipers, Lotte Søgaard-Andersen, and
     Gilles P. Van Wezel. Bacterial solutions to multicellularity: A tale of biofilms, filaments
     and fruiting bodies. Nature Reviews Microbiology, 12:115–124, 2 2014.
[31] Hans Clevers and Roel Nusse. Wnt/b-catenin signaling and disease. Cell, 149:1192–
     1205, 6 2012.
[32] Alma Dal Co, Simon van Vliet, Daniel Johannes Kiviet, Susan Schlegel, and Martin
     Ackermann. Short-range interactions govern the dynamics and functions of microbial
     communities. Nature Ecology and Evolution, 4:366–375, 3 2020.
[33] Francisco M Codoñer, José-Antonio Darós, Ricard V Solé, and Santiago F Elena. The
     fittest versus the flattest: Experimental confirmation of the quasispecies effect with
     subviral pathogens. PLoS Pathogens, 2(12):e136, 2006.
[34] Remy Colin, Bin Ni, Leanid Laganenka, and Victor Sourjik. Multiple functions of
     flagellar motility and chemotaxis in bacterial physiology. FEMS Microbiology Reviews,
     07 2021. fuab038.
[35] Iñaki Comas, Andrés Moya, and Fernando González-Candelas. Validating viral qua-
     sispecies with digital organisms: A re-examination of the critical mutation rate. BMC
     Evolutionary Biology, 5(1):5, 2005.
[36] Christopher R. Cotter, Heinz Bernd Schüttler, Oleg A. Igoshin, and Lawrence J.
     Shimkets. Data-driven modeling reveals cell behaviors controlling self-organization
     during Myxococcus xanthus development. Proceedings of the National Academy of Sci-
     ences of the United States of America, 114:E4592–E4601, 6 2017.
[37] Arthur W Covert, Richard E Lenski, Claus O Wilke, and Charles Ofria. Experiments on
     the role of deleterious mutations as stepping stones in adaptive evolution. Proceedings
     of the National Academy of Sciences, 110(34):E3171–E3178, 2013.
[38] Eugene W Crawford and Lawrence J Shimkets. The Myxococcus xanthus socE and
     csgA genes are regulated by the stringent response. Molecular Microbiology, 37:788–
     799, 2000.
[39] James F Crow. Some possibilities for measuring selection intensities in man. Human
     Biology, 30(1):1–13, 1958.
[40] Patrick D. Curtis, Rion G. Taylor, Roy D. Welch, and Lawrence J. Shimkets. Spa-
     tial organization of Myxococcus xanthus during fruiting body formation. Journal of
     Bacteriology, 189:9126–9130, 12 2007.
                                             109


[41] Eric H. Davidson. Emerging properties of animal gene regulatory networks. Nature,
     468:911–920, 12 2010.
[42] J Arjan G M de Visser, Joachim Hermisson, Günter P Wagner, Lauren Ancel Meyers,
     Homayoun Bagheri-Chaichian, Jeffrey L Blanchard, Lin Chao, James M Cheverud,
     Santiago F Elena, Walter Fontana, et al. Perspective: Evolution and detection of
     genetic robustness. Evolution, 57(9):1959–1972, 2003.
[43] Edward F. DeLong, Christina M. Preston, Tracy Mincer, Virginia Rich, Steven J.
     Hallam, Niels-Ulrik Frigaard, Asuncion Martinez, Matthew B. Sullivan, Robert Ed-
     wards, Beltran Rodriguez Brito, Sallie W. Chisholm, and David M. Karl. Commu-
     nity genomics among stratified microbial assemblages in the ocean’s interior. Science,
     311(5760):496–503, 2006.
[44] Yann S. Dufour, Xiongfei Fu, Luis Hernandez-Nunez, and Thierry Emonet. Limits of
     feedback control in bacterial chemotaxis. PLOS Computational Biology, 10(6):1–11,
     06 2014.
[45] Yann S. Dufour, Sébastien Gillet, Nicholas W. Frankel, Douglas B. Weibel, and Thierry
     Emonet. Direct correlation between motile behavior and protein abundance in single
     cells. PLOS Computational Biology, 12(9):1–25, 09 2016.
[46] Jeffrey A Edlund and Christoph Adami. Evolution of robustness in digital organisms.
     Artificial Life, 10(2):167–179, 2004.
[47] Manfred Eigen. Self-organization of matter and the evolution of biological macro-
     molecules. Naturwissenschaften, 58(10):465–523, 1971.
[48] Manfred Eigen and Peter Schuster. A principle of natural self-organization. part a:
     Emergence of the hypercycle. Naturwissenschaften, 64(11):541–565, 1977.
[49] Santiago F Elena, Purificación Carrasco, José-Antonio Daròs, and Rafael Sanjuán.
     Mechanisms of genetic robustness in RNA viruses. EMBO articles, 7(2):168–173, 2006.
[50] Eva Ellehauge, Mads Nørregaard-Madsen, and Lotte Søgaard-Andersen. The frua
     signal transduction protein provides a checkpoint for the temporal co-ordination of
     inter-cellular signals in Myxococcus xanthus development. Mol Microbiol, 30(4):807–
     817, Nov 1998.
[51] Marc Erhardt. Energy requirements for protein secretion via the flagellar type III
     secretion system. In Methods in Molecular Biology, pages 449–457. Springer New York,
     2017.
[52] Adam Eyre-Walker and Peter D Keightley. The distribution of fitness effects of new
     mutations. Nature Reviews Genetics, 8(8):610–618, 2007.
[53] Robert Forster, Christoph Adami, and Claus O Wilke. Selection for mutational ro-
     bustness in finite populations. Journal of theoretical biology, 243(2):181–190, 2006.
                                            110


[54] David T Fraebel, Harry Mickalide, Diane Schnitkey, Jason Merritt, Thomas E
     Kuhlman, and Seppe Kuehn. Environment determines evolutionary trajectory in a
     constrained phenotypic space. eLife, 6, March 2017.
[55] Joshua Franklin, Thomas LaBar, and Christoph Adami. Mapping the peaks: Fitness
     landscapes of the fittest and the flattest. Artificial Life, 25:250–262, 2019.
[56] Mikako Fujii, Satoshi Shibata, and Shin-Ichi Aizawa. Polar, peritrichous, and lateral
     flagella belong to three distinguishable flagellar families. Journal of Molecular Biology,
     379(2):273–283, 2008.
[57] Samanta Tapas Ghosh Jayanata K., Delampady Mohan. An Introduction to Bayesian
     Analysis, chapter 2, pages 29–63. Springer, New York, NY, 1 edition, 2006. Page 49
     specifically.
[58] Daniel G Gibson, Lei Young, Ray-Yuan Chuang, J Craig Venter, Clyde A Hutchison,
     and Hamilton O Smith. Enzymatic assembly of DNA molecules up to several hundred
     kilobases. Nature Methods, 6(5):343–345, April 2009.
[59] Heather J Goldsby, Anna Dornhaus, Benjamin Kerr, and Charles Ofria. Task-switching
     costs promote the evolution of division of labor and shifts in individuality. Proceedings
     of the National Academy of Sciences, 109(34):13686–13691, 2012.
[60] Heather J Goldsby, David B Knoester, Charles Ofria, and Benjamin Kerr. The evo-
     lutionary origin of somatic cells under the dirty work hypothesis. PLoS Biology,
     12(5):e1001858, 2014.
[61] Thomas M A Gronewold and Dale Kaiser. The act operon controls the level and time
     of c-signal production for Myxococcus xanthus development. Molecular Microbiology,
     40:744–756, 2001.
[62] Sarah B. Guttenplan and Daniel B. Kearns. Regulation of flagellar motility during
     biofilm formation. FEMS Microbiology Reviews, 37(6):849–871, 11 2013.
[63] Douglas Hanahan. Studies on transformation of escherichia coli with plasmids. Journal
     of Molecular Biology, 166(4):557–580, 1983.
[64] Tong Hao, Dvora Biran, Gregory J. Velicer, and Lee Kroos. Identification of the
     ωregulatory region, a developmental promoter of Myxococcus xanthus that is tran-
     scribed in vitro by the major vegetative rna polymerase. Journal of Bacteriology,
     184(12):3348–3359, 2002.
[65] Antonia Herrero, Joel Stavans, and Enrique Flores. The multicellular nature of fila-
     mentous heterocyst-forming cyanobacteria. FEMS Microbiology Reviews, 40:831–854,
     11 2016.
[66] Rink Hoekstra, Richard D. Morey, Jeffrey N. Rouder, and Eric Jan Wagenmakers.
     Robust misinterpretation of confidence intervals. Psychonomic Bulletin and Review,
     21:1157–1164, 10 2014.
                                             111


[67] Egbert Hoiczyk, Michael W. Ring, Colleen A. McHugh, Gertrud Schwär, Edna Bode,
     Daniel Krug, Matthias O. Altmeyer, Jeff Zhiqiang Lu, and Helge B. Bode. Lipid
     body formation plays a central role in cell fate determination during developmental
     differentiation of Myxococcus xanthus. Molecular Microbiology, 74:497–517, 10 2009.
[68] Colette A. Ten Hove, Kuan Ju Lu, and Dolf Weijers. Building a plant: Cell fate
     specification in the early arabidopsis embryo. Development (Cambridge), 142:420–430,
     2 2015.
[69] J. D. Hunter. Matplotlib: A 2D graphics environment. Computing In Science &
     Engineering, 9(3):90–95, 2007.
[70] Antonio A. Iniesta, Francisco García-Heras, Javier Abellón-Ruiz, Aránzazu Gallego-
     García, and Montserrat Elías-Arnanz. Two systems for conditional gene expression
     in myxococcus xanthus inducible by isopropyl-β-thiogalactopyranoside or vanillate.
     Journal of Bacteriology, 194(21):5875–5885, 2012.
[71] Ken F. Jarrell and Mark J. McBride. The surprisingly diverse ways that prokaryotes
     move. Nature Reviews Microbiology, 6(6):466–476, Jun 2008.
[72] Lars Jelsbak and Lotte Søgaard-Andersen. Pattern formation by a cell surface-
     associated morphogen in Myxococcus xanthus. Proceedings of the National Academy of
     Sciences, 99(4):2032–2037, 2002.
[73] Bryan Julien, A. Dale Kaiser, and Anthony Garza. Spatial control of cell differentiation
     in Myxococcus xanthus. Proceedings of the National Academy of Sciences, 97(16):9098–
     9103, 2000.
[74] D. Kaiser. Social gliding is correlated with the presence of pili in Myxococcus xanthus.
     Proceedings of the National Academy of Sciences of the United States of America,
     76(11):5952–5956, Nov 1979. PMC411771[pmcid].
[75] Dale Kaiser. Cell fate and organogenesis in bacteria. Trends in Genetics, 15(7):273–
     277, Jul 1999.
[76] Dale Kaiser and Roy Welcht. Dynamics of fruiting body morphogenesis. Journal of
     Bacteriology, 186:919–927, 2 2004.
[77] Kazem Kashefi and Patricia L. Hartzell. Genetic suppression and phenotypic masking
     of a Myxococcus xanthus frzf- defect. Molecular Microbiology, 15(3):483–494, 1995.
[78] Matthew Kay. tidybayes: Tidy Data and Geoms for Bayesian Models, 2020. R package
     version 2.3.1.
[79] Daniel B. Kearns. A field guide to bacterial swarming motility. Nature Reviews Mi-
     crobiology, 8(9):634–644, Sep 2010.
[80] Daniel B. Kearns. You get what you select for: better swarming through more flagella.
     Trends in Microbiology, 21(10):508–509, 2013.
                                             112


[81] S K Kim and D Kaiser. Cell motility is required for the transmission of c-factor,
     an intercellular signal that coordinates fruiting body morphogenesis of Myxococcus
     xanthus. Genes & Development, 4(6):896–905, June 1990.
[82] Seung K Kim and Dale Kaiser. C-factor: A cell-cell signaling protein required for
     fruiting body morphogenesis of M. xanthus. Cell, 61:19–26, 1990.
[83] Seung K Kim and Dale Kaiser. C-factor has distinct aggregation and sporulation
     thresholds during myxococcus development. Journal of Bacteriology, 173:1722–1728,
     1991.
[84] SK Kim and D Kaiser. Cell alignment required in differentiation of Myxococcus xanthus.
     Science, 249(4971):926–928, 1990.
[85] Alexey S Kondrashov.         Muller’s ratchet under epistatic selection.     Genetics,
     136(4):1469–1473, 1994.
[86] R H Kottel, K Bacon, D Clutter, and D White. Coats from Myxococcus xanthus: char-
     acterization and synthesis during myxospore differentiation. Journal of Bacteriology,
     124(1):550–557, 1975.
[87] David C Krakauer and Joshua B Plotkin. Redundancy, antiredundancy, and the robust-
     ness of genomes. Proceedings of the National Academy of Sciences, 99(3):1405–1409,
     2002.
[88] Lee Kroos. Bacterial development in the fast lane. Journal of Bacteriology, 190:4373–
     4376, 7 2008.
[89] Lee Kroos. Highly signal-responsive gene regulatory network governing myxococcus
     development. Trends in Genetics, 33:3–15, 1 2017.
[90] Lee Kroos, Patricia Hartzell, Karen Stephens, and Dale Kaiser. A link between cell
     movement and gene expression argues that motility is required for cell-cell signaling
     during fruiting body development. Genes & Development, 2:1677–1685, 1988.
[91] Lee Kroos and Dale Kaiser. Expression of many developmentally regulated genes in
     myxococcus depends on a sequence of cell interactions. Genes & Development, 1:840–
     854, 1987.
[92] John K. Kruschke. Bayesian estimation supersedes the t test. Journal of Experimental
     Psychology: General, 142:573–603, 2013.
[93] Thomas Kruse, Sune Lobedanz, Nils M S Berthelsen, and Lotte Sùgaard-Andersen.
     C-signal: a cell surface-associated morphogen that induces and co-ordinates multicel-
     lular fruiting body morphogenesis and sporulation in Myxococcus xanthus. Molecular
     Microbiology, 40:156–168, 2001.
[94] J M Kuner and D Kaiser. Fruiting body morphogenesis in submerged cultures of
     Myxococcus xanthus. Journal of Bacteriology, 151(1):458–461, 1982.
                                            113


 [95] Jerry M Kunert and Dale Kaiser. Fruiting body morphogenesis in submerged cultures
      of Myxococcus xanthus. Journal of Bacteriology, pages 458–461, 1982.
 [96] K Kutsukake and T Iino. Role of the flia-flgm regulatory system on the transcriptional
      control of the flagellar regulon and flagellar formation in salmonella typhimurium.
      Journal of Bacteriology, 176(12):3598–3605, 1994.
 [97] Thomas LaBar and Christoph Adami. Different evolutionary paths to complexity
      for small and large populations of digital organisms. PLoS Computational Biology,
      12(12):e1005066, 2016.
 [98] Thomas LaBar and Christoph Adami. Evolution of drift robustness in small popula-
      tions. Nature Communications, 8(1):1012, 2017.
 [99] Yinghong Lan, Aaron Trout, Daniel M Weinreich, and C Scott Wylie. Natural selection
      can favor the evolution of ratchet robustness over evolution of mutational robustness.
      bioRxiv, page 121087, 2017.
[100] Adam S Lauring and Raul Andino. Quasispecies theory and the behavior of RNA
      viruses. PLoS Pathogens, 6(7):e1001005, 2010.
[101] Adam S Lauring, Judith Frydman, and Raul Andino. The role of mutational robustness
      in RNA virus evolution. Nat Rev Microbiol, 11(5):327–336, 2013.
[102] Bongsoo Lee, Carina Holkenbrink, Anke Treuner-Lange, and Penelope I. Higgs. Myx-
      ococcus xanthus developmental cell fate production: Heterogeneous accumulation of
      developmental regulatory proteins and reexamination of the role of mazf in develop-
      mental lysis. Journal of Bacteriology, 194:3058–3068, 6 2012.
[103] Richard E Lenski, Charles Ofria, Travis C Collier, and Christoph Adami. Genome
      complexity, robustness and genetic interactions in digital organisms.           Nature,
      400(6745):661–664, 1999.
[104] Richard E Lenski, Charles Ofria, Robert T Pennock, and Christoph Adami. The
      evolutionary origin of complex features. Nature, 423(6936):139–144, 2003.
[105] Shengfeng Li, Bheong-Uk Lee, and Lawrence J Shimkets. csgA expression entrains
      Myxococcus xanthus development. Genes & Development, 6:401–410, 1992.
[106] Sune Lobedanz and Lotte Søgaard-Andersen. Identification of the c-signal, a contact-
      dependent morphogen coordinating multiple developmental responses in Myxococcus
      xanthus. Genes and Development, 17:2151–2161, 9 2003.
[107] R. Lux, Y. Li, A. Lu, and W. Shi. Detailed three-dimensional analysis of structural fea-
      tures of Myxococcus xanthus fruiting bodies using confocal laser scanning microscopy.
      Biofilms, 1(4):293–303, 2004.
[108] Nicholas A. Lyons and Roberto Kolter. On the evolution of bacterial multicellularity.
      Current Opinion in Microbiology, 24:21–28, 4 2015.
                                             114


[109] Esteban Martínez-García, Pablo I. Nikel, Max Chavarría, and Víctor de Lorenzo. The
      metabolic cost of flagellar motion in pseudomonas putida kt2440. Environmental Mi-
      crobiology, 16(1):291–303, 2014.
[110] Joan Massagué. Tgfβsignalling in context. Nature Reviews Molecular Cell Biology,
      13:616–630, 10 2012.
[111] Linda L McCarter. Regulation of flagella. Current Opinion in Microbiology, 9(2):180–
      186, 2006. Cell Regulation / Edited by Werner Goebel and Stephen Lory.
[112] Richard McElreath. Statistical Rethinking; A Bayesian Course with Examples in R
      and STAN, chapter 13. CRC Press, Boca Raton, 2020.
[113] Patrick J Mears, Santosh Koirala, Chris V Rao, Ido Golding, and Yann R Chemla.
      Escherichia coli swimming is robust against variations in flagellar number. eLife, 3,
      February 2014.
[114] Susana Merino, Jonathan G. Shaw, and Juan M. Tomás. Bacterial lateral flagella: an
      inducible flagella system. FEMS Microbiology Letters, 263(2):127–135, 10 2006.
[115] James G. Mitchell. The energetics and scaling of search strategies in bacteria. The
      American Naturalist, 160(6):727–740, 2002. PMID: 18707461.
[116] Sheenu Mittal and Lee Kroos. A combination of unusual transcription factors binds co-
      operatively to control Myxococcus xanthus developmental gene expression. Proceedings
      of the National Academy of Sciences, 106(6):1965–1970, 2009.
[117] Sheenu Mittal and Lee Kroos. Combinatorial regulation by a novel arrangement of frua
      and mrpc2 transcription factors during Myxococcus xanthus development. Journal of
      Bacteriology, 191:2753–2763, 4 2009.
[118] Sampriti Mukherjee and Daniel B. Kearns. The structure and regulation of flagella in
      bacillus subtilis. Annual Review of Genetics, 48(1):319–340, 2014. PMID: 25251856.
[119] Frank D. Müller, Christian W. Schink, Egbert Hoiczyk, Emöke Cserti, and Penelope I.
      Higgs. Spore formation in Myxococcus xanthus is tied to cytoskeleton functions and
      polysaccharide spore coat deposition. Molecular Microbiology, 83(3):486–505, 2012.
[120] Carey D Nadell, Knut Drescher, and Kevin R Foster. Spatial structure, cooperation
      and competition in biofilms. Nature Publishing Group, 2016.
[121] Javad Najafi, Mohammad Reza Shaebani, Thomas John, Florian Altegoer, Gert Bange,
      and Christian Wagner. Flagellar number governs bacterial spreading and transport
      efficiency. Science Advances, 4(9), 2018.
[122] Nguyen T. Q. Nhu, John S. Lee, Helen J. Wang, Yann S. Dufour, and Laurie E.
      Comstock. Alkaline ph increases swimming speed and facilitates mucus penetration
      for vibrio cholerae. Journal of Bacteriology, 203(7):e00607–20, 2021.
                                             115


[123] Bin Ni, Bhaswar Ghosh, Ferencz S. Paldy, Remy Colin, Thomas Heimerl, and Victor
      Sourjik. Evolutionary remodeling of bacterial motility checkpoint control. Cell Reports,
      18:866–877, 1 2017.
[124] Martin A Nowak. What is a quasispecies? Trends in Ecology & Evolution, 7(4):118–
      121, 1992.
[125] Kathleen A O’connor and David R Zusman. Development in Myxococcus xanthus
      involves differentiation into two cell types, peripheral rods and spores. JOURNAL OF
      BACTERIOLOGY, 173:3318–3333, 1991.
[126] Brendan D O’Fallon, Frederick R Adler, and Stephen R Proulx. Quasi-species evolution
      in subdivided populations favours maximally deleterious mutations. Proceedings of the
      Royal Society of London B: Biological Sciences, 274(1629):3159–3164, 2007.
[127] C. Ofria and C. O. Wilke. Avida: a software platform for research in computational
      evolutionary biology. Artificial Life, 10(2):191–229, 2004.
[128] Charles Ofria, David M Bryson, and Claus O Wilke. Avida: A software platform
      for research in computational evolutionary biology. In Andrew Adamatzky Maciej Ko-
      mosinski, editor, Artificial Life Models in Software, pages 3–35. Springer London, 2009.
[129] George A. O’Toole and Roberto Kolter. Flagellar and twitching motility are necessary
      for pseudomonas aeruginosa biofilm development. Molecular Microbiology, 30(2):295–
      304, 1998.
[130] Jonathan D. Partridge and Rasika M. Harshey. More than motility: Salmonella flagella
      contribute to overriding friction and facilitating colony hydration during swarming.
      Journal of Bacteriology, 195(5):919–929, 2013.
[131] Jonathan D. Partridge and Rasika M. Harshey. Swarming: Flexible roaming plans.
      Journal of Bacteriology, 195(5):909–918, 2013.
[132] Joyce E. Patrick and Daniel B. Kearns. Swarming motility and the control of master
      regulators of flagellar biosynthesis. Molecular Microbiology, 83(1):14–23, 2012.
[133] Robert T Pennock. Models, simulations, instantiations, and evidence: The case of dig-
      ital evolution. Journal of Experimental & Theoretical Artificial Intelligence, 19(1):29–
      42, 2007.
[134] Gordon R. Plague, Krystal S. Boodram, Kevin M. Dougherty, Sandar Bregg, Daniel P.
      Gilbert, Hira Bakshi, and Daniel Costa. Transposable elements mediate adaptive de-
      bilitation of flagella in experimental escherichia coli populations. Journal of Molecular
      Evolution, 84(5-6):279–284, June 2017.
[135] E. M. Purcell. Life at Low Reynolds Number, pages 47–67. 1977.
                                               116


[136] Boyang Qin, Chenyi Fei, Andrew A. Bridges, Ameya A. Mashruwala, Howard A. Stone,
      Ned S. Wingreen, and Bonnie L. Bassler. Cell position fates and collective fountain
      flow in bacterial biofilms revealed by light-sheet microscopy. Science, 369(6499):71–77,
      2020.
[137] R Core Team. R: A language and environment for statistical computing. R Foundation
      for Statistical Computing, Vienna, Austria, 2017.
[138] Paul B. Rainey and Michael Travisano. Adaptive radiation in a heterogeneous envi-
      ronment. Nature, 394(6688):69–72, Jul 1998.
[139] Ramya Rajagopalan and Lee Kroos. Nutrient-regulated proteolysis of mrpc halts ex-
      pression of genes important for commitment to sporulation during Myxococcus xanthus
      development. Journal of Bacteriology, 196:2736–2747, 2014.
[140] Ramya Rajagopalan and Lee Kroos. The dev operon regulates the timing of sporulation
      during Myxococcus xanthus development. Journal of Bacteriology, 199, 5 2017.
[141] Ramya Rajagopalan, Sébastien Wielgoss, Gerardo Lippert, Gregory J. Velicer, and
      Lee Kroos. devi is an evolutionarily young negative regulator of Myxococcus xanthus
      development. Journal of Bacteriology, 197:1249–1262, 2015.
[142] Hans Reichenbach. Myxococcus spp. myxobacteriales - schwarmentwicklung und bil-
      dung von protocysten. IWF Göttingen, 1965. https://doi.org/10.3203/IWF/E-778
      Last accessed: 15 Jul 2021.
[143] Mark Robinson, Bongjun Son, David Kroos, and Lee Kroos. Transcription factor
      mrpc binds to promoter regions of hundreds of developmentally-regulated genes in
      Myxococcus xanthus. BMC Genomics, 15, 2014.
[144] Curtis T. Rueden, Johannes Schindelin, Mark C. Hiner, Barry E. DeZonia, Alison E.
      Walter, Ellen T. Arena, and Kevin W. Eliceiri. Imagej2: Imagej for the next generation
      of scientific image data, Nov 2017.
[145] B. Sager and D. Kaiser. Spatial restriction of cellular differentiation. Genes Dev,
      7(9):1645–1653, Sep 1993.
[146] Brian Sager and Dale Kaiser.             Two cell-density domains within the Myx-
      ococcus xanthus fruiting body (multicellular development/pattern forma-
      tion/morphogenesis/myxobacterales). Proc. Natl. Acad. Sci. USA, 90:3690–3694,
      1993.
[147] Brian Sager and Dale Kaiser. Intercellular c-signaling and the traveling waves of
      myxococcus. Genes & Development, 8, 1994.
[148] Shreya Saha, Pintu Patra, Oleg Igoshin, and Lee Kroos. Systematic analysis of the
      Myxococcus xanthus developmental gene regulatory network supports posttranslational
      regulation of frua by c-signaling. Molecular Microbiology, 111:1732–1752, 6 2019.
                                               117


[149] J. Sambrook, E.F Fritsch, and T. Maniatis. Molecular cloning: a laboratory manual ;
      2nd ed. Cold Spring Harbor Laboratory Press, 1989.
[150] Rafael Sanjuán, José M Cuevas, Victoria Furió, Edward C Holmes, and Andrés Moya.
      Selection for robustness in mutagenized RNA viruses. PLoS Genetics, 3(6):e93, 2007.
[151] Jan S. Schuhmacher, Kai M. Thormann, and Gert Bange. How bacteria maintain
      location and number of flagella? FEMS Microbiology Reviews, 39(6):812–822, 07 2015.
[152] Peter Schuster and Jörg Swetina. Stationary mutant distributions and evolutionary
      optimization. Bulletin of Mathematical Biology, 50(6):635–660, 1988.
[153] Luca Scrucca, Michael Fop, T. Brendan Murphy, and Adrian E. Raftery. mclust 5:
      clustering, classification and density estimation using gaussian finite mixture models.
      The R Journal, 8(1):289–317, 2016.
[154] Jun seok Lee, Bongjun Son, Poorna Viswanathan, Paul M. Luethy, and Lee Kroos.
      Combinatorial regulation of fmgD by mrpc2 and frua during Myxococcus xanthus de-
      velopment. Journal of Bacteriology, 193(7):1681–1689, 2011.
[155] Lawrence J Shimkets and Sheilah J Asher. Use of recombination techniques to examine
      the structure of the csg locus of Myxococcus xanthus. Molecular and General Genetics,
      211:63–71, 1988.
[156] Oleksii Sliusarenko, David R. Zusman, and George Oster. Aggregation during fruiting
      body formation in Myxococcus xanthus is driven by reducing cell movement. Journal
      of Bacteriology, 189:611–619, 1 2007.
[157] Daniel R. Smith, Matthew R. Chapman, Thomas J. Silhavy, and James Tiedje. Eco-
      nomical evolution: Microbes reduce the synthetic cost of extracellular proteins. mBio,
      1(3):e00131–10, 2010.
[158] L Søgaard-Andersen and D Kaiser. C factor, a cell-surface-associated intercellular sig-
      naling protein, stimulates the cytoplasmic frz signal transduction system in Myxococcus
      xanthus. Proceedings of the National Academy of Sciences, 93(7):2675–2679, 1996.
[159] Bongjun Son, Yu Liu, and Lee Kroos. combinatorial regulation by mrpc2 and frua in-
      volves three sites in the fmge promoter region during Myxococcus xanthus development.
      Journal of Bacteriology, 193:2756–2766, 6 2011.
[160] Yoshiyuki Sowa and Richard M. Berry. Bacterial flagellar motor. Quarterly Reviews
      of Biophysics, 41(2):103–132, 2008.
[161] Apollo Stacy, Luke McNally, Sophie E. Darch, Sam P. Brown, and Marvin Whiteley.
      The biogeography of polymicrobial infection. Nature Reviews Microbiology, 14:93–105,
      2 2016.
[162] Stan Development Team. RStan: the R interface to Stan, 2020. R package version
      2.21.2.
                                              118


[163] Giovanni Strona and Kevin D Lafferty. Environmental change makes robust ecological
      networks fragile. Nature Communications, 7:12462, 2016.
[164] Stella Stylianidou, Connor Brennan, Silas B. Nissen, Nathan J. Kuwada, and Paul A.
      Wiggins. Supersegger: robust image segmentation, analysis and lineage tracking of
      bacterial cells. Molecular Microbiology, 102(4):690–700, 2016.
[165] Toon Swings, Bram Van den Bergh, Sander Wuyts, Eline Oeyen, Karin Voordeckers,
      Kevin J Verstrepen, Maarten Fauvart, Natalie Verstraeten, and Jan Michiels. Adaptive
      tuning of mutation rates allows fast response to lethal stress in Escherichia coli. eLife,
      6:e22939, may 2017.
[166] L Søgaard-Andersen, F J Slack, H Kimsey, and D Kaiser. Intercellular c-signaling
      in Myxococcus xanthus involves a branched signal transduction pathway. Genes I&
      Development, 10(6):740–754, 1996.
[167] Lotte Søgaard-Andersen, Martin Overgaard, Sune Lobedanz, Eva Ellehauge, Lars Jels-
      bak, and Anders Aa Rasmussen. Coupling gene expression and multicellular morpho-
      genesis during fruiting body formation in Myxococcus xanthus. Molecular Microbiology,
      48:1–8, 2003.
[168] John R. Taylor and Roman Stocker. Trade-offs of chemotactic foraging in turbulent
      water. Science, 338(6107):675–679, 2012.
[169] Jennifer K. Teschler, David Zamorano-Sánchez, Andrew S. Utada, Christopher J.A.
      Warner, Gerard C.L. Wong, Roger G. Linington, and Fitnat H. Yildiz. Living in the
      matrix: Assembly and control of vibrio cholerae biofilms. Nature Reviews Microbiology,
      13:255–268, 5 2015.
[170] Shashi Thutupalli, Mingzhai Sun, Filiz Bunyak, Kannappan Palaniappan, and
      Joshua W. Shaevitz. Directional reversals enable Myxococcus xanthus cells to pro-
      duce collective one-dimensional streams during fruiting-body formation. Journal of
      the Royal Society Interface, 12, 8 2015.
[171] Linda Turner, Liam Ping, Marianna Neubauer, and Howard C. Berg. Visualizing
      flagella while tracking bacteria. Biophysical Journal, 111(3):630–639, Aug 2016.
[172] Nicholas Turner and Richard Grose. Fibroblast growth factor signalling: From devel-
      opment to cancer. Nature Reviews Cancer, 10:116–129, 2 2010.
[173] Erik Van Nimwegen, James P Crutchfield, and Martijn Huynen. Neutral evolution of
      mutational robustness. Proceedings of the National Academy of Sciences, 96(17):9716–
      9720, 1999.
[174] Tim van Opijnen, Kip L. Bodi, and Andrew Camilli. Tn-seq: high-throughput par-
      allel sequencing for fitness and genetic interaction studies in microorganisms. Nature
      Methods, 6(10):767–772, Oct 2009.
                                              119


[175] Gregory J. Velicer and Michiel Vos. Sociobiology of the myxobacteria. Annual Review
      of Microbiology, 63:599–623, 10 2009.
[176] Kartik Viswanathan, Poorna Viswanathan, and Lee Kroos. Mutational analysis of the
      Myxococcus xanthus ω4406 promoter region reveals an upstream negative regulatory
      element that mediates c-signal dependence. Journal of Bacteriology, 188:515–524, 1
      2006.
[177] Poorna Viswanathan, Toshiyuki Ueki, Sumiko Inouye, and Lee Kroos. Combinatorial
      regulation of genes essential for Myxococcus xanthus development involves a response
      regulator and a lysr-type regulator. Proceedings of the National Academy of Sciences,
      104(19):7969–7974, 2007.
[178] Nikita Vladimirov and Victor Sourjik.       Chemotaxis: how bacteria use memory.
      390(11):1097–1104, 2009.
[179] Hera Vlamakis, Yunrong Chai, Pascale Beauregard, Richard Losick, and Roberto
      Kolter. Sticking together: Building a biofilm the bacillus subtilis way. Nature Re-
      views Microbiology, 11:157–168, 3 2013.
[180] Xiaoyi Wang, Santosh Koirala, Phillip D. Aldridge, Christopher V. Rao, and Yves V.
      Brun. Two tandem mechanisms control bimodal expression of the flagellar genes in
      salmonella enterica. Journal of Bacteriology, 202(13):e00787–19, 2020.
[181] Yan Wei, Xiaolin Wang, Jingfang Liu, Ilya Nememan, Amoolya H. Singh, Howie Weiss,
      and Bruce R. Levin. The population dynamics of bacteria in physically structured
      habitats and the adaptive virtue of random motility. Proceedings of the National
      Academy of Sciences, 108(10):4047–4052, 2011.
[182] Marvin Whiteley, Stephen P. Diggle, and E. Peter Greenberg. Progress in and promise
      of bacterial quorum sensing research. Nature, 551:313–320, 11 2017.
[183] Hadley Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New
      York, 2016.
[184] Uri Wilensky.
[185] Claus O Wilke. Adaptive evolution on neutral networks. Bulletin of Mathematical
      Biology, 63(4):715–730, 2001.
[186] Claus O Wilke. Quasispecies theory in the context of population genetics. BMC
      Evolutionary Biology, 5(1):44, 2005.
[187] Claus O Wilke and Christoph Adami. Interaction between directional epistasis and
      average mutational effects. Proceedings of the Royal Society of London B: Biological
      Sciences, 268(1475):1469–1474, 2001.
[188] Claus O Wilke and Christoph Adami. Evolution of mutational robustness. Mutation
      Research/Fundamental and Molecular Mechanisms of Mutagenesis, 522(1):3–11, 2003.
                                             120


[189] Claus O Wilke, Jia Lan Wang, Charles Ofria, Richard E Lenski, and Christoph Adami.
      Evolution of digital organisms at high mutation rates leads to survival of the flattest.
      Nature, 412(6844):331–333, 2001.
[190] Jon F Wilkins, Peter T McHale, Joshua Gervin, and Arthur D Lander. Survival
      of the curviest: Noise-driven selection for synergistic epistasis. PLoS Genetics,
      12(4):e1006003, 2016.
[191] John W Wireman and Martin Dworkin. Developmentally induced autolysis during
      fruiting body formation by Myxococcus xanthus. Journal of Bacteriology, 129:796–802,
      1977.
[192] Simon N Wood. Fast stable restricted maximum likelihood and marginal likelihood
      estimation of semiparametric generalized linear models. J. R. Statist. Soc. B, 73:3–36,
      2010.
[193] Samuel S. Wu and Dale Kaiser. Genetic and functional evidence that type iv pili are
      required for social gliding motility in Myxococcus xanthus. Molecular Microbiology,
      18(3):547–558, 1995.
[194] Chunyan Xie, Haiyang Zhang, Lawrence J. Shimkets, and Oleg A. Igoshin. Statistical
      image analysis reveals features affecting fates of Myxococcus xanthus developmental
      aggregates. Proceedings of the National Academy of Sciences of the United States of
      America, 108:5915–5920, 4 2011.
[195] Ri-Sheng Yang and Yi-Tao Chen. Flagellation of shewanella oneidensis impacts bac-
      terial fitness in different environments. Current Microbiology, 77(8):1790–1799, Aug
      2020.
[196] Zhaomin Yang and Penelope Higgs, editors. Myxobacteria: Genomics, Cellular and
      Molecular Biology. Caister Academic Press, 2014.
[197] Xiao Yi and Antony M Dean. Phenotypic plasticity as an adaptation to a functional
      trade-off. eLife, 5, October 2016.
[198] Kevin D. Young. The selective value of bacterial shape. Microbiology and Molecular
      Biology Reviews, 70(3):660–703, 2006.
[199] Haiyang Zhang, Stuart Angus, Michael Tran, Chunyan Xie, Oleg A. Igoshin, and
      Roy D. Welch. Quantifying aggregation dynamics during Myxococcus xanthus devel-
      opment. Journal of Bacteriology, 193:5164–5170, 10 2011.
[200] Haiyang Zhang, Zalman Vaksman, Douglas B. Litwin, Peng Shi, Heidi B. Kaplan, and
      Oleg A. Igoshin. The mechanistic basis of Myxococcus xanthus rippling behavior and
      its physiological role during predation. PLoS Computational Biology, 8, 9 2012.
[201] Zhaoyang Zhang, Christopher R Cotter, Zhe Lyu, Lawrence J Shimkets, and Oleg A
      Igoshin. Data-driven models reveal mutant cell behaviors important for myxobacterial
      aggregation. mSystems, 5, 2020.
                                             121


[202] Shiwei Zhu, Seiji Kojima, and Michio Homma. Structure, gene regulation and envi-
      ronmental response of flagella in vibrio. Frontiers in Microbiology, 4:410, 2013.
                                             122