EVALUATING NITROGEN AND PHOSPHORUS IMPACTS WITHIN WATERSHEDS OF THE GREAT LAKES BASIN By Bailey A. Hannah A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Environmental Geosciences—Master of Science 2022 ABSTRACT The Great Lakes and the streams draining to them provide an abundance of ecosystem services, including habitat, water resources, and recreational opportunities. The success and wellbeing of these water bodies are impacted by a variety of factors, including invasive species and septic systems. Along the shoreline of the Great Lakes, invasive species, such as Phragmites and Typha, are a major concern to the coastal wetlands. Within the inland river systems, near- shore septic systems can create elevated levels of nutrients that can have a collection of negative impacts. Both of these threats ultimately relate back to the presence and application of nutrients such as nitrogen and phosphorus. We first address the landscape conditions that allow for coastal wetland invasion. Using machine learning algorithms, we were able to quantify relationships between the presence of invasive species in coastland wetlands, and a variety of landscape scale variables – primarily the nutrient loads of nitrogen and phosphorus. We determined that high invasion is most strongly associated with nitrogen loading above 118 kg/ha/yr within the watersheds derived from the invaded wetlands. We then address how septic systems could be contributing to nutrient loads within the Manistee and Au Sable Rivers of Michigan. We modeled groundwater flow and the transport of nutrients to assess how competently septic systems are retaining nutrients. On average, septic systems allow 88% of introduced nitrogen, and 49% of phosphorus, to enter groundwater. These findings will inform watershed management and provide a better understanding of the effectiveness of septic systems. ACKNOWLEDGMENTS I could not have completed graduate school without the support of those closest to me. My family has supported me along every step of my academic journey. I cannot imagine progressing as far as I have without their unyielding love and backing. To my mom, you are the best person I know. You would do anything in world for me. Your kindness, love, and willingness to act throughout this rollercoaster of a graduate school experience has been paramount to my success. Thank you for everything and all you continue to do. I love you to the moon and back. To my dad, you have been an excellent role model in drive, persistence, and work ethic. I truly appreciate all you have done for me growing up, and the support you continue to provide. Thank you and I love you so much. To Mitch, thank you for taking in an interest in my research and being a sounding board for my grad school gripes. To Trevor, thank you for always checking in and being good for a smile and a laugh. To both of my brothers, I cherish the bond that we have and love you both so much. To Rocky, I could not have found a better person to go through life with. Thank you for being there for the highs and the lows of the past few years, and never failing to provide a sense of levity. I love you so much. All of you have aided in the completion of this degree and for that (and many other things), I am forever grateful. I would also like to acknowledge the Michigan State Hydrogeology Lab. Your combined knowledge, insight, and comradery has been irreplaceable. Thank you especially to Brent Heerspink, Quercus Hamlin, Luwen Wan, and Chanse Ford for holding down the boat room, always lending an ear, making the long hours worth it, and endeavoring to cross the street to our favorite watering hole. Finally, thank you to my advisors and committee members. Your experience and expertise have been inimitable. Additionally, thank you to my funding sources, including the iii Department of Earth and Environmental Sciences and the College of Natural Sciences at Michigan State as well as grant funding through the National Aeronautics and Space Administration. iv TABLE OF CONTENTS LIST OF ABBREVIATIONS ........................................................................................................ vi CHAPTER 1: QUANTIFYING LINKAGES BETWEEN WATERSHED FACTORS AND COASTAL WETLAND PLANT INVASION IN THE US GREAT LAKES ............................... 1 1.1. Abstract ................................................................................................................................ 1 1.2. Introduction .......................................................................................................................... 2 1.3. Methods ............................................................................................................................... 6 1.4. Results ................................................................................................................................ 18 1.5. Discussion .......................................................................................................................... 23 1.6. Conclusions ........................................................................................................................ 32 1.7. Acknowledgments ............................................................................................................. 35 BIBLIOGRAPHY ......................................................................................................................... 36 APPENDIX A: CHAPTER 1 DATA ........................................................................................... 42 CHAPTER 2: EVALUATING SEPTIC SYSTEM INFLUENCE ON NUTRIENTS WITHIN THE MANISTEE AND AU SABLE RIVER WATERSHEDS .................................................. 48 2.1. Abstract .............................................................................................................................. 48 2.2. Introduction ........................................................................................................................ 49 2.3. Methods ............................................................................................................................. 52 2.4. Results and Discussion ...................................................................................................... 68 2.5. Conclusions ........................................................................................................................ 89 2.6. Acknowledgments ............................................................................................................. 92 BIBLIOGRAPHY ......................................................................................................................... 93 APPENDIX B: CHAPTER 2 DATA ............................................................................................ 96 v LIST OF ABBREVIATIONS BRT Boosted Regression Trees Ca Calcium CART Classification and Regression Trees Cl Chloride DOP Dissolved Organic Phosphorus EPA Environmental Protection Agency GLB Great Lakes Basin HK Hydraulic Conductivity kg/ha/yr Kilograms per Hectares per Year km Kilometer LHM Landscape Hydrology Model LULC Land Use and Land Cover Mg Magnesium mg/L Milligrams per Liter mg/sec Milligrams per Second MODFLOW Modular Finite-Difference Flow MT3D Modular Transport, 3-Dimensional N Nitrogen Na Sodium NED National Elevation Database NLCD National Landcover Database NPOC Non-Purgeable Organic Carbon vi P Phosphorus PDP Partial Dependence Plots RIR Relative Influence Rating RMSE Root Mean Square Error SENSEMap Spatially Explicit Nutrient Source Estimate Map SO4 Sulfate SPARROW Spatially Referenced Regressions on Watershed Attribute SRP Soluble Reactive Phosphorus SSURGO Soil Survey Geographic Database TDN Total Dissolved Nitrogen TN Total Nitrogen TP Total Phosphorus µg/L Micrograms per Liter USDA United States Department of Agriculture VANI Vertical Anisotropy VE Variation Explained vii CHAPTER 1: QUANTIFYING LINKAGES BETWEEN WATERSHED FACTORS AND COASTAL WETLAND PLANT INVASION IN THE US GREAT LAKES 1.1. Abstract Freshwater coastal wetlands provide numerous ecosystem services, including habitat, nutrient uptake, coastal stabilization, and aesthetic value, but the integrity of these ecosystems is threatened by invasion of non-native competitors. Invasive species, such as Phragmites and Typha, are a concern in these wetlands, as they can dominate and outcompete native species. This work sets out to understand the conditions that allow invasive species to dominate. This will allow for better management of landscapes and wetlands. We bring together two datasets to relate landscape conditions to coastal wetland invasion: 1) a spatially explicit map of nutrient inputs (SENSEmap) across the US Great Lakes Basin, and 2) a satellite land use map that includes explicit classifications of wetlands. Using machine learning algorithms, we quantified correlations between wetland plant invasion along the coastline to nutrient loads (both N and P) and other landscape scale variables (hydraulic conductivity, slope, imperviousness, land use, and land cover) across multiple influence zones. We find that high invasion is typically associated with nitrogen loading above 118 kg/ha/yr within the watersheds of the invaded wetlands. Forest cover of <27% is associated with high invasion. Conversely, nearshore slope of >2.6% and phosphorus loads <2.8 kg/ha/yr are associated with low invasion. Through N:P ratios, phosphorus was further identified as important. Overall, areas more anthropogenically impacted were more associated with invasion. We conclude that high nitrogen and low forest cover are correlated with invasion. These conclusions will inform management, as well as future efforts to identify linkages between landscapes and coastal invasion. 1 1.2. Introduction Although wetlands only make up a small proportion (<6%) of global land cover, 24% of the world’s invasive plants are wetland species (Zedler & Kercher 2004). Specifically, invasive species pose a significant threat to coastal wetlands, as they can grow in dense, monolithic stands, outcompeting native species and altering wetland vegetation structure (Marks et al. 1994; Zedler & Kercher 2004). Rapid expansion of invasive species such as Phragmites australis and Typha x glauca is occurring across North America (Tulbure et al. 2007). The Laurentian Great Lakes and their associated wetlands and waterways are at increasing risk of invasion by these species due to high exposure to conditions that promote successful invasion such as anthropogenic nutrient applications to the landscape, typically as fertilizer and other non-point sources, and land use change from natural to agricultural or urban landscapes (Danz et al. 2007; King et al. 2007). As invasion increases, the ecosystem services these wetlands provide are at risk, threatening habitat, water filtration, biodiversity, and recreation. The characteristics that make these species successful invaders also threaten natural ecosystem function. Phragmites is a hollow grass that can grow upwards of two meters. It grows in a wide range of conditions, from alkaline or brackish waters to freshwater ecosystems, making it a strong competitor in most environments (Marks et al. 1994). It has become a dominant species in wetlands, due to its ability to rapidly reproduce and thrive in high nitrogen environments; as a result, it has the propensity to form dense monolithic stands, aided by its opportunist nature to take advantage of canopy gaps (Marks et al. 1994; Zedler & Kercher 2004; Mozdzer & Zieman 2010). Typha x glauca is an invasive cattail, formed as a hybrid of native T. latifolia and exotic T. angustifolia. This species spreads rapidly and reduces native biodiversity by also forming monocultures (Tilman 1990; Angeloni et al. 2006; Geddes et al. 2014). Typha is 2 also opportunistic and thrives in increased water depth as well as rapidly flowing water (Zedler & Kercher 2004). Both species leave behind copious litter after senescence, making a thick thatch that challenges native plants’ reestablishment (Minchinton et al. 2006; Farrer & Goldberg 2009; Vaccaro et al. 2009; Larkin et al. 2012). Invasion by these species threatens wildlife. By altering the structure and function of integral ecosystems and reducing plant biodiversity, nutrient cycles are altered and can cause native species extirpation or even extinction (Marks et al. 1994; Pimentel et al. 2000; Zedler & Kercher 2004; Farrer & Goldberg 2009; Larkin et al. 2012; Geddes et al. 2014). Wetlands have been studied to examine how invasion is facilitated, focusing on the role of nutrients, land cover, and disturbance. It is relatively well accepted that nitrogen levels play a role in facilitating plant invasions (Knops et al. 1999). Invasive plants reach optimal size, have better success, and are harder to manage in areas that have high nitrogen supply (Ehrenfeld 2003; Minchinton & Bertness 2003; Elgersma et al. 2017; Goldberg et al. 2017; Uddin et al. 2018). At the site scale, a nitrogen influx near 10 g/m2 considerably increased net primary productivity of both Phragmites and Typha, compared to those treated with lower nitrogen concentrations (Martina et al. 2016; Goldberg et al. 2017). Prior studies by Rickey & Anderson (2004) and Kettenring et al. (2011) further corroborate this point by indicating that higher nitrogen loads result in greater biomass of Phragmites. Other studies have shown a link between different types of land use and land cover (LULC) and plant invasion. Forested land tends to correlate with lower amounts of invasion, while higher proportions of urban or agricultural land were strongly associated with higher invasion (Danz et al. 2007; King et al. 2007; Trebitz & Taylor 2007). Specifically, shoreline armoring creates a disturbed coastline that can allow for increased invasion (Silliman & Bertness 2004; Sciance et al. 2016). Combining altered LULC with 3 increased nitrogen loads can compound their individual effects, resulting in more highly invaded wetlands (Silliman & Bertness 2004). Additionally, the disturbance of a wetland ecosystem, through water level changes or other direct intervention, often reduces biodiversity, and provides an opening for invasive species (Keddy & Reznicek 1984; Rapport & Whitford 1999; Chambers et al. 1999; Herrick & Wolf 2005; Lishawa et al. 2010). When water levels in the Great Lakes fall, it exposes lakebeds that are not currently inhabited by wetland plants. This provides an opportunity for invasive species, such as Phragmites and Typha, to expand into new areas (Tulbure et al. 2010; Wilcox 2012). Overall, these changes in a wetland, such as increased nitrogen loads, changed land cover, or disturbance, can allow invasive species to propagate. The bulk of the research dedicated to wetland invasion has been carried out on a site- specific scale. Typically, plots or transects were taken and evaluated over time to measure the change in invasive area (Silliman & Bertness 2004; King et al. 2007; Tulbure & Johnston 2010). A multiple-regression model was used to connect expanding Phragmites patches to increased nitrogen availability and soil salinity in Narragansett Bay, Rhode Island (Silliman & Bertness 2004). Classification and regression trees (CART) have also been used to evaluate specific wetland transects. The transects nearest a highly developed area, resulted in the highest occurrence of Phragmites within the study region (King et al. 2007). Mixed models have also been used to explore relationships between the change in Phragmites cover and various wetland characteristics at specific sites (Tulbure et al. 2007). While many studies have investigated small-scale cases, far fewer have analyzed large coastal wetland regions. Building upon work by Richards et al. (1996), Johnson et al. (2010) used statistical analyses and models to assess how landscape scale processes are connected to wetlands and lakes. Sciance et al. (2016) detailed a similar study in the Chesapeake Bay that 4 utilized regression trees to show the relationship between agriculture and other variables and the presence of Phragmites. Mazur et al. (2014) used coastal wetland land use data from Bourgeau- Chavez et al. (2013) to link landscape characteristics within 10 km of the coastline to Phragmites invasion within the US Great Lakes. Using the statistical method boosted regression trees (BRT) and in stream nutrient data from SPARROW (Spatially Referenced Regressions on Watershed Attributes), they found that minimal topographic relief, proximity to urban centers, and poorly drained soils were most closely associated with Phragmites expansion. Coastal wetlands are unique in that they receive waters from three distinct sources: 1) local groundwater, streams, and runoff draining directly to the wetlands, 2) larger stream systems emptying into adjacent littoral waters draining significant watershed areas, and 3) lake/ocean pelagic waters mixed with those and other littoral water sources. Essentially, Mazur et al. (2014) included primarily the first, local source of water via the 10 km buffer selected for their analysis. Here, we develop a framework to address multiple sources of water and nutrients to coastal wetlands from the landscape, explicitly and separately including both local sources and those from nearby 2nd-or-higher order streams. We further expand upon the work of Mazur et al. (2014) by incorporating high-resolution, spatially-explicit maps of N and P inputs to the landscape (SENSEmap, Hamlin et al. 2020), and incorporating several different landscape characteristics, including imperviousness, land use, and land cover. Additionally, our invasion variable is treated as a percent, rather than just presence/absence and includes Typha, rather than just Phragmites. Finally, we utilize an additional statistical analysis, CART, in our study. These differences propagate new and interesting results. This study links landscape variables over multiple spatial scales to coastal wetland invasion for the 7807 km length of the United States Great Lakes Basin shoreline. For this 5 analysis, we define a framework of coastline segments built around the mouths of 2nd-or-higher order streams, and quantify wetland invasion within those coastline segments detailed at 12.5 m resolution via remote sensing (Bourgeau-Chavez et al. 2015). We then link coastline invasion to landscape characteristics using two tree-based machine learning algorithms: classification and regression trees (CART) and boosted regression trees (BRT). This analysis reveals the landscape conditions that promote invasion, including most prominently, nutrient loads from a new, 30 m resolution source- and spatially-explicit map of landscape nutrient inputs (SENSEmap, Hamlin et al. 2020). 1.3. Methods 1.3.1 Study Area This study focuses on the coastal wetlands of The North American Laurentian Great Lakes Basin (hereafter GLB) (Figure 1.1). It consists of five lake sub-basins covering over 580,000 square kilometers throughout Canada and the United States draining to Lakes Superior, Michigan, Huron, Erie, and Ontario. Due to data availability, this study focuses on the United States portion of the basin, primarily between the 40th and 48th parallels and the 75th and 93rd meridians. It encompasses portions of Illinois, Indiana, Michigan, Minnesota, New York, Pennsylvania, Ohio, and Wisconsin. Climate within the basin is primarily humid (Koeppen- Geiger classes Dfa and Dfb, hot- and warm-summer humid continental climates, respectively), with four distinct seasons. While the findings of this study are applicable to other freshwater coastal wetlands, this region was selected based on the ecological significance of the Great Lakes. The GLB is one of the most heavily invaded regions in the world (Ricciardi & MacIsaac 2000). Additionally, the Great Lakes are one of the world's largest freshwater resources and possess the longest freshwater coastline. The lakes provide habitat for organisms, recreational 6 opportunities, hydrologic retention, shoreline protection, sediment trapping, and nutrient cycling (US EPA 2019). Figure 1.1. USGLB Site Map. This map shows the United States Great Lakes Basin (USGLB), with each of the five Great Lakes labeled. The inset map in the lower right shows the continental United States, with the USGLB colored for reference. Additionally, the primary map shows the simplified 10 km shoreline land use data highlighting invasion classes, adapted from (Bourgeau- Chavez et al. 2015), and the outline of the USGLB as a black line. The simplification of classes groups all agricultural sub-groups together, as well as grouping forest sub-groups and wetland sub-groups (see Table A.1.1). The inset maps highlight, clockwise from left, a) a remote area along Lake Superior in the western portion of Michigan’s Upper Peninsula, b) an agriculturally dominated area in Michigan’s Lower Peninsula along Lake Huron, and c) a mainly urban region north of the city of Detroit. 1.3.2 Data Sources This study synthesizes information from two large-scale, high-resolution spatial datasets. First, Bourgeau-Chavez et al. (2015) created a land cover product that identifies specific invasive species along with other wetland land cover classes at a fine 12.5 m resolution. Second, Hamlin 7 et al. (2020), created SENSEmap, which estimates nitrogen and phosphorus inputs to the landscape at 30 m resolution. These datasets have not yet been utilized in combination, and doing so creates a novel opportunity to link landscape characteristics to coastal wetland invasion. The Bourgeau-Chavez et al. (2015) wetland land cover map was created from Landsat and PALSAR data, and represents ca. 2010 conditions along the entire GLB coastline (2015). The 12.5-m resolution data classifies the land cover into several specific categories, consolidated into Anderson level 1 classes (Anderson et al. 1976) (see Table A.1.1), and breaks wetlands into specific invasive species classifications, including Typha and Phragmites. The mapped area stretches inland 10 km from the coastline, and towards the lake, to capture potential nearshore LULC trends. The overall accuracy of the data is 94%, while each lake individually ranges from 86 – 96% accuracy (Bourgeau-Chavez et al. 2015). From this data, invasion patterns can be mapped (Figure 1.1). Visually, land cover and invasion appear to be correlated, with agricultural (Figure 1.1.B) or urban (Figure 1.1.C) dominated landscapes more commonly associated with invasion than forested and wetland classes (Figure 1.1.A). Previous studies have shown that higher nutrient concentrations may enhance the success of plant invasions (Knops et al. 1999; Lougheed et al. 2001; Elgersma et al. 2017; Goldberg et al. 2017; Uddin et al. 2018). To assess the relationship between nutrients and invasion within the Great Lakes, we use recent work that estimates the input of nutrients across the landscape: Spatially Explicit Nutrient Source Estimate Map (SENSEmap, Hamlin et al. 2020). SENSEmap uses GIS, remote sensing, and county-level datasets to describe the distribution of seven nitrogen sources and six phosphorus sources across the USGLB landscape at 30 m resolution. The sources include atmospheric deposition, septic systems, chemical non-agricultural fertilizer, chemical agricultural fertilizer, manure, nitrogen fixation from legumes, and point sources (Figure 1.2). 8 Figure 1.2. SENSEMap nutrient non-point sources. SENSEmap total non-point source maps for a) nitrogen, and b) phosphorus (modified from Hamlin et al. 2020). The color scale is divided into quantiles of input rates. Note, these maps exclude point sources, which are applied as a separate mass flux to streams rather than to the landscape. In addition to these key datasets, we used LULC, slope, imperviousness, and hydraulic conductivity data. The regional LULC data comes from the National Landcover Database (NLCD 2011), a 30 meter resolution dataset, and was utilized with simplified Anderson Level 1 classes: forest, agriculture, urban, range, water, wetland, barren, and unclassified (Anderson et al. 1976). The imperviousness dataset is also an NLCD product, with percent impervious values for each cell. We calculated percent slope from the 1 arc-second National Elevation Database (NED 2012), roughly a 10 m resolution product. 9 Finally, we calculated vertical hydraulic conductivity for the first 25 centimeters of the soil profile at a 30 m resolution in a three step process. First, we computed %sand and %clay textures from the Gridded Soil Survey Geographic Database (gSSURGO 2016), along with tabular data from the SSURGO database (SSURGO 2016). To estimate textural values for those horizons missing data, we then classified all text descriptors into one of the 13 USDA soil textural classes, and used the midpoints %sand, %clay values of each used these classes. Second, we used these %sand and %clay values to lookup hydraulic properties for all horizons from the ROSETTA database (Rosetta Lite version 1.1; Schaap et al. 2001). Third, we combined horizons for each component of soil mapping units into four standard horizons with bottoms at 25, 50, 100, and 200 cm. For vertical saturated conductivity, we computed the harmonic average of all original horizons within the standard horizons. These standard horizons for each component were then averaged across each mapping unit, weighting by the individual component fractions. 1.3.3 Spatial Methods Our spatial methodology consisted of quantifying the invaded proportion of coastal wetlands within distinct shoreline segments, and computing landscape characteristics within watersheds that relate to those segments. To understand the connections between the landscape and coastal wetlands, we applied three spatial analyses: 1) we divided the US GLB coastline into distinct, continuous segments, 2) we assessed several buffer widths to best quantify coastal wetlands for this study, and 3) we computed four zones of influence to better understand the origin of the source waters and the scale of landscape characteristics that best relate to coastal wetland invasion. First, we divided the 7807 km coastline into 737 segments based on a modified method from Danz et al. (2007). This approach develops unique coastline segments along the USGLB 10 via a two step process: 1) identify the mouth of each 2nd-or-higher order stream (NHDPlusV2 2012) along the coastline of interest, and 2) create segments inclusive of each stream outlet point, separated at the midpoint between adjacent outlets. Hollenhorst et al. (2007) used a similar technique to delineate watersheds. Here, we apply this method, but modify the technique used to separate adjacent segments. After identifying all outlet locations, we compute watersheds for each outlet, using the fill-flow direction-flow accumulation-snap points-watershed workflow in ArcGIS. We then calculated Voronoi polygons from the watersheds (using the nibble function in Arc 10.4.1), and intersected them with the coastline. Voronoi polygons are created by partitioning an area into convex hull parcels, based on proximity to a given point. In this case, we partitioned the land not currently included in a watershed via this method. This associates each coastline section with the nearest watershed drainage basin, rather than the nearest outlet point as in Danz et al. (2007). Following segmentation of the coastline, the next steps were to determine which wetlands would be considered coastal and to quantify invasion within those wetlands. While there are a variety of different classifications for coastal wetlands (USDA 2018), we opted for the straightforward approach of including all wetlands within a buffer distance from the coastline. Ecological arguments could be cast for a variety of different buffer widths, so we opted not to select a single width a priori and generated a series of buffer widths to quantify coastline invasion. We selected a sequence of successively larger multiples of our native LULC data resolution to serve as buffered coastline widths: two-way buffers of 60, 120, 240, and 480 m. Within each coastline segment, and for each buffer width, we then computed the invaded proportion of coastal wetlands using the Bourgeau-Chavez et al. (2015) data: (𝐴𝑟𝑒𝑎 𝑜𝑓 𝑃ℎ𝑟𝑎𝑔𝑚𝑖𝑡𝑒𝑠)+(𝐴𝑟𝑒𝑎 𝑜𝑓 𝑇𝑦𝑝ℎ𝑎) 𝐼𝑛𝑣𝑎𝑑𝑒𝑑 𝑃𝑜𝑟𝑡𝑖𝑜𝑛 = 𝑇𝑜𝑡𝑎𝑙 𝑊𝑒𝑡𝑙𝑎𝑛𝑑 𝐴𝑟𝑒𝑎 (Eq. 1.1) 11 When calculating the invaded wetland proportion, it is important to note that the native species of Typha and Phragmites are not distinguished from the much more prevalent invasive forms. In this study, we feel confident that this has a negligible effect on our results, given that the natives are much less common in this region and they do not form the dense monocultures measured through the imagery analyzed in this study. This is supported by data collected from 1751 field sites by Bourgeau-Chavez et al. (2013), which used multi-level ground truthing to ensure accurate characterization of wetland plants. A final best-fit coastal buffer width was then selected based on the CART analysis results, as described below. Based on the aforementioned coastal wetland water sources, we identified three distinct zones of influence that may be linked to coastal wetland invasion: 1) the land draining directly via groundwater, overland flow, or first order streams to the coastline (hereafter “direct zone”), 2) the riparian lands lining the stream network of the 2nd-or-higher order stream associated with each coastline segment (hereafter “riparian zone”), and 3) the complete watershed draining to the outlet point of each 2nd-or-higher order stream network (hereafter “stream watershed”). Also, a fourth zone incorporating all three areas was computed, effectively as a union of the direct zone and stream watershed (hereafter “full watershed”). Watersheds for the full and stream scales had been generated previously as part of the coastline segmentation process: the stream watershed being the land area draining to each stream mouth, and the full watershed being the land area nearest to each stream watershed. We then computed the direct zone as the exclusive union of the full and stream watersheds. Finally, we created a 60 m two-sided buffer (i.e. 120 meters total width) of the stream network and intersected that with the stream watersheds to create our riparian zones. Figure 1.3 shows how these four zones of influence might appear on an idealized landscape. 12 Figure 1.3. Coastal wetland zones of influence. This diagram shows a generic coastline segment and its associated zones of influence. The yellow band represents the wetland segment of interest. The entirety of the dark green shading is the associated full watershed, while the cross hatching represents the stream watershed, and the dots the direct zone. The riparian zone is shown as the blue buffer surrounding the corresponding stream. 13 From these zones, we extracted landscape characteristic variables to serve as the driver variables in our statistical analyses. The driver variables for this study include: nitrogen and phosphorus landscape inputs, proportion of LULC types, soil hydraulic conductivity, slope, and imperviousness. Each of these variables was summarized for all four zones of influence (full, stream, direct, and riparian). We considered different summary statistics depending on the data type. Numerical data were summarized by the mean value within each watershed (Table 1). Categorical data, such as LULC, were summarized by the percent of total area within each zone. Additional variables, such as nitrogen and phosphorus ratios and total wetland area within a segment, were also included. Through this process, 65 driver variables were input into the statistical analysis (summarized in Table A.1.2). Table 1.1. Summary of driver variables. The main driver variables, how they are summarized, and the data source are displayed. For more detail and summary statistics, see Table A.1.2. Variable Spatial Analysis Source N and P Inputs Means, Ratios SENSEmap1 LULC % Total Area NLCD2 Hydraulic Conductivity Mean SSURGO3 Slope Mean NED4 Imperviousness Mean NLCD2 1 (Hamlin et al. 2020); 2 (NLCD 2011); 3 (SSURGO, USDA); 4 (NED 2014) 1.3.4 Statistical Methods We used two tree-based classification and regression methods to statistically link landscape scale variables to invasion percentages within the coastline segments: Classification and Regression Trees (CART), and Boosted Regression Trees (BRT). These methods are particularly well suited to analyze complex ecological data because they are robust to many predictor types, and are able to consider hundreds of driver variables in relation to a single 14 dependent variable (Breiman et al. 1984). In addition, they have low susceptibility to overfitting, and where there is the potential for such statistical error, the methods have corrections, such as pruning or stopping criterion (De’ath & Fabricius 2000). Additionally, the methods are non- linear and can consider the variables in relation to each other as well as to the dependent variable (Breiman et al. 1984). Finally, they do not assume normality of predictor variables, nor are they susceptible to outliers, allowing data to be used without the need to transform the values (Sutton 2005). Using the two methods in conjunction can improve predictive power and understanding (Erdal & Karakurt 2013). CART explains variations within a response variable based on recursive binary splits in multiple drivers (De’ath & Fabricius 2000). The response variable in this case is invasion proportion, defined in Equation 1. CART operates by splitting data into two mutually exclusive groups, based on a driver variable value that accounts for the most variation within each group (King et al. 2007). The method seeks to account for the most variance within the response variables, without creating too large of a tree. In this study, CART was used to compare each segment’s invaded proportion with associated driver variables, and identified those with the highest explanatory power. These relationships act as splits that create subgroups of segments and form a decision tree. CART assumes that the response variable is normally distributed. Here, our response variable is invasion proportion, and is highly skewed to the left, as most segments have a relatively low invasion proportion. We thus applied a logit transformation to the invasion proportion values of the form 𝑥′ = 𝑙𝑜𝑔((𝑥 + 𝜖)/(1 − 𝑥 + 𝜖)) (Eq. 1.2) where x is invasion proportion in each segment, and ϵ is a small value used to assure that 15 transformed values would remain non-infinite. To select ϵ, we used the smallest non-zero value of invasion proportion (0.001). The logit transformation was selected because it supersedes the outdated arcsine method that was commonly used in ecological studies (Douglas & Matthews 1992; Passy 2007). Specifically, logit accounts for additional unexplained variation that a typical logistic method may miss (Warton & Hui 2011). As CART does not assume independent variable normality, these variables were left untransformed. A regression tree was created for each of the four buffered segment widths. Each tree was created with 10-fold cross validation, allowing the data to be used for both training and validation to create a thorough and consistent tree. Each tree grew until the nodes explained less than 1% of the variation. We then selected the buffer scale with the highest overall CART model performance, as an indicator of the extent of coastal wetlands most linked to landscape inputs from the adjacent watersheds. To reduce the tendency of CART to overfit the data, we pruned the original tree by removing insignificant portions that provide little explanatory power (De’ath 2002; Verhougstraete et al. 2015). We utilized a pruning step based on subsequent variation explained. All splits that did not account for at least 3% of the variation were pruned, resulting in a smaller, more parsimonious tree. Our pruned tree still accounted for the majority of the variation (hereafter, “variation explained”; VE). VE quantifies the amount of variance within the relationships that the tree explains, similar to a traditional coefficient of determination (R2) for a linear model. CART computes a split-specific VE, which are then summed down the tree to compute the overall VE. This VE metric was then used to select our best-fit coastline segment buffer width i.e. that most correlated to our landscape drivers. While CART selects only a single driver variable at each node of the tree, it computes 16 splits for all driver variables, some of which may have VE values nearly as high as the best variable. These alternate splits can be grouped into two categories: surrogates or competitors. A surrogate is a variable that splits the response variable into two very similar groups to that of the primary splitter (Martin et al. 2011). Thus a surrogate would split the group in a very similar way, but account for slightly less variation. On the other hand, a competitor variable accounts for nearly the same amount of variation as the primary splitter, but splits the response variable group in an entirely different manner. Both alternatives can be used to examine what the underlying processes of a system may be. Boosted regression trees (BRT) were also used to further verify the findings of CART. BRT combines regression trees and a boosting algorithm that incorporates several models to improve the predictive power of the analysis (Quinlan 1996; Friedman et al. 2000). The final non-parametric model involves a cumulative regression that is created as the analysis recursively generates trees and calculates residuals (Elith et al. 2008). The major benefits of this method are that they can represent complicated nonlinear relationships, and can help interpret interactions between driver variables (Elith et al. 2008). Driver variables are ranked based on their relative influence rating (RIR), which describes how strongly they affect the invasion percent. The RIR values assigned add up to 100%, so they provide a proportional measure of driver variable importance. For this study, we iterated the regression over 1000 trees and required a minimum of fifteen observations per node. We use partial dependence plots (PDP) to analyze relationships that BRT identified. The plots show the marginal effect that a variable has on the output of a machine learning model (Friedman 2000). The dependent variable is plotted against various driver variables to assess a relationship. If the PDP line increases abruptly, it indicates a strong relationship between the two 17 variables. The point at which it increases considerably can be identified as an important threshold within the system. If the PDP line has many steps and plateaus, there is likely a more complex relationship between the two variables. This could indicate that other variables have stronger relationships, or that there are multiple thresholds within that pairing. 1.4. Results 1.4.1 Coastline Segmentation, Invasion, and Zones of Influence The USGLB coastline was divided into 737 distinct coastline segments using the methods described above. The segments themselves varied in length, but had a median value of 5166 meters (Figure A.1.1). If segments were greater than this median length, it was due to a lack of other stream outlets in the area. After calculating invasion percentage within each segment for our four potential buffer widths (60, 120, 240, and 480 meters, two-sided), we selected the 240 m buffer width for the remaining analyses. The CART model using the 240 m buffer width invasion percentages had the greatest explanatory power before pruning, with 76.6% VE, while the other widths, 60, 120, and 480 meters had 68.0%, 71.8%, and 70.8% VE respectively. Overall, each buffer width produced relatively similar trees, however we determined that the 240 m width (one-sided, for a total two-sided width of 480 m) was the most appropriate, given that it accounted for 4.8% more of the variability than the model with the next best VE, and 8.6% more variation than the lowest performing buffer model. The amount of invasion along the shore varies across the basin, as shown in a map of invasion percentage within coastline segments (Figure 1.4.A). There are a considerable number of low invasion segments, approximately 330 have less than 10% invasion (Figure 1.4.B). In contrast, there are only 75 segments with > 50% invasion. While positive ecologically, this 18 significantly skewed dataset required transformation for further analysis. We implemented the logit transform, which created a much more normal distribution (Figure 1.4.B&C). Note that 0% invasion values (represented in the logit transform by the addition of the least significant non- zero value) fall well outside a normal distribution, but following the logit transformation, are significantly separated from other non-zero data values. Figure 1.4. Coastal wetland invasion distribution. a. Map of the non-transformed invasion percentages within buffer segments. Warmer colors indicate higher invasion, cooler colors indicate lesser invasion. Typically, invasion is significantly higher in the southern portion of the basin. Panel b shows the original invasion distribution, while panel c shows the logit transformed invasion distribution. Note that while the distribution in panel b is shown in percentages, the logit transformation was conducted on proportion values. Spatially, low invasion percentages are congregated primarily in the northern portion of the basin, whereas high invasion percentages are concentrated near agricultural or urban centers. This emulates what we observed in the coastal LULC map. Coastal land cover across the 19 northern low-invasion portion of the basin is forested (Figure 1.1A). In the agricultural inset (Figure 1.1B), there is a thick monoculture of Phragmites along the shoreline. In the urban inset (Figure 1.1C), there are monocultures of both Phragmites and Typha. Though invasion seems to correspond well with land use, a linear correlation does not indicate a strong relationship. The highest R2 between land cover and invasion within the buffered segments was 0.36 and occurred between percent forest and percent invasion. This indicates that a purely linear model cannot capture the relationship that is evident visually. Watersheds corresponding to each segment varied in scale and area. As shown in Figure 1.3, for each segment, four zones were created: full, stream, direct, and riparian. Based on the method of derivation, the riparian zones were the smallest, 1,210 ha on average, while the full watersheds were the largest, 39,000 ha on average (Table A.1.2). The stream watersheds are usually more similar in size to the full watershed, while the direct zones tend to be smaller. Area histograms of the four zones are shown in Figure A.1.1. For each zone of influence, we extracted spatial summaries of each landscape driver variable. The maximum hydraulic conductivity was 0.0001 m/s, found within a direct zone. Some watersheds were entirely encompassed by one land cover. Typically, this occurred with either forest or urban cover. Within the four zones of influence, the average nitrogen input ranged from 262.3 (riparian) to 423.3 kg/ha/yr (stream). The average phosphorus input ranged from 30.1 (riparian) to 50.5 kg/ha/yr (stream). The slope of the watersheds was rather flat, with only 3.65% average slope. These summary statistics and more are further described in Table A.1.2. 1.4.2 Model Results The fully-grown CART tree (Figure A.1.2) explained 76.6% of the variation of coastal 20 wetland invasion within the 240 m buffer; after pruning, the final tree had 67.2% VE (Figure 1.5). While the full tree had 11 splits, following pruning, just four splits remained (those with individual VE > 3%): mean nitrogen input in the full watershed (55% VE), mean slope in the direct zone (4.4% VE), forest portion within the full watershed (4.3% VE), and mean phosphorus input in the stream watershed (3.5% VE). While CART was ran with logit-transformed invasion proportions, for clarity, we present the values as non-transformed invasion proportions. Figure 1.5. Pruned CART tree. CART showing the primary explanatory variables that are linked to invasion. The least invasion is on the left (blue group), while most invasion is on the right (red group). Five invasion groups were identified: 1) red - high invasion, 2) orange - mid- high invasion, 3) yellow - mid invasion, 4) green - mid-low invasion, and 5) blue - low invasion. The abbreviation “inv.” of invasion is used due to space constraints. µ indicates average invasion. The pruned tree contained five terminal nodes of segment groups with similar invasion properties: (group 1) high invasion, (group 2) mid-high invasion, (group 3) mid invasion, (group 4) mid-low invasion, and (group 5) low invasion (Figure 1.5). The high invasion group (shown in 21 red, 35.4% invasion in 292 segments) has high nitrogen inputs and low forest area in the full watershed. The mid-high invasion group (shown in orange, 16.3% invasion in 212 segments) has high nitrogen inputs, but high forest area in the full watershed. The mid invasion node (shown in yellow, 5.1% invasion in 80 segments) has low nitrogen inputs and low slope in the full watershed and direct zone, respectively. The mid-low invasion group (shown in green, 3.7% invasion in 38 segments) has low nitrogen in the full watershed, high slope in the direct zone, and high phosphorus in the stream watershed. Finally, the low invasion group (shown in blue, 0.42% invasion in 115 segments) has low nitrogen inputs in the full watershed, high slope in the direct zone, and low phosphorus in the stream watershed. The invasion groups are shown spatially in Figure 1.6. Boosted Regression Trees (BRT) were also used to assess relationships between invasion and the driver variables. The results from this analysis reinforced those from the CART analysis. We found that the mean nitrogen input in the full watershed had the highest relative influence rating (RIR) (21.2), followed by nitrogen inputs within the direct zone (RIR 12.7) (see Table A.1.4 for all RIRs). The relationship between phosphorus and invasion identified in the CART analysis was also noted by BRT. Based on a partial dependence plot (Figure 1.7.B, shown in log scale) of phosphorus inputs in the direct zone (RIR 11.7), even very small phosphorus inputs correlated to a considerable amount of invasion. Other characteristics were also verified through BRT. When the forested area falls below 25%, invasion was significantly higher (Figure 1.7.C). The portion of forested area within the full watershed was relevant in CART and was also considered influential (RIR 2.3) by BRT. Similarly, steeper slopes in the direct zone (RIR 3.76) typically results in lower levels of invasion (Figure 1.7.D), which was also true per CART. 22 1.5. Discussion 1.5.1 CART Analysis Landscape factors were able to describe more than two-thirds of the variation in coastline wetland invasion (67.2% for the pruned tree, and 76.6% for the full-grown tree). The pruned classification and regression tree (CART) identified five groups: group 1 – high invasion, group 2 – mid-high invasion, group 3 – mid invasion, group 4 – mid-low invasion, and group 5 – low invasion (Figure 1.5). The composition of these groups reveals four primary findings: 1) nitrogen inputs greater than 117.8 kg/ha/yr within the full watershed are strongly correlated with higher wetland invasion; 2) low forest cover within the full watershed is linked with higher invasion; 3) high slopes within the direct zone (the contributing area closest to the coast) is related to lower invasion, and; 4) phosphorus inputs greater than 2.8 kg/ha/yr within the stream watershed (the watershed of the river mouth of each coastal segment) are also correlated with higher wetland invasion. First, wetland segments with greater than 117.8 kg/ha/yr of nitrogen inputs to the landscape had higher wetland invasion by Typha and Phragmites. This split alone explained more than half of the variation in coastal wetland invasion, with 55% VE. This positive correlation between nitrogen inputs and invasion is supported by observational studies (Tuchman et al. 2009), experimental studies (Woo & Zedler 2002) and ecosystem modeling work in the Great Lakes (Goldberg et al. 2017; Martina et al. 2016), further supporting a linkage between nitrogen and invasion. These authors showed that greater nitrogen influxes to wetlands correlated with both higher net primary productivity and more complete invasion of non-native Typha and Phragmites. They identified a tipping point at ~10 mg-N/m2/yr of nitrogen influxes, above which Typha and Phragmites were able to completely invade established native ecosystems. 23 While not directly comparable to our value here (they were simulating nitrogen inflows to wetlands vs. our nitrogen inputs to the landscape), it is worth noting that this value is similar to the threshold we identified here (10 mg-N/m2/yr = 100 kg/ha/yr). Identifying thresholds of key nutrients provides a novel predictor for land management; establishing what fluxes of nutrients allow for invasion most commonly can assist in preventing future invasion, and establishing a potential cause of current invasion. Further work will be required to better understand how landscape inputs eventually translate to wetland influxes. Analysis of the competitor splits for this first node revealed that nitrogen inputs within the direct zone and stream watershed have nearly as high of VE, 52.8% and 52.1% respectively, and split on similar thresholds of nitrogen inputs. This agreement between competitors supports that nitrogen inputs are the key variable affecting percent invasion; had other variables appeared as similar-strength competitors, confidence in the primary variable would fall. Furthermore, the similar VE across the three zones of influence reinforces that nitrogen is important no matter where it is being applied. Interestingly, phosphorus input (at 7.5 kg/ha/yr) within the full watershed was a surrogate driver at the first node, and the resultant groups from this surrogate splitter are in 98% agreement with those produced by the primary split. The surrogate relationship indicates that both nitrogen and phosphorus may play an important role in identifying where coastal wetlands are at higher risk of invasion, a novel finding in that phosphorus has not historically been considered as important to plant invasion. Second, lower forest cover (<26.6%) within the full watershed is linked to higher invasion within coastal wetlands. This may be due to several factors, such as: forested areas are typically less anthropogenically influenced, and thus less disturbed; forested areas have lower nutrient inputs, and; forested areas may serve as a nutrient uptake buffer for wetlands. Invasive 24 plants have been found to thrive in disturbed ecosystems, per Rapport and Whitford’s 1999 study of Lakes Erie and Ontario, whereas higher percentages of forest cover throughout the entire GLB is more commonly associated with lower coastal invasion (Danz et al. 2007). This finding supports the idea that watersheds that are less anthropogenically influenced may create less optimal environments for invasion. The surrogate splits also support the linkage between disturbed landscapes and invasion. One of the most similar surrogate splits (81% agreement) was the percent of agricultural cover within the full watershed, which is expected as agriculture and forest are the two most common land covers in this region, thus an increase in one is largely offset by a decrease in the other. Using the surrogate split causes the factors to behave oppositely; rather than being defined by low forest, the highest invasion group (shown in red, Figure 1.4) would have been defined by high (greater than 44.4%) agricultural land. This further supports that land cover is important to invasion, in that low forest cover, or high agriculture cover, is associated with higher invasion within associated coastal wetlands. Third, we found that high slope (>2.6%) within the direct zone, the area nearest the coast, was related to lower invaded wetland proportion. A steeper shoreline may act to reduce the amount of habitat available for invasion when lake levels rise or fall, thus reducing the disturbance of coastal wetlands due to lake level variability. Lake level fluctuations have been shown to be an important factor affecting when and how invasive plants are able to grow and push out native species (Tulbure et al. 2010; Wilcox 2012). While we did not include fluctuations in water levels as a variable in this study, considering the fluctuations’ temporal nature and our study’s spatial emphasis, our variable of slope is indirectly related to water level changes. Steep slopes along the shoreline would expose less invadable lake bed as water levels fell, but gently sloping lake beds would become exposed to a greater degree when water levels 25 fell by the same amount, providing considerable invadable area. It is also possible that slope is acting as a proxy for an underlying variable. The percentage of wetland and the percentage of forest within the direct zone were surrogate splits, each resulting in approximately 80% agreement. Higher forest cover is correlated with lower invasion, while higher wetland cover is linked to greater invasion. This analysis cannot distinguish whether slope is a causal factor (i.e. reducing disturbance) or merely sharing a spatial pattern with wetland and forest cover--or whether both causation and correlation are at work here (likely the case). Finally, our CART analysis showed that phosphorus inputs above 2.8 kg/ha/yr within the stream watershed also appear to be related to higher levels of plant invasions. Others have found that phosphorus is associated with invasion (Hester & Hobbs 1992; Uddin et al. 2018), and is often considered a limiting nutrient in freshwaters, especially when paired with nitrogen (Schindler 1977; Correll 1999; Elser et al. 2007). Our findings reinforce this principle of co- limitation of phosphorus and nitrogen; only at low nitrogen levels did phosphorus appear to limit wetland invasion. The driver variable selected for this split was defined at the stream scale, perhaps implying that phosphorus in wetlands is primarily sourced from surface water, while the first node split of nitrogen at the full watershed may indicate a more diffuse contribution via groundwater. This replicates the expected transport mechanisms of phosphorus and nitrogen, via surface water and groundwater, respectively. Additionally, the nutrients are being sourced from the two largest spatial scales, and the fact that these broader spatial scales are impacting the coastal wetlands, indicates considerable landscape connectivity. The primary surrogate for this node is N:P ratios within the stream watershed, which has over 97% agreement - a result that also supports nutrient co-limitation as playing a role here. This final split shows that segments associated with low nitrogen and low phosphorus incur lower average invasion (0.42%, shown in 26 blue, Figure 1.5). While low nitrogen, but high phosphorus inputs (shown in green, Figure 1.5), result in higher average invasion (3.7%). Both the surrogate split and the N:P proxy we observed in the tree support the expected relationship that lower ratios (higher phosphorus) will likely result in more invasion within the Great Lakes Basin (Lougheed et al. 2001). Additional surrogate and competitor information is summarized in Table A.1.3. While we did identify four primary variables that relate to wetland invasion, there is some level of overlap between them. For example, more forested land contributes fewer nutrients to the landscape, which would then encompass three of our four primary splitting variables. There is some lack of independence between the variables, but CART mitigates the impact of this interdependence with its tree structure. For example, high forest cover relating to lower wetland invasion was only considered important within the group of segments that already had a lower amount of nitrogen within their full watersheds. Thus, rather than interdependence weakening our findings, the fact that these variables build upon each other creates a stronger case to identify nutrients and intensively managed land covers as driving forces behind coastal wetland invasion. 27 Figure 1.6. Spatial distribution of invasion groups. The USGLB with each full watershed colored based on the invasion group of its corresponding segment. Inset in the top right: stacked bar chart showing the percent of each Great Lake’s shoreline that is characterized by each invasion group. Percentages were calculated from Erie’s 119 segments, Huron’s 149 segments, Michigan’s 156 segments, Ontario’s 86 segments, and Superior’s 227 segments. Below, histograms that show the (non logit-transformed) distribution of invasion percentage amongst the segments that fall into each group. The titles of each subplot indicate the mean invasion percent of that group. Note the vertical axes scales differ. Each invasion group has a different statistical distribution of invasion amongst its segments (Figure 1.6). CART is designed to minimize the variance within the response variable (invasion proportion), thus the higher the total VE of the model, the less variation we would expect to see within each group. Invasion group 1 has a nearly normal distribution across the entire range of invasion percentages, indicating that while as a group these segments have a high 28 average invasion, individually they may be relatively uninvaded. Even the unpruned tree (Figure A.1.2) does not split this group further. Thus, the variance within this group may be explained by driver variables not included in the analysis. Invasion groups 2-5 have increasingly left skewed distributions, showing that a higher proportion of low invasion segments are included in each progressive group. Down the invasion group gradient, the range becomes more condensed, concluding with a range of only 0% to 15% invasion within group 5. As is expected, the mid-low and low invasion groups have more low and zero percent invasion segments than the other groups. In most cases, the segments with invasion proportions that do not entirely align with the distribution were very close to being selected into a higher group, but fell slightly below or above the CART-selected splitting value. Analyzing the statistical distribution of each CART node reveals that each group represents a specific category of invasion, but includes individual segments that may vary from the group average. Spatially, the invasion groups mapped onto the corresponding full watershed for each segment provide insight about regions where invasive species are a pressing issue (Figure 1.6). Within the Great Lakes Basin, the wetlands south of the 45th parallel are predominantly classified as high and mid-high invasion. In the northern portion of the basin, heavy invasion is much less common. Not only are there differences across latitudes, but the shoreline of each lake is composed of different invasion groups. The entire US shoreline of Lakes Erie and Ontario have very similar compositions with all areas in the high and mid-high invasion groups, but vary slightly on the frequency of each group. Lakes Huron and Michigan also have a similar distribution, but include a small percentage of mid to mid-low invasion. The shorelines of all four of these lakes are dominated by invasion. Lake Superior is unlike the other four Great Lakes. Its shoreline has much less invaded area, with 50.7% low invasion, 15.9% mid-low 29 invasion, 21.1% mid invasion, and only 12.3% mid-high to high invasion area. Considering that Superior’s segments were defined so differently, we ran an iteration of CART with only Lakes Michigan, Huron, Erie and Ontario. The tree resulted in very similar findings, but identified phosphorus in the full watershed as the first split, which was the primary surrogate for nitrogen in the full watershed in the CART from the entire basin. This may indicate that phosphorus plays a different role in different regions. 1.5.2 BRT Analysis Boosted Regression Trees (BRT) reinforced the findings from the CART analysis. We found that the mean nitrogen input in the full watershed had the highest relative influence rating (RIR), followed by nitrogen inputs within the direct zone. The fact that both nitrogen input metrics were considered most influential shows how strongly correlated nitrogen landscape inputs are to wetland invasion. Additionally, the sources from full watershed and direct zone are also significant. The full watershed is intended to capture groundwater influences from a large spatial area, while the direct zone represents the near shore drivers. The fact that both zones of influence are considered significant indicates that nitrogen is likely contributed by both the larger region and local groundwater. Furthermore, the full watershed encompasses the direct zone, perhaps further indicating the importance of near shore inputs in addition to those sourced from adjacent streams. When shown in a partial dependence plot (hereafter PDP, Figure 1.7.A, shown in log scale), it is evident that as nitrogen inputs increase there is a concomitant increase in percent invasion. There is a noticeable threshold near 100 kg/ha/yr nitrogen input, which supports the threshold of 118 kg/ha/yr defined by CART, and agrees with Martina et al. (2016). The sharp rise in the PDP indicates the strong relationship between nitrogen and invasion. This strong break is indicative of a tipping point for invasion within wetland ecosystems. A more 30 smoothly increasing line would instead support a more linear relationship, or the presence of other complex relationships in determining wetland invasion proportion. Figure 1.7. Partial dependence plots. Partial dependence plots for a. log10 of the mean nitrogen inputs within the full watershed, b. log10 of the mean phosphorus inputs within the direct zone, c. forest proportion within the full watershed, and d. mean slope within the direct zone. Note that the x-axis of plots A and B are in log scale. Both CART and BRT identified phosphorus as a substantial driver of Typha and Phragmites invasion, though at different scales: stream watershed and direct zone, respectively. The stream watershed indicates surface water and the direct zone likely captures direct runoff. The fact that phosphorus applications were specifically found to be important at the stream and direct scales, indicates that this nutrient is likely sourced primarily from the surface. 31 Through both analyses, nitrogen in the full watershed, phosphorus in surface water scales, forest cover in the full watershed, and slope in the direct zone were identified as key variables related to invasion. This consistency, and the fact that the values that were identified were very similar in both analyses, we feel confident that the analyses are identifying the important factors that affect invasion within coastal wetlands. 1.6. Conclusions We identified linkages between landscape scale variables and coastal wetland invasion in the Laurentian Great Lakes, one of the most important freshwater systems in the world. Through this study, we found that high nitrogen inputs within a segment’s watershed had the most significant relationship to percent invasion in the coastal wetlands. Wetlands along coastline segments with landscape nitrogen input levels above 118 kg/ha/yr were an average of 25% more invaded by Phragmites australis and Typha x glauca. We also found that low forest cover, low slope in coastal areas, and phosphorus inputs greater than 2.8 kg/ha/yr tended to be related to wetlands with higher invasion. Insights were also gained about N:P ratios in these ecosystems, and the potential linkage of nutrient co-limitation. Within the decision tree, phosphorus inputs were only considered a significant influence on segments that already had low nitrogen inputs associated with them, but phosphorus inputs could also act as a surrogate for nitrogen in the initial model tree split. This shows that phosphorus may promote invasion when nitrogen levels are either high or low, and should be considered important in managing for invasion. Invasion was lowest within the northern part of the US Great Lakes Basin, specifically along the Lake Superior shoreline. This is likely due to much less anthropogenically disturbed area in this basin, and therefore lower nutrient inputs, relative to the other Great Lakes. Importantly, we identify a threshold of nitrogen and phosphorus inputs where wetland invasion by Phragmites and Typha is 32 more prevalent. Establishing that both of these invaders are more likely to dominate where inputs of nitrogen are greater than 117.8 kg/ha/yr and inputs of phosphorus are greater than 2.8 kg/ha/yr provides important insight for wetland management. Both CART and BRT were used to quantify how landscape drivers relate to coastal wetland invasion. CART was used as the primary analysis, which helped develop a preliminary understanding of relationships between variables. The decision tree clearly identifies nitrogen inputs as being strongly related to invasion, alone explaining over half of the variation in the dataset. BRT offered a more detailed, conformational analysis with continuous outputs closer to that of a predictive model, and provided specific insights into each driver variable through variable relative influence scores. The results of both methods were consistent, reinforcing our findings. Uncertainties in our driver variables may affect some of the finer details of the model outcomes. However, we assert that the findings are robust due to the large spatial extent of this study. The region includes many different watersheds corresponding to segments with similar ranges of invasion across the study domain (Figure 1.5). This provides the statistical power to average out random sources of error in the landscape variables. Given the nature of the SENSEmap dataset, we posit that random errors likely dominate the SENSEmap product, as the quantities of each nutrient source are in general well constrained at the county-level (Hamlin et al., 2020). The NLCD LULC and imperviousness datasets are well validated, with reported accuracies typically exceeding 85%, which is particularly true for the Great Lakes region (Homer et al., 2020; Wickham et al., 2017; Wickham et al., 2013). While it is difficult to assess accuracies of slope calculated from a DEM, accuracies of the NED elevations themselves are high, and in particularly are greater than for other comparable elevation products (Gesch et al., 33 2014). The availability of higher resolution data products will further improve analyses such as these in the future. The approach taken in this study can be readily applied to analyses of other wetland invaders or to invasion of other ecosystem types (i.e. forests). This work provides a basis for improved basin management to reduce the risk of shoreline invasion. By identifying areas that might be approaching the tipping point of nitrogen inputs, targeted mitigation practices could cause an important shift in invasion trajectory (Elgersma et al. 2017). Additionally, our methods could be applied to other landscapes, such as forests. Potential future work using this framework could focus on early or late leafing shrubs that would be able to be identified via remote sensing, or invasive plants that would be more shielded by trees could be assessed in the field and evaluated in a similar statistical manner. Being more aware of how the landscape affects the coastal wetlands should allow policy makers and managers to better regulate what occurs on a local basis. These findings can also form the foundation for future research that examines causal linkages between landscape factors and coastal invasion through process-based modeling. . 34 1.7. Acknowledgments The text of this chapter is a reprint of the material as it appears in Landscape Ecology in a manuscript first published September 21 2020 (doi: 10.1007/s10980-020-01124-3) coauthored by David Hyndman, Anthony Kendall, and Sherry Martin. This research was funded by National Aeronautics and Space Administration grants 80NSSC17K0262 and NNX11AC72G, National Oceanic and Atmospheric Administration grant NA12OAR4320071, and USDA NIFA Water CAP grant 2015-68007-23133. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of NASA, NOAA, or USDA. 35 BIBLIOGRAPHY Anderson, J., Hardy, E., Roach, J., & Witmer, R. (1976). A Land Use and Land Cover Classification System for Use with Remote Sensor Data. Angeloni, N., Jankowski, K., Tuchman, N., Kelly, J. (2006, 8 5). Effects of an invasive cattail species (Typha x glauca) on sediment nitrogen and microbial community composition in a freshwater wetland. FEMS Microbiology Letters, 263(1), 86-92. Bourgeau-Chavez, L., Endres, S., Battaglia, M., Miller, M., Banda, E., Laubach, Z., . . . Marcaccio, J. (2015, 7 9). Development of a Bi-National Great Lakes Coastal Wetland and Land Use Map Using Three-Season PALSAR and Landsat Imagery. Remote Sensing, 7(7), 8655-8682. Bourgeau-Chavez, L., Kowalski, K., Carlson Mazur, M., Scarbrough, K., Powell, R., Brooks, C., . . . Riordan, K. (2013, 1 1). Mapping invasive Phragmites australis in the coastal Great Lakes with ALOS PALSAR satellite imagery for decision support. Journal of Great Lakes Research, 39, 65-77. Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification And Regression Trees. Routledge. Carlson Mazur, M., Kowalski, K., & Galbraith, D. (2014). Assessment of suitable habitat for Phragmites australis (common reed) in the Great Lakes coastal zone. Aquatic Invasions, 9, 1-19. Chambers, R., Meyerson, L., & Saltonstall, K. (1999, 9 1). Expansion of Phragmites australis into tidal wetlands of North America. Aquatic Botany, 64(3-4), 261-273. Correll, D. (1999, 5 1). Phosphorus: a rate limiting nutrient in surface waters. Poultry Science, 78(5), 674-682. Danz, N., Niemi, G., Regal, R., Hollenhorst, T., Johnson, L., Hanowski, J., . . . Host, G. (2007, 5 23). Integrated Measures of Anthropogenic Stress in the U.S. Great Lakes Basin. Environmental Management, 39(5), 631-647. Danz, N., Regal, R., Niemi, G., Brady, V., Hollenhorst, T., Johnson, L., . . . Kelly, J. (2005). Environmentally Stratified Sampling Design for the Development of Great Lakes Environmental Indicators. Environmental Monitoring and Assessment, 102, 41-65. De'ath, G. (2002, 4 1). Multivariate Regression Trees: A New Technique for Modeling Species- Environment Relationships. Ecology, 83(4), 1105-1117. De 'ath, G., & Fabricius, K. (2000). Classification and Regression Trees: A Powerful yet Simple Technique for Ecological Data Analysis. Ecology, 81(11), 3178-3192. 36 Douglas, M., & Matthews, W. (1992, 11). Does Morphology Predict Ecology? Hypothesis Testing within a Freshwater Stream Fish Assemblage. Oikos, 65(2), 213. Ehrenfeld, J. (2003, 10 1). Effects of Exotic Plant Invasions on Soil Nutrient Cycling Processes. Ecosystems, 6(6), 503-523. Elgersma, K., Martina, J. (2017). Effectiveness of cattail (Typha spp.) management techniques depends on exogenous nitrogen inputs. Elem Sci Anth, 5, 19. Elith, J., Leathwick, J., & Hastie, T. (2008, 7 1). A working guide to boosted regression trees. Journal of Animal Ecology, 77(4), 802-813. Elser, J., Bracken, M., Cleland, E., Gruner, D., Harpole, W., Hillebrand, H., . . . Smith, J. (2007, 12 1). Global analysis of nitrogen and phosphorus limitation of primary producers in freshwater, marine and terrestrial ecosystems. Ecology Letters, 10(12), 1135-1142. Erdal, H.I., & Karakurt, O. (2013, 1 16). Advancing monthly streamflow prediction accuracy of CART models using ensemble learning paradigms. Journal of Hydrology, 477, 119-128. Farrer, E., & Goldberg, D. (2009). Litter drives ecosystem and plant community changes in cattail invasion. Ecological Applications, 19(2), 398-412. Friedman, J., Hastie, T., & Tibshirani, R. (2000, 4). Additive logistic regression: a statistical view of boosting (With discussion and a rejoinder by the authors). The Annals of Statistics, 28(2), 337-407. Geddes, P., Grancharova, T., Kelly, J., Treering, D., & Tuchman, N. (2014, 9 22). Effects of invasive Typha × glauca on wetland nutrient pools, denitrification, and bacterial communities are influenced by time since invasion. Aquatic Ecology, 48(3), 247-258. Gesch, D.B., Oimoen, M.J. and Evans, G.A., 2014. Accuracy assessment of the US Geological Survey National Elevation Dataset, and comparison with other large-area elevation datasets: SRTM and ASTER (Vol. 1008). US Department of the Interior, US Geological Survey. Goldberg, D., Martina, J., Elgersma, K., & Currie, W. (2017, 8 2). Plant Size and Competitive Dynamics along Nutrient Gradients. The American Naturalist, 190(2), 229-243. gSSURGO. Description of Gridded Soil Survey Geographic (gSSURGO) Database | NRCS Soils. (2016). Retrieved from https://www.nrcs.usda.gov/wps/portal/nrcs/detail/soils/home/?cid=nrcs 142p2_053628 Hamlin, Q. F., A. D. Kendall, S. Martin, H. Whitenack, J. Roush, B. Hannah, D. W. Hyndman (2020). SENSEmap-USGLB: Nitrogen and Phosphorus Inputs, HydroShare, https://doi.org/10.4211/hs.1a116e5460e24177999c7bd6f8292421 37 Herrick, B., & Wolf, A. (2005). Invasive Plant Species in Diked vs. Undiked Great Lakes Wetlands. Journal of Great Lakes Research, 31(3), 277-287. Hester, A., & Hobbs, R. (1992, 2 1). Influence of fire and soil nutrients on native and non-native annuals at remnant vegetation edges in the Western Australian wheatbelt. Journal of Vegetation Science, 3(1), 101-108. Hollenhorst, T., Brown, T., Johnson, L., Ciborowski, J., & Host, G. (2009, 1 20). Methods for Generating Multi-scale Watershed Delineations for Indicator Development in Great Lake Coastal Ecosystems. Journal of Great Lakes Research, 33(3), 13-26. Homer, C., Dewitz, J., Jin, S., Xian, G., Costello, C., Danielson, P., Gass, L., Funk, M., Wickham, J., Stehman, S. and Auch, R., 2020. Conterminous United States land cover change patterns 2001–2016 from the 2016 National Land Cover Database. ISPRS Journal of Photogrammetry and Remote Sensing, 162, pp.184-199. Johnson, L., & Host, G. (2010, 3 19). Recent developments in landscape approaches for the study of aquatic ecosystems. Journal of the North American Benthological Society, 29(1), 41-66. Keddy, P., & Reznicek, A. (1986, 1). Great Lakes Vegetation Dynamics: The Role of Fluctuating Water Levels and Buried Seeds. Journal of Great Lakes Research, 12(1), 25- 36. Kettenring, K., McCormick, M., Baron, H., & Whigham, D. (2011, 10 1). Mechanisms of Phragmites australis invasion: feedbacks among genetic diversity, nutrients, and sexual reproduction. Journal of Applied Ecology, 48(5), 1305-1313. King, R., Deluca, W., Whigham, D., & Marra, P. (2007, 6). Threshold effects of coastal urbanization on Phragmites australis (common reed) abundance and foliar nitrogen in Chesapeake Bay. Estuaries and Coasts, 30(3), 469-481. Knops, J., Tilman, D., Haddad, N., Naeem, S., Mitchell, C., Haarstad, J., . . . Groth, J. (1999, 9 1). Effects of plant species richness on invasion dynamics, disease outbreaks, insect abundances and diversity. Ecology Letters, 2(5), 286-293. Larkin, D., Freyman, M., Lishawa, S., Geddes, P., & Tuchman, N. (2012, 1 12). Mechanisms of dominance by the invasive hybrid cattail Typha × glauca. Biological Invasions, 14(1), 65- 77. Lishawa, S., Albert, D., & Tuchman, N. (2010, 12 11). Water Level Decline Promotes Typha X glauca Establishment and Vegetation Change in Great Lakes Coastal Wetlands. Wetlands, 30(6), 1085-1096. 38 Lougheed, V., Crosbie, B., & Chow-Fraser, P. (2001). Primary determinants of macrophyte community structure in 62 marshes across the Great Lakes basin: latitude, land use, and water quality effects. Canadian Journal of Fisheries and Aquatic Scinces, 58, 1603-1612. Marks, M., Lapin, B., & Randall, J. (1994). Phragmites australis (P. communis): Threats, Management, and Monitoring. Natural Areas Journal, 14(4), 285-294. Martin, S., Soranno, P., Bremigan, M., & Cheruvelil, K. (2011a, 8 21). Comparing Hydrogeomorphic Approaches to Lake Classification. Environmental Management, 48(5), 957-974. Martina, J., Currie, W., Goldberg, D., & Elgersma, K. (2016, 9 1). Nitrogen loading leads to increased carbon accretion in both invaded and uninvaded coastal wetlands. Ecosphere, 7(9), e01459. Minchinton, T., & Bertness, M. (2003, 10 1). Disturbance-Mediated Competition and the Spread of Phragmites australis in a Coastal Marsh. Ecological Applications, 13(5), 1400-1416. Minchinton, T., Simpson, J., & Bertness, M. (2006). Mechanisms of exclusion of native coastal marsh plants by an invasive grass. Journal of Ecology, 94, 342-354. Mozdzer, T., & Zieman, J. (2010, 1 25). Ecophysiological differences between genetic lineages facilitate the invasion of non-native Phragmites australis in North American Atlantic coast wetlands. Journal of Ecology, 98(2), 451-458. NHDPlusV2. (2012). NHDPlus Great Lakes Data. https://www.epa.gov/waterdata/nhdplus- great-lakes-data-vector-processing-unit-04 Passy, S. (2007, 2 1). Diatom ecological guilds display distinct and predictable behavior along nutrient and disturbance gradients in running waters. Aquatic Botany, 86(2), 171-178. Pimentel, D., Lach, L., Zuniga, R., & Morrison, D. (2000, 1 1). Environmental and Economic Costs of Nonindigenous Species in the United States. Bioscience, 50(1), 53-66. Quinlan, J. (1996). Bagging, Boosting, and C4.5. Proceedings, Fourteenth National Conference on Artificial Intelligence. Rapport, D., & Whitford, W. (1999, 3). How Ecosystems Respond to Stress. BioScience, 49(3), 193-203. Ricciardi, A., & MacIsaac, H. (2000, 2 1). Recent mass invasion of the North American Great Lakes by Ponto–Caspian species. Trends in Ecology & Evolution, 15(2), 62-65. Richards, C., Johnson, L., & Host, G. (1996). Landscape-scale influences on stream habitats and biota. Canadian Journal of Fisheries and Aquatic Sciences, 53(1), 295-311. 39 Rickey, M., & Anderson, R. (2004, 9 30). Effects of nitrogen addition on the invasive grass Phragmites australis and a native competitor Spartina pectinata. Journal of Applied Ecology, 41(5), 888-896. Schaap, M., Leij, F., & van Genuchten, M. (2001, 10 1). ROSETTA: a computer program for estimating soil hydraulic parameters with hierarchical pedotransfer functions. Journal of Hydrology, 251(3-4), 163-176. Sciance, M.B., Patrick, C., Weller, D., Williams, M., McCormick, M., & Hazelton, E. (2016). Local and regional disturbances associated with the invasion of Chesapeake Bay marshes by the common reed Phragmites australis. Biological Invasions. 18 Schindler, D. (1977). Evolution of Phosphorus Limitation in Lakes. Science, 195(4275), 260- 262. Silliman, B., & Bertness, M. (2004, 10 1). Shoreline Development Drives Invasion of Phragmites australis and the Loss of Plant Diversity on New England Salt Marshes. Conservation Biology, 18(5), 1424-1434. SSURGO. Description of SSURGO Database | NRCS Soils. (2016). Retrieved from https://www.nrcs.usda.gov/wps/portal/nrcs/detail/soils/survey/geo/?cid=nrcs142p2_0536 27 Sutton, C. (2005). Classification and Regression Trees, Bagging, and Boosting. Handbook of Statistics, 24(11), 303-329. Tilman, D. (1990, 5). Constraints and Tradeoffs: Toward a Predictive Theory of Competition and Succession. Oikos, 58(1), 3-15. Trebitz, A., & Taylor, D. (2007, 12 1). Exotic and Invasive Aquatic Plants in Great Lakes Coastal Wetlands: Distribution and Relation to Watershed Land Use and Plant Richness and Cover. Journal of Great Lakes Research, 33(4), 705-722. Tuchman, N.C., Larkin, D.J., Geddes, P., Wildova, R., Jankowski, K. and Goldberg, D.E., 2009. Patterns of environmental change associated withTypha xglauca invasion in a Great Lakes coastal wetland. Wetlands, 29(3), pp.964-975. Tulbure, M., & Johnston, C. (2010). Environmental Conditions Promoting Non-native Phragmites australis Expansion in Great Lakes Coastal Wetlands. Wetlands, 30(3), 577- 587. Tulbure, M., Johnston, C., & Auger, D. (2007). Rapid Invasion of a Great Lakes Coastal Wetland by Non-native Phragmites australis and Typha. Journal of Great Lakes Research, 33(3), 269-279. 40 Uddin, M., & Robinson, R. (2018, 2 1). Can nutrient enrichment influence the invasion of Phragmites australis? Science of The Total Environment, 613-614, 1449-1459. US EPA. (2019). Facts and Figures about the Great Lakes. Retrieved from www.epa.gov/greatlakes/facts-and-figures-about-great-lakes USDA. Coastal Wetland Definitions/NRCS Plant Materials Program. (2018). Retrieved from www.nrcs.usda.gov/wps/portal/nrcs/detail/plantmaterials/technical/publications/?cid=stel prdb1044268 Vaccaro, L., Bedford, B., & Johnston, C. (2009, 9 28). Litter accumulation promotes dominance of invasive species of cattails (Typha spp.) in Lake Ontario wetlands. Wetlands, 29(3), 1036-1048. Verhougstraete, M., Martin, S., Kendall, A., Hyndman, D., & Rose, J. (2015, 8 18). Linking fecal bacteria in rivers to landscape, geochemical, and hydrologic factors and sources at the basin scale. Proceedings of the National Academy of Sciences of the United States of America, 112(33), 10419-24. Warton, D., & Hui, F. (2011, 1 1). The arcsine is asinine: the analysis of proportions in ecology. Ecology, 92(1), 3-10. Wickham, J.D., Stehman, S.V., Gass, L., Dewitz, J., Fry, J.A. and Wade, T.G., 2013. Accuracy assessment of NLCD 2006 land cover and impervious surface. Remote Sensing of Environment, 130, pp.294-304. Wickham, J., Stehman, S.V., Gass, L., Dewitz, J.A., Sorenson, D.G., Granneman, B.J., Poss, R.V. and Baer, L.A., 2017. Thematic accuracy assessment of the 2011 national land cover database (NLCD). Remote Sensing of Environment, 191, pp.328-341. Wilcox, D. (2012, 6). Response to wetland vegetation to the post-1986 decrease in Lake St. Clair water levels: Seed-bank emergence and beginnings of the Phragmites australis invasion. Journal of Great Lakes Research, 38(2), 270-277. Woo, I. & Zedler, J.B., 2002. Can nutrients alone shift a sedge meadow towards dominance by the invasive Typha× glauca. Wetlands, 22(3), pp.509-521. Zedler, J., & Kercher, S. (2004, 9). Causes and Consequences of Invasive Plants in Wetlands: Opportunities, Opportunists, and Outcomes. Critical Reviews in Plant Sciences, 23(5), 431-452. 41 APPENDIX A: CHAPTER 1 DATA Table A.1.1. Consolidated land cover classes. Classes (Borgeau-Chavez et al. (2015)) were consolidated to be able to more readily visualize landscape trends throughout the Great Lakes Basin. Identifying spatial trends was more succinct with the consolidated classes. Original Classes Consolidated Classes Agriculture Fallow Agriculture Orchard Forest Pine Plantation Forest Shrub Wetland Schoenoplectus Open Peatland Shrub Peatland Wetland Treed Peatland Forested Wetland Wetland Shrub Water Water Aquatic Bed Suburban Suburban Urban Urban Grass Urban Urban Road Barren Light Barren Barren Dark Typha Invasive Phragmites 42 Table A.1.2. Summary statistics for each driver variable at each watershed scale. All variables considered within this study are displayed. It is important to note that in many cases the average of the watersheds’ means is reported. 43 Table A.1.3. Each split’s competitors and surrogates. Two of the top competitors and surrogates for each splitter. For competitors, a comparable variation explained (VE) is presented, while surrogates are compared using a percent similar metric. Dependent Variable Competitor Splits Surrogate Splits Used as a Splitter % (VE) Variable VE Variable Similar Mean N of Full Mean N in Direct WS 52.7% Mean P in Full WS 97.7% Watershed (55%) Mean N in Stream WS 52.1% Mean P in Stream WS 97.0% Forest Portion of Forest Portion in Direct WS 3.4% Agricultural Portion in Full WS 81.2% Full Watershed (4.3%) Mean Slope in Stream WS 3.4% Agricultural Portion in Stream WS 81.0% Mean Slope of Mean Slope in Full WS 4.1% Wetland Portion in Direct WS 80.7% Direct Watershed (4.4%) Mean P in Direct WS 3.6% Forest Portion in Direct WS 79.4% Mean P of Stream Mean Slope in Full WS 1.5% N:P in Stream WS 97.4% Watershed (3.5%) Mean Slope in Stream WS 1.4% Mean P in Full WS 94.8% Table A.1.4. Relative influence ratings for each driver variable. The relative influence rating for each of the driver variables. This ranking was created through our BRT analysis and indicates how closely related each independent variable is to the dependent variable. Dependent Variable Watershed Scale Relative Influence Rating (RIR) Nitrogen Inputs Full 21.1607178 Nitrogen Inputs Direct 12.7030435 Phosphorus Inputs Direct 11.7333573 Slope Full 5.8083681 Nitrogen Inputs Riparian 4.1995425 Slope Direct 3.7640044 Forest Area Full 2.2916149 N:P Direct 2.2246772 N:P Stream 2.0803444 Forest Area Direct 1.771554 Wetland Area Stream 1.5298538 N:P Riparian 1.4755821 Barren Area Full 1.3270046 Hydraulic Conductivity Stream 1.3051357 Total Area of Riparian 1.2934993 Hydraulic Conductivity Direct 1.2190856 Wetland Area Direct 1.2039463 Wetland Area Full 1.1475149 Water Area Full 1.1393761 Wetland Area Riparian 1.1382061 Water Area Direct 0.981873 Phosphorus Inputs Riparian 0.9757155 Slope Riparian 0.9509606 Hydraulic Conductivity Riparian 0.9300827 44 Table A.1.4. (cont’d) Barren Area Stream 0.9051524 Nitrogen Inputs Stream 0.8962766 Barren Area Direct 0.851227 Imperviousness Stream 0.7833915 Range Area Stream 0.7477695 Urban Area Direct 0.7274759 Urban Area Full 0.6932277 Hydraulic Conductivity Full 0.6688577 Forest Area Riparian 0.6324516 Urban Area Stream 0.6161445 Phosphorus Inputs Full 0.6161392 Total Area of Full 0.5294296 Range Area Full 0.4973456 Slope Stream 0.4777047 Range Area Direct 0.4729667 Water Area Riparian 0.4680059 Imperviousness Full 0.4557592 Forest Area Stream 0.4212617 Range Area Riparian 0.3995427 Total Area of Direct 0.398036 Water Area Stream 0.3662787 Imperviousness Direct 0.3656044 Agricultural Area Stream 0.3631216 Urban Area Riparian 0.3432146 Imperviousness Riparian 0.3168373 Barren Area Riparian 0.2875277 Agricultural Area Riparian 0.2458242 Agricultural Area Full 0.2417069 Total Area of Stream 0.2324283 Agricultural Area Direct 0.2121613 Imperviousness Direct 0.1737893 N:P Full 0.1221614 Phosphorus Inputs Stream 0.1161182 45 Figure A.1.1. Zones of influence histograms. Histograms with outliers removed of A) segment length, B) stream watershed area, C) direct watershed area, D) full watershed area, and E) riparian watershed area. As shown in Figure S1, the distributions of segment length and watershed area are all skewed to the left. The majority of the values are relatively low compared to the maximum value, creating this distribution. The majority of the segments along the coastline are less than 10 km long (Figure S1A). The stream and full watersheds have similar distributions. 46 Figure A.1.2. Complete CART tree. Initial 240 meter buffer width CART, before it was pruned. The initial tree for the 240 meter buffer width (Figure S2) from CART had 12 nodes and explained 76.7% of the variation. The tree identified 11 driver variables that accounted for some of the variation in this data set. The tree included 11 splits based on the relationships between the driver variables and invasion, which resulted in 12 unique groups of segments. Several of these groups, included less than 15 segments and had low VE. This implied that the relationships were likely less significant and therefore could have been the result of over-explaining the trends. 47 CHAPTER 2: EVALUATING SEPTIC SYSTEM INFLUENCE ON NUTRIENTS WITHIN THE MANISTEE AND AU SABLE RIVER WATERSHEDS 2.1. Abstract Septic systems are a ubiquitous means of household wastewater treatment across extensive areas of the United States - areas that also support prime habitat, water resources, and valuable recreation opportunities. To protect the ecosystem services that the riparian systems provide, it is important to understand the source of the water, along with factors that can threaten its quality. The Manistee and Au Sable Rivers, adjacent stream systems in the Lower Peninsula of Michigan, have significantly different human population densities along their banks, and thus varying septic system densities, with higher human density in the Au Sable watershed. In this study, we explicitly simulated groundwater flow transport of nutrients to surface waters, using both septic-specific and general non-point source nutrient input estimates to groundwater. We then used data from water samples that we collected from the two rivers for reference, to evaluate the nutrient inputs that could be attributed to septic systems relative to other sources across the region. On average, we found septic systems appear to degrade ~12% of the nitrogen introduced, allowing 88% to enter the groundwater. Additionally, septic systems were found to capture ~51% of the introduced phosphorus. 48 2.2. Introduction In less developed areas, nutrient inputs to groundwater and streams are different from the typical sources associated with nutrient loading. In areas with little agriculture or residential fertilizer use, atmospheric deposition and septic systems tend to be greater contributors (Hamlin et al. 2020). In regions dominated by forests, the nitrogen from atmospheric deposition is usually utilized by plants. This leaves septic systems as a major contributor of nutrients in forested, remote areas. Regions such as these, not only provide a relatively objective backdrop for the study of septic contributions, they also create an opportunity to evaluate the effects that this source has on river systems. Septic systems are domestic wastewater treatment and dispersal systems used in regions not connected to municipal wastewater treatment plants. Approximately 20% of households across the United States use a septic system (U.S. EPA, 2018). These systems are composed of two parts: the septic tank, which captures and settles waste, and the drainage field that allows the wastewater to slowly percolate through the surrounding soil. The system is intended to attenuate organic content, pathogens, and nutrients (Whelan et al. 1982; Beal et al. 2005). They rely solely on the natural biogeochemical processes that occur in soil to assimilate effluent pollutants, therefore making such systems challenging to control and manage, and heavily reliant on knowledge of the install site conditions (Beal et al. 2005). Most of the nutrient attenuation occurs via formation of a biomat below the drainage field that adsorbs and treats the solutes (Beal et al. 2005). Approximately 10 to 20% of United States’ septic systems are estimated to be in a failing or poorly operating state (James et al. 2016). This could be higher, as failure is usually only reported if there is an acute water quality issue, and homeowners would rarely be cognizant of subsurface conditions that lead to a poorly operating system (Withers et al. 2013). 49 Though intended to attenuate nutrients, septic systems commonly only remove 21-25% of nitrogen; the tank itself only attenuates 1-3% (Costa et al. 2002). If a septic system is installed in a coarse-grained soil, total nitrogen attenuation rates can fall between 10-30% (Withers et al. 2013). Ideally, the ammonium that enters the septic tank nitrifies into nitrate with the presence of oxygen, which then denitrifies into harmless atmospheric nitrogen (N2) if enough carbon or denitrifying bacteria are present. However, nitrogen removal is challenging due to the lack of consistent carbon beneath a drainage field, which is necessary for denitrification of nitrate, leading to high groundwater nitrate concentrations (Wilhelm et al. 1994). Even when septic systems do reduce nitrogen species concentrations, it is commonly due to dilution rather than transformation or attenuation (Walker et al. 1973; Postma et al. 1992). As a result, septic systems usually enrich nitrate levels in groundwater (Gill et al. 2009; Katz et al. 2011). In contrast, phosphorus is more effectively captured by septic systems; the drainage field is able to fix the majority of phosphates, based on the subsoil and its mineralogy (Jones & Lee, 1979; Robertson & Harman, 1999; Gill et al. 2009). The range can vary widely though, between 23-99% phosphorous attenuation depending on soil conditions (Robertson et al. 1998). Other studies have found that soluble reactive phosphorus (SRP) is the dominant form of phosphorus that travels to the subsurface via septic effluent. SRP is a bioavailable species of phosphorus, so it can lead to eutrophication and excessive plant growth. However, SRP can be sorbed to soils, which can limit its ability to be transported to surface waters (Oldfield et al. 2020). Several studies have been conducted on the watershed scale to evaluate the effects of septic systems on nutrient loads in streams. Oldfield et al. (2020) found that in stream nitrogen and phosphorus loads increased considerably in high flow conditions, such as spring months like April and May. They also found that septic systems contribute less than 2% of the instream 50 nitrogen load, and between <1-36% of the instream SRP load. Robertson et al. (2019) found that depending on calcareous soil content, SRP removal was between 66 and 90%. As expected, watersheds with greater septic system density result in higher instream nitrogen and phosphorus concentrations (Iverson et al., 2018). Sowah et al. (2017) found that indicators of septic impact on streams was more pronounced in the spring - likely tying into the greater fluxes of recharge and melt events within this season. The age of septic systems and their associated households can also affect the amount of septic inputs that make it to nearby streams. Tamang et al. 2021 found that aging households were associated with increased septic contributions to instream concentrations. Excessive nutrient levels within a stream can reduce the ecosystem services the stream can perform. Nutrients enhance the growth of invasive species, such as Phragmites and Typha (Goldberg et al. 2017; Hannah et al. 2020). These plants form dense monocultures that alter more heterogeneous habitats that local animals rely on, and out-compete native plant species (Tilman, 1990; Angeloni et al. 2006; Geddes et al. 2014). Additionally, high nutrient concentrations can result in algal blooms that can be harmful to plants and animals, and limit recreation in the area. This poses an issue for both rivers in this study; it is especially a concern for the Au Sable River, which is renowned for its fly fishing. Algal blooms could make the river unpleasant to wade into and harm trout populations. It is thus important to ensure that these fairly pristine rivers stay that way, and are not irreparably damaged by nutrient loads from local septic systems. This study integrates fieldwork and modeling to better understand how septic tanks are contributing to nutrient loads within streams and groundwater. We collected samples from 32 locations, primarily within the headwaters of the Manistee and Au Sable Rivers. The two systems are similar in area and land cover, but differ in density of residential properties. The Au 51 Sable watershed contains more residential properties, many of which are only seasonally inhabited. We developed a fully spatially explicit model for this region to simulate groundwater flow and evaluate how septic systems within this region are operating and sourcing nutrients. Our study includes watersheds with varied septic density, and accounts for seasonal differences as well. We separated recharge and septic nutrient inputs to better understand contributions to this region. By comparing two stream systems with differing levels of residency, over two seasons, we can better evaluate the effect that septic systems can have on nutrient levels during both baseflow and seasonal high flows. 2.3. Methods 2.3.1. Site Description This study examines the Au Sable and Manistee watersheds (Figure 1). The 5,400 km2 Au Sable watershed stretches from North-Central Michigan, towards the East Coast of Michigan’s Lower Peninsula. The river’s headwaters are north of the City of Grayling, and it drains east into Lake Huron, near Oscoda. With adjoining headwaters, the 5,150 km2 Manistee watershed drains west into Lake Michigan. Climate within both watersheds is primarily humid (Koeppen-Geiger classes Dfa and Dfb, hot- and warm-summer humid continental climates, respectively), with four distinct seasons. The watersheds primarily consist of forested land (73% in the Au Sable watershed, and 71% in the Manistee), with only small areas of urban land. These watersheds were selected for this study for several reasons. They are of similar size, location, climate, and land cover, but differ in extent of anthropogenic influence. This provides a relatively consistent backdrop to study the effects of differing human influence. The Au Sable watershed has more year-round and seasonal properties (Table 2.1) along its banks, compared to the Manistee; this results in a denser system of septic tanks and drainage fields. 52 Given the similar non-developed land cover and use (Figure 2.1) of these two watersheds, the nutrient inputs to the streams will likely be differentiated primarily from septic tanks near the banks. These watersheds are especially valued as renowned trout streams. It is important to preserve the integrity of these waterways, not only for the trout’s benefit, but also for the recreational draw that trout fishing creates for the region. Many of the small towns in these watersheds rely on the influx of fishers to subsist economically. The rivers also provide other recreational activities, such as kayaking, canoeing, or swimming. Figure 2.1. Site map. This map shows the Manistee and Au Sable watersheds, overlain on a background depicting land use, with the city of Grayling noted with an A. The inset map shows the United States with the Lower Peninsula of Michigan, with the two watersheds colored red for reference. 53 2.3.2. In-Stream Nutrient Concentrations and Septic Systems 2.3.2.1. In Stream Nutrient Concentration Measurements Water samples were collected from 18 sites along the Manistee River and 14 sites along the Au Sable River. Discharge rates for each sampling location are available in Appendix 1. Most sampling locations were along the headwaters of these river systems (Figure 2). Grab samples of water were collected from the center of the stream, standing downstream of the collection bottle. The sample was then split into three aliquots: a non-filtered sample to measure total N and P, a filtered sample for dissolved N and P, and a filtered sample to analyze ions. Samples were filtered using 0.45-micron filter paper. The total N and P samples were kept frozen over dry ice, while the ion samples were kept chilled over regular ice. We used either an ADCP (acoustic doppler current profiler) or an OTT flow meter to measure streamflow at each sampling location. The ADCP was used for wide streams, while the OTT was used in narrower streams. Basic water quality data, such as water temperature, conductivity, and pH were also collected at each location using an In Situ probe. Two sampling campaigns were conducted: early May and late August of 2018. The nitrogen and the phosphorus analyses were conducted in labs at Michigan State University, while ion analyses were done at the Kellogg Biological Station. The samples were analyzed on a mass spectrometer for four variants of phosphorus: total phosphorus (unfiltered samples, digested in the autoclave), soluble reactive phosphorus (SRP) (filtered samples, not digested), SRP and dissolved organic phosphorus (DOP) (filtered samples, digested), and SRP adjacent, a constituent measured by analyzing unfiltered, undigested samples, that produces results similar to that of SRP. Proper laboratory protocols were followed to reduce error and cross-contamination. 54 2.3.2.2. Septic System Input Maps To estimate septic tank inputs, we used a spatially-explicit septic product from SENSEmap (Hamlin et al., 2020). The product includes methodically estimated septic tank locations for the United States portion of the Great Lakes Basin. Each estimated septic tank location from this model includes other pertinent attributes, such as estimated household size, occupancy status, and sewage treatment plant boundary maps. We used this dataset of septic locations and associated attributes for two purposes: to correlate watershed septic tank density to field-sampled parameters, and to estimate loads of N and P for each septic system location as inputs to the solute transport model developed in MT3D (modular transport, 3-dimensional model) (Bedekar et al.). Concentrations are based on household size and occupancy. Each septic point is designated as either fully occupied, seasonally occupied, or vacant. This allows for the analysis of septics as a whole, and from each of these categories. Nitrogen and phosphorus inputs from septic systems are illustrated in Figures 2.4A and 2.4B respectively. 2.3.2.3. Correlating In-Stream Concentrations to Watershed Septic Systems Watersheds were generated for each river sample location. These watersheds provide a metric to assess septic field nutrient inputs from the upstream watershed that would directly affect the concentrations at each sample point (Figure 2.2, Table 2.1). A buffered area within 200 m of the stream segment within each watershed was also generated for similar assessments as the septic tanks set within this riparian buffer area nearest the river likely contribute more nutrients to the streams with less retardation, degradation, or loss than those in the watershed that are farther from the streams. These riparian buffer and full watershed scales allowed us to quantify and compare the influence of septic tanks within these relevant areas on the concentrations measured in our stream water samples. 55 Figure 2.2. Sampling points and associated watersheds. The 32 sample points of interest and their associated watersheds are shown. The color of the watershed indicates the density of septic tanks that fall within it. Several watersheds encompass other watersheds; M2 includes M1; M6 includes M2, M3, M4, and M5B; M9B includes M9A; M9C includes M9B; M9D includes M9C; M9E includes M9D; M10 includes M6, M7, and M8. A3 includes A1 and A2; A4 includes A3; A6 includes A9; A11 includes A3, A4, and A6; A13 includes A23; A14 includes All and A13; and A21 includes A18. The inset map indicates the subset of the study area displayed in this map. As an initial assessment of trends between the field sample data and the septic tank density, the two spatial categories were paired with a simple statistical analysis. The density of septic tanks was calculated by comparing the total number of septic tanks in the given watershed or riparian buffer, and normalizing the value by dividing by total area. We conducted linear regressions comparing a variety of constituents. Each nutrient or ion from sample analyses was compared to the density of septic tanks within the watershed for that sample point (Figure 2.2), and the number of septic tanks within the riparian buffer of the upstream segment. Three 56 different types of residences were also utilized. The density of septic tanks within each of these spatial scales were considered as three categories: seasonal properties, year-round properties, and all of the septic systems within the given area, regardless of occupation (this category includes both year-round and seasonal, as well as potentially vacant residencies). We used linear regression to compare each sample constituent to six spatial and residency combinations, for three sample categories: the May 2018 and August 2018 samples, or all of the samples grouped together. Table 2.1. Septic system counts. The total number of septic tanks within the area of interest. Each sample location has a watershed and riparian buffer generated for it, and the number of septic tanks are summed within each. Additionally, the tanks are subset into full occupation and seasonal occupation, and the proportions of the total counts were calculated. 57 2.3.3. Modeling In-Stream Nutrient Concentrations To further investigate the connection between septic systems and in-stream nutrients, we utilized a groundwater flow model. The model required a litany of inputs, ranging from spatial represented datasets such as recharge, groundwater levels, stream flow, etc., as well as both subsurface and in-stream nutrient concentrations. With those datasets, a MODFLOW (modular finite-difference flow) model was generated, and sensitive variables were optimized. The MT3D package was also utilized, which allowed for the modeling of constituent transport. Through the modeling process, groundwater heads and stream discharge datasets were generated, which ultimately allowed for the simulation of in-stream nutrient concentrations. This allowed for better comparison to the presence of septic systems in the vicinity, and provided insights as to nutrient sources. This process is further explained in the following subsections, and is illustrated in Figure 2.3. 58 Figure 2.3. Modeling process. This flowchart represents the modeling approach utilized in this study. 2.3.3.1. Subsurface Nutrient Inputs Non-septic sources of nutrients in the model, such as runoff from tile-drained agricultural fields, overland runoff, and groundwater flow were provided by SENSEflux (Wan et al., submitted). SENSEflux is a 120m gridded product that built upon SENSEMAP, which incorporates both point and nonpoint sources applied to the surface from (Hamlin et al., 2020), and estimates how they will be calibrated as appropriate sources move into the groundwater. SENSEflux provides the background concentrations for this study. These values will be further modified in this study and evaluated to estimate their impact to the nutrient levels in streams. Nitrogen and phosphorus inputs from recharge are illustrated in Figures 2.4.C and 2.4.D respectively. 59 Concentrations for the nutrients of interest, nitrogen and phosphorus, were calculated per septic tank, using the following equation: 𝐻𝑜𝑢𝑠𝑒ℎ𝑜𝑙𝑑 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 × 𝑝𝑒𝑟 𝑃𝑒𝑟𝑠𝑜𝑛 𝑁𝑢𝑡𝑟𝑖𝑒𝑛𝑡 𝑅𝑎𝑡𝑒 ×𝑆𝑒𝑎𝑠𝑜𝑛𝑎𝑙 𝐿𝑜𝑎𝑑 𝐹𝑎𝑐𝑡𝑜𝑟 𝐻𝑜𝑢𝑠𝑒ℎ𝑜𝑙𝑑 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 ×𝑝𝑒𝑟 𝑃𝑒𝑟𝑠𝑜𝑛 𝑊𝑎𝑡𝑒𝑟 𝑉𝑜𝑙𝑢𝑚𝑒 𝐶𝑜𝑛𝑐. 𝑎𝑛𝑑 𝑆𝑒𝑝𝑡𝑖𝑐 𝑇𝑎𝑛𝑘 = (Eq. 2.1) 𝑅𝑒𝑐ℎ𝑎𝑟𝑔𝑒 The nutrient rate for nitrogen inputs (N) was set as 0.0112 kg/capita/day and the nutrient rate for phosphorus (P) was set to 0.0027 kg/capita/day (U.S. EPA 2002). Figure 2.4. Nitrogen and phosphorus input maps. The modeled boundary region is shown with colored features, while the smaller black polygons represent the Manistee and Au Sable surface watersheds. A) Input nitrogen concentrations from septic systems in kg/m3. B) Input phosphorus concentrations from septic systems in kg/m3. C) Input nitrogen concentrations simulated from recharge in kg/m3 (Wan et al.). D) Input phosphorus concentrations from recharge in kg/m3 (Wan et al.). 60 2.3.3.2. Streamflow and Groundwater Head Data A robust water level dataset was acquired from the Michigan Department of Environment, Great Lakes, and Energy. It included static water levels at the time of installation of wells throughout the region. The data spanned over a decade and included several different types of wells, including residential, Type I, and Type II. The dataset contained tens of thousands of wells that varied in age, depth, and other characteristics. A dataset with this level of diversity introduces a certain amount of uncertainty. The varied ages of the wells resulted in static water level measurements occurring over a range of ten years, which likely captures variability within seasons and overall water level trends. Varied depths of wells could have captured data from other aquifers, such as the bedrock aquifer that is deeper than our study area. Overall, the volume of data dampens out these sources of variability, though they are still present. All inland streams within the model boundary were included in the model via the drain package within the model. They are input as features with an upper and bottom elevation, and a conductance, which vary depending on the stream. After running the model, the drains have associated flows. 2.3.3.3. Groundwater Flow and Nutrient Transport Modeling To understand how the water flows within these systems and to form a basis for the solute transport model, we constructed a modular finite-difference flow (MODFLOW) model. This was constructed using FloPy, a python based method of creating, running, and post- processing MODFLOW models, using MODFLOW-2005 (Harbaugh et al. 2017). The model consists of three unconfined aquifer layers, bounded by a bedrock layer below. This is representative of the geology in our region of interest. Many iterations of this code were run, until we felt confident that we had created a model that accurately represented our study area. 61 To account for groundwater flow extending past the surficial watershed boundaries, we buffered the area, creating a model boundary that was defined by major river systems outside of the area of interest along the north and south, and Lakes Michigan and Huron along the west and east respectively. The Great Lakes were treated as specified head boundaries, and major inland lakes were represented as rivers within MODFLOW. The Landscape Hydrology Model (LHM) (Hyndman et al., 2007, Kendall, 2009) was used to generate groundwater recharge inputs for MODFLOW. Using available data in the study area, rasters were generated for the following inputs, all of which are required to generate the MODFLOW model: elevations of the top and bottom of the simulated surficial aquifer, specific yield and specific storage, hydraulic conductivity, vertical anisotropy, along with layers characterizing drains and rivers. The input layers for recharge were provided from Wan et al. (submitted) as derived from LHM simulations. We then generated the model packages for recharge, drains, and rivers using these input rasters and FloPy code (Figure 2.5). 62 Figure 2.5. MODFLOW input maps. Some of the input layers used in the groundwater flow model, displayed within the model boundary: A) hydraulic conductivity in m/day, B) surface water features (e.g., drains - streams, wetlands), C) groundwater elevation in meters, and D) recharge in mm/yr. To optimize our initial hydraulic conductivity (HK) input inputs, we created a scaling factor for HK, and ran the MODFLOW model with 15 different scaling values. Using this dataset, paired with a simple root mean square error (RMSE) analysis, we identified a HK multiplier (0.37) that best simulated the region’s water table. To further improve our model, we repeated the process with the vertical anisotropy (VANI) dataset, establishing a VANI multiplier of 1.4 which indicates that horizontal K is ~ 1.4 times higher than vertical K. The same process was repeated for recharge, but it was ultimately concluded that a recharge scaling factor of one (no change from input values) best represented the regions recharge. These were the parameters for our groundwater flow model that provided input flows and velocities for the solute transport simulations; the plots used in the parameter sensitivity process are illustrated in Figure 2.6. 63 25 20 15 RMSE 10 5 0 0.25 0.5 0.75 1 1.25 1.5 1.75 2 2.25 HK Multiplier Figure 2.6. Hydraulic conductivity sensitivity analysis. Hydraulic conductivity scaling value assessment. The value associated with the lowest RMSE was 0.37, so it was selected as the scaling factor in the model. To understand how nutrients are transported from sources to streams within our model area, we used MT3D (Modular Transport, 3 Dimensional). This solute transport code simulated solute transport through the saturated groundwater system, using flow estimates from MODFLOW. For this study, we used MT3D solely to transport nutrients, not to simulate reactions or uptake along the transport pathways. Because we hypothesize that nutrients from general groundwater recharge react differently than those in septic system effluent, we conducted two separate transport simulations for each nutrient, each with recharge or septic nutrients only. This is an appropriate method if we assume that loss mechanisms within the subsurface are not concentration dependent. An MT3D simulation was conducted for each nutrient, and included specific porosity, advection, dispersion, and time components. Each simulation was run for 10,000 days (approximately 27 years) to approximate steady-state conditions. Two model simulations were conducted for each nutrient. The MODFLOW portion of the model remained constant for both 64 runs, but the nutrient sources changed. One simulation only included the septic inputs generated from Hamlin et al. (2020). The second iteration included only the nutrient concentrations in recharge based on the simulations by Wan et al. (submitted). The sum of these two simulations resulted in a raster showing the subsurface nutrient concentrations and discharge to streams within the region. These model outputs were combined with drain flow estimates from MODFLOW to calculate the estimated nutrient loads at each point along the stream. The loads were then summed within each watershed of interest to show the total stream load in each watershed of our study. Separately, flows in the streams were also summed within each watershed, and a simulated concentration for each watershed was calculated by dividing the summed loads by the summed fluxes. We then compared these concentrations and fluxes with our field data. 2.3.3.4. Estimating Retention and Loss of Nutrients from Septic Systems and Groundwater Recharge We estimated losses of nutrients along transport pathways using a linear scaling factor for each of the recharge and septic nutrient components, applied to the simulated watershed concentrations. These source factors sought to identify what percent of each model input made it through to the streams at the sites of the field sample. We can interpret these scaling factors as the average efficiency of the septic tanks for retaining nutrients, and the average loss of nutrients from recharge along groundwater pathways. We fit unique scaling factors to each of the N and P constituents by minimizing residuals between simulated and observed in-stream concentrations. The model initially assumed that all the nutrient inputs contributed to the nutrient concentrations within the streams, and thus there was no degradation. If all of the nutrients produced by a household (the concentration calculated in Eq. 2.1) were able to enter the groundwater, this 65 means that the septic tanks and drainage field were failing to capture or degrade any nutrients, thus the septic system was in failure. We estimated the extent of degradation or attachment of the nutrients by septic tanks and leach fields using scaling factors and a solver to maximize correlation between simulated and observed values. We thus generated scaled values that best explain the composition of septic and recharge nutrients that represent the data from the field sample. The nitrogen and phosphorus concentration estimates from the model for each subwatershed, with one run accounting for septic inputs alone, and the other accounting for all other input sources, were imported into a database. Nitrogen and Phosphorus results were processed independently, and each step described below took place within both simulation outputs. In each watershed, there is some combination of septic loads and recharge loads contributing to the concentration measured in the field. To begin to assess this, we summed the modeled septic concentrations with the modeled recharge concentration and multiplied by source factors for each of these inputs. Concentrations measured in the field would include inputs from all sources, broken down here into recharge and septic loads, which then undergo immobilization, uptake, or loss along transport pathways. Thus the modeled concentrations from the septic and recharge model runs were summed with scaling factors in a linear superposition for each nutrient. To account for the varying sizes of the subwatersheds (from 18 to 1,873 km2) and the varying amounts of septic systems within them, we created area normalized results by dividing by the area of each subwatershed. Larger watersheds provide more time for stream processing, which allows for additional variation in the measured in stream concentration, area normalization limits this. Eq. 2.2 illustrates this method, solving for M: area normalized scaled modeled concentration. 66 (𝑀𝑜𝑑𝑒𝑙𝑒𝑑 𝑆𝑒𝑝𝑡𝑖𝑐 𝐶𝑜𝑛𝑐×𝑆𝑒𝑝𝑡𝑖𝑐 𝑆𝑐𝑎𝑙𝑖𝑛𝑔 𝐹𝑎𝑐𝑡𝑜𝑟)+(𝑀𝑜𝑑𝑒𝑙𝑒𝑑 𝑅𝑒𝑐ℎ𝑎𝑟𝑔𝑒 𝐶𝑜𝑛𝑐× 𝑅𝑒𝑐ℎ𝑎𝑟𝑔𝑒 𝑆𝑐𝑎𝑙𝑖𝑛𝑔 𝐹𝑎𝑐𝑡𝑜𝑟) 𝑀= 𝐴𝑟𝑒𝑎 (Eq. 2.2) The field data was also area normalized for proper comparison, by dividing by each subwatershed’s area. To estimate scaling factors for septic and recharge inputs to best match the observed concentration data, we maximized correlation across all of the subwatersheds. We compared our scaled modeled concentrations relative to the measured concentrations for each point. Specifically, we utilized total dissolved nitrogen (TDN) as our nitrogen constituent and investigated both total phosphorus (TP) and soluble reactive phosphorus (SRP) measurements for comparison. Our calculations consisted of an R2 analysis that allowed us to quantify the amount of variability explained by the model. The analysis was a parameter estimation used to maximize the R2 between the simulated and observed concentrations. The analysis was conducted with the data for May and August independently, illustrated by the following equations: 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑑 𝑅𝑒𝑠𝑖𝑑𝑢𝑎𝑙𝑠 = (𝑀 − 𝐴𝑟𝑒𝑎 𝑁𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑒𝑑 𝐹𝑖𝑒𝑙𝑑 𝐶𝑜𝑛𝑐𝑒𝑛𝑡𝑟𝑎𝑡𝑖𝑜𝑛)2 (Eq. 2.3) 𝑇𝑜𝑡𝑎𝑙 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 = (𝐴𝑟𝑒𝑎 𝑁𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑒𝑑 𝐹𝑖𝑒𝑙𝑑 𝐶𝑜𝑛𝑐 − 𝐴𝑟𝑒𝑎 ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ 𝑁𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑒𝑑 𝐹𝑖𝑒𝑙𝑑 𝐶𝑜𝑛𝑐 )2 (Eq. 2.4) Σ 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑑 𝑅𝑒𝑠𝑖𝑑𝑢𝑎𝑙𝑠 𝑅2 = 1 − (Eq. 2.5) Σ 𝑇𝑜𝑡𝑎𝑙 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 The R2 calculated for May was added to that of August to generate a total R2 for the nitrogen data and the phosphorus data. To maximize the correlation between simulated and observed concentrations, septic and recharge scaling factors needed to systematically be selected. We utilized an evolutionary solver designed to maximize correlation by changing the septic and recharge scaling factors. This process was implemented in both the nitrogen and the phosphorus data. The scaling factors 67 generated by the solver represent the average ratio of nutrients from each source that made it into the streams. 2.4. Results and Discussion 2.4.1. Nutrient Concentrations and Septic System Correlations The samples were analyzed per the methods described above. Each sample was analyzed for 14 analytes (nitrogen and phosphorus compounds, as well as ions), for May and August samples. The analytes were measured as concentrations, but could be converted to loads using flow measurements that were recorded during fieldwork. The results of the laboratory work are shown in Figure 2.7 and Table A.2.1. We first calculated simple statistics (Table 2.2) for each analyte using concentrations with units of mass per liter. Minimum, maximum, and mean values were calculated for each constituent, for both seasons and the combined seasonal data. Average concentrations increased slightly on average between May and August. The August event also had considerably lower recorded flows, as late summer is the baseflow period in this region when most flow is derived from groundwater. Full sample results are available in Table A.2.1. Table 2.2. Summary of in-situ data. The minimum, maximum, and mean load are shown for each analyte that was sampled for. The analytes are shown in concentrations (mass/liter). Note that the four P compounds are in a different unit than the other analytes. 68 Sites from the Manistee River tended to have lower loads (across both months, 9 of the 14 analytes’ minimum concentration was measured at a Manistee sample location) (Table 2.2). Location M11 produced several of the lowest loads for the various analytes, across both months (Na: 0.88 mg/sec, Mg: 4.22 mg/sec, Ca: 21.23 mg/sec, and SO4: 1.85 mg/sec). M11 is in the headwaters of the Manistee River, along Maple Creek in the northwestern Lower Peninsula. Site A23, which tended to have the highest loads (NPOC: 18.34 mg/sec, Na 10.41 mg/sec, and Cl: 20.13 mg/sec), is located along a relatively narrow and shallow stretch of the Au Sable River. Several residences are near the shore at this location. To begin to assess where certain constituents may be sourced from, we compared analyte concentrations in our samples to several spatial and temporal variable combinations. Each analyte shown in Table 2 was compared to the density of septic tanks within the specified spatial scale through linear regressions. The six spatial/residency categories were constructed by pairing both a sample point’s watershed and upstream buffer, with fully occupied septic tank density, seasonally occupied septic tank density, and all septic tank density. A regression was performed comparing each of these variables to each constituent of interest (Table 2.3). Each regression was performed for the field data collected in May, August, and the combined data of both months. The constituents of highest concern for aquatic system eutrophication, nitrogen and phosphorus compounds, had very poor relationships with any of the septic tank categories. The strongest relationship existed between SRP adjacent and the density of fully occupied septic tanks within watersheds, with an R2 of 0.22. The other variable combinations for N and P variables did not score above a 0.14. These regressions indicate weak linear relationships between septic density and N and P concentrations, though these results do not rule out the possibility that septic systems are impacting the rivers. 69 Figure 2.7. In-situ TDN and TP concentrations. A) TDN distribution within the Au Sable. B) TP distribution within the Au Sable. C) TDN distribution within the Manistee. D) TP distribution within the Manistee. Products of salt dissolution, including chloride, sodium, and potassium, had more linear correlations with septic tanks. Both the May sodium and chloride concentrations had R2 values near 0.40 when compared to all or fully occupied septic tanks in the riparian buffer. Potassium concentrations from August produced similar values when compared to fully occupied septic tanks in both the sample points associated with the watersheds as well as riparian buffers. Salt 70 compounds can be related to septic tanks as salt is used in water softeners, which impact the water that travels through the septic systems. Additionally, potassium is a micronutrient that is often found in food; similar to the presence of salt in much of our sustenance. Beyond these direct relationships, septic system presence may correlate with other factors that lead to increased loading of salt compounds, such as road salt. Another interesting relationship that came out of this analysis was the presence of several inverse relationships (indicated by light red in Table 2.3). The strongest and most consistent of these downward trending correlations involved sulfate. For the May, August, and combined samples, across all six septic density categories, sulfate had an inverse relationship, with the largest R2 of 0.24 with May data compared to seasonally occupied septic density in the watersheds. This could be an indicator of sulfate reduction, which occurs in anaerobic and carbon-rich environments consistent with septic fields. However, reduction of sulfate in septic fields should only occur after nitrate is depleted, since nitrate is a more favorable alternative electron acceptor than sulfate, yet we did not see negative relationships with nitrate. Though some interesting correlations appeared in our analysis, there is little evidence to link septic tanks to the constituent concentrations found in these streams. The simple linear regression could not capture the nuanced relationship between nutrient inputs and the loads that are measured in the related streams. To further investigate this complex relationship, a groundwater model was implemented. 71 Table 2.3. Summary of linear regressions. Analytes for each sampling event were compared to the number of septic tanks (both fully used and seasonally used) (Table 1) within the sampling locations’ watersheds and riparian buffers using linear regression. The R2 value for each combination of variables is indicated within the table. Some cells are shaded, based on the key, to indicate higher R2 values. 72 2.4.2. Modeling Results without Estimated Losses Using MODFLOW, we created a three-dimensional groundwater flow model that accurately represents our study area. We performed a parameter sensitivity analysis to identify scaling factors for both HK and VANI to calibrate the model to the measured water levels in the area. Ultimately, a scaling factor of 0.37 and 1.4 were selected for HK and VANI respectively. This indicates that the original HK value was too high, allowing for excessive drainage not representative of actual conditions. Specifically in the area of interest surrounding the headwaters of the Manistee and Au Sable Rivers, the modeled water levels were deemed sufficiently accurate to properly model these watersheds. Figure 2.8 shows the comparison of observed vs. simulated water levels for the entire model area. The RMSE is 10.5. Figure 2.8. Observed vs. simulated heads. Plot of observed vs. simulated water levels across the entire modeled area. Modeled flow was also determined to be a sufficient representation of in-situ stream flow conditions. In Figure 2.9, observed flow is compared to simulated stream flow. The RMSE is 1.00. 73 Figure 2.9. Observed vs. simulated flow. Plot of observed vs. simulated flow at each stream location that in-situ flow data was collected. The steady state MODFLOW model provided the water flows as inputs to the MT3D solute transport model. Four MT3D models were run using these MODFLOW inputs: septic nitrogen, recharge nitrogen, septic phosphorus, and recharge phosphorus. Initially, before any scaling was applied to the simulated concentrations, the nitrogen results had reasonable correlation to the field data, while there was little correlation with phosphorus observations. The field TDN concentrations from May, compared to the modeled nitrogen concentrations, had an R2 value of 0.16. Despite this not being an excellent correlation, the paired data is similar in each watersheds, with the maximum modeled concentration being 1.44 mg/L and the maximum field concentration being 0.81 mg/L. August’s TDN load vs. the modeled loads has an R2 of 0.41, but the model tends to over predict slightly. The data in both May and August could be scaled to better match the field data (Figures 2.10.A and 2.10.B). The comparison of the raw model concentrations to the total phosphorus field data did not have a distinct trend. The May TP data had a 0.04 R2 when compared to the modeled 74 concentrations, and the maximum values were similar, 93.7 µg/L in the modeled data and 122.3 µg/L in the field data, but occurred in different watersheds. The August TP concentrations showed a poor correlation to the modeled data as well, with an R2 of 0.06. Again, the modeled and field maximum values were similar (93.7 and 84.8 µg/L respectively), but they occurred in different watersheds. Considering that soluble reactive phosphorus (SRP) is often the most dominant species in septic effluent, we also compared the field SRP data to our modeled concentrations. May SRP data had an R2 of 0.02 and the August SRP data had an R2 of 0.15. While these R2 values are still low, the data follows the expected trend better than the TP data. The SRP data in both months trended in a positive linear correlation, rather than the negative trend shown in the TP graphs. These distributions can be examined in Figure 2.10.C-F. 75 Figure 2.10. Modeled and observed concentrations. Modeled and observed concentrations, without estimated losses. A) May 2018 in situ total dissolved nitrogen (TDN) (mg/L) compared to modeled nitrogen (mg/L). B) August 2018 in situ TDN (mg/L) compared to modeled nitrogen (mg/L). C) May 2018 in situ total phosphorus (TP) (µg/L) compared to modeled phosphorus (µg/L). D) August 2018 in situ TP (µg/L) compared to modeled phosphorus (µg/L). E) May 2018 in situ soluble reactive phosphorus (SRP) (µg/L) compared to modeled phosphorus (µg/L). F) August 2018 in situ SRP (µg/L) compared to modeled phosphorus (µg/L). In all plots, the dashed line represents the trend within the data being compared, and the solid line shows a 1:1 correlation. 76 2.4.3. Estimated Losses of N and P during Transport As discussed in the methods, there are two scaling factors that we used to estimate losses of N and P during transport. Concentrations generated by the septic and the recharge model runs were likely not correctly representing the percent of each input that made it to the stream. To account for this, we utilized a septic scaling factor and a recharge scaling factor. Scaling values were calculated for each nutrient using a solver that maximized correlation between the model results and observations. Within the nitrogen analysis, the septic scaling factor was estimated to be 0.88 and the recharge scaling factor was estimated to be 0.63. Within the total phosphorus analysis, the septic scaling factor was determined to be 0.49 and the recharge factor was calculated to be 1.12. Within the SRP analysis, the values were 0.22 for septic and 0.23 for recharge. Utilizing these scaling factors resulted in a similar match between the modeled data and the field data in all four comparisons. The scaled nitrogen comparison in May resulted in an R2 value of 0.15, and a tendency to under predict (Figure 2.11.A- scaled linear plots). The scaled nitrogen comparison in August had an R2 of 0.37, and a slight tendency to under predict (Figure 2.11.B). The scaled phosphorus comparison for May resulted in an R2 of 0.03, still lacking a distinct trend (Figure 2.11.C). The scaled phosphorus comparison in August resulted in an R2 of 0.09, also lacking a distinct trend (Figure 2.11.D). The May scaled SRP comparison resulted in an R2 of 0.03, with a slight tendency to under predict larger field concentrations. The scaled SRP comparison in August resulted in an R2 of 0.23, with an accurate trend of prediction. 77 Figure 2.11. Scaled modeled and observed concentrations. In all plots, the dashed line represents the trend within the data being compared, and the solid line shows a 1:1 correlation. A) May 2018 in situ total dissolved nitrogen (TDN) (mg/L) compared to scaled modeled nitrogen (mg/L). B) August 2018 in situ TDN (mg/L) compared to scaled modeled nitrogen (mg/L). C) May 2018 in situ total phosphorus (TP) (µg/L) compared to scaled modeled phosphorus (µg/L). D) August 2018 in situ TP (µg/L) compared to scaled modeled phosphorus (µg/L). E) May 2018 in situ soluble reactive phosphorus (SRP) (µg/L) compared to scaled modeled phosphorus (µg/L). F) August 2018 in situ SRP (µg/L) compared to scaled modeled phosphorus (µg/L). 78 The final step of these analyses involved area normalizing the data for each watershed. The area of the watersheds varied widely – from 18 to 1,873 km2 – so there is an inherent bias based on size, the quantity of septic systems within each watershed, and the amount of stream processing that is able to occur. To reduce this bias, we divided each concentration, both modeled and in situ, by the watershed’s area. This resulted in considerably different relationships between the field and modeled loads. The May nitrogen comparison now has a R2 of 0.87 and the model tends to reasonably predict concentrations in all watersheds. The August nitrogen comparison resulted in an R2 of 0.86 and also tends to predict very reasonably in all watersheds. The trends in the nitrogen data are very similar between May and August. In the May phosphorus comparison, the R2 was 0.12 and the model consistently under predicted the concentrations. In August, the phosphorus R2 was 0.04, and the model considerably under predicted concentrations in most watersheds. The SRP comparisons resulted in an R2 of 0.34 in May and an R2 of 0.43 in August, both tending to only slightly under predict. R2 from the area normalized data was used as the metric to maximize correlation within the model results. An average of the R2 among the May data and the R2 from the August data was maximized to determine the final septic and recharge scaling factors. In initial investigations of the model, we evaluated how septics contributed to the total modeled load. Looking at the nitrogen data, the septic model concentrations accounted for an average of only 12% of the total modeled concentration. When compared to the field data, the modeled septic concentration was representative of an average of 18% of the field concentration from May, and 22% of the August concentration. On the other hand, septic concentrations accounted for 60% of the total modeled concentration of phosphorus. When compared to the field data, the modeled septic phosphorus concentration was considerably higher on average than 79 the field concentrations, representing an average of 145% of the field concentration in May, and 99% in August. When compared to field SRP data, the modeled results were even more likely to over predict. Box plots showing how much of the modeled nutrients are related to septic systems are shown in Figure 2.12. Figure 2.12. Modeled nutrients attributed to septic systems. The distribution of the percentage of the total modeled concentrations accounted for by the modeled septic concentrations within each watershed of interest. To further investigate the data and evaluate if the trends observed in our analysis could be linked to other attributes of the watersheds, a residual analysis was performed. The residuals (modeled – observed concentrations) for each of the four datasets were calculated and compared to total area of the watershed, the associated stream system, the total number of septic systems within the watershed, the average nutrient input within the watershed, and three percentages of land cover: urban, forested, and agricultural. In the four TDN and TP comparisons, there was not a clear trend in relation to watershed size. Both small and large watersheds had residuals near zero, and larger residuals tended to be 80 from random watersheds with an area near the middle of the distribution (Figures 2.13A-D). When comparing the results to the stream system they were derived from, there tended to be slightly more variation in residuals in the Au Sable than the Manistee. Though the Au Sable presented a more normal distribution of the residuals, the Manistee tended to have under predicted watersheds. In comparison with the total count of septic systems with a watershed, the model tended to slightly over predict concentrations in watersheds with higher amounts of septic systems in all comparisons. Specifically, watershed A2, which has the second highest quantity of septic systems within its bounds, was consistently over predicted in all four comparisons. When compared to input nutrient concentrations, there was not a distinct trend; both large and small concentrations were both over and under predicted by the model in all four comparisons (Figures 2.13E-H). In comparisons to percentages of different land covers, there were not any distinct trends. The likelihood of under or over predicting was not dependent on land cover. In analyzing these same variables with regard to the SRP data, it revealed similar trends as described above, but highlighted the fact that the model tends to either accurately predict or over predict when only compared to SRP – there are no major under predictions in this data. After identifying the need to adjust some of our inputs, the final MODFLOW results are accurate and representative of the study area, specifically the subwatershed area identified in Figure 2.2. Utilizing the input layers from LHM and other sources, via MODFLOW, we were able to replicate water levels within our model area with a high level of accuracy (RMSE of 10.488). Establishing an accurate flow model provides a starting point for our nutrient transport model. 81 Figure 2.13. Residual analysis results. A) May nitrogen residuals compared to watershed area. B) August nitrogen residuals compared to watershed area. C) May phosphorus residuals compared to watershed area. D) August phosphorus residuals compared to watershed area. E) May nitrogen residuals compared to average initial septic input within the watershed. F) August nitrogen residuals compared to average initial septic input within the watershed. G) May phosphorus residuals compared to average initial septic input within the watershed. H) August phosphorus residuals compared to average initial septic input within the watershed. 82 2.4.3. Model Discussion By comparing modeled nutrient data to the nutrients sampled in the field, we were able to discern a variety of findings related to nutrient loading in the Manistee and Au Sable watersheds. We initially compared the raw modeled data, and from there used scaling factors and solvers to draw conclusions on septic systems’ ability to capture both nitrogen and phosphorus. From an initial analysis of the raw modeled data, a few telling trends were discovered. The nitrogen comparison in both May and August, illustrated that many of the watersheds had a similar modeled value in relation to the in situ measurement, but the trend was not particularly strong across all watersheds. The model was not consistently predicting concentrations to be high or low, it varied between each watershed. This pattern of low correlation with a typically low error between modeled and field values is indicative that the modeled loads are not properly accounting for a specific scaling variable. Considering that this relationship is consistent between both months, this issue of low correlation is likely derived from septic and recharge inputs not being properly accounted for in the model. This gap was remedied via source scaling. Both the May and August phosphorus comparisons showed a non-distinct trend. Specifically, the relationship in both months lacked a true pattern, with the model both vastly over predicting and under predicting concentrations. This lack of a pattern is more indicative of the model not properly predicting how much of the septic and recharge inputs are making it to the stream. Specifically, phosphorus is a very limiting nutrient – especially in drier months, such as August. Typically, phosphorus is strongly limiting to plants and also tends to sorb to soils, and therefore will not make it to the surface water via recharge. Instead, it primarily travels via overland flow paths. Overland runoff is less common in drier months, such as August. To attempt to remedy this, we used the solver previously described to not only create a more 83 accurate correlation between the modeled and in situ data, but to also ultimately answer our question of determining average ratios of septic and recharge inputs reaching the surface water. To provide additional perspective, we repeated the phosphorus analysis using the field SRP data. This resulted in a generally better match, but a tendency to over predict. Although SRP is considered the dominant phosphorus species in septic effluent (Oldfield et al. 2020), it is just a portion of total phosphorus, so the typical over prediction of the model is logical. Once the source scaling factors were applied, the modeled results were better calibrated to the field data. In the nitrogen analyses, the scaled model loads were still slightly under predicting (Figures 11A and 11C), with the August data being slightly closer to the 1:1 line. In investigating which watersheds were accountable for the trend of under predicting, it was identified that the watersheds most under predicted in both May and August were typically smaller in area. This indicates that while our scaling values mitigated calibration issues for the most part, they struggled to fully represent concentrations in smaller watersheds. This is likely related to the fact that larger watersheds process nutrients differently than smaller watersheds. This could explain why some smaller watersheds had high field nitrogen, but comparably low modeled nitrogen. To account for this, we area normalized the data in order to properly assess source scaling factors. The phosphorus data on the other hand, still lacks a trend despite the application of source scaling factors. Small watersheds were both over and under predicted. Using the SRP comparison, the results are more meaningful – showing a fairly strong correlation to SRP in the field and modeled phosphorus. Though more accurate than the unscaled results, there is still a fair amount of watersheds of all areas that are not well represented in the model. This likely indicates that the model is not properly modeling phosphorus transport. 84 The final load scaling factors provide the main take away from this study. Within the nitrogen investigation, we found that septic systems only captured 12% of the nitrogen introduced, allowing 88% to enter the groundwater. Conversely, septic systems were found on average to capture 51% of phosphorus, specifically capturing 78% of SRP. This indicates that septic systems are better at capturing phosphorus. As for recharge concentrations, it was determined that 63% of the nitrogen and 112% of the total phosphorus made it to the surface water. The nitrogen recharge value indicates that some of the recharge input concentration is lost or otherwise retained while it travels to the surface waters of interest. The phosphorus recharge value indicates that it is likely that our recharge input data source is under predicting total phosphorus. We were also able to determine from our initial evaluation of the modeled data, that septic sources contributed more to the total phosphorus modeled load, and contributed less to the total nitrogen modeled load. When paired with the septic attenuation values, this provides insight into the makeup of the nutrient landscape in this region. Despite the fact that septic systems are better at attenuating phosphorus, the phosphorus that septic systems do contribute is more significant than the load contributed by other sources. Conversely, although septic systems only capture 12% of nitrogen loads, septic loads do not account for a large portion of the total nitrogen load in our model. Other sources of nitrogen contribute an average of 88% of the modeled loads. The ability of septic systems to capture both nitrogen and phosphorus determined by this study, confirmed previous literature. Costa et al. (2002) established that septic systems can attenuate 21-25% of nitrogen. Our study attributed septic systems with capturing 12% of input nitrogen. Though lower, our results are similar to Costa’s study. This similarity not only provides 85 credence to our groundwater model, but also shows that similar results can be established through vastly different studies. Robertson et al. (1998) posited that septic systems can capture anywhere from 23-99% of input phosphorus, but their success varies widely on the mineralogy of the soil surrounding the drainage field, specifically regarding the grain size and calcareous sediment content (Robertson et al. 2019). Several other studies (Jones & Lee, 1979; Robertson & Harman, 1999; Gill et al. 2009) supported this notion and verified that phosphorus is more effectively captured than nitrogen. Our findings also correspond with these points. Our analysis predicted that septic systems in this region were able to attenuate 51% of input phosphorus on average. This falls within the range identified by Robertson (1998), and is greater than our calculated nitrogen attenuation rate. Our SRP findings also correspond to the literature. Robertson et al. (2019) found that 66-99% of SRP was attenuated, and Oldsfield et al. (2020) echoed this sentiment by finding that septic systems only contributed <1-36% of the instream SRP loads. We found that septic systems attenuate 78% of SRP, allowing 22% of it to reach the stream; these values fall within both Robertson’s and Oldfield’s expected ranges. Our findings correlate well with that of the literature and indicate that septic systems are more capable of capturing phosphorus, compared to nitrogen, attenuating 49 and 88%, respectively. The fact that our nitrogen values are on the lower end of previously established values, could relate to soil type. The literature indicates that installing a septic system in coarse- grained soil can reduce attenuation immensely. Much of our region of interest includes sand as its dominant surficial soil. This could explain why our septic nitrogen attenuation rates are on the lower end of what the literature predicts. Per our findings, septic systems are typically not attenuating a large portion of the nutrients input into them. This supports our hypothesis that septic systems are considerable 86 contributors of nutrients into the waterways in remote areas such as the region of the Manistee and Au Sable headwaters in Northern Michigan. Nutrients within waterways can contribute to eutrophication, algal blooms, and increased presence of invasive species (Hannah et al. 2020) within streams, which can negatively impact wildlife and humans alike. Considering that phosphorus is often a limiting nutrient in many ecosystem processes, septic systems’ greater ability to capture it may limit some of the negative effects associated with nutrient loads within streams. Per Hannah et al. (2020), phosphorus was identified as a key factor contributing to the presence of invasive species whether nitrogen was present in large quantities or not. With that, if phosphorus loads via septic systems can be limited, invasive species should pose less of a threat in these natural regions. Additionally, a greater focus should be placed on installing drainage fields in appropriate, fine-grained soil. Withers et al. (2013) found that septic systems installed in coarse-grained soil only had a phosphorus attenuation rate of 10-30%. Fine materials, such as clay, are also capable of retarding some nitrogen species (Wilhelm 1994). Finally, while not explicitly addressed in this model, installing septic systems farther from streams and waterways, will further limit the nutrient loads that make it to the streams. Beyond determining the likely average contributions of septic and recharge in these regions, our residual analysis elucidated underlying trends within the models ability to accurately predict loads. Comparing the residuals to the area of each subwatershed indicated that area was not highly correlated to the model’s accuracy. In all comparisons, the model predicted the concentrations in the largest watersheds fairly accurately. Mid-sized and small watershed concentrations followed a less distinct trend, when depending on the month and the nutrient, the model over predicted, under predicted, and correctly predicted the concentrations within them. 87 This indicates that despite the model frequently doing a good job of predicting concentrations within large watersheds, there is not a distinct trend of model ability in relation to area. We also investigated trends within each stream system. The Au Sable tended to have more variability, but had a more even distribution, centered on a residual of zero. The Manistee tended to have less variation, but tended to under predict. The Au Sable’s increased range of residuals is primarily attributed to watershed A2, which was consistently over predicted in all four comparisons. A2 falls in the middle of the distribution of area, but has the highest percentage of urban land cover, and the second highest amount of septic systems amongst all 32 watersheds. Additionally, the City of Gaylord’s Waste Water Treatment Plant falls within the watershed. Perhaps this elevated level of human impact, and the elevated level of septic systems associated with it, caused the model to consistently over predict the concentrations within this watershed. There was not much of a trend when the residuals were compared to the average septic input concentrations. The fact that the model under predicted, over predicted, and accurately predicted the loads in watersheds with both high and low nutrient inputs helps to rule out an inherent bias related to input concentrations. Considering that this region is primarily dominated by forested land cover, the resulting lack of trends related to land cover is not surprising. Most of the watersheds had very similar land cover compositions, mostly dominated by forest, with a small amount of urban and agricultural. Directly relating variability within the model, to a dataset with very little variation resulted in a lack of a distinct trend. When conducting the same residual analyses for SRP, there was one main trend. No matter the variable, the model was either near an accurate prediction, or over predicting the 88 concentration. The tendency to over predict is likely due to the fact that SRP is only a portion of total phosphorus, and there are likely other contributing species that make up the whole of what the model is simulating. Through our analyses, septic systems were determined to lack the ability to fully capture nitrogen and phosphorus. On average, only 12% of nitrogen and 51% of phosphorus were attenuated, with the remaining portions of the nutrients making it to the surface water. The model effectively predicted nitrogen concentrations in the remote regions that it was intended for and generated attenuation values very similar to that of the literature. We are less confident in the model’s ability to properly model phosphorus. With that said, we do feel confident that the analyses are properly modeling septic systems within the remote region of the northern Lower Peninsula of Michigan, with more accuracy when modeling nitrogen. 2.5. Conclusions We evaluated the relationship between septic systems and stream concentrations of two common nutrients: nitrogen and phosphorus. This study was conducted in a remote region of the northern Lower Peninsula of Michigan along the Au Sable and Manistee Rivers, two important waterways both naturally and recreationally. We found that septic systems were capable of capturing approximately 12% of the input nitrogen, and 51% of input phosphorus. When studying SRP, the septic retention rate increases considerably to 78%. The remaining portion of each of these nutrients were identified as having reached surface water within the local subwatershed. This shows that septic systems are not particularly efficient in capturing the nutrients that are allocated to them, specifically in regards to nitrogen. These values were similar to those identified in previous studies, such as the findings of Robertson et al. (2019), which 89 indicated that SRP removal by septic systems was between 66 and 90%. This indicates that we have created a functional model to predict attenuation abilities of the septic system of a region. Uncertainties in this study could arise from a variety of places, though we did our best to mitigate their impacts. Initially, there is the possibility of sample errors either in the field or in the laboratory. The utmost care was taken in both settings, but human error does run the risk of confounding data. Secondly, the model is lacking in robustness. There were additional capabilities within MT3D and MODPATH that were not utilized in this analysis. The model was deemed accurate enough as it is, and therefore we decided not to incorporate these additional parameters at this time. In the future, we plan to build upon this model and implement additional functionalities in order to study new hypotheses. With that said, we do feel confident in our current model and scaling procedure. In investigating the residuals, we found no indication of an inherent input concentration bias. Additionally, we confirmed our final scaling values via sensitivity analysis and returned the same result as our initial solver analysis. Finally, the results of the model indicated that phosphorus is not being modeled as well as nitrogen is. In both the raw comparisons and the scaled comparisons, the modeled phosphorus concentrations did not compare well to the field data. Some watersheds had modeled values that compared accurately to our in situ values, but there are too many instances in which the concentrations were vastly under and over predicted to have full confidence in the phosphorus model. Given that the recharge scaling factor for phosphorus was determined to be greater than 100%, this indicates that our recharge product may be systematically underrepresenting phosphorus concentrations. There is an updated data product from Wan et al. that could resolve this issue in the future. This work provides a basis for improved septic system understanding. By identifying attenuation abilities of septic systems for both nitrogen and phosphorus, this work can inform 90 future improvements to septic systems. To evaluate certain landscape factors such as soil type and its impacts on attenuation rates, septic systems within watersheds with high stream loads could be compared to those in watersheds with low loads and assessed for a trend relative to the landscape factor of interest. By better understanding the nutrients loaded to the landscape via septic systems, action can be taken to prevent harmful impacts on the environment. Additionally, by also evaluating the recharge concentration product created by Wan et al. (2020), we can provide useful data that could be used to improve that dataset. Additionally, this model was designed for a remote region with few nutrient inputs outside of septic systems and recharge; in the future, it could be expanded to include other nutrient sources and applied to more anthropogenically influenced regions. Finally, other sources of interest could be evaluated for their contributions to surface water loads. 91 2.6. Acknowledgments This chapter was coauthored by David Hyndman, Anthony Kendall, and Stephen Hamilton. This research was funded by National Aeronautics and Space Administration grants 80NSSC17K0262 and NNX11AC72G, and National Oceanic and Atmospheric Administration grant NA12OAR4320071. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of NASA or NOAA. 92 BIBLIOGRAPHY Angeloni, N., Jankowski, K., Tuchman, N., Kelly, J. (2006, 8 5). Effects of an invasive cattail species (Typha x glauca) on sediment nitrogen and microbial community composition in a freshwater wetland. FEMS Microbiology Letters, 263(1), 86-92. Beal, C., Gardner, E., & Menzies, N. (2005, 11). Process, performance, and pollution potential: A review of septic tank - soil absorption systems. Soil Research, 43(7), 781. Retrieved from http://www.publish.csiro.au/?paper=SR05018 Bedekar, Vivek, Morway, E.D., Langevin, C.D., and Tonkin, Matt, 2016, MT3D-USGS version 1: A U.S. Geological Survey release of MT3DMS updated with new and expanded transport capabilities for use with MODFLOW: U.S. Geological Survey Techniques and Methods 6-A53, 69 p., http://dx.doi.org/10.3133/tm6A53. Costa, J., Heufelder, G., Foss, S., Milham, N., & Howes, B. (2002). Nitrogen Removal Efficiencies of Three Alternative Septic System Technolo-gies and a Conventional Septic System a. Retrieved from http://m.buzzardsbay.org/etistuff/results/costaenvccarticle2.pdf Domagalski, J., & Johnson, H. (2011, 10). Subsurface transport of orthophosphate in five agricultural watersheds, USA. Journal of Hydrology, 409(1-2), 157-171. Geddes, P., Grancharova, T., Kelly, J., Treering, D., & Tuchman, N. (2014, 9 22). Effects of invasive Typha × glauca on wetland nutrient pools, denitrification, and bacterial communities are influenced by time since invasion. Aquatic Ecology, 48(3), 247-258 Gill, L., O'Luanaigh, N., Johnston, P., Misstear, B., & O'Suilleabhain, C. (2009, 6). Nutrient loading on subsoils from on-site wastewater effluent, comparing septic tank and secondary treatment systems. Water Research, 43(10), 2739-2749. Retrieved from https://www.sciencedirect.com/science/article/pii/S0043135409001833 Goldberg, D., Martina, J., Elgersma, K., & Currie, W. (2017, 8 2). Plant Size and Competitive Dynamics along Nutrient Gradients. The American Naturalist, 190(2), 229-243. Hamlin, Q. F., A. D. Kendall, S. Martin, H. Whitenack, J. Roush, B. Hannah, D. W. Hyndman (2020). SENSEmap-USGLB: Nitrogen and Phosphorus Inputs, HydroShare, https://doi.org/10.4211/hs.1a116e5460e24177999c7bd6f8292421 Hannah, B., Kendall, A., Martin, S., & Hyndman, D. (2020). Quantifying linkages between watershed factors and coastal wetland plant invasion in the US Great Lakes. Landscape Ecology, 35, 2843-2861. Harbaugh, A.W., Langevin, C.D., Hughes, J.D., Niswonger, R.N., and Konikow, L. F., 2017, MODFLOW-2005 version 1.12.00, the U.S. Geological Survey modular groundwater model: U.S. Geological Survey Software Release, 03 February 2017, http://dx.doi.org/10.5066/F7RF5S7G 93 Holman, I., Whelan, M., Howden, N., Bellamy, P., Willby, N., Rivas-Casado, M., & McConvey, P. (2008, 12). Phosphorus in groundwater - An overlooked contributor to eutrophication? Hydrological Processes, 22(26), 5121-5127. Hyndman, D.W., A. D. Kendall, and N. R.H. Welty, 2007, Evaluating Temporal and Spatial Variations in Recharge and Streamflow Using the Integrated Landscape Hydrology Model (ILHM), AGU Monograph, Subsurface Hydrology: Data Integration for Properties and Processes, p. 121-142, DOI:10.1029/171gm11. Iverson, G., Humphrey, C., O'Driscoll, M., Sanderford, C., Jernigan, J., & Serozi, B. (2018, 4). Nutrient exports from watersheds with varying septic system densities in the North Carolina Piedmont. Journal of Environmental Management, 211, 206-217. James, C., Miller-Schulze, J., Ultican, S., Gipe, A., & Baker, J. (2016, 9). Evaluating Contaminants of Emerging Concern as tracers of wastewater from septic systems. Water Research, 101, 241-251. Retrieved from https://www.sciencedirect.com/science/article/pii/S0043135416303724#bib21 Jones, R., & Lee, G. (1979). Septic Tank Wastewater Disposal Systems as Phosphorus Sources for Surface. Retrieved from https://www.jstor.org/stable/pdf/25040491.pdf?refreqid=excelsior%3A6fffbcf6f1283 51f21d4c24bdf126da5 Katz, B., Eberts, S., & Kauffman, L. (2011, 2). Using Cl/Br ratios and other indicators to assess potential impacts on groundwater quality from septic systems: A review and examples from principal aquifers in the United States. Journal of Hydrology, 397(3-4), 151-166. Retrieved from https://www.sciencedirect.com/science/article/pii/S0022169410007134#! Oldfield, L., Roy, J., & Robinson, C. (2020, 8). Investigating the use of the artificial sweetener acesulfame to evaluate septic system inputs and their nutrient loads to streams at the watershed scale. Journal of Hydrology, 587, 124918. Postma, F. B., Gold, A. J., & Loomis, G. W. (1992). Nutrient and Microbial Movement from Seasonally-Used Septic Systems. Journal of Environmental Health, 55(2), 5–10. http://www.jstor.org/stable/44534507 Robertson, W., & Harman, J. (1999, 3). Phosphate Plume Persistence at Two Decommissioned Septic System Sites. Ground Water, 37(2), 228-236. Retrieved from http://doi.wiley.com/10.1111/j.1745-6584.1999.tb00978.x Robertson, W., Schiff, S., & Ptacek, C. (1998, 11). Review of Phosphate Mobility and Persistence in 10 Septic System Plumes. Ground Water, 36(6), 1000-1010. Retrieved from http://doi.wiley.com/10.1111/j.1745-6584.1998.tb02107.x Robertson, W., Van Stempvoort, D., & Schiff, S. (2019, 11). Review of phosphorus attenuation in groundwater plumes from 24 septic systems. Science of The Total Environment, 692, 640-652. 94 Sowah, R., Habteselassie, M., Radcliffe, D., Bauske, E., & Risse, M. (2017, 1). Isolating the impact of septic systems on fecal pollution in streams of suburban watersheds in Georgia, United States. Water Research, 108, 330-338. Tamang, A., Roy, J., Boreux, M., & Robinson, C. (2021, 10). Variation in septic system effluent inputs to tributaries in multiple subwatersheds and approaches to distinguish contributing pathways and areas. Science of The Total Environment, 151054. Retrieved from https://linkinghub.elsevier.com/retrieve/pii/S0048969721061325 Tilman, D. (1990, 5). Constraints and Tradeoffs: Toward a Predictive Theory of Competition and Succession. Oikos, 58(1), 3-15. US Environmental Protection Agency (2002). Onsite Wastewater Treatment Systems Manual. U.S. EPA Septic systems overview. Available at: https://www.epa.gov/septic/septic-systems- overview (2018) Washington D.C Walker, W., Bouma, J., Keeney, D., & Olcott, P. (1973). Nitrogen Transformations During Subsurface Disposal of Septic Tank Effluent in Sands: II. Ground Water Quality1. Journal of Environment Quality, 2(4), 521. Retrieved from https://www.agronomy.org/publications/jeq/abstracts/2/4/JEQ0020040521 Wan, L., Kendall, A., Martin, S., Hamlin, Q., & Hyndman, D. (2022 submitted). Groundwater and septic plumes are important nutrient transport pathways to the Great Lakes. Environmental Research Letters. Whelan, B., & Titamnis, Z. (1982, 2). Daily chemical variability of domestic septic tank effluent. Water, Air, and Soil Pollution, 17(2), 131-139. Retrieved from http://link.springer.com/10.1007/BF00283296 Wilhelm, S., Schiff, S., & Cherry, J. (1994, 11). Biogeochemical Evolution of Domestic Waste Water in Septic Systems: 1. Conceptual Model. Ground Water, 32(6), 905-916. Retrieved from http://doi.wiley.com/10.1111/j.1745-6584.1994.tb00930.x Wilhelm, S., Schiff, S., & Robertson, W. (1994, 2). Chemical fate and transport in a domestic septic system: Unsaturated and saturated zone geochemistry. Environmental Toxicology and Chemistry, 13(2), 193-203. Retrieved from http://doi.wiley.com/10.1002/etc.5620130203 Withers, P., Jarvie, H., & Stoate, C. (2011, 4). Quantifying the impact of septic tank systems on eutrophication risk in rural headwaters. Environment International, 37(3), 644-653. Retrieved from https://www.sciencedirect.com/science/article/pii/S0160412011000043 Withers, P., May, L., Jarvie, H., Jordan, P., Doody, D., Foy, R., . . . Deal, N. (2012, 12). Nutrient emissions to water from septic tank systems in rural catchments: Uncertainties and implications for policy. Environmental Science & Policy, 24, 71-82. Retrieved from https://www.sciencedirect.com/science/article/pii/S1462901112001293#! 95 APPENDIX B: CHAPTER 2 DATA Table A.2.1. Complete in-situ data. Concentrations measured for each analyte for each sample location. Shown in mass/L, note the change in mass unit for the phosphorus compounds. Septic Data May Data Site Total Seasonal Full Occupation Full Occupation NPOC TDN Na K Mg Ca Cl NO as N SO4 NH4 as N SRP ~SRP TP DOP & SRP Systems Systems Systems Proportion (mg/L) (mg/L) (mg/L) (mg/L) (mg/L) (mg/L) (mg/L) (mg/L) (mg/L) (mg/L) (ug/L) (ug/L) (ug/L) (ug/L) A1 55 17 35 0.636 6.82 0.29 2.27 0.66 8.19 42.15 4.46 0.04 4.82 0.023 0.86 7.98 4.78 4.29 A2 4019 900 2886 0.718 4.03 0.3 5.48 0.89 8.93 41.31 10.14 0.01 4.01 0.034 0.47 10.42 5.62 4.68 A3 418 126 261 0.624 4.29 0.38 4.31 0.71 9.65 49.61 7.32 0.09 5.49 0.036 1.63 6.94 6.88 4.29 A4 433 106 307 0.709 4.82 0.39 5.03 0.74 8.95 47.04 8.37 0.09 5.41 0.029 1.24 11.11 27.09 5.46 A6 184 65 107 0.582 9.77 0.42 2.86 0.75 8.46 43.78 4.55 0.11 5.24 0 3.94 11.46 10.25 5.07 A9 173 51 118 0.682 5.02 0.3 1.13 0.55 8.71 43.11 0.73 0.01 6.1 0.06 2.40 5.20 10.25 4.29 A11 2853 937 1797 0.630 5.93 0.51 6.12 0.73 8.11 41.86 9.7 0.11 5.81 0.013 3.17 12.50 41.83 6.63 A13 6460 2754 3292 0.510 13.25 0.61 5.52 0.9 5.72 26.18 8.14 0.07 3.31 0 11.66 11.46 24.57 13.63 A14 1161 572 563 0.485 12.6 0.5 5.3 0.77 6.9 33.33 8.12 0.06 4.71 0.011 4.72 9.72 20.36 8.57 A16 2866 939 1805 0.630 6.78 0.41 2.88 0.77 8.74 44.83 4.5 0.11 4.69 0 5.49 11.11 24.57 7.02 A18 427 247 166 0.389 4.98 0.37 1.52 0.43 8.12 35.87 1.1 0.05 10 0.029 2.40 5.54 122.26 8.18 A19 3843 2021 1611 0.419 8.27 0.41 2.41 0.64 6.14 32.93 3.51 0.01 3.3 0 2.79 6.24 21.62 7.02 A21 1417 821 548 0.387 5.85 0.31 3.06 0.52 10.42 44.42 3.15 0.12 12.97 0 3.56 6.94 16.15 7.41 A23 1199 354 790 0.659 18.34 0.75 8.92 0.75 6.1 29.81 16.16 0.1 4.02 0.026 8.96 9.72 16.15 12.08 M1 563 174 372 0.661 4.15 0.24 1.36 0.63 8.16 45.22 2.76 0.05 6.06 0.011 1.63 8.33 15.73 3.51 M2 329 137 176 0.535 3.57 0.32 3.17 0.62 9.81 50.64 4.66 0.12 5.99 0.033 3.56 4.85 12.36 4.29 M3 393 237 143 0.364 8.23 0.42 1.99 0.59 8.26 43.7 3.66 0.12 4.93 0.076 1.24 5.89 12.36 3.90 M4 638 331 291 0.456 5.66 0.47 3.17 0.53 8.65 39.02 4.57 0.1 6.74 0.074 2.79 4.50 11.94 3.90 M5B 43 13 30 0.698 8.58 0.3 2.01 0.64 8.31 41.83 5.17 0.03 4.2 0.005 3.17 5.89 8.57 10.52 M6 615 287 310 0.504 4.58 0.19 3.55 0.58 9.04 45.3 5.02 0 6.22 0 1.63 9.37 21.62 3.51 M7 9 2 7 0.778 5.06 0.36 1.46 0.53 9.45 42.35 0.91 0.19 9.75 0.016 3.17 3.80 16.15 4.29 M8 143 68 69 0.483 3.96 0.23 2.14 0.55 8.31 39.96 1.55 0.04 11.8 0 3.17 4.85 14.88 6.63 M9A 626 329 269 0.430 2.76 0.9 6.95 38.39 3.86 0.02 5.11 0.094 3.56 6.24 12.36 5.85 M9B 904 414 443 0.490 9.7 0.38 1.89 0.63 6.53 36.08 3.14 0.02 3.25 0.004 3.17 7.98 15.73 5.07 M9C 158 29 115 0.728 7.3 0.41 2.96 0.77 6.93 38.66 4.97 0.1 3.51 0.008 3.94 7.28 22.88 6.24 M9D 7 4 3 0.429 8.91 0.54 2.89 0.85 6.44 36.6 4.74 0.11 3.01 0.012 6.26 8.33 23.73 8.18 M9E 154 61 84 0.545 9.71 0.54 2.77 0.95 6.24 35.76 4.39 0.05 3.41 0.012 3.17 8.33 19.51 8.96 M10 104 62 40 0.385 8.28 0.3 3.63 0.61 8.94 44.25 5.09 0.06 7.66 0.017 0.86 6.94 21.20 7.80 M11 7 2 5 0.714 14.89 0.7 0.88 0.8 4.22 21.23 0.8 0.13 1.85 0.025 7.03 12.50 27.94 12.85 M13 200 98 89 0.445 6.62 0.26 2.08 0.57 9.26 42.79 2 0.03 13.93 0.005 4.72 7.28 35.94 8.18 M15 317 71 214 0.675 4.83 0.42 3.34 0.62 10.57 42.41 4.99 0.21 5.5 0.019 4.33 7.98 26.25 8.18 M16 197 12 166 0.843 9.65 0.81 2.79 0.7 13.29 53.47 6.55 0.75 8.34 0 3.17 5.54 11.51 3.90 August Data May Discharge Data August Discharge Data Site NPOC TDN Na K Mg Ca Cl NO as N SO4 NH4 as N SRP ~SRP TP DOP & SRP Discharge Discharge Discharge Discharge (mg/L) (mg/L) (mg/L) (mg/L) (mg/L) (mg/L) (mg/L) (mg/L) (mg/L) (mg/L) (ug/L) (ug/L) (ug/L) (ug/L) (m3/sec) (L/sec) (m3/sec) (L/sec) A1 9.53 0.28 2.72 0.56 9.26 38.42 5.14 0.01 4.51 0.005 2.40 3.43 28.08 5.09 0.2514 251.4 ? 200 A2 5.35 0.35 6.74 0.98 9.96 37.54 12.43 0.02 4.12 0 4.45 5.48 11.60 7.00 0.261 261 0.124 124 A3 7.69 0.31 4.65 0.65 10.97 53.46 7.85 0.08 5.52 0.017 3.77 5.48 17.73 5.85 1.751 1751 1.23 1230 A4 4.49 0.3 5.54 0.72 10.28 50.95 9.21 0.09 5.77 0.009 4.79 8.21 19.65 5.47 2.948 2948 1.838 1838 A6 3.68 0.37 2.82 0.59 11.38 52.86 4.48 0.14 7.03 0.004 4.79 10.60 12.37 5.47 1.651 1651 0.673 673 A9 7.46 0.23 1.31 0.47 10.34 45.32 0.75 0.01 6.4 0.016 3.77 8.89 11.98 4.32 0.699 699 0.48 480 A11 2.99 0.31 6.69 0.56 9.64 45.53 10.58 0.1 6.88 0.002 8.21 9.57 10.45 7.77 11.563 11563 7.042 7042 A13 3.62 0.3 8.15 0.55 10.92 46.73 13.05 0.12 11.35 0.003 12.31 11.96 9.68 7.00 14.188 14188 2.037 2037 A14 5.87 0.25 6.29 0.52 10.05 44.8 9.55 0.07 8.22 0.005 6.16 6.50 11.60 2.79 28.233 28233 10.722 10722 A16 3.64 0.19 3.62 0.49 11.19 49.15 5.34 0.01 5.94 0 6.50 9.92 11.60 8.15 6.022 6022 2.53 2530 A18 4.21 0.27 1.9 0.41 10.51 44.9 1.21 0.04 10.17 0.002 4.45 6.16 8.53 5.85 1.099 1099 0.672 672 A19 11.17 0.42 3.13 0.38 10.35 46.72 3.43 0.05 5.42 0.013 7.53 12.65 18.50 11.60 2.327 2327 0.752 752 A21 2.68 0.23 3.05 0.44 11.31 47.14 3.04 0.06 12.76 0.007 5.13 12.65 40.34 19.27 2.688 2688 2.373 2373 A23 3.88 0.27 10.41 0.65 10.21 47.25 20.16 0.06 8 0 8.21 9.92 31.15 27.70 1.021 1021 2.81 2810 M1 8.23 0.25 1.58 0.67 9.2 50.5 3.46 0.01 5.64 0.015 4.45 6.16 84.80 84.03 0.685 685 0.142 142 M2 3.28 0.26 3.22 0.67 10.21 52.13 5.04 0.1 5.91 0.003 4.11 5.13 37.66 26.55 4.198 4198 2.853 2853 M3 3.4 0.35 2.08 0.55 9.43 49.33 4.1 0.18 5.75 0.001 4.11 5.82 38.04 84.42 0.4916 491.6 0.251 251 M4 3.9 0.35 3.65 0.57 9.03 38.54 5.46 0.13 7.17 0 2.40 4.45 45.71 38.04 0.938 938 0.538 538 M5B 3.96 0.3 1.98 0.49 10.02 43.75 5.18 0.05 4.95 0.01 4.11 6.16 48.01 38.04 0.121 121 0.091 91 M6 6 0.21 3.77 0.57 9.6 47.37 5.28 0.07 6.38 0 2.40 2.74 40.73 38.81 9.527 9527 7.326 7326 M7 2.46 0.35 1.54 0.47 10.64 47.52 0.9 0.24 10.12 0.004 2.74 5.82 38.43 43.79 0.309 309 0.288 288 M8 3.57 0.25 2.25 0.59 9.75 46.34 1.39 0.03 14.3 0.019 4.79 6.84 43.03 39.58 0.809 809 0.538 538 M9A 11.2 0.31 1.85 0.65 10.17 53.12 3.02 0.01 5.88 0.013 5.48 6.84 43.41 42.26 0.1659 165.9 0.039 39 M9B 8.69 0.23 1.82 0.57 9.82 49.99 2.64 0.05 5.58 0.015 3.43 6.84 48.39 40.34 1.051 1051 0.314 314 M9C 4.55 0.31 3.97 0.7 9.6 51.48 6.95 0.12 5.22 0.01 4.79 26.08 58.36 46.86 1.803 1803 0.728 728 M9D 9.41 0.35 4.26 0.66 9.48 48.53 7.15 0.12 4.84 0.013 5.48 5.54 66.40 48.39 2.243 2243 0.755 755 M9E 9.57 0.34 4.19 0.75 9.6 48.48 6.69 0.1 7.05 0.018 3.09 6.84 50.69 51.46 2.826 2826 1.208 1208 M10 3.36 0.19 3.57 0.54 9.64 47.24 5.01 0.09 8.19 0.023 3.77 3.77 72.92 83.65 12.657 12657 10.778 10778 M11 3.37 0.74 1.51 0.6 10.13 42.1 1.21 0.66 6.07 0.003 3.43 6.50 18.88 3.17 0.213 213 0.044 44 M13 7.36 0.15 2.46 0.55 11.28 48.94 2.33 0.01 19.35 0 4.11 8.21 10.45 7.00 1.51 1510 0.497 497 M15 7.41 0.34 3.38 0.59 11.24 44.64 4.99 0.22 5.78 0 4.11 8.21 13.13 6.62 0.448 448 0.826 826 M16 2.81 0.81 2.8 0.71 13.26 53.24 6.07 0.73 8.64 0 3.09 6.16 7.00 3.17 0.125 125 0.094 94 96