COMBINING REMOTE SENSING, MACHINE-LEARNING AND MECHANISTIC MODELING TO IMPROVE COASTAL HYDRODYNAMICS AND WATER QUALITY MODELING IN THE LAURENTIAN GREAT LAKES By Saeed Memari A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Civil Engineering – Doctor of Philosophy 2023 ABSTRACT Large lakes frequently act as early indicators of shifts in the environment and observations within the Great Lakes ecosystems continue to highlight a deterioration in water quality, a surge in the occurrences of algal blooms, and growing threats to indigenous species. Given the inherent complex dynamics of these inland seas and the growing environmental pressures, it is important to understand shifts in the intricate process dynamics governing these systems. Hydrodynamics and temperature, in particular, are fundamental variables that play significant roles as they influence multiple physical, chemical, and biological processes that take place within the lakes and their coastal areas. The goal of this study is to improve models that focus on coastal hydrodynamics and water quality in the Great Lakes. Extensive field datasets were collected in Lake Huron and Lake Erie over multiple years focusing on diverse factors affecting coastal processes including the roles of oscillating, bidirectional exchange flows between Lake Michigan and Lake Huron at the Straits of Mackinac, groundwater upwelling and submerged sinkholes in bays of Lake Huron, and contaminant plumes originating from rivers draining into the lakes. The performance of the models heavily depends on the quality of boundary forcing data and how the domain is discretized. Thus, a systematic assessment was done to improve the models by improving domain discretization through depth-adaptive triangular meshes and nested-grid methods. Additionally, detailed meteorological forcing fields were created with reanalysis and in-situ datasets. High-resolution time series data for water quality variables were generated using machine learning models since traditional monitoring data are notorious for their low temporal resolution, especially for microbiological water quality. The accuracy and performance of the models were tested against in-situ observations. This encompassed data on currents, lake levels, water temperature, and water quality variables (turbidity and Escherichia coli concentrations). High-resolution remote sensing imagery was also incorporated for a comprehensive evaluation of spatial plume dynamics. Novel insights from this research include an understanding of the crucial role played by the exchange flows in the Straits of Mackinac on transport timescales and biophysical processes in the bays of Lake Huron. The exchange flows significantly influence regions as far down as 50-70 km from the straits changing, among other things, bottom currents, which have important implications for biogeochemical processes, including the resuspension of bottom sediment, nutrient availability (e.g., nitrogen and phosphorus), and the growth and sloughing events of benthic algae such as Cladophora. Observed vertical velocities close to the lake bed in Thunder Bay, Lake Huron were found to be an order of magnitude higher compared to simulated vertical velocities of the same system using models that did not explicitly account for groundwater inflow from the karst lake bed. Models and data were used to estimate the upwelling groundwater flux and to quantify the impacts of ignoring groundwater in this system. The study highlights the significant benefits of merging best-available techniques and a fusion of mechanistic modeling, machine learning, and remote sensing to push the envelope of model performance in the Great Lakes. This integration further optimized the utilization of remote sensing imagery, particularly in coastal areas. This research is anticipated to aid management initiatives aimed at enhancing and preserving the resilience of coastal regions. Suggested citation: Memari, S. (2023), “Combining Remote Sensing, Machine-Learning and Mechanistic Modeling to Improve Coastal Hydrodynamics and Water Quality Modeling in the Laurentian Great Lakes”, Ph.D. Dissertation, Michigan State University, East Lansing, Michigan, United States. Copyright by SAEED MEMARI 2023 ACKNOWLEDGEMENTS This research was partially funded by the U.S. Geological Survey (Award No. G19AC00109). I wish to extend my sincere thanks to my academic advisor, Professor Mantha S. Phanikumar, for his invaluable guidance, enthusiastic support, and constant encouragement throughout my doctoral studies. My gratitude also extends to my dissertation committee members, Professors Yadu Pokhrel, Narendra Das, and Pang-Ning Tan, for their insightful suggestions and constructive feedback on my dissertation work. The fieldwork and data collection efforts were greatly aided by the support of the USGS team, Dr. Mary Anne Evans and Mr. Glen Black, whose assistance I deeply appreciate. I am grateful to Dr. Vishnu Boddeti for introducing me to deep learning and coordinate-based networks, and for our ongoing collaboration. I would like to acknowledge my peers Dr. Chelsea Weiskerger and Muhammad Bilal Zafar for their engaging discussion and support. I would also like to extend my heartfelt thanks to my friends Drs. Jason Smith, Mahdi Zareei, Mina Akbari, Alireza Safaripour, and Ghazaleh Ahmadi for their invaluable support, compassion, and enduring friendship. Most importantly, my heartfelt gratitude goes to my family for their unconditional love, support, and encouragement. v TABLE OF CONTENTS CHAPTER 1 INTRODUCTION .................................................................................................... 1 CHAPTER 2 ASSESSING TRANSPORT TIMESCALES IN LAKE HURON'S HAMMOND BAY: THE CRUCIAL ROLE OF THE STRAITS OF MACKINAC'S EXCHANGE FLOWS .... 7 CHAPTER 3 EVALUATING THE ROLE OF GROUNDWATER AND SUBMERGED SINKHOLES IN CIRCULATION AND THERMAL STRUCTURE OF THUNDER BAY, LAKE HURON ............................................................................................................................. 59 CHAPTER 4 INTEGRATING MACHINE LEARNING, NUMERICAL MODELING, AND REMOTE SENSING FOR THE ASSESSMENT OF WATER QUALITY INDICATORS ......... 92 CHAPTER 5 CONCLUSIONS .................................................................................................. 126 BIBLIOGRAPHY ....................................................................................................................... 128 vi CHAPTER 1 INTRODUCTION 1.1. Problem Description Lakes, oceans, and other water systems are vital indicators of the health and direction of our environment. The state of the Great Lakes highlights the fine line between natural habitats and the effects of human interventions. With increasing contaminant loads from both point and nonpoint sources, coupled with long residence times and favorable physical environments, the Great Lakes have become vulnerable to a range of environmental stressors (Tomlinson et al., 2010; Smith et al., 2015). Such challenges manifest in the form of nuisance algal blooms, dreissenid mussel invasions, deteriorating water quality, and potential threats to native flora and fauna, ultimately influencing the overall health of the Great Lakes ecosystem. Nonetheless, the issues go beyond this. With the increasing speed of climate change globally, lake ecosystems, with their interconnected surface and subsurface activities, act as revealing indicators of climate alterations. (Adrian et al., 2009; Schindler, 2009). Recent observations, such as the rising trend in global lake surface temperatures, underscore the magnitude and immediacy of the issue (Bilotta & Brazier, 2008). Moreover, the intertwined dynamics of surface waters and groundwater present a complex picture. Groundwater, with its temperature stability attributes, can often mitigate the impacts of surface warming, especially in ecosystems predominantly fed by it (Kaushal et al., 2010; Menberg et al., 2014; Malik et al., 2021; Barbieri et al., 2021). When we narrow our focus to areas such as bays or coastlines, the situation gets even more complex. While being components of broader lake systems, bays have shown specific reactions to climate shifts (Winslow et al., 2015). Parallelly, coastal regions present a unique set of challenges, 1 especially when monitoring water quality variables such as turbidity. Turbidity, an important indicator of water clarity and quality, is influenced by various factors, including particulate matter and suspended sediment solids (SSD). Any perturbation in water clarity can have cascading effects on aquatic ecosystems, making its monitoring essential (Bilotta & Brazier, 2008). Turbidity's significance extends beyond just an indicator of water quality. Its interplay with the clarity of water can indirectly measure the influx of pollutants that might have made their way into water bodies. Although methods exist to quantify turbidity, such as through Nephelometric Turbidity Units (NTU), Formazin Turbidity Units (FTU), and milligrams per liter (mg/l), the correlation between turbidity and SSD concentrations remains complex due to the sensitivity of these metrics to variations in particle size and shape, and the method by which it is measured (Bilotta & Brazier, 2008). This underscores the necessity of a more comprehensive understanding and assessment of turbidity concentration to evaluate water quality dynamics. Given these complex challenges, Researchers have utilized a range of techniques to track and forecast environmental factors, particularly in ever-changing coastal regions. Numerical modeling, grounded in three-dimensional conservation principles, has seen extensive use in simulating the fate and transport of various parameters like turbidity in water bodies (Sequeiros et al., 2009). Yet, the accuracy of these models often hinges on the realistic representation of multiple factors, emphasizing the need for further refinements (Zhu et al., 2022; Jalón-Rojas et al., 2021). Concurrent with these advancements, the consistent inflow of high-resolution remote sensing data is revolutionizing environmental oversight. When combined with the rapidly advancing area of machine learning, remote sensing unveils new possibilities for enhancing our knowledge and predictive skills (Arias-Rodriguez et al., 2023; Syariz et al., 2021). However, the interplay between 2 these advanced methods and traditional numerical models presents its own set of challenges, necessitating an integrated approach that harmoniously blends the strengths of each method. As we strive to comprehend and mitigate the impacts, an interdisciplinary and holistic approach becomes not just preferable but essential. Diving deeper into the complexities, it is evident that while our understanding of aquatic systems has progressed considerably, significant gaps remain. Semi-enclosed bays in marine and freshwater systems, for instance, possess unique physical conditions due to factors like complex shorelines and reduced offshore transport and mixing (Yuan et al., 2007; Kenov et al., 2012). These conditions often lead to unique challenges and phenomena, emphasizing the need for specialized studies and solutions. Moreover, the availability and accuracy of meteorological and hydrological data play a pivotal role in enhancing our understanding of aquatic systems. In regions like the Great Lakes, while there is an abundance of real-time and historical meteorological data, the need for improved and realistic meteorological forcing fields remains a pressing concern (Safaie et al., 2016; Xue et al., 2015). In Conclusion, as we stand at the crossroads of traditional research and advanced technology, the path ahead aims to integrate knowledge from various fields, striving for a more holistic understanding of our aquatic systems and their complex interactions. 1.2. Summary of Specific Aims The overarching objective of this research is to unravel the multifaceted interactions between physical processes, water quality, and ecosystem dynamics in the bays and coastal areas of the Great Lakes. Given the unique characteristics and challenges associated with these systems, the emphasis is on understanding the role of various environmental stressors, including water circulation, groundwater interactions, and turbidity levels. 3 1. What is the role of transport timescales and meteorological data in understanding the fate and transport of nutrients and contaminants in semi-enclosed bays like the Hammond Bay area, Lake Huron? • Aim 1: Investigate the significance of forcing fields in circulation and temperature. • Aim 2: Quantify the effects of bidirectional exchange flow in the Straits of Mackinac on circulation and temperature in the Hammond Bay area. • Aim 3: Provide estimates of the residence time in the Hammond Bay area under varying summer conditions, accounting for the influence of bidirectional exchange flow. 2. How does groundwater influence the hydrodynamics and thermal structure of bays, specifically Thunder Bay, during the summer stratification period? • Aim 1: Determine alterations in summer stratification and circulation patterns when groundwater contributions are overlooked. • Aim 2: Evaluate whether groundwater in Thunder Bay originates solely from identified sinkholes or if there is evidence suggesting groundwater as a significant water source throughout the bay. 3. How can integrated approaches, combining numerical modeling, machine learning, and remote sensing, enhance the prediction and monitoring of turbidity in dynamic coastal environments? • Aim 1: Develop machine learning models to generate hourly boundary conditions for the mechanistic model. • Aim 2: Develop machine learning models trained on synthetic data from numerical modeling to predict turbidity using remote sensing imagery in the visible bands. 4 • Aim 3: Investigate the applicability of the trained models in different domains through transfer learning techniques. By addressing these research questions, the study aims to provide a holistic and comprehensive understanding of the physical, chemical, and biological dynamics of the targeted aquatic systems. This research not only seeks to contribute novel insights into the intricate workings of these systems but also to propose innovative methodologies for effective monitoring and prediction. 1.3. Expected Benefits and Significance The primary goal of this research is to weave together numerical modeling, machine learning techniques, remote sensing, and field observations to deepen our understanding of the intricate dynamics of aquatic ecosystems, especially in the context of the Great Lakes region, their adjoining bays, and coastal areas. This integrated approach is poised to offer several pivotal insights and benefits: 1. Holistic Understanding of Aquatic Systems: By delving into the roles of transport timescales, groundwater interactions, and turbidity dynamics, the research seeks to present a comprehensive picture of how various environmental factors and stressors influence aquatic systems. This understanding is crucial given the multifaceted challenges these systems face, from invasive species to climatic shifts. 2. Methodological Advancements: This study is set to provide a systematic and innovative approach to studying aquatic ecosystems. By leveraging the strengths of traditional numerical models with machine learning techniques and remote sensing, the research aims to improve environmental modeling and predictions. 3. Management and Conservation: Insights derived from this research could play an important role in informing and enhancing management strategies for aquatic ecosystems. Whether 5 it is addressing the challenges posed by harmful algal blooms, understanding groundwater contributions, or predicting turbidity levels, the findings could guide interventions to safeguard the health and sustainability of these systems. 4. Broad Applicability: While the focus is on specific regions like the Great Lakes, the methodologies and insights from this research hold the potential to be applied to other aquatic systems worldwide. This broad applicability can play a significant role in global efforts to understand, protect, and restore water bodies. Fundamentally, the impact of this research goes beyond its direct goals. By providing a more comprehensive and interconnected insight into aquatic ecosystems, the investigation stands to shape both scholarly discussions and practical conservation and management approaches. 6 CHAPTER 2 ASSESSING TRANSPORT TIMESCALES IN LAKE HURON'S HAMMOND BAY: THE CRUCIAL ROLE OF THE STRAITS OF MACKINAC'S EXCHANGE FLOWS 2.1. Introduction Increasing contaminant loads from point and nonpoint sources, long residence times, and a favorable physical environment can generate ideal conditions for the production of nuisance algal blooms, dreissenid mussel invasion, poor water quality, and possible threats to flora and fauna collectively affecting the Great Lakes ecosystem. Ecosystems, such as the Great Lakes, often experience multiple environmental stressors simultaneously, with some stressors, particularly invasive and nuisance species like dreissenid mussels, having significant impacts on the ecosystem (Tomlinson et al., 2010; Smith et al., 2015). Semi-enclosed bays in marine and freshwater systems have the potential to provide environments with specific physical conditions, due to their complex shorelines and/or weak nearshore mixing. This is because physical barriers in these areas can reduce the effects of tides or lake-wide circulation, leading to reduced offshore transport and mixing (Yuan et al., 2007; Kenov et al., 2012; Ge et al., 2012; Nguyen et al., 2014; Rynne et al., 2016; Safaie et al., 2016; Wheat et al., 2019). Transport timescales, including flushing and residence times, are important indicators in the study of the fate and transport of nutrients and contaminants in semi-enclosed bays. These timescales can also be used to identify potential locations for harmful algal blooms (HABs) and nuisance green algae such as Cladophora. The residence time is extensively discussed in previous literature (Geyer, 2000; Monsen et al., 2002; Nguyen et al., 2014; Bravo et al. 2020). Bays, wetlands, 7 lagoons, and tidal estuaries are of particular interest due to exchange flows. Previous studies have used numerical models, dye release, and Lagrangian particle tracking to investigate the variability of residence time in these environments (Yuan et al., 2007; Kenov et al., 2012; Nguyen et al., 2014; Defne and Ganju, 2015; Ralston et al., 2015; Hamidi et al., 2015; Rynne et al., 2016; Lee et al., 2020). In the Great Lakes region, Nguyen et al. (2014 & 2017), Mao and Xia (2020 a,b), and Bravo et al. (2020) used hydrodynamic models and data from Acoustic Doppler Current Profiler (ADCP) and drifters to investigate circulation and residence time in Saginaw Bay of Lake Huron, eastern coast of Lake Michigan, and Green Bay of Lake Michigan, respectively. They showed that the trajectories of drifters are controlled by winds and initial releasing sites, and strong winds significantly impact nearshore dynamics. In addition, they found that the circulation patterns strongly control the release and timing of nutrients and contaminants from the bays to the lake. Accurate circulation patterns in the lakes are also important in water quality modeling/management, mainly due to the transport of nutrients and other solutes by water and the effect of water temperature on dissolved nutrients, reaction rates, water quality parameters, and biological activity. Previous research has investigated the impact of bidirectional exchange flows in the Straits of Mackinac on the hydrodynamics and exchange between Lakes Michigan and Huron. Studies by Saylor and Sloss (1976), Saylor and Miller (1976, 1979), Quinn (1992), and Anderson and Schwab (2013, 2017) have shown that oscillating currents with a period of 2-3 days occur between the lakes. The average flow from Lake Michigan to Lake Huron was found to be 1900 m3/s, with a superimposed oscillating flow on the mean flow. The main factors driving this oscillation include differences in water surface elevation and local meteorological conditions such as wind and air pressure. Anderson and Schwab (2013) developed the first 3D hydrodynamic model of the connection between Lake Michigan and Lake Huron at the Straits of Mackinac, which 8 supported the Helmholtz mode as the main mechanism behind the oscillation. The study also found that bidirectional exchange flow can extend hydrodynamic modifications as far as 70 km in Lake Michigan and 50 km in Lake Huron, potentially contributing to contamination. The Enbridge Line 5 oil pipeline, located at the bottom of the Straits of Mackinac, is an aging infrastructure that poses a significant risk of environmental disaster. The pipeline stretches between Michigan's lower and upper peninsula, and as it ages, concerns about potential oil leaks have risen (Melstrom et al., 2019). This is of great importance considering the location and direction of currents in the Straits, which may result in the pollution of both lakes (Schwab, 2016) and nearby bays. Residence time considerations, or the amount of time a substance remains in a specific area, are at the heart of discussions related to the fate and transport of oil and other contaminants in the region, and technologies for their remediation (Byrne et al., 2021). A review of the literature shows that although transport time scales have been estimated for other bays in the Great Lakes region, the influence of bidirectional exchange flows on transport time scales in bays of Lake Huron closer to the Straits have not been studied in the past. Therefore, it is necessary to investigate the potential impact of bidirectional exchange flows on transport time scales in the bays of Lake Huron, to understand the risk of contamination and to develop effective remediation strategies. This study employs numerical modeling and reanalysis forcing dataset to model circulation, temperature, and residence times. Field observations are used to validate the model and time series analysis is applied to quantify the effect of bidirectional exchange flow on residence time. Therefore, the objectives of this paper are to: (a) Quantify the effect of bidirectional exchange flow in the Straits of Mackinac on the circulation and temperature in the Hammond Bay area, Lake Huron, for two recent years (2018 and 2019), (b) Quantify the changes in the volumetric flux and transport caused by the bidirectional exchange flow, and (c) Provide the first estimates of the 9 residence time in the Hammond Bay area for summer conditions with and without the effect of bidirectional exchange flow. We use field observations combined with three-dimensional hydrodynamic and solute transport models based on the Finite-Volume Coastal Ocean Model, FVCOM (Chen et al., 2003) to study the three objectives. 2.2. Materials and Methods 2.2.1. Study area Lake Huron connects with Lake Michigan through the Straits of Mackinac at the north (Figure 1a). This research focuses on the Hammond Bay area in Lake Huron, located 50 km from the Straits (Figure 1 b, c, d). The Straits of Mackinac are a narrow waterway with an average depth of 20 m and a minimum width of 6 km and play an important role in transporting goods between the cities and ports of the two lakes. The Hammond Bay area has a surface area of 430 km2 and a mean depth of 38.4 m. It has been a subject of interest for research in ecology and fisheries (Duan et al., 2016; Binder et al., 2017; Nowicki et al., 2017; Nevers et al., 2018). In addition, the United States Geological Survey (USGS) is conducting field measurements in the Hammond Bay area to address questions related to Cladophora in the Great Lakes. The Ocqueoc River, which drains a watershed of approximately 382 km2, flows into the bay. Although recent estimates of discharge in the river are not available, historical data reported at USGS gage # 04132160 provide a mean annual discharge value of about 4.67 m3/s (USGS, 2022). 10 Figure 1. (a) The FVCOM numerical mesh and (inset in panel a) bathymetry of Lakes Michigan and Huron. The locations of the important physical features, the Straits of Mackinac and the Hammond Bay area, are shown on the mesh. ERA5 data points are shown on the bathymetric map as dots. (b) The FVCOM nested-grid of the Hammond Bay area (c) the bathymetry map of the Hammond Bay area. The ADCP location is shown with a white square. (d) The FVCOM unstructured mesh for the CHM model, showing the location of the ADCP, MACM4 station, transects, and the Hammond Bay area in Lake Huron. 11 The shoreline geometry, bathymetry, and the geographic location of the bay affect the hydrodynamics, temperature, and solute transport in the bay by both the lake-wide circulation and the bidirectional exchange flow from the Straits of Mackinac. Therefore, such physical features and a short distance from the Straits have made this site an ideal location to study the effects of both lake-wide and bidirectional exchange flow from the Huron-Michigan system on the hydrodynamics, temperature, contaminant transport, and water quality. 2.2.2. Field Measurements and Data A bottom-mounted, up-looking Acoustic Doppler Current Profiler (ADCP, 600 kHz Workhorse Monitor, Teledyne RD Instruments, Poway, California) was deployed in the Hammond Bay area of Lake Huron (Figure 1d) during the summer of 2019. The ADCP recorded 3D velocities from June 25th to August 13th, 2019, at the coordinates of longitude -84.1542 and latitude 45.5948. The water temperature was also monitored by the ADCP at its deployment depth of 6.25 m. The measurement bins were set to 0.4 m during this deployment period. Shoreline data of Lake Huron and Lake Michigan were downloaded from the National Oceanic and Atmospheric Administration, Great Lakes Environmental Research Laboratory (NOAA- GLERL) website (https://www.glerl.noaa.gov/). Bathymetric data (National Geophysical Data Center, 1996 & 1999) with a resolution of 3-arc-second (~90 m) were interpolated to the numerical mesh using the natural neighbor method. Similarly, surface water temperature data from the National Data Buoy Center (NDBC) (https://www.ndbc.noaa.gov/) were used to evaluate simulated temperatures at the southern part of the Straits of Mackinac and in Lake Huron. We also used vertical water temperature data from NOAA’s thermistor chain deployed in central Lake Huron (NOAA Great Lakes Environmental Research Laboratory, 2019). Additionally, the performance of the models was evaluated using surface water level fluctuation data from NOAA 12 for six stations in Lake Huron and Lake Michigan (https://www.ndbc.noaa.gov/). The location of surface water level stations and the thermistor chain deployment are shown in Figure 2. Figure 2. Bathymetry of Lake Huron and Lake Michigan. The locations of the land-based stations are shown on the map as green dots. The location of the thermistor chain deployment and the surface water level stations are shown with white diamond and black stars, respectively. The locations of #45007 and #45008 stations are shown with rectangular boxes. FVCOM utilizes an unstructured triangular mesh, which allows for flexibility in mesh generation. As the source code does not include an automatic mesh generation tool, any mesh generation tool can be used (Chen et al., 2012). In this research, the Surface Water Modeling System (SMS) software (www.aquaveo.com) was utilized to generate the unstructured triangular mesh. SMS allows for the assignment of different functions for mesh resolution and interactive modification of the mesh. Additionally, mesh quality criteria for stability can be easily checked. The mesh was generated as a function of distance from the shoreline and bathymetric contours. This depth- adaptive approach effectively captures the complex shoreline and bathymetry gradient of lakes and 13 bays. Therefore, it allows for a more accurate representation of the study area, as it accounts for variations in the shoreline and bathymetry. Meteorological forcings (i.e., wind speed and direction, solar radiation) are among the main driving forces generating circulation in enclosed water bodies such as lakes. Therefore, their spatiotemporal resolution and the interpolation method are among the most important factors in hydrodynamic modeling. While most researchers modeling hydrodynamics and circulation in the Great Lakes have used land-based stations (Nguyen et al., 2014 & 2017) to describe forcing fields, some modelers have used reanalysis datasets (e.g., CFSv2, NCEP, NARR, CSFR, ERA5) to create forcing fields for their models, especially in areas where the density of observational networks is low or in the absence of high-quality measurements (Memari & Siadatmousavi, 2018; Beyramzade et al. 2019; Noranian Esfahani et al., 2019). Similarly, satellite data (Chen et al., 2004) such as QuikSCAT and GLSEA2 (Xue et al., 2015) have been previously used to create meteorological forcing fields. In this research, we opted for the ERA5 dataset from the European Center for Medium-Range Weather Forecast (ECMWF) due to its higher spatiotemporal resolution, ensuring a more accurate representation of the forcing fields. Additionally, ERA5 encompasses all the necessary input variables required by our hydrodynamic model, making it a comprehensive choice. The ERA5 dataset provides hourly estimates of meteorological data including wind speed, wind direction, humidity, solar radiation, cloud cover, and air temperature on a ~30 km grid and in 137 vertical levels (downloaded from https://cds.climate.copernicus.eu). 2.2.3. Numerical Model FVCOM, which is a prognostic, free-surface, 3D primitive equation model based on triangular unstructured grids (Chen et al., 2003) was used to simulate circulation and temperature in lake- 14 wide and nearshore regions. The model solves the hydrodynamic equations with hydrostatic and Boussinesq approximations. The Mellor-Yamada level 2.5 turbulence closure scheme (Mellor & Yamada, 1982; Galperin et al., 1988) was used to describe vertical eddy viscosity and diffusivity. For horizontal diffusion, the Smagorinsky turbulence closure model (Smagorinsky, 1963) was used to calculate the coefficients for horizontal momentum and thermal diffusion. The FVCOM governing equations of momentum (Eq. 1-3), continuity (Eq. 4), temperature (Eq. 5), and salinity (Eq. 6) are as follows: 𝜕𝑢̅ 𝜕𝑡 𝜕𝑣̅ 𝜕𝑡 + 𝑢̅ 𝜕𝑢̅ 𝜕𝑥 + 𝑣̅ 𝜕𝑢̅ 𝜕𝑦 + 𝑤̅ 𝜕𝑢̅ 𝜕𝑧 − 𝑓𝑣̅ = − 1 𝜌0 𝜕(𝑃̅) 𝜕𝑥 + 𝜕 𝜕𝑧 (𝐾𝑚 𝜕𝑢̅ 𝜕𝑧 ) + 𝐹𝑢 + 𝑢̅ 𝜕𝑣̅ 𝜕𝑥 + 𝑣̅ 𝜕𝑣̅ 𝜕𝑦 + 𝑤̅ 𝜕𝑣̅ 𝜕𝑧 − 𝑓𝑢̅ = − 1 𝜌0 𝜕(𝑃̅) 𝜕𝑦 + 𝜕 𝜕𝑧 (𝐾𝑚 𝜕𝑣̅ 𝜕𝑧 ) + 𝐹𝑣 𝜕𝑤̅ 𝜕𝑡 + 𝑢̅ 𝜕𝑤̅ 𝜕𝑥 + 𝑣̅ 𝜕𝑤̅ 𝜕𝑦 + 𝑤̅ 𝜕𝑤 𝜕𝑧 = − 1 𝜌̅ 𝜕𝑃̅ 𝜕𝑧 − 𝑔 + 𝜕 𝜕𝑧 (𝐾𝑚 𝜕𝑤̅ 𝜕𝑧 ) + 𝐹𝑤 𝜕𝑢̅ 𝜕𝑥 𝜕𝑇̅ 𝜕𝑡 𝜕𝑆̅ 𝜕𝑡 + 𝜕𝑣̅ 𝜕𝑦 + 𝜕𝑤̅ 𝜕𝑧 = 0 + 𝑢̅ + 𝑢̅ 𝜕𝑇̅ 𝜕𝑥 𝜕𝑆̅ 𝜕𝑥 + 𝑣̅ + 𝑣̅ 𝜕𝑇̅ 𝜕𝑦 𝜕𝑆̅ 𝜕𝑦 + 𝑤̅ 𝜕𝑇̅ 𝜕𝑧 = 𝜕 𝜕𝑧 (𝑘ℎ 𝜕𝑇̅ 𝜕𝑧 ) + 𝐹𝑇 + 𝑤̅ 𝜕𝑆̅ 𝜕𝑧 = 𝜕 𝜕𝑧 (𝑘ℎ 𝜕𝑆̅ 𝜕𝑧 ) + 𝐹𝑆 (1) (2) (3) (4) (5) (6) where 𝑢̅, 𝑣̅, and 𝑤̅ are mean velocity components in 𝑥, 𝑦, and 𝑧 directions, respectively. 𝑇̅ and 𝑆̅ are time-averaged temperature and salinity. 𝜌̅ denotes the mean density and 𝜌0 is the reference density. 𝑃̅ is the mean pressure. 𝑓 is the Coriolis parameter and 𝑔 is the gravitational acceleration. 𝐾𝑚 and 𝐾ℎ are vertical and eddy viscosity and thermal vertical eddy diffusion coefficients, respectively. 𝐹𝑢 and 𝐹𝑣 are the horizontal momentum; 𝐹𝑤, 𝐹𝑇, and 𝐹𝑠 are vertical momentum, thermal, and salt diffusion terms. By a scaling argument for vertical velocity, the vertical momentum equation (Eq. 3) is reduced to the following hydrostatic equation: 15 0 = −𝑔 − 1 𝜌̅ 𝜕𝑃̅ 𝜕𝑧 Similarly, 𝐹𝑢, 𝐹𝑣, 𝐹𝑇, and 𝐹𝑆 are modeled using the following forms: 𝐹𝑢 = 𝐹𝑣 = 𝜕 𝜕𝑥 𝜕 𝜕𝑦 (2𝐴𝑚 𝜕𝑢̅ 𝜕𝑥 ) + 𝜕 𝜕𝑦 (𝐴𝑚 ( 𝜕𝑢̅ 𝜕𝑦 + 𝜕𝑣̅ 𝜕𝑥 )) (2𝐴𝑚 𝜕𝑣̅ 𝜕𝑦 ) + 𝜕 𝜕𝑥 (𝐴𝑚 ( 𝜕𝑣̅ 𝜕𝑥 + 𝜕𝑢̅ 𝜕𝑦 )) 𝐹𝑇 = 𝜕 𝜕𝑥 (𝐴ℎ 𝜕𝑇̅ 𝜕𝑥 ) + 𝜕 𝜕𝑦 (𝐴ℎ 𝜕𝑇̅ 𝜕𝑦 ) 𝐹𝑆 = 𝜕 𝜕𝑥 (𝐴ℎ 𝜕𝑆̅ 𝜕𝑥 ) + 𝜕 𝜕𝑦 (𝐴ℎ 𝜕𝑆̅ 𝜕𝑦 ) (7) (8) (9) (10) (11) Where 𝐴𝑚 and 𝐴ℎ are horizontal diffusion coefficients for momentum and scalars. A conservative solute (tracer) transport model was coupled with the FVCOM hydrodynamic model to simulate the fate and transport of a tracer in space and time. The FVCOM tracer-tracking module was developed and tested against observed dye movement across a tidal mixing front on the southern flank of George Bank (Chen et al. 2008). The tracer module is based on a three-dimensional advection-diffusion equation: 𝜕𝐶 𝜕𝑡 + 𝑢 𝜕𝐶 𝜕𝑥 + 𝑣 𝜕𝐶 𝜕𝑦 + 𝑤 𝜕𝐶 𝜕𝑧 = 𝜕 𝜕𝑥 (𝐾𝐻 𝜕𝐶 𝜕𝑥 ) + 𝜕 𝜕𝑦 (𝐾𝐻 𝜕𝐶 𝜕𝑦 ) + 𝜕 𝜕𝑧 (𝐾𝑉 𝜕𝐶 𝜕𝑧 ) + 𝑆 (12) where t denotes time, x, y, and z denote the horizontal and vertical coordinates, respectively. C is the model dye concentration, 𝐾𝐻 and 𝐾𝑉 are the horizontal and vertical diffusion coefficients, respectively, and S is a volumetric source term. The tracer model used the same turbulence closure models as the FVCOM hydrodynamic model. 16 2.2.4. Modeling Process and Purpose In this study, for the models' open boundary conditions at the lake surface, hourly ERA5 gridded data were extracted and interpolated to the numerical mesh using the natural neighbor method. We utilized a sigma-level coordinate system for vertical discretization, which offers a more accurate representation of surface and bottom topography, enhancing model accuracy (Mellor et al., 2002). The vertical domain was divided using 20 terrain-following sigma layers (21 levels). 2.2.4.1. Large-Scale Models Two models were created to analyze lake-wide circulation and temperature: the Combined Lake Huron-Michigan model (CHM) and the Lake Huron model (LH). The CHM model includes both Lake Huron and Lake Michigan connected via the Straits of Mackinac, while the LH model treats the Straits of Mackinac as a closed land boundary with zero flux condition. This approach allows us to exclude the effects of bidirectional exchange flows from the lake-wide circulation of Lake Huron (LH model) while including their effects in the CHM model. This also enables us to quantify the impact of bidirectional flows between the lakes on the circulation and temperature within the lake system. Since several previous studies ignored the bidirectional exchange flows in models of Lake Huron hydrodynamics and transport, especially far from the Straits, the two models (LH and CHM) will allow us to evaluate the impact of these flows on hydrodynamics, transport and residence times close to the Straits. The computational domain in both models was discretized using an unstructured mesh to accurately capture the complex shoreline features and sharp bathymetric gradients in the lakes. The mesh size varied in space, with a minimum size of 400 m at the shoreline and increasing to 2,500 m in the deepest parts of the lakes. The mesh size was determined by the distance from the shoreline and bathymetry contours. The LH model had 21707 nodes and 40674 elements, while 17 the CHM model had 39328 nodes and 74139 elements. Both the CHM and LH models used the same numerical mesh representation in Lake Huron, with an equal number of nodes and elements. The simulation period for both models began on March 1st of 2018 and 2019, respectively and ended on September 15th of 2018 and 2019, covering a duration of six and a half months in each year. The initial condition was set as a cold start with zero salinity and velocity fields. The initial condition for temperature was set to 0.3 degrees Celsius at the lake surface (based on MACM4 NDBC station data) and linearly increased to 3.0 degrees at the bottom of the lake (based on the NOAA-GLERL thermistor chain data). The models simulated the circulation and temperature with a time step of 1.0 s and saved the output at hourly intervals. The time step value was chosen to ensure the stability of the numerical model using the Courant–Friedrichs–Lewy (CFL) condition (Hundsdorfer et al., 2003). The first two months of the simulation period, March and April, were considered as the spin-up time. Therefore, considering the model spin-up time, the final simulation results for both the LH and CHM models cover four and a half months. The ice modeling module was not used as the focus of the research was on the summer months of 2018 and 2019, when ice coverage over the lakes was negligible. Specifically, the monthly average ice coverage over the lakes in March and April was approximately 13.68% and 3.70%, respectively, and dropped to zero by the beginning of May (https://www.glerl.noaa.gov/data/ice). This decision was made to simplify the modeling process and focus on the key parameters of interest. The simulation results from the LH and CHM models were used to calculate the volumetric flow rates at transects AA, BB, and CC (shown in Figure 1d), apply instantaneous dye release in the Hammond Bay area, compare simulated water temperature and surface water level fluctuation to the observed data (Figure 1d and 2), and provide boundary and initial conditions for the nested model in the Hammond Bay area (Figure 1b). Additionally, the transects were selected to 18 investigate bidirectional exchange flows between Lake Huron and Michigan at the Straits of Mackinac (section AA, Figure 1d) and its downstream effects on circulation in Lake Huron, specifically in the Hammond Bay area (section BB and CC, Figure 1d). 2.2.4.2. Hammond Bay Nested-Grid Model In this study, a nested-grid model was created to describe hydrodynamics and temperature in the nearshore region of the Hammond Bay area in Lake Huron (Figure 1 b & c). The nested model, referred to as the Hammond Bay (HB) model, was coupled with either the Lake Huron (LH) or the Combined Lake Huron-Michigan (CHM) models through one-way nesting. The mesh resolution of the nested-grid model varied from 5 m to 1000 m, allowing for detailed simulations of nearshore dynamics. In contrast, the mesh resolution of the large-scale models (LH and CHM) ranged from 400 m to 2500 m, providing a broader view of the Lake dynamics. This approach allows the large- scale models to regulate the hydrodynamics and temperature in the HB model through the open boundary. This may also allow for a more accurate representation of the domain without compromising on resolution while keeping computational costs reasonable. Additionally, this approach minimizes numerical diffusion errors in solute transport simulations while eliminating the need for spin-up time by providing the boundary and initial conditions from the large-scale models. Nested-grid approaches may suffer from numerical issues due to energy accumulation at the boundary of the two domains due to different mesh sizes that cause surface gravity waves propagating from the two domains to have different speeds at the boundary. While energy accumulation and mass conservation issues can be resolved using cells that perfectly overlap between the two domains, additional issues related to non-hydrostatic pressure gradients in internal waves are known to cause wave steepening producing a temporal horizon that can not be resolved using model grid refinement alone (Hodges et al., 2006; Zamani et al., 2020). Therefore, before 19 using a nested-grid model, we have used numerical experiments to ensure that meaningful improvements in model performance can be obtained using the nested-grid model. The simulation period of the HB model was set to match the ADCP field measurements. The simulation began on June 20, 2019, and ended on August 15, 2019. The horizontal mesh for the HB model has 9233 nodes and 17932 elements, covering an area of around 400 km2. The minimum mesh size was 5 m close to the ADCP location and increased to 1000 m at the open boundary. The HB and large-scale models (LH and CHM) exchanged information at the open boundary nodes and elements of the HB model. The initial condition of the HB model was assigned by interpolating the variables from the large-scale models. The HB model simulated the hydrodynamics and temperature with a 0.1 s time step and saved the output at 20-min intervals (the same frequency as the ADCP instrument). 2.2.5. First-Order Transport Timescales Previous literature has extensively defined residence time. However, the definitions can be different based on the method by which the residence time is calculated. In this research, we define residence time as the time it takes for a water parcel, first located inside the domain of interest to move and exit from the boundaries of the model (Monsen et al., 2002). If we treat the domain as a Continuously Stirred Tank Reactor (CSTR) and release a known mass at a certain grid point, the time it takes for the mass to reach 36.78% (1/𝑒) of its initial mass is the residence time (Monsen et al., 2002; Nguyen et al. 2014). The residence time is a local measure and varies both spatially, in each grid cell, and temporally based on the time of the simulation. The residence time was calculated by releasing a dye (a conservative solute) with an initial concentration of 100 ppm at all nodes located inside the bay area. The dye concentration was subsequently tracked using the Eulerian approach in space and time. The residence time was then 20 calculated by computing the time it takes for the vertically averaged dye concentration at every surface node inside the domain to drop to 36.78 ppm which is 1/𝑒 of the initial concentration. Although this approach provides the residence time more accurately by considering the effects of meteorological forcings and current patterns, it is a compute-intensive task. Since the meteorological forcings and current patterns are subject to change in both space and time, this may result in differences in residence time estimates based on the time of the dye release. To address this and to investigate the effect of meteorological forcings on the residence time, we ran the solute transport simulations for 2018 and 2019, with the instantaneous release starting on June 20th of both years. In addition, the residence time estimates may be altered by the nature of the unstructured mesh which may be skewed in some places and the number of nodes on which the tracer was initially released. One way to prevent mesh-related artifacts and report a more consistent first-order time scale may be by converting the concentration to mass by considering the volume of the numerical elements on which the tracer was released. In doing so, we focus on the amount of mass (for example, in kg) transported by the currents instead of the concentration (e.g., in ppm). Residence times computed using both approaches (mass and concentration) are reported in the present work. Additional details of the dye release are available in Nguyen et al. (2014, 2017). The flushing time, on the other hand, was calculated by dividing the volume of the bay by the volumetric flow going in or out of the open boundaries of the nearshore domain (the boundary marked with a red line in Figure 1d). By assuming the steady state condition, the flux coming into the domain and going out was assumed to be equal. The flushing time would then be calculated as (Nguyen et al., 2014): 𝑇𝑓 = 𝑉/𝑄𝑖𝑛 (𝑜𝑢𝑡) (13) 21 where 𝑄𝑖𝑛 (𝑜𝑢𝑡) was calculated as: 𝑄𝑖𝑛 (𝑜𝑢𝑡) = 1 𝑡2 − 𝑡1 𝑡2 𝐵 𝐻 ∫ ∫ ∫ 𝑣𝑖𝑛 (𝑜𝑢𝑡). 𝑑ℎ. 𝑑𝑏. 𝑑𝑡 𝑡1 0 0 (14) In Eq. 14, 𝑣𝑖𝑛 (𝑜𝑢𝑡) is the velocity normal to the boundary of the nearshore model which should be calculated at each grid point on the open boundary by: 𝑣𝑖𝑛 (𝑜𝑢𝑡) = {𝑉⃗ . 𝑛̂ 0 𝑉. 𝑛̂ > 0 𝑉. 𝑛̂ ≤ 0 (15) In Eq. 15 and 4, 𝑉⃗ is the velocity calculated by the hydrodynamic model, 𝐵 and 𝐻 are width and depth of the boundary. 𝑡1 and 𝑡2 are the start and end times of the simulations, and 𝑛̂ is the unit vector normal to the boundary at all locations along the boundary. 2.2.6. Model Performance Metrics Three metrics were used to evaluate the performance of the model against the observed data. The metrics were the Root Mean Square Error (RMSE), Correlation Coefficient (𝑅2), and Root Mean Square Error to Standard Deviation Ratio (RSR) which is a standardized version of RMSE which incorporates the benefits of RMSE and normalization (Moriasi et al., 2007) enabling comparisons across different datasets with different orders of magnitudes. 𝑅𝑀𝑆𝐸 = √ 𝑛 𝑖=1 ∑ (𝑃𝑖 − 𝑂𝑖)2 𝑛 𝑅2 = ∑ (𝑂𝑖 − 𝑂̅)(𝑃𝑖 − 𝑃̅) 𝑛 𝑖=1 √∑ (𝑂𝑖 − 𝑂̅)2 𝑛 𝑖=1 √∑ (𝑃𝑖 − 𝑃̅)2 𝑛 𝑖=1 𝑅𝑆𝑅 = √∑ (𝑂𝑖 − 𝑃𝑖)2 𝑛 𝑖=1 √∑ (𝑂𝑖 − 𝑂̅)2 𝑛 𝑖=1 22 (16) (17) (18) In the above equations, O, P, and variables with a bar indicate the observed data, the model predicted values, and their mean values, respectively. In addition, we applied spectral analysis to simulated and observed currents to determine the relative strengths of different frequency components in the signals. Spectral analysis is a widely used technique in marine and freshwater systems to analyze currents and flows. Previous studies by Webster (1968), Senjyu et al. (2005), Anderson et al. (2013) and Klyuvitkin et al. (2019) have shown the effectiveness of this method. We employed the Welch's method to calculate the Power Spectral Density (PSD) of the signals. Compared to the periodogram method, the Welch's method is considered more appropriate for non-stationary signals as it uses averaging over multiple overlapping segments of the signals, reducing spectral leakage, and providing more accurate estimates of the PSD (Stoica & Moses, 2005). All PSD were calculated using Welch’s method with 95% confidence intervals. 2.3. Results 2.3.1. Hydrodynamic Model performance The simulated water surface temperatures of the LH and CHM models were tested against data from the MACM4 NDBC station at the Straits of Mackinac and the comparison is shown in Figure 3. Both the LH and CHM models predicted the water surface temperature reasonably well from May to the end of Jun 2019 (DOY 120 to 178). From early July to mid-September (DOY 180 to 260), the LH model failed to predict the water surface temperature as accurately as the CHM model (Figure 3). The discrepancy between the simulated LH temperature and observed data is attributed to the impacts of bidirectional exchange flows between Lake Michigan and Lake Huron ignored in the LH model, and the fact that the station is located in close proximity to the Straits. The metric scores between the simulated water surface temperatures from the LH and CHM models are 23 presented in Table 1. All three metrics indicate that the CHM model outperformed the LH model in predicting the water surface temperature at the MACM4 station close to the Straits. Figure 3. Comparison of observed surface water temperature time series at the MACM4 station and simulated (LH and CHM model) temperatures. We investigated the effect of bidirectional exchange flows on Lake Huron's vertical water temperature by comparing simulated results from the LH and CHM with the observed thermistor chain data. As shown in Figure 4, both models accurately simulated the vertical temperature and captured the formation of the thermocline during summer. However, there is a discrepancy between the observed and modeled temperature around DOY 230, which can be explained by differences in thermocline depth and the measurement and model's vertical layers. Details can be found in Table 1 and model performance metrics for the temperature comparisons in Figure 4 are included in Table 2. 24 Figure 4. Comparison of observed and simulated water temperatures as a function of depth and time in the central part of Lake Huron. (a) observed vertical water temperature time series (b) simulated water temperatures from the LH model, and (c) simulated water temperatures from the CHM model. 25 Table 1. Comparison between the center depth of the model layers and the thermistor chain measurements in Lake Huron. Center Depth of Model Layers (m) Thermistor Chain Depth (m) -5.3 -15.8 -26.3 -36.8 -47.3 -57.9 -68.4 -78.9 -89.4 -99.9 -110.5 -121.0 -131.5 -142.0 -152.5 -163.0 -173.6 -184.1 -194.6 -205.1 -9.0 -14.0 -19.0 -24.0 -29.0 -34.0 -39.0 -44.0 -56.0 -49.0 -66.0 -76.0 -86.0 -96.0 -106.0 -119.0 -131.0 -151.0 -171.0 -191.0 Table 2. Model performance metrics for the observed and simulated vertical temperatures at the location of the thermistor chain in Lake Huron. Model RMSE (°C) R² LH model CHM model 1.103 1.106 0.913 0.913 RSR 0.509 0.510 26 Figure 5. Comparison of observed and simulated (CHM and LH) water level fluctuation at (a) De Tour Village, (b) Essexville, (c) Harbor Beach, and (d) Mackinaw City in Lake Huron as well as (e) Holland and (f) Port Inland in Lake Michigan. In addition, we compared the simulated surface water level fluctuation from the models (LH and CHM) with the observed data at six stations in Lake Huron and Lake Michigan (Figure 5). Both models accurately simulated the water level fluctuation throughout the simulation period. The results are summarized in Table 3, which shows that the CHM model performed better than the LH model in all stations, particularly at the Mackinaw City station near the Straits where bidirectional exchange flows have the greatest impact. The lowest difference in error was at Essexville station, far from the Straits. The results indicate that the impact of bidirectional flows from the Straits extends beyond 50 km, influencing the surface water level fluctuation in the bays of both lakes. 27 Table 3. Model performance metrics for the simulated CHM and LH surface water level fluctuation against the observed data. Station De Tour Village (Lake Huron) Essexville (Lake Huron) Harbor Beach (Lake Huron) Mackinaw City (Lake Huron) Holland (Lake Michigan) Port Inland (Lake Michigan) Model CHM LH CHM LH CHM LH CHM LH CHM CHM RMSE (m) 0.020 0.030 0.044 0.047 0.020 0.029 0.026 0.038 0.028 0.029 R² 0.698 0.257 0.773 0.737 0.771 0.415 0.585 0.150 0.677 0.607 RSR 0.716 1.049 0.639 0.677 0.637 0.919 0.834 1.214 0.742 0.817 Comparisons of observed current and temperature data with model results from the large-scale and the nested-grid models showed that the nested-grid model produced meaningful improvements relative to the results from the large-scale models (Figures 6-8). Figures 9 and 10 show color contour plots of the observed eastward and northward velocities from the Hammond Bay ADCP deployment data against the simulated velocities (the nearshore nested model based on LH and CHM). The comparison covers a period of 50 days from late June to mid-August 2019. The vertical spatial extent of the comparisons in the figures ranges from -1 to -4 m from the lake surface. Figures 9 and 10 show that both models predicted the velocity components reasonably well over the water depth during the simulation period in the Hammond Bay area. However, the nearshore nested model based on the CHM model did better in capturing the currents compared to the LH model. 28 Figure 6. Comparisons between the observed and simulated (using large scale LH and CHM models) vertically-averaged (a) eastward velocity, (b) northward velocity, and (c) bottom temperature in the Hammond Bay area between June and August 2019. 29 Figure 7. Comparisons between the observed and simulated (using large scale LH and nested-grid LH models) vertically-averaged (a) eastward velocity, (b) northward velocity, and (c) bottom temperature in the Hammond Bay area between June and August 2019. 30 Figure 8. Comparisons between the observed and simulated (using large scale CHM and nested- grid CHM models) vertically-averaged (a) eastward velocity, (b) northward velocity, and (c) bottom temperature in the Hammond Bay area between June and August 2019. 31 Figure 9. Contour plots of (a) observed and (b and c) simulated (LH and CHM) eastward velocity components at the location the ADCP was deployed in the Hammond Bay area between June and August 2019. 32 Figure 10. Contour plot of (a) observed and (b and c) simulated (LH and CHM) northward velocity components at the location the ADCP was deployed in the Hammond Bay area between June and August 2019. Figures 11a and 11b compare the depth-averaged water velocity components u and v in the eastward and northward directions between the observed, LH, and CHM models. While the CHM model exhibited more accurate predictions of velocity magnitudes, the LH model showed larger discrepancies, particularly in the northward direction, 𝑣 (Figure 11b). Overall, the models performed better in simulating the eastward velocity magnitude and direction compared to the northward component. A comparison of the observed and simulated alongshore and cross-shore velocity components shows that the cross-shore velocity component is an order of magnitude lower compared to the alongshore velocity component (Figure 12). 33 Figure 11. Comparisons between observed and simulated (using LH and CHM models) vertically- averaged (a) eastward velocity, (b) northward velocity, and (c) temperatures in the Hammond Bay area between June and August 2019. 34 Figure 12. Comparisons between the observed and simulated (LH and CHM) vertically-averaged (a) alongshore, (b) cross-shore velocity components in the Hammond Bay area between June and August, 2019. The measured water temperatures at the depth of the ADCP deployment (5 m below the lake surface) were compared with the simulated LH and CHM temperatures in Figure 11c. Although both models predicted the bottom temperature well, the CHM model followed the observed temperature trends better. The metrics in Table 4, evaluating the performance of the models against the observed water velocity and temperature, indicate that the CHM model outperformed the LH model. For instance, the RMSE value of the eastward velocity component decreased from 0.058 m/s to 0.052 m/s in the CHM model, a more than 10% improvement in the RMSE value. In addition, improvements were also observed in the 𝑅2 and RSR metrics, showing enhancements of 11% and 12%, respectively. Similar improvements are seen in the northward velocity component 35 and water temperature. Velocity errors (Table 4) in our study were found to be similar to those obtained by Anderson and Schwab (2013) at the Straits, showing improvement in the RMSE values for the eastward component of velocity (𝑢) and a similar error threshold for the northward component of velocity (𝑣). However, it is important to note that while we draw parallels, a direct comparison between our model and Anderson and Schwab (2013) might not capture the nuances as factors such as data frequency, data processing, mesh resolution, the period of comparison, etc., affect the error values. Table 4. Model performance metrics for observed and simulated (LH and CHM) eastward velocity (u), northward velocity (v), and temperature (temp) between June and August 2019. Model Variable RMSE LH model CHM model u (m/s) v (m/s) temp (°C) u (m/s) v (m/s) temp (°C) Perfect Value 0.058 0.067 2.275 0.052 0.062 1.849 0 R2 0.642 0.678 0.711 0.731 0.746 0.795 1 RSR 0.792 0.839 0.750 0.703 0.771 0.609 0 The power spectra of the observed and simulated velocities in the Hammond Bay area are shown in Figure 13. The power spectra indicate the amount of power in each frequency of the velocity components. The highest peaks in the eastward and northward observed velocities (blue lines in Figures 13a and 13b) have periods of 2.7, 3.5, and 4.3 days. Similarly, the power in each frequency tends to diminish for periods greater than five days for the observed data. The power spectra of the velocity components for the CHM model (red lines in Figures 13a and 13b) show that the CHM outperformed the LH model in predicting peaks, especially the ones that fall between periods of 2.5 and 5 days. This is also true for periods greater than five days, where the CHM model generates 36 a similar power signature against the observed power. When the bidirectional exchange flow from the Straits of Mackinac was isolated, the magnitudes of peaks in power spectra plots (yellow lines in Figures 13a and 13b) are reduced for periods between 2.5 and 5 days. Conversely, the magnitude of peaks increases for periods greater than five days. Overall, the bidirectional exchange flow tends to increase the power in frequencies with periods of less than five days, while isolating the effect will cause a decrease in power during periods of less than five days and an unrealistic increase in power for periods greater than five days. Figure 13. Power spectra of observed and simulated (a) eastward and (b) northward velocity components in the Hammond Bay area. The simulated volumetric flux at the Straits of Mackinac (AA transect shown in Figure 1d) using the CHM model and its power spectra are shown in Figure 5a&b from May to mid-September 2019. The volumetric flow (Figure 14a) exhibits an oscillating pattern with negative and positive values ranging from -64,490 m3/s to 59,263 m3/s. To maintain consistency, negative values 37 represent flow in the direction of Lake Huron, while positive values represent flow in the direction of Lake Michigan. The mean volumetric flow from May to mid-September is -43.84 m3/s toward Lake Huron. The power spectra of the volumetric flow at the Straits (Figure 14b) indicates that higher power magnitudes have periods between 2.0 to 3.9 days, with the highest peak at 2.1 days. This is in line with the previous observations by Anderson and Schwab (2013), who indicated dominant flow at the Straits with the period between 2 to 3 days caused by the Helmholtz mode. 38 Figure 14. (a, c, & e) Time series of the volumetric flow and (b, d, & f) power spectra at the transects AA, BB, and CC, respectively, (marked in Figure 1) in the Straits of Mackinac. 39 Figure 14c shows the volumetric flux comparison between the LH and CHM models at the BB transect, which is 20 km downstream of the Straits. In the LH model, isolation of bidirectional exchange flow from the Straits resulted in unrealistically low volumetric flow values at the BB transect, ranging from -10,664 m3/s to 3,865 m3/s, with a mean of -27 m3/s in the direction of Lake Huron. In contrast, the CHM model produced higher volumetric flow values at the BB transect, ranging from -26,320 m3/s to 21,684 m3/s, with a mean of -1,648 m3/s in the direction of Lake Huron. Similarly, the power spectra of the volumetric flux at the BB transect (Figure 14d) indicates that the CHM model simulated the flow, with most peaks falling between 2.0 to 4.0 days. This pattern is similar to the power spectra at the AA transect, indicating the dominant effect of the bidirectional exchange flow at the Straits on downstream regions. It is interesting to note that the power spectra of volumetric flow based on the LH model (blue line in Figure 14d) have no similarity in power magnitude and period to the power spectra based on the CHM model (red line in Figure 14d) which has to do with excluding the bidirectional exchange flow from the Straits in the LH model. The CHM and LH models were used to estimate the volumetric flux at the CC transect (Figure 5e). The CHM model predicted volumetric flux ranging from -35,005 m3/s to 19,122 m3/s, with a mean of 1,083 m3/s in the direction of Lake Michigan. Conversely, the LH model predicted volumetric flow ranging from -17,108 m3/s to 15,880 m3/s, with a net value of 1,997 m3/s in the direction of Lake Michigan. The difference in mean volumetric flux between LH and CHM models at the CC transect indicates a weakening of exchange flows compared to the lake-wide currents. This finding is in line with the energy density plots (Figure 5f) of LH and CHM at the CC transect, where currents with longer periods dominate the currents with shorter periods (the signature of the bidirectional exchange flows from the Straits of Mackinac). 40 Furthermore, the power spectra plot of the volumetric flow at the CC transect shows two distinct regions. The first region is for periods less than five days, and the second region is for periods greater than seven days. The dominant peaks in the power spectra of the LH model have periods of greater than six days, which fall into the second region of the period and can be attributed to the lake-wide circulation pattern. Conversely, the CHM model has peak values in both regions, where the peaks in the first region are attributed to the bidirectional flow from the Straits and the second region describing the lake-wide circulation. 2.3.2. First-Order timescales (Residence time and Flushing time) The residence time for the Hammond Bay area and its change in response to the bidirectional exchange flow from the Straits of Mackinac were investigated by instantaneous dye release with a constant concentration of 100 ppm at every node and layer of the numerical nodes inside the bay area. The time it takes for the vertically averaged dye concentration inside the bay area to fall below 1/𝑒 of its initial concentration is, by definition, the residence time of the bay area. This approach provides the residence time by dynamically calculating the fate and transport of a non- reactive tracer in space and time. 41 Figure 15. Time series of normalized (a) concentration and (b) mass inside the Hammond Bay area based on the LH and CHM models for the years 2018 and 2019. The residence time of the Bay is the time (in days) when the concentration (or mass) falls below the red line (y=1/e). Figure 15 shows the residence time estimation for the Hammond Bay area based on CHM and LH models for 2018 and 2019. The x-axis denotes the time from the instantaneous dye release in days, while the y-axis shows the vertically averaged normalized (𝐶/𝐶0) concentration or normalized mass. For each model, the residence time was determined when the vertically averaged normalized concentration (or mass) falls below the red line, representing 1/𝑒 (=36.78%) of the initial concentration. In all scenarios, the vertically averaged normalized concentration inside the bay reduces with time as it is transported and diluted by currents due to advection and dispersion. The residence time for the Hammond Bay area is 16 days for 2018 based on the LH model, while it is 9.87 days based on the CHM model. Similarly, the residence time for 2019 is 23.63 and 13.75 days for the LH and CHM models, respectively (Table 5). The residence time differences between the LH and CHM models each year are because of the effect of bidirectional exchange flow from the Straits of Mackinac on transport in the bay area. The results from Figure 15 and Table 5 indicate that the bidirectional exchange flow increases the transport and therefore reduces the residence time. Although there are significant differences in the computed residence times for the two years 42 based on concentration, results presented in Table 5 show that the estimates based on mass are more consistent across the years. Table 5. The transport timescales (residence time and flushing time) for the Hammond Bay area in 2018 and 2019. Transport Timescales (day) Residence Time (C/C0) Residence Time (M/M0) Flushing Time Year 2018 2019 2018 2019 2018 2019 LH model 16.00 23.62 21.00 23.62 14.38 10.80 CHM model 9.87 13.57 16.12 16.37 12.14 8.96 Reduction (%) 62.11 74.06 30.27 44.29 18.45 20.54 Similarly, as shown in Table 5, the residence time for each LH or CHM model was different in 2018 and 2019. Although the instantaneous dye release started on June 20th each year, the calculated residence times were different each year. This indicates the effect of current patterns and meteorological forcings on the residence time. Figure 16. Spatial maps of residence time in the Hammond Bay area in 2019 computed using (a) the LH and (b) the CHM models. 43 The spatial variation of the residence time in the Hammond Bay area is shown in Figure 16a&b for the LH and CHM models in 2019, respectively. Although a single value is reported for the mean residence time of the bay area, in reality, the residence time varies spatially inside the bay area. According to Figure 16a&b, the concentration in the nearshore areas (shallow waters), north, and northwest of the bay fall below the 1/𝑒 of the initial connection faster than do the other regions inside the bay. Similarly, the CHM model for 2019 (Figure 16b) produced lower residence times across most regions inside the bay relative to values from the LH model for the same year (Figure 16a). This has to do with the bidirectional exchange flow from the Straits of Mackinac and its effect on currents and hence solute transport in the Hammond Bay area. Moreover, the bidirectional exchange flow increased the transport (decreased the residence time) on both nearshore and offshore regions of the bay area. The flushing time was calculated with and without the bidirectional exchange flow from the Straits of Mackinac during summers of 2018 and 2019. The calculated flushing times based on the CHM models are smaller than the ones calculated based on the LH models for years 2018 and 2019 (Table 5). The bidirectional exchange flows from the Straits have caused an 18% and 20% reduction in the flushing time (Table 5) for years 2018 and 2019, respectively. Unlike the residence time, the flushing time was calculated for the whole bay area based on the water exchange from the open boundaries (Figure 17) and the volume of the bay area. Therefore, the flushing time is an integrative measure and provides no information about the spatial extent of transport inside the bay. 44 Figure 17. The water exchange from the open boundary of the Hammond Bay area during the summer based on the LH and CHM models for years 2018 and 2019. Similarly, Figure 17 shows the water exchange from the open boundary of the Hammond Bay area (shown using a red semi-circular arc in Figure 1d) during the summers of 2018 and 2019 for the LH and CHM models. The average values from these time series are used to calculate flushing times. Although both models follow the same trend, the CHM models show greater peaks in water exchange due to the effect of bidirectional exchange flows from the Straits of Mackinac, further affecting the reduction in the flushing time. This temporal variability in the water exchange at the open boundary for years 2018 and 2019, which was regulated by both the bidirectional exchange flows and local meteorological conditions, causes different flushing and residence times for the bay area. 45 Figure 18. A snapshot of current vectors showing the effect of bidirectional exchange flows in the Straits of Mackinac on surface, middle, and bottom layer velocities in the Hammond Bay area on Jun 28, 2018, at 9:00 AM UTC. (a, c, and e) currents based on the LH model and (b, d, and f) currents based on the CHM model. The red line shows the open boundary of the nested model. 46 Finally, the effects of exchange flows in the Straits on surface, middle, and bottom current velocities are shown in Figure 18. The results show that velocities from the two models (with and without the influence of exchange flows in the Straits) can be different with the combined CHM model generating significantly larger currents compared to those from the LH model. 2.4. Discussion Our results indicate that the LH model without the exchange flows is inconsistent with temperature and velocity observations in the Hammond Bay area and near the Straits of Mackinac. However, many previous studies in the Great Lakes region have used models that do not account for the connection between Lake Huron and Lake Michigan through the Straits. These models assume that the exchange flows between the lakes have no effect on the hydrodynamics of the lakes, which is only true for study sites far from the Straits. For example, if comparisons with observed currents are used to test the model at sites far from the Straits of Mackinac such as those in the Saginaw Bay in the south, previous researchers found good agreement with observed data using a single lake model. This does not change the fact that the LH-only model is an inferior model, especially from the point of lake-wide circulation tested using extensive field observations throughout the lake. However, the choice of working with the LH-only model may be justified if the focus is on a small area far from the Straits. Previous research that explored the contributions of barotropic and baroclinic modes to the seasonal variability in exchange flows at the Straits shows that the barotropic mode persists year- long with a ~3-day period of oscillation (the Helmholtz mode) and is constantly driven / modified by wind curl and atmospheric pressure gradients. During the summer months, however, stratification in the water column sets up a two-layer, bi-directional flow field driven by local meteorology representing a shift from the regional meteorology that drives the barotropic mode 47 (Anderson and Schwab, 2013 & 2017). Meteorological forcings (especially solar radiation) have a direct effect on the vertical temperature profile and thermocline depth. This becomes particularly important in the seasons where stratification is present in the lake, and density differences along with wind and barometric pressure differences are the main factors influencing water circulation in the lake. Figure 19. Comparisons between the ERA5 data and observations at the NDBC station (#45008) in Lake Huron. The figure shows the time series of (a) u- component of wind, (b) v-component of wind, and the scatterplots of (c) u- component of wind, (d) v-component of wind, and (e) air temperature. In addition, spatiotemporal differences in meteorological forcings produce differences in the exchange and bidirectional flows, which add to the complexity of hydrodynamics inside the two 48 lakes (Anderson and Schwab, 2013). We compared the wind speed and air temperature of the ERA5 dataset against the observed NDBC data in Lake Michigan and Lake Huron. The comparisons showed that the ERA5 data is in good agreement with the observed data (Figure 19 and 20). Additionally, when we used them separately for describing forcing fields for modeling the hydrodynamics and temperature, the comparisons with the ERA5 dataset produced more accurate results in Hammond Bay area (Figure 21 and Table 6) and in Lake Huron (Figures 22 and 23). This is likely attributed to the consistent data from the ERA5, and its finer spatial resolution compared with the land-based stations, particularly in Lake Huron where the density of the land- based stations is low at the offshore area. 49 Figure 20. Comparison between the ERA5 dataset and NDBC station (#45007) in Lake Michigan. The figure shows the time series of (a) u-component of wind, (b) v-component of wind, and the scatterplots of (c) u-component of wind, (d) v-component of wind, and (e) air temperature. 50 Figure 21. Comparisons between the observed and CHM simulated (using ERA5 and Land-based data) vertically-averaged (a) eastward velocity, (b) northward velocity, and (c) bottom temperature in the Hammond Bay area between June and August 2019. 51 Table 6. Performance metrics for observed and CHM-simulated (using ERA5 and Land-based data) eastward velocity (u), northward velocity (v), and bottom temperature (temp) between June and August, 2019. Model Variable RMSE R² CHM - ERA5 CHM - Land-based Perfect Value u (m/s) v (m/s) temp (°C) u (m/s) v (m/s) temp (°C) 0.052 0.062 1.848 0.068 0.082 2.634 0 0.733 0.747 0.794 0.642 0.712 0.736 1 RSR 0.701 0.771 0.611 0.927 1.020 0.871 0 Our results based on three-dimensional hydrodynamic and transport models show that it is important to include bidirectional exchange flows to accurately estimate residence and flushing times in bays as far down as 50 km from the Straits of Mackinac. The estimated flushing and residence times varied from 8.96 days to 23.62 days, depending on the year and the model/method used and the estimates are comparable to values for other bays in the Great Lakes region (e.g., Nguyen et al., 2014; Bravo et al., 2020). Additionally, the presence of tidal flushing can greatly enhance exchange in some environments, such as estuaries and coastal lagoons, leading to shorter flushing times compared to environments without tidal flushing. Previous estimates of residence times for estuaries and coastal lagoons using two-dimensional models are often of the order of a few days. For example, estimated residence times were less than 4 days for the Mersey Estuary, the largest estuary in the UK (Yuan et al., 2007); between 1 and 12 days for the Mondego Estuary in Portugal (Kenov et al., 2012); up to 2 days for intertidal flats in the Willapa Bay in Washington, USA (Wheat et al., 2019); and between 2 and 4 days for the Tapeng Bay in Taiwan (Lee et al., 2020). However, it is important to note that direct comparisons of flushing and residence times between different marine and freshwater environments should be made with caution, as these 52 estimates are highly dependent on specific characteristics of the water body, such as its size, shape, depth, and the nature of the inflows and outflows (Mahanty et al. 2016). Consequently, flushing times in some bays with tidal flushing, such as the Chincoteague Bay in Maryland, USA (Bratton et al., 2009), can be comparable to or higher than the estimates from the present work. To the best of our knowledge, the present work provides the first estimates of residence times in the Hammond Bay area in addition to estimates of errors introduced as a result of ignoring exchange flows in the Straits which can be as high as 74%. Computed e-folding residence times show significant spatial variability in both years with shorter residence times generally found near the shore (typically between 1.5 and 4.3 km depending on the model used; See Figure 16 in which distances to the outer edge of the low residence time bands are marked) and longer values in deeper waters. In general, the combined CHM model shows a wider band of low residence times values around the shore compared to the LH model. This is in contrast to results reported for some marine environments (e.g., Wheat et al., 2019) where increased residence times were noted towards the shore, and this is attributed to the fact that it becomes increasingly difficult for water reaching the shallower depths near the shore to mix with newer ocean water. However, the situation is different in the Great Lakes since the presence of a significant alongshore current component (Figure 12a shows peak u velocities around 0.4 m/s) allows water entering the nearshore area to be quickly removed. The presence of strong residence time gradients noted in our work has important implications for cross-shore physical and biological gradients as noted by Wheat et al. (2019). Residence time gradients modulate biogeochemical processes since differences in residence time between adjacent water masses can potentially lead to different rates of nutrient retention and processing. Figure 16 shows that a zone of low residence time values exists in the study area extending from nearly 1 km to a few km from the shore. 53 Cladophora and associated benthic algal data for this shoreline (Przybyla-Kelly et al. 2020, Przybyla-Kelly et al. 2021) show depth restricted distributions within this band of low residence time, highlighting the need for accurate estimates of water movements within this band for determining Cladophora growth conditions. The results of our study suggest that bidirectional exchange flows in the Straits of Mackinac significantly affect bottom currents, which in turn have important implications for biogeochemical processes, including the resuspension of bottom sediment, nutrient availability (e.g., nitrogen and phosphorus), and the growth and sloughing events of Cladophora. Analysis of the spatial plots of current velocities (Figures 18e and 18f) and associated time series data from both models (LH and CHM) indicated that the CHM model generated bottom currents that were, on average, 1.43 times greater than those from the LH model across the entire simulation period. At times, the bottom currents from the CHM model could be as high as 50 times the values from the LH model. Therefore, the use of the CHM model is necessary if the study site is close to the Straits (at least 50 km into both lakes), and it has significant implications for temperature and currents, as well as water quality variables. Furthermore, the water level fluctuations are highly modulated by the connection between the lakes at the Straits of Mackinac, and this modulation decreases with distance from the Straits. Taken together, our results indicate that bidirectional exchange flows in the Straits should be included in models to accurately simulate biogeochemical processes, including the fate and transport of nutrients, suspended sediment, and Cladophora. 54 Figure 22. Comparisons between the observed (NDBC Buoy data) and CHM-simulated water surface temperatures using (a and c) the ERA5 and (b and d) Land-based stations. The bidirectional exchange flows from the Straits of Mackinac have a dominant effect on water temperature from late June (DOY=180) to mid-September (DOY=260), altering temperature at the Straits and their vicinity. As a result, this decreases the rate of chemical and biological reactions, which may explain the better water quality in the northern regions of Lake Huron (Saylor and Miller, 1991; Anderson and Schwab, 2013). The comparison between the models and the observed data indicates that the bidirectional exchange flows do not significantly impact the vertical temperature distribution in central Lake Huron and in regions far from the Straits. 55 Figure 23. Comparisons between the observed (NDBC Buoy data) and (a and c) LH-simulated water surface temperatures and (b and d) CHM-simulated water surface temperature. The results of this study showed that the bidirectional exchange flows and local forcings alter the intensity, direction, and energy of the currents in the vicinity of Hammond Bay, which modulates the transport and residence times. Many studies have indicated the role of local forcings on transport and residence time. For example, river discharge tends to decrease the residence time in bays and lagoons by increasing transport and mixing (Nguyen et al., 2014 and Xiong et al., 2021). While other forcings, including tides and bidirectional flows, generate complex spatiotemporal variability of the residence time in bays and lagoons that may improve or hinder the mixing and residence time (Rynne et al. 2016 and Lee et al., 2020). Accurate hydrodynamic modeling is the key to understanding the spatiotemporal variability of the residence time as an indicator of 56 eutrophication, harmful algal blooms, and water quality (Dettmann et al. 2001; Ralston et al. 2015; Defne et al. 2015.) It is essential to elucidate the distinction between residence time and flushing time. Residence time dynamically tracks how long it takes for the concentration inside the bay to fall below 1/𝑒 of its original value, offering a detailed spatial insight by continually monitoring concentration changes, leveraging the capabilities of 3D models. Conversely, flushing time gives a broader average duration, working on a steady-state premise from the simulation's start to its end. 2.5. Conclusion The impact of bidirectional exchange flows in the Straits of Mackinac on transport time scales were examined in this study using 3D unstructured grid hydrodynamic and transport models based on FVCOM. Two separate large-scale models, a Combined Lake Huron-Michigan model (CHM), and the Lake Huron model (LH) were used to simulate hydrodynamics, temperature, and dye transport. The models were evaluated against data from an ADCP deployed in the Hammond Bay area, Lake Huron in 2019. The CHM model accurately captured the water exchange and bidirectional flow between Lakes Huron and Michigan at the Straits of Mackinac. It was found that hydrodynamics, temperature, and transport cannot be accurately modeled without considering the bidirectional exchange flow in the regions close to the Straits. Regarding the effect of bidirectional exchange flow on the circulation and temperature in the Hammond Bay area for 2018 and 2019, our results showed that the bidirectional exchange flows led to significant alterations in both years. In terms of the changes in the volumetric flux and transport due to the bidirectional exchange flows, the CHM model indicated a dominant effect on water temperature from late June to mid-September. This had implications for water quality, particularly in the northern regions of Lake Huron. The volumetric flux at the Straits of Mackinac 57 exhibited oscillating patterns, with the CHM model showing significant differences compared to the LH model, especially in regions close to the Straits. Our results indicate that the presence of bidirectional exchange flows can have a direct impact on the residence times in the Hammond Bay area by altering the hydrodynamics and transport, causing the residence time and flushing time to be overestimated by as much as 74% and 20%, respectively, if the exchange flows are not considered. The research illuminates the importance of exchange flows and local forcings in altering transport timescales and regulating biogeochemical processes in regions affected by the complex hydrodynamics in the region. 58 CHAPTER 3 EVALUATING THE ROLE OF GROUNDWATER AND SUBMERGED SINKHOLES IN CIRCULATION AND THERMAL STRUCTURE OF THUNDER BAY, LAKE HURON 3.1. Introduction Lake ecosystems serve as vital indicators for understanding the effects of climate change. Their intertwined surface and subsurface processes within their encompassing watersheds make them particularly sensitive to climate shifts [Adrian et al., 2009; Schindler, 2009]. From 1985 to 2009, an alarming trend was observed as global lake surface temperatures rose at an average rate of 0.34°C per decade. This warming phenomenon is not limited to just lakes; rivers, streams, and even shallow groundwater have displayed similar warming trends [Kaushal et al., 2010; Menberg et al., 2014]. However, it is essential to note the inherent differences in the rates at which surface and groundwater systems respond to such changes. Because of the heat retention capabilities of soils and the moderating influence of vegetation, groundwater temperatures exhibit a marked stability throughout the year. This unique characteristic often mitigates the impacts of surface warming, especially in ecosystems where lakes are predominantly fed by groundwater. While numerous studies have documented the impacts of climate change on the Laurentian Great Lakes, encompassing trends like rising annual temperatures, truncated winter durations, and diminishing ice covers [Austin and Colman, 2007; Nguyen et al., 2014], recent research suggests a distinct response from bays. Bays, despite being part of the larger lake systems, might manifest 59 different responses to climatic change compared to the world's primary lakes [Winslow et al., 2015]. The pivotal role of temperature in shaping ecosystem dynamics cannot be overstated. As a fundamental driver, it directly influences biological productivity and indirectly affects processes such as nutrient cycling, eutrophication, and hypoxia. These changes, in turn, precipitate shifts in lake circulation and thermal stratification. Most conventional lake models, however, often ignore the impacts of groundwater including the changes in local temperatures associated with groundwater influx, even though their significance can be substantial [Kettle et al., 2012]. Smaller bays, exemplified by Thunder Bay, inherently have shorter fetch lengths. Such bays are often shielded from dominant wind and lake-wide circulation patterns, leading to diminished wind- induced turbulent mixing [Hondzo and Stefan, 1993; Markfort et al., 2010]. These localized conditions heighten the influence of water clarity, making the mixed-layer depths in these bays more sensitive than in their larger counterparts [Heiskanen et al., 2015]. Furthermore, the way solar radiation permeates the water column dictates a series of interconnected processes across various lake layers [Simpson and Dickey, 1981b]. A nuanced example of this intricate relationship is how phytoplankton compete for vital resources in a stratified water column, with the equilibrium of light attenuation and nutrient uptake being a deciding factor [Yoshiyama et al., 2009, Safaie et al., 2021]. Recognizing the importance of groundwater and its interactions is not a concern restricted to the Great Lakes. Its significance has been underscored in diverse regions worldwide. For instance, in Central Italy, researchers delved into groundwater's role in influencing sinkholes, shedding light on intricate hydrological and geochemical dynamics [Tuccimei et al., 2005]. A similar emphasis on understanding groundwater's contribution is evident in the karst regions of Quintana Roo, 60 Mexico [Perez-Flores et al., 2019]. Closer to the subject at hand, in Lake Huron, submerged sinkholes have been the subject of study, unveiling distinct biogeochemical patterns [Biddanda et al., 2006]. This research is anchored in understanding how groundwater modifies the hydrodynamics and thermal characteristics of Thunder Bay. This research focuses on the summer stratification period and introduces the groundwater impact via strategically defined boundary conditions. We aim to elucidate two primary questions: (a) In overlooking groundwater contributions, what alterations in summer stratification and circulation patterns emerge? This inquiry gains significance in light of previous research which has spotlighted groundwater's indispensable role in shaping lake temperatures and circulation dynamics [Safaie et al., 2017; Safaie et al., 2021]. (b) Does groundwater in Thunder Bay solely originate from the already identified sinkholes? Or is there evidence to support the view that groundwater maybe an important source of water throughout the bay? Given Thunder Bay's karst topography, this question is paramount. To guide readers through this chapter, we begin with a detailed description of the study site, proceed with an overview of the collected data, and delve into the intricacies of the hydrodynamic model employed. This is complemented by a description on how the bathymetric map of the lake was used to create the numerical mesh. Further, our approach to integrating meteorological data with the lake model is elucidated. Given the unavailability of direct groundwater flux data, we juxtapose our model's performance against available observational water temperature and velocity data. The chapter concludes with a comprehensive evaluation of our results and findings, supplemented by the discussion and conclusion. 61 3.2. Materials and Methods 3.2.1. Study Area Thunder Bay is situated within Lake Huron (Figure 24) and stands out as the location of the Thunder Bay National Sanctuary, the first National Marine Sanctuary in the Great Lakes and the thirteenth of its kind in the United States. The sanctuary's significance partly stems from its numerous shipwrecks dating from the mid-1800s to the mid-1900s. Consequently, historical endeavors, such as those by Coleman (2002), have undertaken mapping efforts to chart these shipwrecks. Thunder Bay has an average depth of 20 meters and encompasses a surface area of Figure 24. Lake Huron map showing bathymetric contours, Thunder Bay area, ERA5 grid points, NDBC stations, and the location of the ADCP and Thermistor chain deployments. 62 approximately 210 km². The intricate shoreline configuration and bay shape inherently reduces its interaction with Lake Huron's primary circulation dynamics. Figure 25 provides a visual representation of the identified sinkholes in proximity to Thunder Bay. The unique characteristics of Thunder Bay, especially its sinkholes and karst formations, have been explored in numerous past studies. These studies have highlighted the distinct biogeochemical attributes of Thunder Bay, juxtaposed against other regions of Lake Huron, with particular attention to factors like conductivity, oxygen levels, nutrient content, cyanobacteria presence, and more. Furthermore, the United States Geological Survey (USGS) has initiated field surveys within Thunder Bay, seeking to address Cladophora-related queries in the Great Lakes. From an ecological standpoint, the work of Johnson and VanAmberg (1995) provides insights into Lake trout's natural reproduction and rehabilitation as explored by Eshenroder et al. (1995). Concurrently, habitat remediation studies concerning lake trout have also been pursued, as evidenced by the findings of Marsden et al. (2016). 63 Figure 25. Locations of identified sinkholes inside and near Thunder Bay, Lake Huron and the deployment sites in Thunder bay, Lake Huron. 3.2.2. Field Measurements In the summers of 2019 and 2022, we deployed three Acoustic Doppler Current Profilers (ADCPs) and one thermistor chain within Thunder Bay, precisely pinpointed in Figure 24. One ADCP was positioned in shallow waters characterized by a rocky bed, whereas the other two were set within deeper sandy terrains. Details of these deployments, including geographic coordinates, deployment depths, bin sizes, and the years of deployment, are summarized in Table 7. Notably, the ADCPs also measured water temperature at the depths at which the instruments were deployed. The thermistor chain in Thunder Bay was equipped with four Onset HOBO Pro v2 temperature 64 sensors. These sensors, with an accuracy of 0.2°C, were positioned close to the deeper ADCP deployment site. Figure 26 shows the setup of the thermistor chain in Thunder Bay. Table 7. Details of the instruments deployed in Thunder Bay, Lake Huron. Station Instrument Depth (m) Bin size (m) Deployment Location Latitude Longitude Shallow water 600 kHz Monitor (2022) ADCP Deep Water 1000 kHz Sentinel (2022) V20 ADCP Deep Water 1000 kHz Sentinel (2019) V20 ADCP 6.9 0.25 44.9897 -83.4031 15.2 0.30 44.9870 -83.3397 12.2 0.60 45.0127 -83.3680 Thermistor Chain (2022) Onset HOBO Pro V2 temperature 15.5 Sensor 0.64, 1.61, 2.91, 5.21 44.9851 -83.3386 65 Figure 26. Schematic of the thermistor chain deployed in Thunder Bay in 2022. 66 3.2.3. Numerical Model Setup The Finite Volume Community Ocean Model (FVCOM) has previously demonstrated its efficacy in simulating hydrodynamics across all North American Great Lakes [Bai et al., 2013], including Lake Michigan [Luo et al., 2012; Rowe et al., 2015; Safaie et al., 2016, 2017a], Lake Huron [Nguyen et al., 2014], and other significant water bodies. The details of the FVCOM governing equations and turbulence closures can be found in Chapter 2. For the purpose of this research, meteorological forcings from the ERA5 dataset by the European Center of Medium-Range Weather Forecast (ECMWF) were complemented by readings from NDBC stations close to Thunder Bay. This integrated dataset, incorporating meteorological data such as wind speed, direction, air temperature, humidity, solar radiation, cloud cover, was then interpolated to the numerical mesh using the natural neighbor method. The ERA5 grid points and NDBC stations used for describing the forcings are shown in Figure 24. 67 Figure 27. FVCOM computational mesh for Lake Huron and Thunder Bay. Additionally, shoreline and bathymetric data of Lake Huron were downloaded from the National Oceanic and Atmospheric Administration's Great Lakes Environmental Research Laboratory (NOAA-GLERL) database. This bathymetric data, with a resolution of approximately 90 m, was also interpolated to the numerical mesh using the natural neighbor method. The FVCOM uses an unstructured triangular mesh for its simulations. We employed the Surface Water Modeling System (SMS) software to generate this mesh. The mesh adaptation based on bathymetric and shoreline proximities ensures an accurate representation of our study site. Notably, our model utilizes a sigma-level coordinate system with 20 sigma-layers (21 levels) for vertical discretization, enhancing the model's accuracy [Mellor et al., 2002]. The mesh, with its variable sizes contingent 68 upon distance from the shoreline and bathymetry gradients, comprises 25,239 nodes and 46,918 elements (Figure 27). Our simulations spanned from April 1st to September 30th for both 2019 and 2022, with the initial months, April and May, designated as the spin-up time. The modeling was executed with a timestep of 1.5 seconds, saving outputs at hourly intervals. 3.2.4. Groundwater Scenarios Obtaining precise groundwater information, particularly in regions with karst formations like Thunder Bay, remains challenging. While techniques like seepage meters offer solutions [Mendoza-Sanchez et al., 2013], alternative methods involving chemical tracers, stable isotopes, or heat-budget analyses have also been considered. However, given the absence of comprehensive groundwater data in our case, we resorted to modeling groundwater fluxes at different magnitudes (Table 8). We applied these fluxes at designated sinkhole locations and also at nodes within Thunder Bay. The temperature boundary conditions for the groundwater simulations were based on the data from the thermistor chain sensor located at approximately 0.64 m from the lakebed. The thermistor chain readings suggest that the groundwater temperature has higher temperature than the lake water. Therefore, in our simulation cases, we included the bottom boundary condition of temperature not only based on the readings from the closest sensor to the lakebed (0.64 m) but also, we ran cases where the groundwater temperature is 2 °𝐶 higher than the readings from the closest sensor to the lake bed. In addition, since no thermistor chain and bottom boundary temperature were available in 2019, we used the thermistor chain data for 2022 in the model for 2019. 69 Table 8. Simulation cases for groundwater and temperature contribution in Thunder Bay. Case # GW Flux (m/s) Details Case 0 0 No GW applied Case 1 0.00001 Applied at the Sinkholes Case 2 0.0001 Applied at the Sinkholes Case 3 0.001 Applied at the Sinkholes Case 4 0.01 Applied at the Sinkholes Case 5 0.00001 Applied to all nodes in the Bay Case 6 0.00001 Applied to half of the nodes in the Bay Case 7 0.0001 Applied to half of the nodes in the Bay Case 8 0.00001 Temp +2 °C & Applied to all nodes in the Bay Case 9 0.00001 Temp +2 °C & Applied at the Sinkholes In FVCOM, the groundwater flux and temperature are applied at the nodes of the computational mesh as time series. This will change the bottom boundary condition of the velocity as: 𝐾𝑚 ( 𝜕𝑢 𝜕𝑧 , 𝜕𝑣 𝜕𝑧 ) = 1 𝜌0 (𝜏𝑏𝑥, 𝜏𝑏𝑦), 𝑤 = −𝑢 𝜕𝐻 𝜕𝑥 − 𝑣 𝜕𝐻 𝜕𝑦 + 𝑄𝑏 Ω at 𝑍 = −𝐻(𝑥, 𝑦) (1) where (𝜏𝑏𝑥, 𝜏𝑏𝑦) = 𝐶𝑑√𝑢2 + 𝑣2 (𝑢, 𝑣) is the x and y components of bottom stresses, 𝑄𝑏 is the groundwater volume flux at the bottom boundary and Ω is the area of the groundwater source. Introducing volumetric groundwater flow through the bottom boundary will cause the volume change in the continuity equation. Since the prescribed groundwater has a different temperature 70 compared to the lake water, it will change the temperature of the water column as well. The following equation is used for the temperature bottom boundary condition. 𝜕𝑇 𝜕𝑡 + 𝑞𝑖𝑛 𝜕𝑇 𝜕𝑧 = 𝐷𝑧 𝜕2𝑇 𝜕𝑧2 at 𝑍 = −𝐻(𝑥, 𝑦) (2) Where T is the water temperature in °𝐶 as a function of depth (z) and time (t), 𝑞𝑖𝑛 is the rate of groundwater flux (𝑚𝑠−1), 𝐷𝑧 is the vertical thermal diffusion coefficient (𝑚2𝑠−1), and H(x,y) is the bottom depth. The value of 𝐷𝑧 in all simulations were set to 1 𝑚2/𝑑𝑎𝑦 based on Safaie et al. (2017). 3.3. Results and Discussion We conducted a total of 10 simulations, including groundwater flux and temperature at both the sinkhole and within the nodes inside the bay. Case0 refers to the base condition that does not incorporate any groundwater contribution. To determine the best-performing model, we compared the simulation results of these 10 cases against the observed water temperature and velocity (ADCP and thermistor chain data) from 2022. After identifying the top 4 models, we ran these best models for the year 2019 to understand similarities and differences in processes between the two years. 3.3.1. Water temperatures Table 9 presents the performance scores between the simulations (from Case0 to Case9) and the observed water temperatures at the thermistor chain deployment site in Thunder Bay. A significant reduction in error metrics (RMSE and RSR) was observed when we incorporated groundwater flux and temperature into the model. For instance, the RMSE value for the first sensor, which is located close to the lakebed, dropped from 3.312 in Case0 to 1.024 in Case7. A similar trend was noticed with other temperature sensors, where the addition of groundwater flux and temperature yielded a better match with the observed data. 71 Figure 28 illustrates the comparison between the simulated and observed temperature time series at the thermistor chain deployment location. For clarity in the figure, only Case0, Case4, Case7, and Case8 are displayed. Among these four cases, Case7 and Case8 were more accurate in predicting the temperature than the other cases, as reflected in Table 9, especially for the bottom three sensors where the RMSE values ranged from 1.024 to 2.883. However, the accuracy of the models' predictions diminished for the upper sensor, registering an RMSE of 3.107. This discrepancy might be attributed to its closeness to the thermocline. Table 9. Performance scores between the thermistor chain temperature in deep water and simulation cases during 2022. THC 1 RMSE R² RSR THC 2 RMSE R² RSR Case0 3.312 0.676 1.136 Case0 3.925 0.572 1.201 Case1 3.306 0.679 1.134 Case1 3.928 0.566 1.201 Case2 3.297 0.685 1.131 Case2 3.889 0.585 1.190 Case3 3.704 0.783 1.271 Case3 4.329 0.755 1.324 Case4 7.942 0.699 2.725 Case4 8.819 0.693 2.697 Case5 1.853 0.883 0.636 Case5 2.696 0.856 0.825 Case6 2.222 0.866 0.762 Case6 3.053 0.838 0.934 Case7 1.024 0.956 0.351 Case7 1.898 0.916 0.580 Case8 1.599 0.849 0.549 Case8 2.301 0.798 0.704 Case9 3.305 0.681 1.134 Case9 3.922 0.571 1.199 THC 3 RMSE R² RSR THC 4 RMSE R² RSR Case0 3.910 0.531 1.237 Case0 3.311 0.535 1.377 Case1 3.901 0.532 1.234 Case1 3.305 0.531 1.374 Case2 3.877 0.566 1.226 Case2 3.336 0.592 1.387 Case3 4.521 0.711 1.430 Case3 4.423 0.627 1.839 72 Table 9 (cont’d) Case4 9.389 0.642 2.969 Case4 9.301 0.582 3.867 Case5 3.305 0.815 1.045 Case5 3.612 0.763 1.502 Case6 3.518 0.794 1.112 Case6 3.570 0.707 1.484 Case7 2.870 0.846 0.908 Case7 4.030 0.749 1.676 Case8 2.883 0.723 0.912 Case8 3.107 0.704 1.292 Case9 3.892 0.539 1.231 Case9 3.297 0.537 1.371 Figure 28. Plots of observed water temperature measured by the thermistor chain in deep water in 2022. A similar trend was evident in the ADCP temperature sensor data, as shown in Table 10. Specifically, for the deep water ADCP, the RMSE value reduced from 2.731 in Case0 to 2.418 in Case8. However, for the shallow water ADCP, there was no significant change in the RMSE value between Case0 and Case8. Figure 29 displays the time series of water temperatures at the locations of both deep and shallow water ADCPs in 2022. As inferred from Figure 29, Case7 and Case8 73 accurately predicted the water temperatures from late June to late August. Nonetheless, their predictive accuracy reduced in September. This decline can be attributed to the lack of a precise bottom boundary condition for groundwater temperature, given that thermistor chain data was only available from late June to late August. Table 10. Performance scores between the temperature at the location of deep and shallow ADCP and simulation cases during 2022. Temp (deep) RMSE R² RSR Temp (Shallow) RMSE R² RSR Case0 2.731 0.599 0.840 Case0 1.983 0.759 0.746 Case1 2.731 0.598 0.840 Case1 1.993 0.760 0.750 Case2 2.720 0.614 0.836 Case2 2.113 0.762 0.795 Case3 3.120 0.662 0.959 Case3 3.139 0.685 1.181 Case4 7.430 0.440 2.285 Case4 9.411 0.466 3.540 Case5 2.447 0.659 0.753 Case5 2.555 0.662 0.961 Case6 2.409 0.697 0.741 Case6 2.227 0.740 0.838 Case7 3.367 0.483 1.035 Case7 4.083 0.487 1.536 Case8 2.418 0.704 0.744 Case8 2.072 0.695 0.779 Case9 2.725 0.602 0.838 Case9 1.994 0.759 0.750 74 Figure 29. Plots of water temperature at the location of (top) deep and (bottom) shallow ADCP deployment in 2022. In Case4, despite the introduction of groundwater flux and temperature to the model, we observed an elevated RMSE value for temperature—both from ADCP and the thermistor chain. This discrepancy can be attributed to the excessive groundwater flux introduced into the model, which led to an artificial mass increment. As a result, there was a noticeable rise in the water surface elevation, as depicted in Figure 30. This unintended elevation surge, which consequently increased the water depth, meant that the bottom segment of the water body absorbed less energy from solar radiation. Hence, the water temperature near the lakebed did not exhibit a significant rise. 75 Figure 30. Plot of water surface elevation at the location of the deep ADCP deployment in 2022. Table 11 presents the performance scores between the simulated and observed water temperatures at the ADCP deployment site for 2019. Given that we used the 2022 thermistor chain water temperature data to set the boundary condition for the 2019 model, Case8 and Case0 demonstrated superior performance compared to the other cases. Examining the water temperature time series in Figure 31, it is evident that using the 2022 thermistor chain water temperature data led to a discrepancy between the trends of the simulated cases with groundwater (namely Case4, Case7, and Case8) and the observed water temperature for 2019. This finding underscores the significance that water temperature data from one year might not be directly transferable as a boundary condition for models representing different years. 76 Table 11. Performance scores between the temperature at the location of deep ADCP and simulation cases during 2019. Temp (deep) RMSE R² RSR Case0 2.290 0.889 0.731 Case4 7.139 0.870 2.279 Case7 2.880 0.537 0.919 Case8 2.326 0.705 0.742 Figure 31. Plot of water temperature at the location of the deep ADCP deployment in 2019. 3.3.2. Horizontal water velocity components We further compared the simulated horizontal velocity components (u & v) of the models against the observed velocities from the ADCPs situated in Thunder Bay. The performance metrics for the eastward and northward velocity components—measured at the top, middle, and bottom cells at the ADCP deployment site in deep water for 2022—are detailed in Table 12. As per Table 12, the introduction of groundwater flux and temperature did not significantly alter the performance metrics. Interestingly, the performance metrics for the top cell exhibit a broader range of values than those for the middle and bottom cells. 77 Figures 32 and 33 depict the time series of horizontal velocity components for the top, middle, and bottom cells of the ADCP positioned in deep water during 2022. It is evident that the magnitude of the velocity components diminishes as one progresses from the lake surface towards the lakebed. Moreover, the range of the simulated velocity magnitudes aligns closely with the observed data in the deep water for that year. Table 12. Performance scores between the velocity at top, mid, and bottom cells of ADCP and simulation cases in deep water during 2022. U (top) RMSE R² RSR V (top) RMSE R² RSR Case0 Case1 Case2 Case3 Case4 Case5 Case6 Case7 Case8 Case9 0.070 0.317 1.731 0.070 0.319 1.729 0.070 0.318 1.732 0.069 0.355 1.710 0.079 0.316 1.951 0.070 0.366 1.729 0.071 0.320 1.759 0.083 0.404 2.059 0.073 0.354 1.807 0.070 0.313 1.735 Case0 Case1 Case2 Case3 Case4 Case5 Case6 Case7 Case8 Case9 0.082 0.294 1.614 0.082 0.291 1.620 0.083 0.278 1.640 0.086 0.252 1.690 0.094 0.255 1.849 0.078 0.360 1.532 0.081 0.319 1.593 0.075 0.281 1.481 0.081 0.330 1.583 0.082 0.291 1.620 U (mid) RMSE R² RSR V (mid) RMSE R² RSR Case0 Case1 Case2 Case3 Case4 Case5 Case6 Case7 Case8 Case9 0.060 -0.001 1.537 0.059 -0.009 1.526 0.059 -0.014 1.526 0.057 -0.030 1.459 0.054 0.045 1.396 0.059 -0.054 1.521 0.061 -0.086 1.572 0.060 0.094 1.548 0.057 -0.047 1.462 0.059 -0.001 1.524 Case0 Case1 Case2 Case3 Case4 Case5 Case6 Case7 Case8 Case9 78 0.053 0.143 1.399 0.053 0.147 1.398 0.054 0.120 1.415 0.054 0.145 1.415 0.060 0.037 1.577 0.047 0.213 1.230 0.050 0.167 1.313 0.054 0.156 1.424 0.047 0.180 1.246 0.053 0.155 1.391 Table 12 (cont’d) U (bot) RMSE R² RSR V (bot) RMSE R² RSR Case0 Case1 Case2 Case3 Case4 Case5 Case6 Case7 Case8 Case9 0.045 0.114 1.372 0.045 0.094 1.378 0.045 0.094 1.375 0.043 0.138 1.320 0.044 0.126 1.365 0.040 0.167 1.234 0.041 0.145 1.247 0.055 0.110 1.695 0.039 0.133 1.186 0.044 0.105 1.369 Case0 Case1 Case2 Case3 Case4 Case5 Case6 Case7 Case8 Case9 0.045 0.408 1.144 0.045 0.409 1.142 0.044 0.408 1.123 0.042 0.385 1.062 0.041 0.363 1.044 0.042 0.345 1.071 0.041 0.373 1.045 0.042 0.241 1.068 0.042 0.392 1.056 0.045 0.401 1.147 Figure 32. Plots of observed eastward velocity at the top, mid, and bottom cell of the ADCP and best simulation cases in Deep water in 2022. 79 Figure 33. Plots of observed northward velocity at the top, mid, and bottom cell of the ADCP and best simulation cases in Deep water in 2022. Similar to the deep water ADCP, the introduction of groundwater flux and temperature did not bring about significant changes to the horizontal velocity components of the shallow water ADCP in 2022. Table 13 provides the performance metrics comparing the simulated cases with the observed horizontal velocities in shallow water for that year. Notably, the performance metrics for Case1 through Case9 are in the same range as that of Case0, which operates without groundwater influence. Figures 34 and 35 depict the time series of both simulated eastward and northward velocity components alongside the ADCP velocities for shallow waters. Except for Case4, there is a commendable alignment between the simulated and observed velocities. 80 Table 13. Performance scores between the velocity at top, mid, and bottom cells of ADCP and simulation cases in shallow water during 2022. U (top) RMSE R² RSR V (top) RMSE R² RSR Case0 0.044 0.353 1.226 Case0 0.062 0.577 1.025 Case1 0.044 0.349 1.226 Case1 0.062 0.577 1.029 Case2 0.045 0.335 1.246 Case2 0.063 0.567 1.046 Case3 0.049 0.301 1.375 Case3 0.076 0.502 1.262 Case4 0.063 0.244 1.758 Case4 0.095 0.391 1.586 Case5 0.046 0.309 1.287 Case5 0.062 0.560 1.040 Case6 0.045 0.334 1.262 Case6 0.062 0.561 1.038 Case7 0.057 0.213 1.577 Case7 0.066 0.575 1.102 Case8 0.047 0.282 1.294 Case8 0.062 0.550 1.030 Case9 0.044 0.349 1.228 Case9 0.062 0.574 1.032 U (mid) RMSE R² RSR V (mid) RMSE R² RSR Case0 0.031 0.062 1.466 Case0 0.054 0.308 1.336 Case1 0.031 0.064 1.449 Case1 0.055 0.307 1.340 Case2 0.031 0.042 1.476 Case2 0.056 0.268 1.378 Case3 0.034 -0.023 1.595 Case3 0.064 0.198 1.575 Case4 0.037 -0.039 1.735 Case4 0.079 -0.074 1.944 Case5 0.031 0.072 1.488 Case5 0.054 0.242 1.317 Case6 0.031 0.073 1.470 Case6 0.055 0.256 1.339 Case7 0.033 0.088 1.540 Case7 0.051 0.157 1.262 Case8 0.031 0.079 1.469 Case8 0.052 0.274 1.266 Case9 0.031 0.070 1.451 Case9 0.055 0.307 1.337 81 Table 13 (cont’d) U (bot) RMSE R² RSR V (bot) RMSE R² RSR Case0 0.032 0.136 1.440 Case0 0.046 -0.135 1.566 Case1 0.032 0.133 1.437 Case1 0.045 -0.125 1.555 Case2 0.032 0.143 1.431 Case2 0.047 -0.181 1.602 Case3 0.033 0.120 1.490 Case3 0.047 -0.154 1.598 Case4 0.067 -0.050 3.027 Case4 0.064 -0.118 2.192 Case5 0.032 0.144 1.422 Case5 0.044 -0.099 1.517 Case6 0.032 0.166 1.418 Case6 0.045 -0.130 1.555 Case7 0.029 0.128 1.299 Case7 0.043 0.071 1.478 Case8 0.030 0.153 1.333 Case8 0.042 -0.053 1.451 Case9 0.032 0.146 1.433 Case9 0.046 -0.134 1.571 82 Figure 34. Plots of observed eastward velocity at the top, mid, and bottom cell of the ADCP and best simulation cases in Shallow water in 2022. Figure 35. Plots of observed northward velocity at the top, mid, and bottom cell of the ADCP and best simulation cases in shallow water in 2022. 83 We also evaluated the horizontal velocity components of the simulated cases against the ADCP data collected in deep water in 2019. Table 14 provides a summary of the performance metrics comparing select simulated cases with the observed data. The integration of groundwater flux and temperature into the model enhanced the performance metrics across the top, middle, and bottom cells. Figures 36 and 37 display the time series of horizontal velocity components for chosen cases in relation to the observed ADCP data for 2019. The simulated velocity components exhibit a solid alignment with the observed data, with Case8 standing out due to its optimal trend and performance metrics, as detailed in Table 14. Table 14. Performance scores between the velocity at top, mid, and bottom cells of ADCP and simulation cases in deep water during 2019. U (top) RMSE R² RSR V (top) RMSE R² RSR Case0 Case4 Case7 Case8 0.059 0.386 1.283 0.063 0.436 1.367 0.061 0.423 1.319 0.053 0.492 1.150 Case0 Case4 Case7 Case8 0.072 0.383 1.390 0.085 0.372 1.644 0.077 0.398 1.484 0.070 0.407 1.361 U (mid) RMSE R² RSR V (mid) RMSE R² RSR Case0 Case4 Case7 Case8 0.045 0.213 1.274 0.049 -0.017 1.393 0.045 0.170 1.265 0.043 0.151 1.217 Case0 Case4 Case7 Case8 0.052 0.160 1.336 0.049 0.119 1.258 0.050 0.329 1.274 0.044 0.270 1.121 U (bot) RMSE R² RSR V (bot) RMSE R² RSR Case0 Case4 Case7 Case8 0.042 0.279 1.070 0.046 0.055 0.042 0.103 0.192 0.188 1.183 1.417 1.077 Case0 Case4 Case7 Case8 0.043 0.417 0.924 0.044 0.048 0.042 0.359 0.332 0.442 0.940 1.014 0.904 84 Figure 36. Plots of observed eastward velocity at the top, mid, and bottom cell of the ADCP and best simulation cases in deep water in 2019. Figure 37. Plots of observed northward velocity at the top, mid, and bottom cell of the ADCP and best simulation cases in deep water in 2019. 85 3.3.3. Upward velocity Figures 38 and 39 depict the simulated and observed upward velocity components at the sites of the deep and shallow water ADCPs in 2022. The simulated upward velocity component is notably smaller, being two orders of magnitude less than the observed upward velocity. Furthermore, integrating groundwater flux through the sinkholes and the nodes within the bay has not bridged this discrepancy. Notably, the observed upward velocities in the top and middle cells, as seen in Figures 38 and 39, display similar oscillating patterns around zero. In contrast, the upward velocity in the bottom cell exhibits a more intricate pattern than those in the middle and top cells. The values at the bottom cell do not oscillate around zero and have magnitudes greater than that of the middle cell. Figure 38. Plots of observed upward velocity at the top, mid, and bottom cell of the ADCP and best simulation cases in deep water in 2022. 86 Figure 39. Plots of observed upward velocity at the top, mid, and bottom cell of the ADCP and best simulation cases in shallow water in 2022. Figure 40 presents the contour plot of the upward velocity component as captured by the fifth beam of the Sentinel-V ADCP instrument in deep water areas for 2019 and 2022. A notable similarity between the two contour plots is the pronounced upward trend observed in August for both years. Additionally, both plots reveal a robust upward velocity magnitude in the bottom cell, situated near the lakebed. 87 Figure 40. Contour plots of upward velocity component from the 5th beam of the ADCP in deep water in (top) 2019 and (bottom) 2022. Figure 41 displays the spectral analysis of the observed upward velocity component for the top, middle, and bottom cells of the deep water ADCPs in both 2019 and 2022. Despite the sampling duration differing between 2019 and 2022, both power spectra plots in Figure 41 reveal pronounced power spectral magnitudes in the bottom cell near the lakebed, which are conspicuously absent in the middle and top cells. In 2022, the peak magnitude of the upward 88 velocity at the bottom cell is observed with periods ranging between 13-17 days, while in 2019, this peak falls between 7-12 days. Figure 41. Plots of power spectral density of the observed upward velocity component at the top, middle, and bottom cell of the ADCPs in deep water in (top) 2022 and (bottom) 2019. Our results clearly show the significant role of groundwater in Thunder Bay of Lake Huron in modulating hydrodynamics and temperature. Temperature comparisons between the model scenarios and the thermistor chain data show that Case8 (with 1.0E-4 m/s groundwater flux and +2 °C temperature) performed better relative to other cases. This is in line with the work of Baskaran et al. (2016), who suggested that the groundwater intrusion into the lake comes not only from the sinkholes but also from the karst formations, leading to other micro-seepages of groundwater into the lake. 89 In addition, the horizontal velocity comparison (𝑢 and 𝑣) closest to the lakebed at all ADCP locations show improvement in performance based on their metric scores. However, the vertical upward velocities at all ADCP locations were significantly underestimated by the models for all scenarios. The vertical upward velocities in deep water stations (2019 and 2022) demonstrate a distinctive pattern with longer periods and higher intensity compared with the other layers, which are indicative of groundwater intrusion from the lakebed. Although the estimated groundwater flux through the sinkholes was estimated around 0.06 m/s based on Baskaran et al. (2016), we did not apply the groundwater flux greater than 0.1 m/s due to its effect on the water elevation. Similarly, groundwater fluxes through the karst formation are probably several order of magnitude lower than the fluxes through the sinkholes. 3.4. Conclusion Our findings provide insightful details about the complex interplay between groundwater and lake hydrodynamics. The influx of groundwater into Thunder Bay plays a pivotal role in shaping its thermal structure and internal circulation. Scenarios factoring in groundwater flux and temperature underscored the significance of groundwater contributions without which vertical temperature estimations are notably lower. This misestimation subsequently impacts the circulation patterns, particularly in the vertical direction. While the incorporation of groundwater flux and temperature led to considerable enhancements in predicting water temperatures, it fell short in accurately capturing the upward velocity component. The horizontal velocity components exhibited variable behaviors with the introduction of groundwater flux and temperature, with improvements in some instances and slight regressions in others. However, introducing groundwater flux with the magnitude of E-04 m/s (Case 8) improved the results in both temperature and velocity compared to the observed data. Furthermore, our 90 analysis posits the necessity for groundwater to both supply and drain the lake. In the scenario represented by Case4 (with groundwater flux of 0.01 m/s), where groundwater flux was at its peak, the water surface elevation escalated by up to 3.5 meters, inducing a profound alteration in circulation and temperature projections. This further demonstrated the intricate interplay between the groundwater flux and lake hydrodynamics. Spectral analysis of the upward velocity components indicates distinct patterns in power spectra magnitudes and longer periods. Specifically, at the bottom cell of the ADCP, located near the lakebed, these patterns differ markedly from those in the middle and top cells. Notably, the peaks of the highest magnitudes at the bottom cell exhibit longer periods in comparison to the other cells. To encapsulate, the hydrodynamic model for Thunder Bay has been developed and enhanced through multiple approaches: (a) Depth-adaptive refinement of the computational grid, (b) Merging data from land-based stations (NDBC) with the reanalysis dataset (ERA5) to describe the open boundary conditions, and (c) Employing diverse scenarios of groundwater flux and temperature via sinkholes and nodes within the bay. Integrating groundwater flux and temperature within the bay nodes bolstered the model’s precision in replicating water temperatures, especially when compared with thermistor chain and ADCP temperature data. However, the model struggled to replicate the upward velocity component as seen in the ADCP data. This hints at the groundwater flux exhibiting inconsistent spatial and temporal patterns within the bay and at the sinkholes. Groundwater's influence over the bay's temperature and circulation patterns holds vast implications for water quality and the fate and transport of nutrients. For a precise simulation of Thunder Bay and comparable groundwater-fed bays within the Great Lakes, a detailed spatiotemporal groundwater dataset is imperative. 91 CHAPTER 4 INTEGRATING MACHINE LEARNING, NUMERICAL MODELING, AND REMOTE SENSING FOR THE ASSESSMENT OF WATER QUALITY INDICATORS 4.1. Introduction Turbidity, an instrumental water quality parameter that gauges both the clarity and quality of water, is principally dictated by the presence of particulate matter, sediment, and suspended solids (SSD) (Bilotta & Brazier, 2008). The perturbation in the clarity of water, encapsulated by the turbidity levels, can have far-reaching implications on the health of aquatic ecosystems. Furthermore, the intensity of turbidity can serve as an indirect measure of pollutants that might have penetrated the water bodies. Although turbidity is primarily quantified through Nephelometric Turbidity Units (NTU), Formazin Turbidity Units (FTU), and milligrams per liter (mg/l), its correlation with SSD concentrations is not unequivocal due to sensitivity to particle size and shape (Bilotta & Brazier, 2008). Hence, to develop a comprehensive understanding of water quality, an all-encompassing assessment of SSD concentrations is advocated (Bilotta & Brazier, 2008). Monitoring and predicting turbidity in coastal areas present a formidable challenge, necessitating the convergence of different methods, from field measurements to numerical modeling and remote sensing. Numerical or mechanistic models, particularly those based on three-dimensional conservation principles, have been used extensively in simulating the fate and transport of turbidity in water bodies. Owing to their foundation in physical and biological principles, these models can simulate complex phenomena such as turbidity currents with reversing buoyancy (Sequeiros et al., 2009). However, their accuracy pivots on the realistic representation of forcing fields and boundary 92 conditions, such as river discharge and turbidity (Zhu et al., 2022; Jalón-Rojas et al., 2021). This suggests that while these models have furnished pivotal insights, their reliability and applicability necessitate further enhancements in dynamic environments such as coastal waters. In parallel with numerical modeling, remote sensing, particularly satellite data, has become a potent tool for assessing water quality parameters (Nelson et al., 2003; Olmanson et al., 2008; Topp et al., 2020). This technology offers a synoptic view of water bodies, allowing for consistent tracking of parameters, including turbidity. It exploits remote sensing indices such as the Normalized Difference Turbidity Index (NDTI) and Normalized Difference Water Index (NDWI) to infer turbidity levels from space (Gao, 1996). However, despite its promise, remote sensing confronts several limitations, such as the inability to penetrate deep water bodies and the influence of atmospheric conditions on the received signal (Cunningham et al., 2013), necessitating supplementary approaches for water quality assessment. In light of this, the fusion of machine learning with remote sensing and field data has gained momentum in recent years (Filisbino Freire da Silva et al., 2021). Machine learning models equipped with the ability to discern complex patterns in data have been utilized to predict outcomes related to water quality, including turbidity (Li et al., 2021; Normandin et al., 2019). These models, however, are data-sensitive and demand high-quality labeled data for training, which may not always be available. A significant limitation is the lack of spatial and temporal resolution in field measurements critical in dynamic coastal environments. Similarly, the generalizability of these models hinges on the training data encompassing a wide range of concentration values, which field sampling often lacks, especially for high turbidity samples. The advent of transfer learning, a technique that allows the application of knowledge from related source domains to improve the performance of target models, has shown promising results in 93 overcoming these challenges (Zhuang et al., 2020; Farahani et al., 2021; Kim et al., 2022). Transfer learning techniques, coupled with machine learning algorithms, have shown potential in applications such as remote sensing for water quality assessment (Arias-Rodriguez et al., 2023; Syariz et al., 2021; Gambin et al., 2021; Pu et al., 2019; Zhu et al., 2019). Despite these advancements, considerable gaps remain, particularly concerning data availability in a wide range of concentration values for training machine learning models and their generalizability. This calls for an integrated approach that leverages the strengths of numerical modeling, machine learning, and remote sensing. Our study addresses these gaps by utilizing synthetic data from numerical modeling to train machine learning models for turbidity prediction using remote sensing imagery in the visible bands. We also aim to investigate the applicability of the trained models through transfer learning in different domains. This approach will allow us to predict turbidity in a unified concentration unit (𝐹𝑁𝑈) for the study site and potentially for other coastal areas. The study contributes novel insights into the precision and limitations of numerical models, as turbidity can serve as a natural tracer. This holistic approach can revolutionize our understanding of water quality in dynamic coastal environments. 4.2. Materials and Methods 4.2.1. Study Areas 4.2.1.1. Cleaveland Harbor in Lake Erie This study focuses on Lake Erie around the Port of Cleveland, situated at the mouth of the Cuyahoga River in Cleveland, Ohio, United States. The Cuyahoga River has a significant historical background of contamination, characterized by numerous major fires resulting from the accumulation of industrial waste and oil slicks. One notable incident occurred in 1969, which acted 94 as a catalyst for the environmental movement and subsequently led to the enactment of the Clean Water Act in 1972. Figure 42. (a) Bathymetric map of Lake Erie highlighting significant features, including the placement of NDBC meteorological and water temperature stations, ERA5 grid points, and the deployment site of the ADCP. (b) Bathymetric map of Cleveland Harbor displaying the numerical mesh, the Cuyahoga River mouth's position, and the harbor's sampling area. (c) Location of the river mouth and the primary opening of the harbor. The Port of Cleveland, located along the banks of the Cuyahoga River, has played a vital role in the city's economic development. However, the port has also contributed to the pollution of the river through the discharge of wastewater and other industrial effluents. Recent endeavors addressed these concerns and diminished the adverse impacts on the river and its surrounding 95 ecosystems. The port has implemented several initiatives to enhance water quality and safeguard the river's health and the Great Lakes ecosystem. Multiple factors contribute to the pollution and turbidity observed in the Cuyahoga River. Notably, the discharge of untreated or partially treated wastewater from municipal and industrial sources is a significant source of contamination. This encompasses sewage, stormwater runoff, and various types of water contaminated with chemicals, metals, and other pollutants. Another contributing factor is the release of untreated or partially treated industrial effluent from factories and other industrial facilities, which contains chemicals, metals, and other pollutants utilized or produced during the manufacturing process. In addition to these pollution sources, stormwater runoff can transport sediment and debris into the river, exacerbating its turbidity. Moreover, the river is affected by nonpoint source pollution, including agricultural and urban runoff. 4.2.1.2. Burns Ditch in Lake Michigan The research site is situated near the Portage-Burns waterway, also known as Burns Ditch, in the southern part of Lake Michigan (designated as USGS station 04095090). We primarily focus on three beaches named OD1, OD2, and OD3, which are influenced by the outflow from Burns Ditch, highlighted as BD in Figure 43. To examine nearshore currents and validate the hydrodynamic model, Acoustic Doppler Current Profilers (ADCPs) that are bottom-mounted and look upward were placed in the research zone from early June to late August 2008, as documented by Safaie et al., 2016. Water samples were collected from shallow waters at these beaches. Distances of the beaches OD1, OD2, and OD3 from the Burns Ditch outfall are approximately 1500m, 800m, and 500m westward, respectively. 96 Burns Ditch is essentially the drainage point of the Little Calumet River. This river drains a varied landscape comprising of industrial zones, farmlands, and residential areas. On occasion, combined sewer overflows (CSOs) from various municipal sewage treatment facilities are released into Burns Ditch. Following significant rainfall that triggers a CSO, E. coli levels have been observed to peak at 10,000 CFU/100 ml. Nonetheless, these E. coli levels vary considerably, often dropping below 100 CFU/100 mL in dry periods. Figure 43. Google Earth image of Burns Ditch in Lake Michigan showing three beaches (OD1, OD2, and OD3), and the outfall (BD) location. The FVCOM unstructured grid is also shown on top of the lake. This image was adapted from Safaie et al., 2016. During the period from early June to late August 2008, water samples were typically taken in shallow waters around 7:00 AM and 2:00 PM. These samples were tested for E. coli within 4 hours of collection using the IDEXX Colilert-18 reagent along with the IDEXX Quanti-Tray 2000 procedure (in line with Standard Methods SM9223B of the American Public Health Association). Furthermore, this research monitored other water quality metrics such as turbidity, specific conductance or electrical conductivity, water temperature, and the Burns Ditch outflow. E. coli levels, turbidity, and other water quality parameters were assessed twice daily, barring weekends. 97 4.2.2. Mechanistic Modeling Details and Data 4.2.2.1. Circulation and Transport Model for Turbidity We utilized the Finite Volume Community Ocean Model (FVCOM) to conduct simulations of lake- wide and nearshore circulation. FVCOM is a three-dimensional numerical model based on triangular unstructured grids capable of prognostic and free-surface simulations of water circulation and transport in coastal and oceanic environments. Details and the governing equations and specifications for FVCOM can be found in Chapter 2. The hydrodynamic model simulation was conducted from April 1st, 2019, to July 30th, with a time step of 0.2 seconds. The simulation was initialized as a cold start with zero velocity. The initial temperature values were assigned as 0.1 at the surface and linearly increased to 2.0 degrees Celsius at the bottom of the lake. The first two months of the simulation (April and May) were designated as a spin-up period. The coupled transport model describing the turbidity transport in space and time can be characterized as: 𝜕𝐶 𝜕𝑡 + 𝑢 𝜕𝐶 𝜕𝑥 + 𝑣 𝜕𝐶 𝜕𝑦 + 𝑤 𝜕𝐶 𝜕𝑧 = 𝜕 𝜕𝑥 (𝐾𝐻 𝜕𝐶 𝜕𝑥 ) + 𝜕 𝜕𝑦 (𝐾𝐻 𝜕𝐶 𝜕𝑦 ) + 𝜕 𝜕𝑧 (𝐾𝑉 𝜕𝐶 𝜕𝑧 ) (1) In Eq. 1, 𝐶 represents the turbidity concentration in Formazin Nephelometric Units (𝐹𝑁𝑈), while 𝑢, 𝑣, and 𝑤 denote the velocity components in the 𝑥, 𝑦, and 𝑧 directions, respectively, measured in meters per second (𝑚/𝑠). 𝐾𝐻 and 𝐾𝑉 refer to the horizontal and vertical mixing coefficients (𝑚2/𝑠), respectively. It should be emphasized that in our analysis, we assumed no additional input or removal of turbidity once it enters the lake via the river. Consequently, any increase in turbidity concentration (e.g., sediment resuspension) or decrease in concentration (e.g., settling) had a negligible impact on the overall turbidity levels. This assumption holds particularly true near the 98 river mouth and within the harbor area, where the proximity to the river mouth and the presence of river-induced currents play a significant role. The coupled transport model run was initiated on May 30th, 2019, and continued until July 30th, with a time step of 0.2 seconds. The initial turbidity concentration was set uniformly to 8 𝐹𝑁𝑈 across all computational nodes. Output from the model was saved at hourly intervals throughout the simulation period. 4.2.2.2. Circulation and Transport Model for E. coli We also used a coupled three-dimensional hydrodynamic and water quality model. By expanding on Eq. 1, we can include sink and source terms to simulate the spatiotemporal distribution of E. coli in nearshore waters (Liu et al., 2006; Safaie et al., 2016; Thupaki et al., 2010). 𝜕𝐶 𝜕𝑡 + 𝑢 𝜕𝐶 𝜕𝑥 + 𝑣 𝜕𝐶 𝜕𝑦 + 𝑤 𝜕𝐶 𝜕𝑧 = 𝜕 𝜕𝑥 (𝐾𝐻 𝜕𝐶 𝜕𝑥 ) + 𝜕 𝜕𝑦 (𝐾𝐻 𝜕𝐶 𝜕𝑦 ) + 𝜕 𝜕𝑧 (𝐾𝑉 𝜕𝐶 𝜕𝑧 ) − 𝑘𝐶 (2) 𝑘 = [ 𝑓𝑝𝑣𝑠 𝐻 + 𝑘𝐼𝐼𝑡 + 𝑘𝑑 ] 𝜃 𝑇−20 (3) In Eq. 2, 𝑘 is the contaminant decay function, and 𝐶 is the concentration of E. coli (CFU/100 ml). In the contaminant decay function (Eq. 3), 𝑓𝑝 is the fraction of contaminant concentration that is attached to suspended particles (unitless), 𝑣𝑠 is the settling velocity of suspended particles (𝑚. 𝑑−1) and 𝐻 is the depth of the water column. 𝐼𝑡 represents the solar irradiance at the water surface at time t (𝑊. 𝑚−2), 𝑘𝐼 is the contaminant solar inactivation rate (𝑚2𝑊−1𝑑−1), 𝑘𝑑 is the based mortality (𝑑−1) and 𝜃 𝑇−20 is the temperature correction factor. The unstructured mesh consists of 12,684 nodes and 23,602 triangular elements in the horizontal direction. In the vertical direction, 20 sigma layers (21 levels) were used. The horizontal mesh resolution close to the BD outfall (Figure 43) is 40 m and increases to 5 km in the central and deep part of Lake Michigan. Other details about the model set up can be found in Safaie et al., 2016. 99 4.2.2.3. Domain discretization and bathymetry data The computational domain encompassing Lake Erie and the Cleveland Harbor Area was discretized into 56,556 triangular elements and 30,071 nodes in the horizontal direction. The resolution of the triangular mesh in the horizontal plane varied based on the distance from the shoreline and bathymetry. In the harbor area and at the river mouth, the mesh resolution ranged from 10-40 meters, gradually increasing to 1800 meters in the central and deeper regions of the lake. Using an unstructured triangular mesh facilitated the precise representation of abrupt changes in the coastline shape, particularly within the harbor area. The computational domain was divided into 20 uniformly spaced sigma layers in the vertical direction. The resolution varied from a few centimeters near the shoreline to several meters in the offshore and deep sections of the lake. For most parts of the computational domain, 3-arc-second bathymetry data with a resolution of approximately 90 meters, obtained from NOAA, were employed. While this resolution is considered accurate for most lake and ocean models, it lacks the required precision within the harbor, near breakwaters, and at the river mouth. To address this issue, two additional sets of bathymetry data were utilized. The first was 10-meter bathymetry data from the United States Army Corps of Engineers (USACE), covering most harbor and river mouth areas. The second consisted of Electronic Navigational Charts (ENC) from the NOAA's Office of Coast Survey (www.nauticalcharts.noaa.gov), which accounted for less than 1% of the total bathymetry data and specifically captured the bathymetry near the breakwater walls. This level of detail was necessary to ensure an accurate representation of the computational domain, given our focus on the circulation within and near the harbor area and the transport between the lake and the river. Additionally, the fine mesh resolution of 10-40 meters applied inside and near the harbor further 100 justified the inclusion of such detailed bathymetry. All bathymetry data were interpolated to the computational mesh using the natural neighbor method. 4.2.2.4. Meteorological forcing data Meteorological data from the National Data Buoy Center (NDBC) and the ERA5 reanalysis dataset were combined to generate the forcing field for this study's mechanistic model (FVCOM model). To integrate the two meteorological datasets, hourly wind and air temperature data from the ERA5 grid points situated at least 15 km away from any NDBC stations were extracted and merged with the NDBC data. The observed wind speeds from the NDBC stations were adjusted to a height of 10 meters as described by Schwab et al. (1981), who use a formulation that incorporates parameters such as the drag coefficient, stability length, and roughness length. The height-adjusted NDBC wind speeds were combined with the ERA5 10-m wind speeds. On the other hand, no height correction was applied to the NDBC air temperature, as air temperature variations over short heights are negligible. The combined wind speeds and air temperatures were subsequently interpolated to the numerical mesh using the natural neighbor method. For other forcing fields, including solar radiation (shortwave and longwave), cloud cover, and relative humidity, interpolation was performed solely from the ERA5 grid points over the lake without integration with field observations. The decision to use an integrated approach combining ERA5 and NDBC data was driven by the limited availability and inconsistency of field observations over the lake, particularly on the northern (Canadian) side. By integrating the field observations with the ERA5 reanalysis dataset, we aimed to enhance the spatiotemporal resolution of the data while utilizing valuable field observations. This integrated method is expected to provide a more accurate representation of the mechanistic model's forcing fields than relying solely on ERA5 or NDBC data alone. 101 4.2.2.5. River Forcing Data Hourly observations of river discharge and turbidity concentration were utilized to generate input (boundary condition) for the transport model. The data were obtained from the USGS stream gauge (#0420800) at the Cuyahoga River in Independence, OH, which is 20.1 km away from the mouth of the river (Figure 42). However, the turbidity concentration time series had continuous gaps due to missing data. To address this, interpolation methods were employed to predict the missing values. Classic interpolation methods proved ineffective due to the non-linear nature of turbidity values and continuous missing data. Therefore, Gaussian Process Regression (GPR), a machine learning technique (Rasmussen & Williams, 2006), was employed to predict the missing values and to generate hourly turbidity concentration as input for the mechanistic model. Gaussian Process Regression is a Bayesian nonparametric model suitable for interpolation and prediction (Rasmussen & Williams, 2006; Wang & Jing, 2022). It assumes that the underlying function describing the data follows a Gaussian process consisting of random variables indexed by inputs with a joint Gaussian distribution. GPR can handle noise in the data by explicitly incorporating it into the likelihood function. GPR has been widely applicable in diverse fields such as machine learning, economics, and engineering (Ni et al., 2012; Goudenège et al., 2021; Manfredi & Trinchero, 2022). For the ML model setup, we used the ERA5 runoff and total precipitation data over the Cuyahoga watershed and combined with observed discharge and turbidity from the USGS gauge. This decision was made to increase the dimensionality of the data and enhance the predictive power of the ML model by using the variables that affect the turbidity concentration. As the river and turbidity data had already undergone quality checks by the USGS, no further cleaning or filtering was performed. Due to the significant difference in the order of magnitude within the turbidity 102 concentration range (1-1000 𝐹𝑁𝑈), we convert the linear turbidity scale into a logarithmic scale (log10) for training our ML model. By making this conversion, we aimed to maintain consistency and improve the model's effectiveness. Figure 44. Time series of river discharge (top) and turbidity concentration (bottom) at the Cuyahoga River. The available data were divided into 80% for training the ML model and 20% for testing. To prevent overfitting and ensure robust validation of the ML model, we employed cross-validation techniques. The training data was divided into eight groups, allowing us to independently calculate validation scores for each group. This approach helped assess the performance and generalizability of the ML model across different subsets of the data. Considering the different units of the variables, data scaling was implemented before training the ML model. The trained ML model was then used to predict turbidity concentration for the missing data. The resulting turbidity time series 103 and the observed river flux served as the open boundary condition for the numerical mechanistic and transport model, capturing the behavior of the Cuyahoga River. 4.2.3. Remote Sensing Data In this study, remote sensing data from Planet's Dove satellites (www.planet.com) were utilized to train a Machine Learning model based on the synthetic data from the mechanistic model. Planet operates a fleet of small remote-sensing satellites that capture high-resolution images of the Earth's surface daily. These images, obtained through multispectral and hyperspectral sensors, offer a frequent revisit time (1-2 days) and high spatial resolution (~3 m), making them suitable for our research in a small area with dynamic circulation and transport patterns. It should be noted that the spectral resolution of Dove satellite data is comparatively lower than other remote sensing datasets like Landsat 8, Landsat 9, Sentinel 2, and Sentinel 3. The specific data in our research was the Surface Reflectance (SR) product, which provides radiometrically calibrated and atmospherically corrected images. This product is available for orthorectified scenes captured by the sun-synchronous orbit Dove satellites. To achieve atmospheric correction, lookup tables generated using the 6SV2.1 radiative transfer code were employed, enabling the mapping of top-of-atmosphere (TOA) reflectance to bottom-of- atmosphere (BOA) reflectance. The SR product is a 16-bit GeoTIFF image with reflectance values scaled by 10,000. For atmospheric inputs, water vapor and ozone information were retrieved from MODIS near-real-time (NRT) data, while aerosol optical depth (AOD) input was determined from MODIS NRT aerosol data. By considering localized atmospheric conditions, the SR product ensures consistency and minimizes uncertainty in spectral response across various time points and locations. Further details can be found in Table 15, included in this study. 104 Table 15. Details of the remote sensing images used in this research, including instrument type, spectral bands, pixel size, and atmospheric corrections. Instrument Spectral Bands (nm) Pixel Size Atmospheric Corrections Blue: 455 - 515 Green: 500 - 590 Conversion to top-of-atmosphere (TOA) reflectance values using at-sensor radiance and supplied coefficients PS2 3.0 m Red: 590 - 670 NIR: 780 - 860 Conversion to surface reflectance values using the 6SV2.1 radiative transfer code and MODIS NRT data Due to the disparity in spatial resolution between the mechanistic model and remote sensing data, preprocessing and data cleaning are necessary before their utilization. Firstly, we extract the surface reflectance data for each band within each triangular mesh. Subsequently, the z-score of each band's data is calculated individually, and data points exceeding an absolute z-score threshold of 3 are considered outliers and excluded (Hoaglin, 2013). This approach ensures consistency among the data points. The accepted values within each triangular mesh are then averaged for each band, resulting in a single surface reflectance value per mesh. Similarly, the Normalized Difference Turbidity Index (NDTI) and Normalized Difference Water Index (NDWI) are computed using the provided formulas (Lacaux et al., 2006 & Gao, 1996). 𝑁𝐷𝑇𝐼 = 𝑆𝑅𝑅𝑒𝑑 − 𝑆𝑅𝐺𝑟𝑒𝑒𝑛 𝑆𝑅𝑅𝑒𝑑 + 𝑆𝑅𝐺𝑟𝑒𝑒𝑛 𝑁𝐷𝑊𝐼 = 𝑆𝑅𝐺𝑟𝑒𝑒𝑛 − 𝑆𝑅𝑁𝐼𝑅 𝑆𝑅𝐺𝑟𝑒𝑒𝑛 + 𝑆𝑅𝑁𝐼𝑅 (4) (5) 105 4.2.3.1. Data Preparation for Machine Learning To develop the Machine Learning model to extract turbidity from satellite imagery, we first extracted the simulated turbidity concentration for the uppermost 3 m of the water column from the transport model. This process was conducted concurrently with the image acquisition time to ensure temporal alignment. These simulated turbidity values were then combined with the processed remote sensing bands—specifically, blue, green, red, and Near Infrared bands, the Normalized Difference Turbidity Index (NDTI), and the Normalized Difference Water Index (NDWI)—within a defined sampling area located inside the harbor (as depicted in Figure 42). To secure data quality for training the Machine Learning (ML) model, the scope was strictly limited to the data derived from the harbor sampling area. This strategic selection was crucial in guaranteeing that the detected turbidity predominantly originated from the Cuyahoga River, thereby minimizing the influence of lake-wide turbidity and turbidity from other rivers (upstream and downstream). This decision also capitalized on the resolution of the mechanistic and transport model, which, considering the data-intensive nature commonly associated with ML models, supplied ample training data samples. The Machine Learning model was then developed using the gathered and processed simulated turbidity data and remote sensing images, explicitly focusing on the harbor sampling area. Of the five images harnessed for model development, a division was made where 75% of the data was earmarked for training and validation, and the remaining 25% was dedicated to evaluating the ML model. This systematic and thoughtful allocation ensured a robust development and comprehensive evaluation of the ML model's performance. The trained ML model can be utilized to predict turbidity concentration (in 𝐹𝑁𝑈) based on the remote sensing surface reflectance at four bands and the NDTI and NDWI indexes. 106 Figure 45. Processed surface reflectance data at each band (band1, band2, band3, and band4), NDTI and NDWI indexes, and Turbidity concentration data used for training and testing the ML model. 107 Figure 46. The algorithms of (a) physical (Mechanistic) modeling, (b) remote sensing, and (c) machine learning used in this research. 4.2.3.2. Transfer Learning: Detecting Turbidity in Other Regions of the Lake Transfer learning, briefly defined as the application of a model trained in one domain to a new but related domain, was employed in our study to enhance the adaptability of machine learning (ML) models across diverse regions (Zhuang et al., 2020; Syariz et al., 2021). We utilized the trained ML model, which was developed for predicting turbidity based on remote sensing images in the Cleveland Harbor area, to predict turbidity in other regions of Lake Erie, namely the Cattaraugus River and River Raisin. This allowed us to assess the trained ML model's generalizability across different lake regions. 4.2.4. Validation Process The performance of the mechanistic model in accurately describing water velocity was assessed by comparing model results for water current velocity with observed data obtained from a Nortek Aquadopp HR (High-Resolution, 2.0 MHz frequency) Profiler ADCP (Acoustic Doppler Current 108 Profiler) deployed in Lake Erie at Erie, PA, during August and September 2019. Additional information regarding the instrument setup and deployment can be found in Table 16. Likewise, the accuracy of the mechanistic model in representing surface water temperature was evaluated using data from two NDBC (National Data Buoy Center) buoys, as illustrated in Figure 42. Table 16. Details of the ADCP deployment and setup in Erie, PA. Instrument Deployed Retrieved Longitude Latitude Depth (m) Cell size (m) 09-Aug-2019 10-Sep-2019 -79.9821 42.1886 5.80 0.25 Nortek Aquadopp HR (High-Resolution) Profiler (2.0 MHz frequency) 4.3. Results 4.3.1. Application of ML to River Turbidity Prediction Figure 47 (a and b) displays scatter plots comparing the measured and predicted turbidity for the validation and test datasets. The validation data yielded an RMSE of 0.154 log10(𝐹𝑁𝑈) and an 𝑅2 value of 0.92, while the test data showed an RMSE of 0.146 log10(𝐹𝑁𝑈) and an 𝑅2 value of 0.93. In Figure 47c, the predicted turbidity concentration time series (red line) exhibits an oscillating pattern similar to the observed data. Although there are no observations for the missing data period (red line in Figure 47c), we will compare the turbidity map near the harbor with a remote sensing image for this specific timeframe later. This comparison will help assess how much the turbidity plume affected the harbor area. 109 Figure 47. Scatter plots of (a) training and (b) testing data with their performance metrics. (c) Log10 turbidity concentration at the river mouth. The red line denotes the ML predicted values, and the blue line denotes the observed data. 4.3.2. Validation of the Circulation Model The mechanistic model was tested using observed vertically averaged eastward (u) and northward (v) velocity components at the ADCP deployment location. Figure 48 (a and b) compares the simulated u-component and v-component of velocity against the observed data. Overall, the mechanistic model performed well in simulating the velocity components over time. Table 17 includes the RMSE, 𝑅2, and RSR values as evaluation metrics. Although the RMSE for the v- component of velocity is smaller than that of the u-component, the 𝑅2 and RSR values for the u- component are lower than those of the v-component. This difference can be attributed to the higher 110 magnitude of the u-component compared to the v-component of velocity. The smaller values of the v-component result in a reduced RMSE compared to the u-component. Figure 48. Comparison between the vertically-averaged simulated and observed (a) eastward and (b) northward components of velocity at the location of the ADCP deployment. Comparison between the simulated and observed (c and d) water surface temperature at the NDBC buoys. Additionally, we assessed the mechanistic model's ability to accurately simulate the temperature field by comparing the simulated water surface temperature with observations from two NDBC stations (#45164 and #45169) close to the harbor area. Figures 48 (c and d) demonstrate that the mechanistic model successfully simulated water surface temperature values and captured the overall trend. Table 17 provides the RMSE, 𝑅2, and RSR values for comparing the simulated and 111 observed water surface temperatures. The RMSE was smaller than 0.5 °C, and the 𝑅2 was greater than 0.93. Table 17. Error metric scores between the simulated and observed water velocity and surface temperature. Variable RMSE R² RSR u (𝑚. 𝑠−1) v (𝑚. 𝑠−1) 0.064 0.716 0.766 0.047 0.650 0.952 Water Temp (°C), NDBC #45164 0.437 0.938 0.375 Water Temp (°C), NDBC #45169 0.482 0.955 0.410 4.3.3. Validation of the Transport Model During the simulation period, no measured turbidity data were available for the harbor area. Therefore, a quantitative comparison between the simulated turbidity plume and the actual turbidity concentration cannot be made. However, we can still assess the performance of the mechanistic model in simulating turbidity transport by comparing the spatial map of turbidity with remote sensing RGB images at different times. Figure 49 shows the remote sensing RGB images at three different times and their corresponding simulated turbidity plume. In Figure 49, we observe similarities between the extent and intensity of the turbidity plume in the simulated results and the remote sensing RGB images, particularly at a smaller scale within the harbor. While the similarities are notable within the harbor area, differences become more pronounced as we move away from the point source at the mouth of the river. Although the simulated turbidity closely resembles the observed data inside the harbor, variations become more significant outside the harbor area as the distance from the point source increases. 112 Figure 49. Remote sensing RGB images (left side) and their corresponding simulated turbidity plume (right side) for images taken on Jun 22, 23, and 25 of 2019. 113 Figure 50. (top) RGB image of the remote sensing data and its corresponding (bottom) simulated turbidity map on June 25th, 2019. Figure 50 depicts the remote sensing RGB image and the simulated turbidity inside the harbor and its vicinity. The figure highlights the similarities between the simulated turbidity plume and the RGB image in areas outside the harbor. However, the accuracy or similarity gradually degrades as we move farther from the harbor area. This discrepancy can be attributed to several factors, including the utilization of a larger mesh size outside the harbor, potential diffusion errors associated with the mechanistic model, and the absence of other sources of turbidity, such as the Rocky River outlet depicted in the left corner of the images in Figure 50. 4.3.4. Extracting Turbidity Concentration Map from Remote Sensing Images Figure 51 presents a scatter plot illustrating the scaled simulated turbidity plotted against the ML- predicted scaled turbidity. For the validation dataset, the RMSE and 𝑅2 values were found to be 0.0682 and 0.93, respectively. Similarly, the testing scores yielded RMSE and 𝑅2 values of 0.0688 and 0.93, respectively. 114 Figure 51. Scatter plots of validation and testing data with their error metric scores. Figure 52 showcases the ML-predicted turbidity concentration (in 𝐹𝑁𝑈) for two remote sensing images captured on 22-Jun-2019 and 23-Jun-2019. As depicted in Figure 52, the turbidity appears to be higher within the harbor area on June 22nd, 2019. However, as we move farther from the harbor and shift to the image captured on June 23rd, 2019, the turbidity dilutes, and its concentration decreases as we progress in space and time. 115 Figure 52. (top) Remote sensing RGB images and their (bottom) ML-predicted turbidity on Jun 22 and 23, 2019. 4.3.5. Transfer Learning: Detecting turbidity in other regions of the lake The results depicted in Figure 53, shown for the Cattaraugus and Raisin River plumes, present the RGB images compared to their corresponding turbidity plumes, as the machine learning (ML) model developed for the Cleveland Harbor area predicted. There appears to be a robust correlation between high turbidity concentrations and the corresponding RGB images for the Cattaraugus River and River Raisin. Overall, the ML model demonstrated considerable skill in predicting turbidity concentration through transfer learning, particularly when juxtaposed with the RGB images. However, it's worth noting that the model failed to predict accurate images, as shown in Figure 53 (indicated by a red box). The absence of fine-tuning for these new sites likely contributed to this shortcoming. 116 Figure 53. (top) Remote sensing RGB images for Cattaraugus River and River Raisin and their corresponding (bottom) ML-predicted turbidity plume. The discrepancy might also be linked to the limitations of the training data. The model was originally trained on a limited number of images from a different region of the lake, which could have influenced the results. Additionally, variations in the concentration of dissolved and suspended compounds within the water might have affected the surface reflectance at the measured bands in these regions, potentially contributing to the observed discrepancies. 4.3.6. Application of ML on the Prediction of E. coli Concentration Figure 54 shows the scatter plots for validation and test of the ML model that is trained for prediction of E. coli concentration at the river mouth. The RMSE values for validation and test are 117 0.119 and 0.176, respectively. Similarly, the R² values for validation and test are 0.40 and 0.32, respectively. Figure 54. Scatter plots of (left) Validation and (right) Test for E. coli concentration at BD. Using the trained model for the prediction of E. coli concentration at the river mouth (BD in Figure 43), we made hourly prediction for E. coli. Figure 55 shows the time series of E. coli concentration using the statistical model (based on Safaie et al., 2016) and the ML model. 118 Figure 55. The time series of E. coli concentration at the Burns Ditch outfall, created using statistical model and ML model. Further testing the performance of the statistical and ML models in generating hourly data for the mechanistic model was carried out by simulating each case and comparing their results with the observation at the Ogden Dunes beaches at OD1, OD2, and OD3. Table 18 shows the performance scores of each case. In all three stations, we observed improvements in the performance of the simulation carried out by the ML generated data. Figure 56 shows the times series of the simulated and observed E. coli concentration at the three beaches. Both the statistical and ML models, used to generate the river forcing for the FVCOM model, have similar trends with the ML model predictions have better fit compared the observed data. It is worth mentioning the uncertainty about the location and depth of E. coli sampling in OD1, OD2, and OD3 beaches. While we used the model based on Safaie et al. (2016) that used a statistical approach for generating the E. coli concentration at the river mouth, the metrics presented in Table 18 do not match perfectly with the values presented in Safaie et al. (2016). This, we believe, has to do with the uncertainty about the exact horizontal location of the sampling site 119 . In addition, it is known that samples were collected from just below the water surface while the model results used for computing the metrics in Table 18 are based on the vertically integrated concentrations. Hence using E. coli concentrations from the top one or two layers are expected to produce better metrics comparable to those reported in Safaie et al. (2016). Therefore, the goal of this study was to compare the performance of the statistical and ML approaches in generating E. coli concentration time series and how these will improve simulated E. coli at downstream beaches. Table 18. Performance scores of simulated and observed E. coli at three sites (OD1, OD2, and OD3) based on the statistical and ML E. coli models. Station Sample time OD1 Morning Afternoon Overall Morning OD2 Afternoon Overall Morning OD3 Afternoon Overall RMSE R² RSR Stat 0.481 0.478 0.480 0.427 0.417 0.422 0.491 0.583 0.534 ML 0.445 0.438 0.442 0.385 0.404 0.393 0.448 0.500 0.471 Stat 0.315 0.282 0.289 0.525 0.698 0.614 0.505 0.506 0.508 ML 0.386 0.287 0.336 0.582 0.700 0.643 0.546 0.547 0.547 Stat 1.205 1.760 1.373 1.062 0.952 1.009 1.227 1.408 1.304 ML 1.115 1.611 1.265 0.959 0.923 0.940 1.119 1.206 1.152 120 Figure 56. Plots of Simulated and Observed E. coli concentration at OD1, OD2, and OD3 beaches based on the hourly generated data by the Statistical and ML models. 4.4. Discussion The analysis of our results indicates that the most accurate results are obtained in the harbor area due to detailed domain discretization and precise boundary conditions (Sequeiros et al., 2009). This detailed discretization provides a clearer understanding of turbidity patterns, similar to the insights from high-resolution Landsat 8 imagery in the Po River prodelta (Braga et al., 2017). Our results align with the findings of Braga et al. (2017), underscoring the importance of high- resolution data for accurate prediction and analysis of turbidity. Significantly, our utilization of a mechanistic model, calibrated with Acoustic Doppler Current Profiler (ADCP) data, offers an unprecedented resolution of water dynamics, capturing the 121 spatiotemporal variability of turbidity currents with high fidelity. This approach distinguishes our work from previous studies, as the ADCP data-driven calibration ensures a realistic representation of the underlying physical processes, enhancing the precision and accuracy of our predictions while providing abundant synthetic data for the Machine Learning model. By excluding this 3D mechanistic modeling component, we would sacrifice the spatiotemporal resolution, compromising the robustness and reliability of our assessments and rendering our model susceptible to oversimplifications and poor performance. Notably, our study also highlights the need for a unified measure of turbidity across the lake, which aligns with the method used by Zheng and DiGiacomo (2022) in their study of the Great Lakes. Using the same unit as the USGS gauges, we can compare turbidity values at the river mouth with those in the lake, enhancing the comparability of data. In contrast, the accuracy of results decreases outside the harbor area, mainly due to larger mesh size and the absence of other sources of turbidity, such as wastewater treatment plants, as well as mechanisms like deposition and resuspension (Felix, 2002; Schulz et al., 2018). This mirrors the conclusions drawn by Schulz et al. (2018), where varying hydro- and meteorological conditions affected sediment fluxes in different locations. As we moved farther from the river mouth and harbor area, we encountered mesh sensitivity and numerical diffusion issues. This suggests that an accurate representation of forcing fields, such as the wind field (Beletsky et al., 2013), and river boundary conditions is required to address the uncertainty of boundary conditions (Hunt and Jones, 2020). Our study also recognized that turbidity is influenced by various sources along the river, from the USGS gauge to the mouth of the river at the harbor. However, deriving turbidity from one remote sensing index poses challenges, such as a single Normalized Difference Turbidity Index (NDTI) 122 value can correspond to multiple turbidity concentrations (Garg et al., 2017 & 2020). This supports the findings of Zheng and DiGiacomo (2022), who utilized a simplified water clarity-turbidity index (CTI) to better capture significant changes in water clarity/turbidity by including multiple variables from VIIRS measurements, Secchi disk depth, and particulate backscattering coefficient. The type of remote sensing instrument used significantly impacts the results. A model trained on imagery from a specific instrument may produce inaccurate predictions when applied to images captured by other sensors such as Sentinel or Landsat (le Fouest et al., 2015; Vanhellemont and Ruddick, 2021). Yet, our approach proved effective when training machine learning models on different instruments, supporting the claims of Saberioon et al. (2020) and Filisbino Freire da Silva et al. (2021) regarding the potential of machine learning and satellite data in water quality prediction. Notably, our study also highlights the need for a unified measure of turbidity across the lake and at the river mouth through in-situ instruments, which aligns with the method used by Garg et al. (2017). Using the same unit as the USGS gauges, we can compare turbidity values at the river mouth with those in the lake, enhancing the comparability of data. Finally, our study proposes using a Gaussian Process Regression (GPR) model to address uncertainty effectively, supporting the conclusions of Filisbino Freire da Silva et al. (2021) about the potential of machine learning in water quality assessment. This strategy, coupled with generating unlimited data for training machine learning (ML) models, can significantly improve our understanding of turbidity detection and monitoring, effectively addressing issues of water composition variability noted in our study and the research conducted by Normandin et al. (2019). This approach also aids in evaluating and refining the performance of mechanistic models in predicting turbidity accurately, building upon the work of Sequeiros et al. (2009) and Felix (2002). 123 Our approach offers a promising direction for developing robust, data-rich, and effective turbidity detection and monitoring strategies. The synergy between in situ measurements, remote sensing imagery, and machine learning algorithms can help develop a more comprehensive understanding of turbidity dynamics in marine and estuarine environments. 4.5. Conclusions Our study set out to bridge the gaps in existing research concerning data availability and model generalizability in predicting water turbidity using an integrated approach of numerical modeling, machine learning, and remote sensing. Our results substantiate the efficacy of our approach, offering promising advances in both the precision and comprehensiveness of turbidity prediction and our understanding of water quality in coastal areas. We leveraged numerical modeling's synthetic data for training the machine learning model, thereby improving data availability in a broad spectrum of concentration values. Our study also highlights the necessity of unified turbidity measures across diverse locations and emphasizes the impact of the type of remote sensing data on the accuracy of predictions. Our results demonstrated that machine learning models can effectively describe boundary conditions, enhance interpolation during data scarcity, and improve the performance of mechanistic models. The accurate and abundant synthetic data generated by the improved mechanistic model prove invaluable for training machine learning models. Furthermore, applying these machine learning models in predicting turbidity concentration from remote sensing data presents a novel approach to the challenge. The validation of the machine learning model revealed promising results, with strong correlations between the measured and predicted turbidity for both validation and test datasets. The successful implementation of the mechanistic model in simulating velocity components and water surface 124 temperatures and the plausible representation of the turbidity plume in the harbor area underpins the accuracy of our models. However, we recognize limitations in the transferability of our model to different regions. The discrepancy in prediction accuracy might be attributed to variations in local conditions, the quality and scale of input data, and the fact that the models were fine-tuned for a different region. Further research and adaptations are needed to generalize the model across diverse locations. In conclusion, this research is vital for improving our understanding of water quality dynamics and turbidity prediction in coastal environments. Our work showcases the potential of leveraging machine learning and remote sensing in creating high-resolution, unified turbidity predictions, thereby helping policymakers and stakeholders develop effective strategies for monitoring and maintaining water quality. 125 CHAPTER 5 CONCLUSIONS The research goal was to improve coastal hydrodynamics and water quality models in the Great Lakes by integrating Mechanistic Modeling, Machine Learning, and Remote Sensing approaches. The primary focus of the research was on developing and evaluating three-dimensional hydrodynamic and temperature models for Lake Huron and Lake Erie for the summers of 2018, 2019, and 2022. These models were developed using depth-adaptive and detailed unstructured mesh. A particular emphasis was placed on evaluating these hydrodynamic models with various forcing fields derived from in-situ observations and reanalysis datasets. Hammond Bay of Lake Huron was used as a key test case in the investigation. Here, given the significant influence of lake-wind circulation and bidirectional exchange flows from the Strait of Mackinac on local hydrodynamics and temperature, the importance of the spatiotemporal resolution of the forcing field emerged distinctly. Specifically, regions with sparse and low in-situ observations benefited the most from the use of reanalysis datasets, such as ERA5. Furthermore, our findings underscored the positive impact of detailed domain discretization and nested models on model performance. The research revealed an important insight by dynamically calculating the residence time in Hammond Bay, clarifying the intricate interactions between lake-wide and bidirectional currents. This understanding is vital for water quality implications due to its effects on hydrodynamics and temperature. Groundwater's role in Thunder Bay of Lake Huron was another important aspect of the study. Using the Mechanistic Modeling approach in conjunction with comprehensive field measurements, the research shed light on the substantial impact of groundwater discharge through the bay's sinkholes and lakebed on the hydrodynamics and temperature of the bay. By simulating 126 different scenarios, the investigation highlighted how groundwater modulates Thunder Bay's circulation and temperature, which in turn carries a significant effect on water quality and nutrient transport. Transitioning to the domain of Machine Learning, the study employed this tool to train a machine learning model for river turbidity based on a set of explanatory variables. The research ventured further into Lake Erie, specifically the Cleveland Harbor area, to develop a detailed, high- resolution mechanistic and transport model. This region includes complex geometry and coastal structures. It provided an apt challenge for simulating the spatiotemporal distribution of turbidity discharging from the Cuyahoga river. High-resolution remote sensing imagery in the visible band acted as a validation test for the model's accuracy. Using the simulated turbidity, a machine learning model was developed to predict turbidity concentration using hyperspectral remote sensing imagery. When applied to new imagery from other coastal areas, the machine learning model demonstrated acceptable accuracy, showcasing its wider applicability. Such integration of simulated turbidity and remote sensing paves the way for decreased reliance on in-situ water quality sampling, enhancing modeling and real-time coastal environment monitoring, and offering significant insights related to health and management. In conclusion, this research has improved coastal hydrodynamics and water quality modeling by integrating Mechanistic Modeling, Machine Learning, and Remote Sensing approaches. The insights emanating from Lake Huron and Lake Erie provide not only a template for future studies in other regions but also offer practical strategies for improving water quality and enhancing our understanding of intricate lake systems. The study holds important implications for coastal management, environmental conservation, and the broader understanding of the complex system between hydrodynamics, temperature, and water quality. 127 BIBLIOGRAPHY Anderson, E. J., & Schwab, D. J. (2013). Predicting the oscillating bi-directional exchange flow in the Straits of Mackinac. Journal of Great Lakes Research, 39(4), 663–671. https://doi.org/10.1016/j.jglr.2013.09.001 Anderson, E. J., & Schwab, D. J. (2017). Meteorological influence on summertime baroclinic exchange in the Straits of Mackinac. Journal of Geophysical Research: Oceans, 122(3), 2171–2182. https://doi.org/doi:10.1002/2016JC012255. Kenov, I.A., Garcia, A. C., & Neves, R. (2012). Residence time of water in the Mondego estuary (Portugal). Estuarine, Coastal and Shelf Science, 106, 13–22. https://doi.org/10.1016/j.ecss.2012.04.008 Beyramzade, M., Siadatmousavi, S. M., & Majidy Nik, M. (2019). Skill assessment of SWAN model in the red sea using different wind data. Regional Studies in Marine Science, 30, 100714. https://doi.org/10.1016/j.rsma.2019.100714 Bratton, J. F., Böhlke, J. K., Krantz, D. E., & Tobias, C. R. (2009). Flow and geochemistry of groundwater beneath a back-barrier lagoon: The subterranean estuary at Chincoteague Bay, Maryland, USA. Marine Chemistry, 113 (1), 78–92. https://doi.org/https://doi.org/10.1016/j.marchem.2009.01.004 Bravo, H. R., Hamidi, S. A., Anderson, E. J., Klump, J. V., & Khazaei, B. (2020). Timescales of transport through Lower Green Bay. Journal of Great Lakes Research, 46(5), 1292-1306. https://doi.org/10.1016/j.jglr.2020.06.010 Byrne, E.R., Roche, K.M., Schaerer, L.G., Techtmann, S.M., Temporal variation of crude and refined oil biodegradation rates and microbial community composition in freshwater systems, J. Great Lakes Research, 47(5), pp. 1376-1385 (2021) Chen, C., H. Liu, and R. C. Beardsley (2003), An unstructured, finite-volume, three-dimensional, primitive equation ocean model: Application to coastal ocean and estuaries, J. Atmos. Oceanic Technol., 20, 159–186. Chen, C., Xu, Q., Ralph, E., Budd, J. W., & Lin, H. (2004). Response of Lake Superior to mesoscale wind forcing: A comparison between currents driven by QuikSCAT and buoy winds. Journal of Geophysical Research C: Oceans, 109(10). https://doi.org/10.1029/2002JC001692 Defne, Z., & Ganju, N. K. (2015). Quantifying the Residence Time and Flushing Characteristics of a Shallow, Back-Barrier Estuary: Application of Hydrodynamic and Particle Tracking 128 Models. Estuaries and Coasts, 38(5), 1719–1734. https://doi.org/10.1007/S12237-014-9885- 3/FIGURES/10 Dettmann, E. H. (2001). Effect of water residence time on annual export and denitrification of nitrogen in estuaries: A model analysis. Estuaries 2001 24:4, 24(4), 481–490. https://doi.org/10.2307/1353250 Galperin, B., L. H. Kantha, S. Hassid, and A. Rosati (1988), A quasi-equilibrium turbulent energy model for geophysical flows, J. Atmos. Sci., 45(1), 55–62. Ge, Z., Whitman, R. L., Nevers, M. B., Phanikumar, M. S., & Byappanahalli, M. N. (2012). Nearshore hydrodynamics as loading and forcing factors for Escherichia coli contamination at an embayed beach. Limnology and Oceanography, 57(1), 362–381. https://doi.org/10.4319/lo.2012.57.1.0362 Geyer, W. R., Morris, J. T., Prahl, F. G., & Jay, D. A. (2000). Interaction between physical processes and ecosystem structure: A comparative approach in Estuarine Science: A Synthetic Approach to Research and Practice (Hobbie, J.E., ed.), pp. 177-206, Island Press, Washington, DC Hamidi, S. A., Bravo, H. R., Val Klump, J., & Waples, J. T. (2015). The role of circulation and heat fluxes in the formation of stratification leading to hypoxia in Green Bay, Lake Michigan. Journal of Great Lakes Research, 41(4), 1024–1036. https://doi.org/10.1016/j.jglr.2015.08.007 Hodges, B.R., Laval, B., Wadzuk, B.M. Numerical error assessment and a temporal horizon for internal waves in a hydrostatic model. Ocean Modell. 2006, 13, 44-64. Hundsdorfer, W. H., Verwer, J. G., & Hundsdorfer, W. H. (2003). Numerical solution of time- dependent advection-diffusion-reaction equations (Vol. 33). Berlin: Springer. Klyuvitkin, A. A., Ostrovskii, A. G., Lisitzin, A. P., & Konovalov, S. K. (2019). The energy spectrum of the current velocity in the deep part of the Black Sea. Doklady Earth Sciences, 488(2), 1222–1226. https://doi.org/10.1134/S1028334X1910012X Lee, H. J., Meng, P. J., Chen, C. C., & Tew, K. S. (2020). Monsoon effects on the residence time of a coastal lagoon in southwestern Taiwan. Estuarine, Coastal and Shelf Science, 233, 106535. https://doi.org/10.1016/J.ECSS.2019.106535 Mahanty, M. M., Mohanty, P. K., Pattnaik, A. K., Panda, U. S., Pradhan, S., & Samal, R. N. (2016). Hydrodynamics, temperature/salinity variability and residence time in the Chilika lagoon during dry and wet period: Measurement and modeling. Continental Shelf Research, 125, 28–43. https://doi.org/10.1016/J.CSR.2016.06.017 129 Mao, M., & Xia, M. (2020). Monthly and Episodic Dynamics of Summer Circulation in Lake Michigan. Journal of Geophysical Research: Oceans, 124, e2019JC015932. https://doi.org/10.1029/2019JC015932 Mao, M., Xia, M. (2020). Particle dynamics in the nearshore of Lake Michigan revealed by an observation-modeling system. Journal of Geophysical Research: Oceans, 125, e2019JC015765. https://doi.org/10.1029/2019JC015765 Mellor, G. L., and T. Yamada (1982), Development of a turbulence closure model for geophysical fluid problems, Rev. Geophys., 20, 851– 875. Mellor, G.L., Häkkinen, S.M., Ezer, T., Patchen, R.C. (2002). A Generalization of a Sigma Coordinate Ocean Model and an Intercomparison of Model Vertical Grids. In: Pinardi, N., Woods, J. (eds) Ocean Forecasting. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-22648-3_4 Melstrom, R.T., Reeling, C., Gupta, L., Miller, S.R., Zhang, Y.L., and Lupi, F. (2019). Economic damages from a worst-case oil spill in the Straits of Mackinac, J. Great Lakes Research, 45(6), pp. 1130-1141. Memari, S., Phanikumar, M.S., Evans, M.A. and Byappanahalli, M.N. (2023). Acoustic Doppler Current Profiler data collected near Hammond Bay, Lake Huron, 2019: U.S. Geological Survey, https://doi.org/10.5066/P9VVENAZ. Memari, S., & Siadatmousavi, S. M. (2018). Numerical Modeling of Heat and Brine Discharge Near Qeshm Desalination Plant. International Journal of Coastal and Offshore Engineering, 1(4), 27–35. https://doi.org/10.29252/ijcoe.1.4.27 Monsen, N. E., Cloern, J. E., Lucas, L. V., & Monismith, S. G. (2002). A comment on the use of flushing time, residence time, and age as transport time scales. Limnology and Oceanography, 47(5), 1545–1553. https://doi.org/10.4319/lo.2002.47.5.1545 Moriasi, D. N., Arnold, J. G., Van Liew, M. W., Bingner, R. L., Harmel, R. D., & Veith, T. L. (2007). Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Transactions of the ASABE, 50(3), 885-900. National Geophysical Data Center, 1996. Bathymetry of Lake Michigan. National Geophysical Data Center, NOAA. doi:10.7289/V5B85627. Accessed data: 2020-09-06. National Geophysical Data Center, 1999. Bathymetry of Lake Huron. National Geophysical Data Center, NOAA. doi:10.7289/V5G15XS5. Accessed data: 2020-09-06. 130 Nguyen, T. D., Thupaki, P., Anderson, E. J., & Phanikumar, M. S. (2014). Summer circulation and exchange in the Saginaw Bay‐Lake Huron system. Journal of Geophysical Research: Oceans, 119(4), 2713-2734. Nguyen, T. D., Hawley, N., & Phanikumar, M. S. (2017). Ice cover, winter circulation, and exchange in Saginaw Bay and Lake Huron. Limnology and Oceanography, 62(1), 376–393. https://doi.org/10.1002/lno.10431 NOAA Great Lakes Environmental Research Laboratory (2019). Water temperature collected at multiple depths from a mooring in central Lake Huron, Great Lakes region from 2012-10- 23 to 2019-06-28 (NCEI Accession 0191817). Subset used: vertical water temperature for 2018. NOAA National Centers for Environmental Information. Dataset. https://www.ncei.noaa.gov/archive/accession/0191817. Accessed data: 2021-01-31. Noranian Esfahani, M., Akbarpour Jannat, M.R., Banijamali, B. et al. The impact of ERA- Interim winds on wave generation model performance in the Southern Caspian Sea region. Meteorol Atmos Phys 131, 1281–1299 (2019). https://doi.org/10.1007/s00703-018-0638-x Parkinson, C. L., and W. M. Washington (1979), A large-scale numerical model of sea ice, J. Geophys. Res., 84, 311–337. Przybyla-Kelly, K., Nevers, M.B., Shively, D.A., Benson, S.P., Carter, G.M., Dwyer, S.C., Lewan, M.E., Picard, K.R., Richards, L.C., Sopovski, D.S., Spoljaric, A.M., Szymkowski, D.R., Welninski, A.D., Wimmer, E.E., and Evans, M.A., 2020, Cladophora biomass and supporting data collected in the Great Lakes, 2019: U.S. Geological Survey data release, https://doi.org/10.5066/P99O4QXB. Przybyla-Kelly, K., Nevers, M.B., Szymkowski, D.R., Shively, D.A., Carter, G.M., Lewan, M.E., Picard, K.R., Spoljaric, A.M., Wimmer, E.E., and Evans, M.A., 2021, Cladophora biomass and supporting data collected in the Great Lakes, 2020: U.S. Geological Survey data release, https://doi.org/10.5066/P9O9FSTT. Quinn, F.H., (1992). Hydraulic residence times for the Laurentian Great Lakes. J. Great Lakes Res. 18, 22–28. Ralston, D. K., Brosnahan, M. L., Fox, S. E., Lee, K. D., & Anderson, D. M. (2015). Temperature and residence time controls on an estuarine harmful algal bloom: Modeling hydrodynamics and Alexandrium fundyense in Nauset estuary. Estuaries and Coasts : Journal of the Estuarine Research Federation, 38(6), 2240–2258. https://doi.org/10.1007/S12237-015-9949-Z 131 Rynne, P., Reniers, A., van de Kreeke, J., & MacMahan, J. (2016). The effect of tidal exchange on residence time in a coastal embayment. Estuarine, Coastal and Shelf Science, 172, 108– 120. https://doi.org/10.1016/J.ECSS.2016.02.001 Safaie, A., A. Wendzel, Z. Ge, M.B. Nevers, R.L. Whitman, S.R. Corsi, and M.S. Phanikumar, Comparative evaluation of statistical and mechanistic models of Escherichia coli at beaches in Southern Lake Michigan, Environmental Science & Technology, 50, 2442−2449, doi: 10.1021/acs.est.5b05378 (2016) Saylor, J. H., & Sloss, P. W. (1976). Water volume transport and oscillatory current flow through the Straits of Mackinac. Journal of Physical Oceanography, 6(2), 229-237. Saylor, J. H., and G. Miller (1991), Current flow through the Straits of Mackinac, technical report 19910004, Natl. Oceanic and Atmos. Admin. Great Lakes Environ. Res. Lab, Ann Arbor, Mich. Saylor, J. H., and Miller, G. S. (1979), Lake Huron winter circulation, J. Geophys. Res., 84( C6), 3237– 3252, doi:10.1029/JC084iC06p03237. Schwab, D. J. (2016). Statistical Analysis of Straits of Mackinac Line 5: Worst Case Spill Scenarios. March, 23. www.graham.umich.edu/water. Senjyu, T., Shin, H. R., Yoon, J. H., Nagano, Z., An, H. S., Byun, S. K., & Lee, C. K. (2005). Deep flow field in the Japan/East Sea as deduced from direct current measurements. Deep- Sea Research Part II: Topical Studies in Oceanography, 52(11–13), 1726–1741. https://doi.org/10.1016/j.dsr2.2003.10.013 Sloss, P. W., & Saylor, J. H. (1976). Large-scale current measurements in Lake Huron. Journal of Geophysical Research, 81(18), 3069–3078. https://doi.org/10.1029/jc081i018p03069. Smagorinsky, J. (1963), General circulation experiments with the primitive equations: I. The basic experiment, Monthly Weather Review, 91(3), 99–164. Smith, S. D. P., et al. (2014). Rating impacts in a multi-stressor world: a quantitative assessment of 50 stressors affecting the Great Lakes. Ecological Applications, 25(3): 717-728. Stoica, P., and Moses, R. L. (2005). Spectral Analysis of Signals, Vol. 452, pp. 25-26. Pearson Prentice Hall, Upper Saddle River, NJ. Tomlinson, L.M., Auer, M.T., Bootsma, H.A., Owens, E.M., 2010. The Great Lakes Cladophora Model: development, testing, and application to Lake Michigan. J. Great Lakes Res. 36(2), 287–297. 132 US Geological Survey, National Water Information System, https://waterdata.usgs.gov/nwis/measurements/?site_no=04132160, accessed October 18, 2022 Webster, F. (1968), Observations of inertial-period motions in the deep sea, Rev. Geophys., 6( 4), 473– 490, doi:10.1029/RG006i004p00473. Wheat, E.E., Banas, N.S., Ruesink, J.L., Multi-day water residence time as a mechanism for physical and biological gradients across intertidal flats, Estuarine, Coastal and Shelf Science, 227, 106303, pp. 1-10 (2019) Xiong, J., Shen, J., Qin, Q., & Du, J. (2021). Water exchange and its relationships with external forcings and residence time in Chesapeake Bay. Journal of Marine Systems, 215, 103497. https://doi.org/10.1016/J.JMARSYS.2020.103497 Xue, P., Schwab, D. J., and Hu, S. (2015), An investigation of the thermal response to meteorological forcing in a hydrodynamic model of Lake Superior, J. Geophys. Res. Oceans, 120, 5233– 5253, doi:10.1002/2015JC010740. Yuan, D., Lin, B., & Falconer, R. A. (2007). A modelling study of residence time in a macro- tidal estuary. Estuarine, Coastal and Shelf Science, 71(3–4), 401–411. https://doi.org/10.1016/j.ecss.2006.08.023 Zamani, B., Koch, M., & Hodges, B. R. (2020). Effects of morphology in controlling propagation of density currents in a reservoir using uncalibrated three-dimensional hydrodynamic modeling. Journal of Limnology, 79(3), 238–253. https://doi.org/10.4081/jlimnol.2020.1942 Arias-Rodriguez, L. F., Tüzün, U. F., Duan, Z., Huang, J., Tuo, Y., & Disse, M. (2023). Global Water Quality of Inland Waters with Harmonized Landsat-8 and Sentinel-2 Using Cloud- Computed Machine Learning. Remote Sensing, 15(5), 1390. https://doi.org/10.3390/RS15051390/S1 Barreneche, J. M., Guigou, B., Gallego, F., Barbieri, A., Smith, B., Fernández, M., Fernández, V., & Pahlevan, N. (2023). Monitoring Uruguay’s freshwaters from space: An assessment of different satellite image processing schemes for chlorophyll-a estimation. Remote Sensing Applications: Society and Environment, 29. https://doi.org/10.1016/j.rsase.2022.100891 Beletsky, D., Hawley, N., & Rao, Y. R. (2013). Modeling summer circulation and thermal structure of Lake Erie. Journal of Geophysical Research: Oceans, 118(11), 6238–6252. https://doi.org/10.1002/2013JC008854 133 Bilotta, G. S., & Brazier, R. E. (2008). Understanding the influence of suspended solids on water quality and aquatic biota. Water research, 42(12), 2849-2861. Braga, F., Zaggia, L., Bellafiore, D., Bresciani, M., Giardino, C., Lorenzetti, G., Maicu, F., Manzo, C., Riminucci, F., Ravaioli, M., & Brando, V. E. (2017). Mapping turbidity patterns in the Po river prodelta using multi-temporal Landsat 8 imagery. Estuarine, Coastal and Shelf Science, 198, 555–567. https://doi.org/10.1016/j.ecss.2016.11.003 Cao, Z., Ma, R., Duan, H., Pahlevan, N., Melack, J., Shen, M., & Xue, K. (2020). A machine learning approach to estimate chlorophyll-a from Landsat-8 measurements in inland lakes. Remote Sensing of Environment, 248. https://doi.org/10.1016/j.rse.2020.111974 Chen, C., H. Liu, and R. C. Beardsley (2003), An unstructured, finite-volume, three-dimensional, primitive equation ocean model: Application to coastal ocean and estuaries, J. Atmos. Oceanic Technol., 20, 159–186. Cunningham, A., Ramage, L., & McKee, D. (2013). Relationships between inherent optical properties and the depth of penetration of solar radiation in optically complex coastal waters. Journal of Geophysical Research: Oceans, 118(5), 2310–2317. https://doi.org/10.1002/jgrc.20182 Farahani, A., Pourshojae, B., Rasheed, K., & Arabnia, H. R. (2021). A Concise Review of Transfer Learning. https://arxiv.org/abs/2104.02144v1. Felix, M. (2002). Flow structure of turbidity currents. Sedimentology, 49(3), 397–419. https://doi.org/10.1046/J.1365-3091.2002.00449.X Filisbino Freire da Silva, E., Márcia Leão de Moraes Novo, E., de Lucia Lobo, F., Clemente Faria Barbosa, C., Tressmann Cairo, C., Almeida Noernberg, M., & Henrique da Silva Rotta, L. (2021). A machine learning approach for monitoring Brazilian optical water types using Sentinel-2 MSI. Remote Sensing Applications: Society and Environment, 23. https://doi.org/10.1016/j.rsase.2021.100577 Galperin, B., L. H. Kantha, S. Hassid, and A. Rosati (1988), A quasi-equilibrium turbulent energy model for geophysical flows, J. Atmos. Sci., 45(1), 55–62. Gambin, A. F., Angelats, E., Gonzalez, J. S., Miozzo, M., & DIni, P. (2021). Sustainable Marine Ecosystems: Deep Learning for Water Quality Assessment and Forecasting. IEEE Access, 9, 121344–121365. https://doi.org/10.1109/ACCESS.2021.3109216 Gao, B. C. (1996). NDWI—A normalized difference water index for remote sensing of vegetation liquid water from space. Remote sensing of environment, 58(3), 257-266. 134 Garg, V., Aggarwal, S. P., & Chauhan, P. (2020). Changes in turbidity along Ganga River using Sentinel-2 satellite data during lockdown associated with COVID-19. Geomatics, Natural Hazards and Risk, 11(1), 1175–1195. https://doi.org/10.1080/19475705.2020.1782482 Garg, V., Kumar, A. S., Aggarwal, S. P., Kumar, V., Dhote, P. R., Thakur, P. K., ... & Rastogi, G. (2017). Spectral similarity approach for mapping turbidity of an inland waterbody. Journal of hydrology, 550, 527-537. Garg, V., Senthil Kumar, A., Aggarwal, S. P., Kumar, V., Dhote, P. R., Thakur, P. K., Nikam, B. R., Sambare, R. S., Siddiqui, A., Muduli, P. R., & Rastogi, G. (2017). Spectral similarity approach for mapping turbidity of an inland waterbody. Journal of Hydrology, 550, 527– 537. https://doi.org/10.1016/J.JHYDROL.2017.05.039 Germán, A., Shimoni, M., Beltramone, G., Rodríguez, M. I., Muchiut, J., Bonansea, M., Scavuzzo, C. M., & Ferral, A. (2021). Space-time monitoring of water quality in an eutrophic reservoir using SENTINEL-2 data - A case study of San Roque, Argentina. In Remote Sensing Applications: Society and Environment (Vol. 24). Elsevier B.V. https://doi.org/10.1016/j.rsase.2021.100614 Goudenège, L., Molent, A., & Zanette, A. (2020). Gaussian process regression for pricing variable annuities with stochastic volatility and interest rate. Decisions in Economics and Finance, 44(1), 57–72. https://doi.org/10.1007/S10203-020-00287-7 Grasso, F., Verney, R., le Hir, P., Thouvenin, B., Schulz, E., Kervella, Y., Khojasteh Pour Fard, I., Lemoine, J. P., Dumas, F., & Garnier, V. (2018). Suspended Sediment Dynamics in the Macrotidal Seine Estuary (France): 1. Numerical Modeling of Turbidity Maximum Dynamics. Journal of Geophysical Research: Oceans, 123(1), 558–577. https://doi.org/10.1002/2017JC013185 Gunn, J. M., Snucins, E. D., Yan, N. D., & Arts, M. T. (2001). Use of water clarity to monitor the effects of climate change and other stressors on oligotrophic lakes. Environmental Monitoring and Assessment, 67, 69-88. Hoaglin, D.C. (2013). Volume 16: How to Detect and Handle Outliers. Huang, H., Imran, ; Jasim, Asce, A. M., & Pirmez, C. (2005). Numerical Model of Turbidity Currents with a Deforming Bottom Boundary. Journal of Hydraulic Engineering, 131(4), 283–293. https://doi.org/10.1061/(ASCE)0733-9429(2005)131:4(283) Hunt, S., & Jones, H. F. E. (2020). The fate of river-borne contaminants in the marine environment: Characterising Regions of Freshwater Influence (ROFIs) and estuary plumes using idealised models and satellite images. Marine Pollution Bulletin, 156. https://doi.org/10.1016/j.marpolbul.2020.111169 135 Jalón-Rojas, I., Dijkstra, Y. M., Schuttelaars, H. M., Brouwer, R. L., Schmidt, S., & Sottolichio, A. (2021). Multidecadal Evolution of the Turbidity Maximum Zone in a Macrotidal River Under Climate and Anthropogenic Pressures. Journal of Geophysical Research: Oceans, 126(5). https://doi.org/10.1029/2020JC016273 Kim, H. E., Cosa-Linan, A., Santhanam, N., Jannesari, M., Maros, M. E., & Ganslandt, T. (2022). Transfer learning for medical image classification: a literature review. BMC Medical Imaging 2022 22:1, 22(1), 1–13. https://doi.org/10.1186/S12880-022-00793-7 Knaeps, E., Ruddick, K. G., Doxaran, D., Dogliotti, A. I., Nechad, B., Raymaekers, D., & Sterckx, S. (2015). A SWIR based algorithm to retrieve total suspended matter in extremely turbid waters. Remote Sensing of Environment, 168, 66–79. https://doi.org/10.1016/j.rse.2015.06.022 Kuhn, C., de Matos Valerio, A., Ward, N., Loken, L., Sawakuchi, H. O., Kampel, M., Richey, J., Stadler, P., Crawford, J., Striegl, R., Vermote, E., Pahlevan, N., & Butman, D. (2019). Performance of Landsat-8 and Sentinel-2 surface reflectance products for river remote sensing retrievals of chlorophyll-a and turbidity. Remote Sensing of Environment, 224, 104–118. https://doi.org/10.1016/j.rse.2019.01.023 le Fouest, V., Chami, M., & Verney, R. (2015). Analysis of riverine suspended particulate matter fluxes (Gulf of Lion, Mediterranean Sea) using a synergy of ocean color observations with a 3-D hydrodynamic sediment transport model. Journal of Geophysical Research: Oceans, 120(2), 942–957. https://doi.org/10.1002/2014JC010098 Legleiter, C. J., Sansom, B. J., & Jacobson, R. B. (2022). Remote Sensing of Visible Dye Concentrations During a Tracer Experiment on a Large, Turbid River. Water Resources Research, 58(4). https://doi.org/10.1029/2021WR031396 Li, H., Yang, Q., Mo, S., Huang, J., Wang, S., Xie, R., Luo, X., & Liu, F. (2022). Formation of Turbidity Maximum in the Modaomen Estuary of the Pearl River, China: The Roles of Mouth Bar. Journal of Geophysical Research: Oceans, 127(12). https://doi.org/10.1029/2022JC018766 Li, J., Tian, L., Wang, Y., Jin, S., Li, T., & Hou, X. (2021). Optimal sampling strategy of water quality monitoring at high dynamic lakes: A remote sensing and spatial simulated annealing integrated approach. Science of the Total Environment, 777. https://doi.org/10.1016/j.scitotenv.2021.146113 Li, Z., Yang, W., Matsushita, B., & Kondoh, A. (2022). Remote estimation of phytoplankton primary production in clear to turbid waters by integrating a semi-analytical model with a 136 machine learning algorithm. Remote Sensing of Environment, 275. https://doi.org/10.1016/j.rse.2022.113027 Manfredi, P., & Trinchero, R. (2022). A Probabilistic Machine Learning Approach for the Uncertainty Quantification of Electronic Circuits Based on Gaussian Process Regression. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 41(8), 2638–2651. https://doi.org/10.1109/TCAD.2021.3112138 Mellor, G. L., and T. Yamada (1982), Development of a turbulence closure model for geophysical fluid problems, Rev. Geophys., 20, 851– 875. Memari, S., & Siadatmousavi, S. M. (2018). Numerical Modeling of Heat and Brine Discharge Near Qeshm Desalination Plant. International Journal of Coastal and Offshore Engineering, 1(4), 27–35. https://doi.org/10.29252/ijcoe.1.4.27 Nehorai, R., Lensky, I. M., Hochman, L., Gertman, I., Brenner, S., Muskin, A., & Lensky, N. G. (2013). Satellite observations of turbidity in the Dead Sea. Journal of Geophysical Research: Oceans, 118(6), 3146–3160. https://doi.org/10.1002/jgrc.20204 Nelson, S. A., Soranno, P. A., Cheruvelil, K. S., Batzli, S. A., & Skole, D. L. (2003). Regional assessment of lake water clarity using satellite remote sensing. Journal of Limnology, 62(1s), 27-32. Ni, W., Wang, K., Chen, T., Ng, W. J., & Tan, S. K. (2012). GPR model with signal preprocessing and bias update for dynamic processes modeling. Control Engineering Practice, 20(12), 1281–1292. https://doi.org/10.1016/J.CONENGPRAC.2012.07.003 Normandin, C., Lubac, B., Sottolichio, A., Frappart, F., Ygorra, B., & Marieu, V. (2019). Analysis of Suspended Sediment Variability in a Large Highly Turbid Estuary Using a 5- Year-Long Remotely Sensed Data Archive at High Resolution. Journal of Geophysical Research: Oceans, 124(11), 7661–7682. https://doi.org/10.1029/2019JC015417 Olmanson, L. G., Bauer, M. E., & Brezonik, P. L. (2008). A 20-year Landsat water clarity census of Minnesota's 10,000 lakes. Remote Sensing of Environment, 112(11), 4086-4097. Pahlevan, N., Smith, B., Schalles, J., Binding, C., Cao, Z., Ma, R., Alikas, K., Kangro, K., Gurlin, D., Hà, N., Matsushita, B., Moses, W., Greb, S., Lehmann, M. K., Ondrusek, M., Oppelt, N., & Stumpf, R. (2020). Seamless retrievals of chlorophyll-a from Sentinel-2 (MSI) and Sentinel-3 (OLCI) in inland and coastal waters: A machine-learning approach. Remote Sensing of Environment, 240. https://doi.org/10.1016/j.rse.2019.111604 137 Pu, F., Ding, C., Chao, Z., Yu, Y., & Xu, X. (2019). Water-Quality Classification of Inland Lakes Using Landsat8 Images by Convolutional Neural Networks. Remote Sensing 2019, Vol. 11, Page 1674, 11(14), 1674. https://doi.org/10.3390/RS11141674 Rasmussen, C. E. and C. K. I. Williams. Gaussian Processes for Machine Learning. MIT Press. Cambridge, Massachusetts, 2006 Roy, D. P., Huang, H., Boschetti, L., Giglio, L., Yan, L., Zhang, H. H., & Li, Z. (2019). Landsat- 8 and Sentinel-2 burned area mapping - A combined sensor multi-temporal change detection approach. Remote Sensing of Environment, 231. https://doi.org/10.1016/j.rse.2019.111254 Saberioon, M., Brom, J., Nedbal, V., Souc̆ ek, P., & Císar̆ , P. (2020). Chlorophyll-a and total suspended solids retrieval and mapping using Sentinel-2A and machine learning for inland Schulz, E., Grasso, F., le Hir, P., Verney, R., & Thouvenin, B. (2018). Suspended Sediment Dynamics in the Macrotidal Seine Estuary (France): 2. Numerical Modeling of Sediment Fluxes and Budgets Under Typical Hydrological and Meteorological Conditions. Journal of Geophysical Research: Oceans, 123(1), 578–600. https://doi.org/10.1002/2016JC012638 Sequeiros, O. E., Cantelli, A., Viparelli, E., White, J. D. L., Garcí, M. H., & Parker, G. (2009). Modeling turbidity currents with nonuniform sediment and reverse buoyancy. Water Resources Research, 45(6). https://doi.org/10.1029/2008WR007422 Smagorinsky, J. (1963), General circulation experiments with the primitive equations: I. The basic experiment, Monthly Weather Review, 91(3), 99–164. Syariz, M. A., Lin, C. H., Heriza, D., Lasminto, U., Sukojo, B. M., & Jaelani, L. M. (2021). A Transfer Learning Technique for Inland Chlorophyll-a Concentration Estimation Using Sentinel-3 Imagery. Applied Sciences 2022, Vol. 12, Page 203, 12(1), 203. https://doi.org/10.3390/APP12010203 Topp, S. N., Pavelsky, T. M., Jensen, D., Simard, M., & Ross, M. R. V. (2020). Research Trends in the Use of Remote Sensing for Inland Water Quality Science: Moving Towards Multidisciplinary Applications. Water 2020, Vol. 12, Page 169, 12(1), 169. https://doi.org/10.3390/W12010169 Turbidity -- Units of Measurement (usgs.gov), Website: https://or.water.usgs.gov/grapher/fnu.html Vanhellemont, Q., & Ruddick, K. (2021). Atmospheric correction of Sentinel-3/OLCI data for mapping of suspended particulate matter and chlorophyll-a concentration in Belgian turbid 138 coastal waters. Remote Sensing of Environment, 256. https://doi.org/10.1016/j.rse.2021.112284 Wang, W., & Jing, B. (2022). Gaussian process regression: Optimality, robustness, and relationship with kernel ridge regression. J. Mach. Learn. Res., 23, 193:1-193:67. Wernand, M. R. (2010). On the history of the Secchi disc. Journal of the European Optical Society-Rapid Publications, 5. Werther, M., Odermatt, D., Simis, S. G. H., Gurlin, D., Lehmann, M. K., Kutser, T., Gupana, R., Varley, A., Hunter, P. D., Tyler, A. N., & Spyrakos, E. (2022). A Bayesian approach for remote sensing of chlorophyll-a and associated retrieval uncertainty in oligotrophic and mesotrophic lakes. Remote Sensing of Environment, 283. https://doi.org/10.1016/j.rse.2022.113295 Winterwerp, J. C. (2002). On the flocculation and settling velocity of estuarine mud. Continental Shelf Research, 22(9), 1339–1360. https://doi.org/10.1016/S0278-4343(02)00010-9 Yuan, Q., Shen, H., Li, T., Li, Z., Li, S., Jiang, Y., Xu, H., Tan, W., Yang, Q., Wang, J., Gao, J., & Zhang, L. (2020). Deep learning in environmental remote sensing: Achievements and challenges. Remote Sensing of Environment, 241. https://doi.org/10.1016/j.rse.2020.111716 Zheng, G., & DiGiacomo, P. M. (2022). A simple water clarity-turbidity index for the Great Lakes. Journal of Great Lakes Research, 48(3), 686–694. https://doi.org/10.1016/j.jglr.2022.03.005 Zhu, C., van Maren, D. S., Guo, L., Lin, J., He, Q., & Wang, Z. B. (2022). Feedback Effects of Sediment Suspensions on Transport Mechanisms in an Estuarine Turbidity Maximum. Journal of Geophysical Research: Oceans, 127(6). https://doi.org/10.1029/2021JC018029 Zhu, Q., Shen, F., Shang, P., Pan, Y., & Li, M. (2019). Hyperspectral Remote Sensing of Phytoplankton Species Composition Based on Transfer Learning. Remote Sensing 2019, Vol. 11, Page 2001, 11(17), 2001. https://doi.org/10.3390/RS11172001 Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., Xiong, H., & He, Q. (2020). A Comprehensive Survey on Transfer Learning. https://doi.org/10.1109/JPROC.2020.3004555 139