ASSESSING WATERBIRD HOTSPOTS FOR CONSERVATION AND MANAGEMENT IN THE GREAT LAKES By Allison L. Sussman A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Integrative Biology—Master of Science 2017 ABSTRACT ASSESSING WATERBIRD HOTSPOTS FOR CONSERVATION AND MANAGEMENT IN THE GREAT LAKES By Allison L. Sussman Waterbird species are highly mobile and often aggregate in large groups, leading to difficulties assessing their distributions and spatial patterns. Species patterns also vary throughout the year due to migration and habitat preference for both breeding and wintering locations. Yet, waterbirds are a key study group because they are abundant, easily measured, and reactive, making them ideal indicators of environmental change. Identifying persistent areas of high use (i.e., hotspots) for waterbird species is both ecologically and economically beneficial. For Chapter 1, I selected four commonly used hotspot analysis models to evaluate the consistency of approaches for eight species and species groups in the Great Lakes region. Although there was some consistency across models, correlation analyses and mapped results demonstrated that hotspot analysis approaches can produce conflicting results. For Chapter 2, I developed an integrated hotspot modeling approach, in which I combined multiple models to produce a single hotspot value per location. I then used this model to estimate hotspot locations for the eight species groups in surveyed locations across three Great Lakes. The effectiveness of hotspot analyses is dependent upon the quantity and quality of available data, the spatial scale at which data are collected, and the scale at which results are needed (e.g., for management). My work provides researchers and resource managers with a quantitative evaluation of common hotspot analyses techniques. My results also elucidate possible waterbird hotspots in the Great Lakes, providing baseline estimates which can be used to prioritize sampling locations for future waterbird surveys. Copyright by ALLISON L. SUSSMAN 2017 This thesis is dedicated to my family. When I first mentioned the possibility of moving to Michigan, you didn’t balk at the hundreds of miles it would put between us. Instead, you simply encouraged me to follow my dreams regardless of where they would take me (and switched credit cards to build up your miles!). Thank you for your unwavering love and support. iv ACKNOWLEDGEMENTS This research was supported by the U.S. Fish and Wildlife Service Great Lakes Fish and Wildlife Restoration Act grants program. My work would not have been possible without the support of our regional project team for the project on monitoring and mapping avian resources in the Great Lakes. Specifically: Evan Adams, Beth Gardner, Kevin Kenow, Katie Koch, Michele Leduc-Lapierre, David Luukkonen, Michael Monfils, William Mueller, Becky Pearson, Victoria Pebbles, Leo Salas, Kate Williams, and the many pilots (Mike Callahan, Derek DeRuiter, Brian Lubinski, Keith Teague, and Luke Wuest) and observers (Jeff Bahls, Seth Cutright, Tim Demers, Joelle Gehring, Luke Fara, Kristen Finch, Steven Houdek, Robby Lambert, Tom Schultz, Brendan Shirkey, Howie Singer, Joel Trick, and Sarah Folsom Yates) who assisted with collecting the data. A special thanks to: Evan Adams and Beth Gardner who provided invaluable feedback on the models and methods used; Michael Fitzgibbon, Leo Salas, and Jeffrey Tash who provided early GIS and data support; and Kathy Kuletz and Patrick O’Hara who provided input on the spatial models and standardization process. I am grateful to my committee members, Catherine Lindell, Brian Maurer, and Michael Monfils, for providing constructive feedback and direction throughout my tenure at MSU. I am incredibly appreciative of the entire Zipkin Quantitative Ecology Lab who all provided productive feedback on the selected hotspot approaches and maps. I will miss the incredible and easy comradery formed between us in the ZQEL lab, and I know that no future group will come close to us; keep on ‘making nature predictable’! I am especially thankful to Matt Farr, Sam Rossman, and Sarah Saunders, although I’m sure they are probably glad they won’t be hearing the word ‘hotspots’ for much longer. I am grateful to Jacob Bowman, the late Brian Kinlan, and v Mark Wimer for encouraging and supporting me, and without whom I wouldn’t be where I am today. My journey with seabirds began with Brian and Mark and I am incredibly grateful for the opportunities their collaborations and friendships afforded me. A special thank you to Mark, who has been a boss, mentor, and friend for more than ten years. I’m lucky to know you (and thanks for hiring me way back when). And finally, I am completely indebted to Elise Zipkin for her continued support and guidance from day one (which was in fact long before either of us arrived in Michigan). She has been a remarkable mentor over the last few years. Elise has the unmatched ability to make everything seem okay even when it’s not; a five-minute conversation with her really can turn your day around. I am honored and privileged to have worked with and learned from such an intelligent and skilled woman. I am a better scientist, and person, for knowing and working with you all. Thank you. vi TABLE OF CONTENTS LIST OF TABLES ....................................................................................................................... viii LIST OF FIGURES ....................................................................................................................... ix INTRODUCTION .......................................................................................................................... 1 CHAPTER 1 A comparative analysis of methods to identify waterbird hotspots .......................... 4 ABSTRACT ................................................................................................................................ 4 INTRODUCTION ....................................................................................................................... 5 METHODS.................................................................................................................................. 8 Study area & data description .................................................................................................. 8 Species groups and composition............................................................................................ 11 Data standardization .............................................................................................................. 13 Hotspot analysis ..................................................................................................................... 14 Spatial non-parametric model: kernel density estimation .................................................. 14 Spatial non-parametric model: Getis-Ord Gi* ................................................................... 15 Parametric model: hotspot persistence............................................................................... 16 Parametric model: hotspots conditional on species presence ............................................ 17 Comparative analysis of the methods .................................................................................... 18 RESULTS.................................................................................................................................. 19 DISCUSSION ........................................................................................................................... 30 APPENDIX ................................................................................................................................... 36 CHAPTER 2 Combining models to identify waterbird hotspots in the Great Lakes ................... 38 ABSTRACT .............................................................................................................................. 38 INTRODUCTION ..................................................................................................................... 39 METHODS................................................................................................................................ 43 Study area & data description ................................................................................................ 43 Species groups & data standardization .................................................................................. 45 Hotspot analysis ..................................................................................................................... 46 Combining hotspot models .................................................................................................... 48 RESULTS.................................................................................................................................. 48 DISCUSSION ........................................................................................................................... 60 APPENDIX ................................................................................................................................... 66 LITERATURE CITED ................................................................................................................. 70 vii LIST OF TABLES Table 1.1. List of species and species groups, including the total number of encounters and total number observed. .......................................................................................................................... 12 Table 1.2. Pearson correlation matrix of pairwise comparisons between the four hotspot analysis approaches (kernel density estimation, Getis-Ord Gi*, hotspot persistence, and hotspots conditional on presence) with a Bonferroni adjustment and an alpha level of 0.05. Light gray values show copies (i.e., values above and below of the diagonals are mirror images). .............. 23 Table S1.1. Diving/sea ducks and loons reanalyzed without long-tailed duck and common loon, respectively. Pearson correlation matrix of pairwise comparisons between the four hotspot analysis approaches (kernel density estimation, Getis-Ord Gi*, hotspot persistence, and hotspots conditional on presence) with a Bonferroni adjustment and an alpha level of 0.05. Light gray values show copies (i.e., values above and below of the diagonals are mirror images). .............. 37 Table 2.1. Temporal distribution of unique aerial-surveys conducted from September 2012 to June 2014 and summed across all research entities. ..................................................................... 44 Table 2.2. Percent hotspots within each lake for each species/group and the all-species-combined group. The top row of the graph shows the percent of all grid cells surveyed (out of 1767) within each lake. When comparing across the lakes, it is important to note percentages are relative to the number of grid cells surveyed within each lake. Bold font indicates that the percent values are greater than the proportion surveyed in that lake, and indicate more hotspots than would be expected by chance for a particular species group in a particular lake. ........................................ 51 Table S2.1. Percent hotspots within each lake for diving and sea duck species groups. The top row of the graph shows the percent of all grid cells surveyed (out of 1767) within each lake. When comparing across the lakes, it is important to note percentages are relative to the number of grid cells surveyed within each lake. Bold font indicates that the percent values are greater than the proportion surveyed in that lake, and indicate more hotspots than would be expected by chance for a particular species group in a particular lake. ............................................................ 67 viii LIST OF FIGURES Figure 1.1. (a) Aerial-survey transects flown over selected areas of the Great Lakes during the fall, winter, and spring seasons from September 2012 to June 2014. The survey regions (divided by research entity who collected the data) are distinguished by colors. (b) Map of the study area showing the number of sampling events (or visits) per 5 km x 5 km grid cell during the entire survey period. ................................................................................................................................ 10 Figure 1.2. Potential hotspots (values above the 75% percentile) across all sampled locations for the all-species-combined group (includes: diving/sea ducks, gulls, loons, mergansers, and scaup) as estimated with each of the four hotspot analysis approaches: (a) kernel density estimation, (b) Getis-Ord Gi*, (c) hotspot persistence, and (d) hotspots conditional on presence. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. Note the survey regions are delineated for the hotspot persistence approach (c) because hotspots in this method are calculated relative to other grid cells within these specific regions. .......................... 25 Figure 1.3. Hotspot maps for the scaup species group (example pictured in a; upper panel) in western Lake Erie, including (a; lower panel) mean effort-corrected counts. The hotspot values are shown on the raw scales for each of the four methods: (b) non-parametric spatial approaches: (upper panel) kernel density estimation and (lower panel) Getis-Ord Gi*; (c) parametric GLMbased approaches: (upper panel) hotspot persistence and (lower panel) hotspots conditional on presence......................................................................................................................................... 27 Figure 1.4. Potential hotspots (values above the 75% quantile) in a portion of Lake Michigan for the diving/sea duck species group, highlighting (a) single cell selection in hotspots conditional on presence method (a parametric, non-spatial approach) and (b) clustered cell selection in kernel density estimation (a non-parametric, spatial approach). ............................................................. 29 Figure 2.1. Great Lakes study area, with aerial-survey transects flown over selected areas of the region during the fall, winter, and spring seasons from September 2012 to June 2014. Survey regions are colored according to the research entity who collected the data. ............................... 44 Figure 2.2. Potential hotspots (values above the 75% percentile) across all sampled locations for all-species-combined (includes: all diving/sea ducks, gulls, loons, mergansers, and scaup) as defined by the integrated hotspot modeling approach. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. ............................................................. 52 Figure 2.3. Potential hotspots (values above the 75% percentile) across all sampled locations for diving/sea ducks (includes: bufflehead, canvasback, common eider, long-tailed duck, redhead, ring-necked duck, ruddy duck, all eiders, all goldeneye, all mergansers, all scaup, all scoters, all unidentified diving ducks) as defined by the integrated hotspot modeling approach. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. .............. 53 ix Figure 2.4. Potential hotspots (values above the 75% percentile) across all sampled locations for gulls (includes: Bonaparte's gull, glaucous gull, great black-backed gull, herring gull, Iceland gull, mew gull, ring-billed gull, all unidentified gulls) as defined by the integrated hotspot modeling approach. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. ........................................................................................................................ 54 Figure 2.5. Potential hotspots (values above the 75% percentile) across all sampled locations for long-tailed duck [Clangula hyemalis] as defined by the integrated hotspot modeling approach. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. .............................................................................................................................. 55 Figure 2.6. Potential hotspots (values above the 75% percentile) across all sampled locations for mergansers (includes: common merganser, hooded merganser, red-breasted merganser, all unidentified mergansers, all unidentified merganser/goldeneye) as defined by the integrated hotspot modeling approach. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. .................................................................................................... 56 Figure 2.7. Potential hotspots (values above the 75% percentile) across all sampled locations for scaup (includes: greater scaup, lesser scaup, all unidentified scaup) as defined by the integrated hotspot modeling approach. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. .................................................................................................... 57 Figure 2.8. Potential hotspots (values above the 75% percentile) across all sampled locations for loons (includes: common loon, red-throated loon, all unidentified loons) as defined by the integrated hotspot modeling approach. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. ..................................................................................... 58 Figure 2.9. Potential hotspots (values above the 75% percentile) across all sampled locations for common loon [Gavia immer] as defined by the integrated hotspot modeling approach. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. .............. 59 Figure S2.1. Potential hotspots (values above the 75% percentile) across all sampled locations for diving ducks (includes: canvasback, redhead, ring-necked duck, ruddy duck, and all scaup) as defined by the integrated hotspot modeling approach. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. ............................................................. 68 Figure S2.2. Potential hotspots (values above the 75% percentile) across all sampled locations for sea ducks (includes: bufflehead, common eider, long-tailed duck, all goldeneye, all mergansers, and all scoters) as defined by the integrated hotspot modeling approach. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. .............. 69 x INTRODUCTION Waterbird species are notoriously difficult to study; they exhibit patchy, skewed distributions and often flock in high aggregations (Santora and Veit 2013, Zipkin et al. 2015). Waterbirds can move across large spatial scales, responding to environmental changes to find suitable habitat and prey (Montevecchi et al. 2012; Perry et al. 2005). Monitoring waterbirds is challenging because of unpredictable weather conditions, high costs, and difficulty identifying individuals to the species level. As such, assessing waterbird populations and identifying spatial patterns is challenging. Recent interest in alternative energy sources, increased coastal development, improvements in commercial and recreational shipping and fishing, and rapid global climate changes pose new and serious threats to many waterbird species (Garthe and Huppop 2004, Williamson et al. 2013, Adam et al. 2015). Because waterbird species respond to changes in the environment, they are often used as indicators of ecosystem and environmental health (Reese and Brodeur 2006, Piatt et al. 2007, Nur et al. 2011, Croxall et al. 2012, Santora and Sydeman 2015). By studying their distributions and observing spatial patterns, we can begin to predict how waterbirds will respond in the face of potential threats. One of the foremost methods in assessing waterbird populations is to identify areas with persistent high-use or aggregations, termed ‘hotspots’. By identifying areas with consistently high species richness or abundance, researchers and decision makers can protect areas that are beneficial to many species (not just waterbirds), thereby decreasing the cost and resources associated with conservation and management (Harcourt 1999, Possingham and Wilson 2005). The process of delineating hotspots has been used in ecology for nearly thirty years, and has evolved such that there is no longer a single definition or scientific consensus to identify such 1 areas (Myers 1988, Briscoe et al. 2015). The different methods to identify hotspots can lead to potentially different results, culminating in conflict of delineated hotspots (Prendergast 1993, Araujo 2002, Orme 2005, Possingham and Wilson 2005, Hobday and Pecl 2014, Daru et al. 2015, Harvey et al. 2017). Consequently, researchers are left to arbitrarily select an analysis technique with little confidence in the resultant identified hotspots. Comparing several of the most commonly used hotspot analysis models for waterbird species will help resolve some of these conflicts, providing insights on different techniques that can aid in selecting the most appropriate method based on data availability and conservation concerns. The following chapters aim to assess waterbird abundances and distributions throughout the Great Lakes region to inform decision-making and conservation planning by comparing four distinctly different hotspot analysis models and quantifying the consistency across the various approaches. The Great Lakes have played an important role throughout human history, and will continue to do so into the future (EPA 1995). Not only do they provide freshwater to millions of people daily, but the Great Lakes also provide recreational activities (e.g., swimming), commercial resources (e.g., shipping), sustenance (through hunting and fishing), and cultural services (e.g., birdwatching) to people throughout the year (Gronewold 2013). The lakes also provide resources that most people do not consider, such as hydrologic retention, nutrient cycling, shoreline protection, and sediment trapping (EPA 1995). It is in our best interest then, to ensure the health of the environment in and around the Great Lakes. By monitoring and understanding waterbird populations, we will be better equipped to understand and predict environmental changes that will affect ecosystem health. The goal of this thesis is to increase both consistency and confidence of delineated hotspot locations using an integrated modeling approach that combines multiple hotspot models. 2 Products from this research include a single hotspot map for each species or species group, but due to the limited data extent (both spatially and temporally) and inconsistent survey methods, caution should be taken when interpreting them. Despite data inadequacies, the hotspot maps are especially useful in determining areas in need of greater surveying and of interest for possible conservation and management. 3 CHAPTER 1 A comparative analysis of methods to identify waterbird hotspots ABSTRACT Detection of hotspots is a commonly used method in ecology and conservation to identify areas of high biodiversity or conservation concern. However, delineating and mapping hotspots is subjective and various approaches can lead to different conclusions with regard to the classification of particular areas as hotspots, complicating long-term conservation planning and implementation efforts. We present the results of a comparative analysis of recent approaches for identifying waterbird hotspots. We examined the literature and selected four common measures of identifying persistent areas of high use: two non-parametric, spatial approaches (kernel density estimation and Getis-Ord Gi*) and two parametric, non-spatial approaches (hotspot persistence and hotspots conditional on presence). We applied each of the methods to aerialsurvey waterbird count data collected in the Great Lakes from 2012-2014 using a 5 km2 grid. For each approach, we identified areas of high use for seven species/species groups and then compared the results across all methods. Our results indicate that formal hotspot analysis frameworks do not always lead to the same conclusions. The two spatial methods yielded the most similar results across all species analyzed. Yet, we found that the spatial models can differ substantially from the non-spatial models, which were not consistently similar to one another. The hotspot persistence approach differed most significantly from the other methods, but is the only method to explicitly account for temporal variation. We recommend considering the ecological question and scale of any conservation or management activities prior to designing survey methodologies. Deciding the appropriate definition and scale for analysis is also critical 4 for interpretation of hotspot analysis results. Combining methods using an integrative approach (i.e., inclusion of both spatial and non-spatial methods), either within a single analysis or posthoc, could lead to greater consistency in the identification of waterbird hotspots. INTRODUCTION Hotspots are most often defined as small geographic areas (within a predefined larger region) that exhibit persistent high concentrations of individuals or species (Harcourt 1999, Possingham and Wilson 2005). Yet in the three decades since Myers (1988) first introduced the term, the definition of hotspots has expanded and adapted to reflect changes in conservation goals (Briscoe et al. 2015). Animal hotspots are generally defined as areas with high levels of at least one of the following biological measures: species abundance, richness, or endemism; rare, threatened, or endangered species; and taxonomic distinctiveness (Possingham and Wilson 2005, Briscoe et al. 2015). Hotspots are typically defined on a case-by-case basis because patterns vary by species and location, thus, the threshold to differentiate between hotspots and other locations naturally varies as well (Nelson and Boots 2008). For example, common definitions of hotspots focus on determining areas with consistent high species abundance (Piatt et al. 2006, Davoren 2007), richness, or biological activity (Sydeman et al. 2006) or some combination of these (Nur et al. 2011). Hotspots have also been defined as locations where some metric exceeds a predefined threshold (e.g., top five percent of the data [Harvey et al. (2017); locations within one [Suryan et al. (2012) and Santora and Veit (2013)] or three [Zipkin et al. (2015)] standard deviations above the mean of a particular region or area sampled). Such definitions attempt to quantify hotspots (allowing for direct location comparison) as opposed to identifying hotspots using only qualitative criteria, which was common until recently (Mittermeier 2011). The 5 different approaches to identify hotspots have become increasingly complex and may lead to dissimilar or inconsistent results (Prendergast 1993, Araujo 2002, Orme 2005, Possingham and Wilson 2005, Hobday and Pecl 2014, Daru et al. 2015, Harvey et al. 2017). The consequences of applying different metrics to define hotspots is a lack of congruence across measures and, thus, hotspot locations, culminating in controversy and conflict over long-term conservation efforts (Orme et al. 2005, Possingham and Wilson 2005, Marchese 2015). Waterbird species display extreme variability in habitat use over both space and time (Certain 2007, Piatt et al. 2007, Votier et al. 2008); they often exhibit large, patchy aggregations offshore making it difficult to measure their spatial distributions (Santora and Veit 2013, Zipkin et al. 2015). As such, the foremost method to determine patterns of waterbird species is to identify locations of persistent aggregation or high use, such as hotspots. Hotspot identification is useful in studies of highly mobile organisms, such as waterbirds, because the likelihood that a survey event of any given location is representative of the true population size at that location is low due to the extreme variability of their distributions (Santora and Veit 2013). There are many methods to examine the diversity and abundance patterns of open water populations, but locating persistent high-use areas is a frequent first step towards understanding the processes that generate spatial patterns of species distributions (Nelson and Boots 2008). For waterbird abundance data, hotspot analyses are typically conducted using one of three approaches: 1) qualitative analyses (e.g., through mapping abundance); 2) with nonparametric spatial models; or 3) with parametric generalized linear modeling (GLM) techniques (Tremblay et al 2009). Historically, areas of high density or concentration were displayed and compared visually, and mapping relative species abundances remains a prevalent conservation tool today (Tremblay et al. 2009, Harvey et al. 2017). Yet, qualitative approaches are limited 6 because they do not reflect temporal changes (i.e., they are simply a snapshot in time), cannot adequately account for aggregations, and can be misleading based on how data are classified/colored (Marchese 2015). As a result, more rigorous quantitative approaches, typically in the form of spatial models or GLM-based approaches, were developed. Spatial models use data collected in both focal and surrounding locations to derive areas of high use. As their name suggests, these techniques account for spatial patterns in the data (Harvey et al. 2017). In contrast, non-spatial parametric approaches, which use a GLM framework, consider locations independent of one another and require the use of a metric or threshold and statistical distributions to describe observed patterns (Oppel et al. 2012, Zipkin et al. 2015, Santora and Veit 2013). Waterbirds are highly mobile and tend to aggregate in large groups, resulting in highly skewed data with a lot of non-detections (i.e., zeros). As such, selecting an appropriately skewed statistical distribution to model waterbird data is fundamental to accurately identifying hotspots using parametric approaches (Zipkin et al. 2014). In this study, we evaluate four methods for identifying hotspots for waterbird populations using data collected in the Great Lakes: two non-parametric, spatial models and two parametric GLM-based approaches. For the non-parametric models, we selected kernel density estimation and the Getis-Ord Gi* hotspot analysis method. Kernel density estimation is perhaps the most well-known and widely used spatial statistic method for identifying hotspots. Kernel density estimation is an interpolation technique that is used to estimate the probability density function of a variable of interest (e.g., abundance) to identify areas of high density (Wilson et al. 2009, O’Brien et al. 2012, Suryan et al. 2012, Wong et al. 2014). A less common spatial statistic for detecting hotspots is the Getis-Ord Gi* statistic (Gi*), which allows for cluster evaluation within a specified distance of a single point (Getis and Ord 1992, Kuletz et al. 2015, Santora et al. 7 2010). Gi* analysis is a spatial tool that identifies spatially explicit areas with values higher in magnitude than would be expected due to random chance, independent of the magnitude of abundance (Kuletz et al. 2015, Santora et al. 2010). For the parametric models, we adapted two GLM-based methods which have been used to identify waterbird hotspot locations. The first parametric approach, which we term “hotspot persistence”, defines hotspots for every unique sampling event and calculates persistence over time (Suryan et al. 2012, Santora and Veit 2013, Johnson et al. 2015). The second parametric approach, which we term “hotspots conditional on presence”, combines survey data from all sampling events and defines hotspots as locations with a long-term average abundance greater than three times the regional mean, conditional on species presence (Zipkin et al. 2015). Our objective was to compare consistency across the four different hotspot analysis techniques for several waterbird species and species groups. To do this, we applied the four hotspot methods to the species data and then performed pairwise correlations to measure the strength and association between the different approaches. This allowed us to quantify the degree to which the various hotspot estimators aligned in their assessments of hotspots. METHODS Study area & data description We conducted systematic aerial transect surveys of waterbirds in portions of Lakes Erie, Huron, and Michigan, as well as Lake St. Clair during fall, winter, and spring seasons from late September 2012 through early June 2014 (Fig. 1.1a). The transects ranged in length from 3 to 177 km covering approximately 8,000 km within the entire study area. Most of the transects (97%) were surveyed repeatedly with an average number of 10.68 (SD: 3.85) sampling events 8 per transect with approximately 83,000 km flown over the duration of the survey period. Transects were spaced 3.2 to 5 km apart and flight altitude ranged from 61 to 100 m above the lake’s surface. Two observers, one on either side of the plane, recorded every waterbird flock that was detected in the observable portion of the transect (the area not obscured by the plane). For each sighting, we recorded the species, number of individuals seen, and latitude and longitude (using onboard GPS) on the transect line. For large flocks that covered many square kilometers (i.e., up to 30 km), the location recorded is an approximate to the center of the flock. Birds were identified to the lowest taxonomic group possible when observers were unable to determine species. We integrated the data into the open access Midwest Avian Data Center (MWADC), a regional node of the Avian Knowledge Network (AKN), hosted by Point Blue Conservation Science (http://data.pointblue.org/partners/mwadc/). In some cases, observers did not include transect attribution for individual observations (23.4% of records). For these records, we used incremental buffering by 1-m to identify the closest transect to each observation record. We used data collected on the location of the transect line when available (42.26% of transect lines) and GIS to reconstruct the transect lines from observations in instances when that information was not recorded. During the survey period, 253 transects were surveyed resulting in 136 unique sampling events/surveys. 9 Research Entity (a) Green Bay Number of Sampling Events (b) Biodiversity Research Institute Michigan Department of Natural Resources Michigan Natural Features Inventory United States Geological Survey WesternLake Great Lakes Bird & Bat Observatory Green Bay Georgian Bay Lake Huron Huron Lake Ontario Lake Michigan Lake St. Clair 1-3 4-6 7 - 10 11 - 16 17 - 20 21 - 30 Lake Ontario Lake Michigan Lake Erie Lake St. Clair Lake Erie Figure 1.1. (a) Aerial-survey transects flown over selected areas of the Great Lakes during the fall, winter, and spring seasons from September 2012 to June 2014. The survey regions (divided by research entity who collected the data) are distinguished by colors. (b) Map of the study area showing the number of sampling events (or visits) per 5 km x 5 km grid cell during the entire survey period. 10 Species groups and composition We observed 76 bird species at least one time with 41,803 observations including over two million individual birds. We focused our analysis on seven species/species groups: longtailed duck, common loon [Gavia immer], gulls [Larus spp.], mergansers [Mergus spp. and Lophodytes spp.], scaup [Aythya spp.], loons [Gavia spp.], and diving/sea ducks [Aythya spp., Bucephala spp., Clangula hyemlais, Melanitta spp., Mergus spp., Oxyura jamaicensis, and Somateria spp.]. (Table 1.1). We chose these species and species groups (hereafter referred to as species groups) because they were fairly evenly distributed across the study area (i.e., occurred in most lakes with data), with observations in at least 200 grid cells, and were encountered at least 1000 times during the survey period. The seven species groups used in our analysis comprised 33 species (Table 1.1) and nearly 90% of all observed birds, including some individuals that could not be identified to species (and were used in analyses with multiple species only). Canvasback was the most abundant bird species (i.e., most individuals observed), but long-tailed duck was encountered most often (Table 1.1). We identified potential hotspots for the seven species groups, and then used the data from all species groups to analyze hotspots for an all-species-combined group. Some individual species appeared in multiple groups; for example, long-tailed duck was analyzed individually and in the diving/sea duck group. In such instances, an individual species was used only once in the all-species-combined group (i.e., not double counted). 11 Table 1.1. List of species and species groups, including the total number of encounters and total number observed. Species/ Species Group Diving/sea ducks Number of Encounters 19,183 Number Observed 1,700,311 Species Included bufflehead [Bucephala albeola], canvasback [Aythya valisineria], common eider [Somateria mollissima], long-tailed duck [Clangula hyemalis], redhead [Aythya americana], ringnecked duck [Aythya collaris], ruddy duck [Oxyura jamaicensis], all eiders [Somateria spp.], all goldeneye [Bucephala spp.], all mergansers [Mergus spp.], all scaup [Aythya spp.], all scoters [Melanitta spp.], and all unidentified diving ducks [Aythya spp.] Gulls 12,233 81,399 Bonaparte's gull [Chroicocephalus Philadelphia], glaucous gull [Larus hyperboreus], great black-backed gull [Larus marinus], herring gull [Larus smithsonianus], Iceland gull [Larus glaucoides], mew gull [Larus canus], ring-billed gull [Larus delawarensis], and all unidentified gulls [Laridae spp.] Long-tailed duck 6,011 149,542 long-tailed duck [Clangula hyemalis] 95,702 common merganser [Mergus merganser], hooded merganser [Lophodytes cucullatus], redbreasted merganser [Mergus serrator], all unidentified mergansers [Mergus spp.], and all unidentified merganser/goldeneye [Mergus/Bucephala spp.] 383,495 greater scaup [Aythya marila], lesser scaup [Aythya affinis], and all unidentified scaup [Aythya spp.] Mergansers Scaup 4,865 3,431 Loons 2,111 4,364 common loon [Gavia immer], red-throated loon [Gavia stellata], and all unidentified loons [Gavia spp.] Common loon 1,688 2,922 common loon [Gavia immer] 12 Data standardization We imposed a 5 km x 5 km grid (consisting of 17,746 cells) over the entire Great Lakes region and assigned transects to grid cells based on their spatial locations. We chose this scale because the maximum distance between survey transects was 5 km. Thus, a smaller grid would create a very patchy system of survey effort with many empty cells, whereas a larger size would lump together more data, obscuring fine scale aggregations of species. We segmented all transects using the grid system so that grid cells contained only the portion of the transect that occurred within the cell, such that a cell could contain anywhere from zero to many transect segments. Then, we calculated the total length of the transect segments(s) within a grid cell to determine the number of kilometers flown for all sampling events within each cell. We included in our analysis only those cells which contained a total transect length of at least 1-km. The sum of transect lengths within cells ranged from 1 to 16.02 km with an average of 4.69 km (SD: 2.16 km). A total of 1,699 of the 1,767 grid cells included in our analysis had bird observations on at least one sampling occasion. We standardized the count data within grid cells (Johnson et al. 2015) because the survey effort is unequal and highly variable across cells within the study area (Fig. 1.1b). We divided the number of observations of a species for the sampling event-grid cell combination by the summed transect length, resulting in a continuous effort-corrected count. We define a sampling event as a unique year-month-day (survey date) combination within each region of the Great Lakes. Inclement weather and extensive ice coverage necessitated some surveys to be halted prematurely or conducted over a short period of time. We thus assumed that instances in which an area was surveyed over multiple consecutive days were a single sampling event. The result was 136 unique sampling events that we included in our analysis. We used data from all 13 sampling events in our hotspot analyses; however, we limited the method comparisons, correlations, and hotspot maps to grid cells that contained at least four sampling events, for a total of 1,473 grid cells (83.4% of cells). Using grid cells with at least four sampling events allowed for analysis from across the study region, while excluding grid cells that did not have sufficient data which could lead to false hotspot identification (Fig. 1.1b; Kuletz et al. 2015, Zipkin et al. 2015). Hotspot analysis Spatial non-parametric model: kernel density estimation Kernel density estimation is a common method for estimating relative density in animal populations that aggregate, and has been used repeatedly to identify waterbird hotspot locations and marine areas in need of protection (Wilson et al. 2009, O’Brien et al. 2012, Suryan et al. 2012, and Wong et al. 2014). Essentially, the modeling approach converts point data (i.e., effortcorrected counts) into a continuous surface grid reflecting relative densities across all grid cells, where the resulting density of each grid cell is weighted according to the distance from the focal location/grid cell (Wong et al. 2014). To implement the kernel density method, we calculated the midpoint of each grid cell and assigned all observations of the species group (across all sampling events) to the midpoint of the grid cell to which they occurred. We accounted for uneven sampling effort (grid cells were surveyed from one to thirty times) by dividing the summed observations in a grid cell by the total number of sampling events for the grid cell. We used the kernel density tool in the Spatial Analyst extension of ArcGIS 10.3.1 (ESRI 2015) to estimate bird density, inputting values for both bandwidth and cell size. The bandwidth, or size of the neighborhood over which the density 14 is averaged is the amount of smoothing applied to each kernel (Nelson and Boots 2008). Small bandwidth values may result in undersmoothed small-scale kernels, leading to underestimation of hotspots, whereas large values result in oversmoothed general kernels, overestimating hotspots (Wong et al. 2014). We selected a 5-km bandwidth for kernel smoothing based on both the geographic extent of the data and the distance between survey transects (Suryan et al. 2012). Kernel density estimation results in a raster, which is a matrix of pixels where each pixel contains a value representing information (e.g., bird abundance; ESRI 2015). The cell size for the output raster can also affect the interpretation of the kernel estimate: large cell sizes may result in a blocky raster that is a poor approximation of a continuous surface, and small cell sizes may result in a raster of many cells that takes too long to calculate (Beyer 2014). We selected a cell size of 1 km x 1 km for the output raster. For each species group, we extracted the mean expected count from the resulting kernel density raster back to the 5 km2 grid for comparison with the other methods. Spatial non-parametric model: Getis-Ord Gi* The Getis-Ord Gi* (Gi*) statistic detects waterbird hotspots while also indicating the statistical significance of those hotspots (Santora et al. 2010, Kuletz et al. 2015). The Gi* technique identifies grid cells whose data points cluster spatially by examining each grid cell within the context of the neighboring cells (Getis and Ord 1992). Gi* differs from kernel density estimation because it incorporates the value of each feature in the context of its neighbors, whereas kernel density estimates the neighbors based on the focal feature. Gi* is less subjective than kernel density estimation because kernel estimation is based on user-specified values (bandwidth and cell size; Kuletz et al. 2015). 15 To implement the Gi* statistic, we again calculated the midpoint of each grid cell and assigned all observations of the species group across all sampling events to the midpoint of the cell into which they occurred. We again accounted for uneven sampling effort by dividing the summed effort-corrected grid cell observations by the total number of sampling events for that cell. We built a neighbor list for all grid cells using rook’s case contiguity (i.e., grid cells that share a border), and then used the neighbor list to calculate a row-standardized spatial weights matrix (spdep package in R; Bivand et al. 2013, Bivand and Piras 2015). The matrix informs every grid cell’s relationship to all other cells in the neighborhood (Kuletz et al. 2015). We used the effort-corrected counts and the spatial weights matrix to calculate the Gi* for each grid cell (spdep package in R; Bivand et al. 2013, Bivand and Piras 2015). Gi* produces a z-score for each grid cell, where high positive values are statistically significant and indicate the possibility of a local cluster of high species abundance (i.e., a hotspot) that is unlikely due to random chance. Parametric model: hotspot persistence The hotspot persistence method quantifies the persistence of species counts within individual grid cells (Suryan et al. 2012, Santora and Veit 2013, Johnson et al. 2015). To implement this method, we first fit a gamma distribution to the effort-corrected continuous species group count data, summed within grid cells, for each unique sampling event (fitdistrplus package in R; Delignette-Muller and Dutang 2015). We selected the two-parameter gamma distribution (shape and scale) because it can fit a variety of continuous right-skewed data (Dennis and Patil 1983, Bolker 2008) and because it has been used before with this hotspot analysis technique (Johnson et al. 2015). We then assigned a probability to each grid cell based on the fit 16 of the data within the cumulative distribution curve for that sampling event. This allowed us to identify grid cells (for each unique sampling event) with high abundance of the target species group relative to other cells. Within a unique sampling event, we identified grid cells as hotspots if the value of the cumulative distribution for the cell, based on the fit of the gamma distribution, was above the 75th percentile for that sampling event. After identifying which grid cells were categorized as hotspots for every unique sampling event, we calculated the proportion of sampling events in which a grid cell was identified as a hotspot to examine persistence. The final output was the proportion of sampling events, ranging from zero to one, in which a grid cell was considered a hotspot for the target species group. Proportions at zero indicate the grid cell was never a hotspot. The higher the proportion, the more frequently the grid cell was considered a hotspot, with a proportion of one indicating the grid cell was a hotspot for all sampling events. Parametric model: hotspots conditional on species presence The hotspots conditional on presence method calculates the long-term probability that a grid cell is a hotspot for a particular species given observed abundances over time (Kinlan et al. 2012, Zipkin et al. 2015). To implement this method, we fit the effort-corrected nonzero count data using a lognormal distribution, which does not contain zero in its support (fitdistrplus package in R; Delignette-Muller and Dutang 2015). The lognormal is a two parameter (mean and standard deviation), positive, continuous probability distribution characterized by a heavy tail and has been shown to fit waterbird data well because of its flexible shape and ability to fit heavily skewed data (Zipkin et al. 2014, Zipkin et al. 2015). We then estimated prevalence in the reference region as the proportion of blocks with occurrences for the target species group (at least one individual observed within the cell over all sampling events) relative to the total 17 number of blocks surveyed. We defined the reference region as the entire area sampled across the Great Lakes. Using the estimated mean and standard deviation from the lognormal distribution (as our count component) and the prevalence estimate (as our Bernoulli component), we simulated data with a Monte Carlo approach to calculate hotspot locations (Kinlan et al. 2012, Zipkin et al. 2015). We defined a hotspot as a grid cell in which the long-term average effort-corrected count conditional on presence (with a 𝛼 = 0.05 threshold), was at least three times the mean of the reference region, also conditional on presence (Kinlan et al. 2012). The resulting values, ranging from zero to one, represent the proportion of simulated sample means that are greater than three times the average count. Values close to zero indicate the grid cell is not a hotspot. Values close to one indicate a high probability that the long-term average abundance in the grid cell is greater than three times the mean of the reference region. Comparative analysis of the methods Our objective was to determine the degree of congruence among the four methods across species groups and for all-species-combined. To quantify the consistency among the four approaches in their ability to detect hotspots, we performed a Pearson’s product-moment correlation to evaluate the pairwise associations of the four approaches with a Bonferroni adjustment and an alpha level of 0.05 (psych package Revelle 2015). We analyzed the correlation coefficients (ranging from zero to one) to determine associations among the different approaches: the higher the value, the higher the correlation between two methods. We produced maps for all species groups to visually compare the results of the four hotspot analysis approaches. For the first set of maps, we plotted the values produced for each grid cell using each hotspot analysis technique. Direct visual comparison among the hotspot 18 methods can be difficult because the scale of the results for each method varies (i.e., hotspot persistence and hotspots conditional on presence range between zero to one, kernel density produces unbounded positive values, while Gi* produces both positive and negative values). To resolve this issue, we created a second set of maps for each species group in which we considered a hotspot as any grid cell with a value above the 75th percentile (of all values for that method) and plotted those according to their percentiles. Values below the 75th percentile were not considered hotspots. RESULTS The highest correlation between methods for all species groups occurred between the two spatial techniques, kernel density estimation and Gi*, with a correlation ≥0.80 for all species groups except mergansers (Table 1.2). For mergansers, the correlation between the two spatial methods was 0.67 and nearly identical to the correlation between Gi* and the hotspots conditional on presence method. For the other species groups, including all-species-combined, there was much higher congruency between the spatial methods than any other combination of pairwise comparisons (Table 1.2). For example, for all-species-combined there was 94% overlap in identification of hot and non-hot grid cells between kernel density estimation and Gi* (Fig. 1.2). In general, the parametric approaches were no more similar to one another than to either of the spatial methods and the highest correlations for each of the parametric models varied by species (Table 1.2). Kernel density estimation and hotspot persistence showed the lowest correlations (0.03-0.56) for the species groups that we examined. For the all-species-combined group, we found that the four methods identified the same 63% of grid cells as non-hot locations (below the 75th percentile), while approximately 8% of grid cells were identified as hotspot 19 locations under all four methods (above the 75th percentile). The remaining 29% of the cells were identified as hotspots by one, two, or three of the methods. The hotspot persistence approach differed most significantly from the other three methods and had the lowest correlations overall with other methods (Table 1.2; Fig. 1.2-1.3). Unlike the other methods, the hotspot persistence approach calculates hotspots relative to the survey region, rather than the entire study area (i.e., Fig. 1.1a, Fig. 1.2c upper panel), and also explicitly incorporates temporal variability. For example, in the analysis of the scaup species group, we found that many grid cells in Lake St. Clair were identified as hotspots by all methods except for hotspot persistence (Fig. 1.3). The counts for scaup were generally quite high in Lake St. Clair relative to other surveyed locations. However, the hotspot persistence method revealed that individual grid cells did not often have high counts on repeated occasions (as evidenced with zeros and other low values in grid cells within Lake St. Clair; Fig. 1.3c). The all-speciescombined analysis produced similar results, with 9% of grid cells identified as hotspots by the persistence approach but not by the other three methods (Fig. 1.2). The spatial methods are inherently different from the parametric, GLM-based approaches, which is evident in both the correlations and maps (Table 1.2). The parametric approaches tended to select single grid cells as hotspots, whereas the two spatial methods selected small clusters of grid cells as hotspots (Fig. 1.4). In many cases, grid cells that were highly ranked with the hotspots conditional on presence approach, were also highly ranked with the spatial methods. However, the surrounding grid cells tended to also be ranked with the spatial approaches (Fig. 1.4). Gi* had higher correlations (than kernel density estimation) with both parametric approaches for gulls, mergansers, loons, and common loon. For diving/sea ducks, long-tailed duck, and scaup, Gi* produced more similar results to hotspot persistence while 20 kernel density estimation was more consistent with hotspots conditional on presence. Allspecies-combined showed a similar pattern to the diving/sea ducks. The two parametric approaches, hotspot persistence and hotspots conditional on presence, had relatively low correlations, ranging from 0.33-0.67 across the species groups examined (Table 1.2). Hotspot persistence correlation ranged from 0.03 (common loon) to 0.67 (scaup) and showed the highest correlation with hotspots conditional on presence for all species groups and all-species-combined (0.33-0.67). Hotspots conditional on presence showed the most consistency with kernel density estimation for all-species-combined (0.44) and diving/sea ducks (0.47), with Gi* for gulls (0.46) and mergansers (0.67), and with hotspot persistence for long-tailed duck (0.58), scaup (0.69), loons (0.45), and common loon (0.60). Scaup showed the highest agreement between any two methods (0.88 correlation between the two spatial methods), whereas common loon had the lowest correlation among the methods (0.03 correlation between kernel density estimation and hotspot persistence). Species-specific hotspots for common loon and long-tailed duck occurred in areas identified as hotspots for the corresponding species group. Common loon observations comprised 67% of the loon group observations. Interestingly, a reanalysis of the loons group without common loons only substantially altered the correlations between the parametric approaches, which had a correlation nearly as high as the two spatial approaches (Appendix 1, Table S1.1). Additionally, both parametric models performed better with Gi* than they did with kernel density estimation. This reanalysis suggests that common loon heavily influences the loon species group. When long-tailed duck data were removed from the diving/sea duck group (9% of the data), Gi* showed a higher correlation than kernel density estimation with hotspots conditional on presence (although the two spatial methods were still highly correlated with one 21 another; Appendix 1, Table S1.1), suggesting that long-tailed ducks may be disproportionately influencing the results of the hotspot analyses for the diving/sea duck group. 22 Table 1.2. Pearson correlation matrix of pairwise comparisons between the four hotspot analysis approaches (kernel density estimation, Getis-Ord Gi*, hotspot persistence, and hotspots conditional on presence) with a Bonferroni adjustment and an alpha level of 0.05. Light gray values show copies (i.e., values above and below of the diagonals are mirror images). Kernel density estimation Hotspot Getis-Ord Gi* persistence Hotspots conditional on presence All species/groups Kernel density estimation 1.000 0.870 0.121 0.441 Getis-Ord Gi* 0.870 1.000 0.125 0.435 Hotspot persistence 0.121 0.125 1.000 0.332 Hotspots conditional on presence 0.441 0.435 0.332 1.000 Kernel density estimation 1.000 0.874 0.160 0.467 Getis-Ord Gi* 0.874 1.000 0.167 0.458 Hotspot persistence 0.160 0.167 1.000 0.364 Hotspots conditional on presence 0.467 0.458 0.364 1.000 Kernel density estimation 1.000 0.808 0.318 0.457 Getis-Ord Gi* 0.808 1.000 0.325 0.459 Hotspot persistence 0.318 0.325 1.000 0.446 Hotspots conditional on presence 0.457 0.459 0.446 1.000 Kernel density estimation 1.000 0.804 0.406 0.475 Getis-Ord Gi* 0.804 1.000 0.451 0.432 Hotspot persistence 0.406 0.451 1.000 0.582 Hotspots conditional on presence 0.475 0.432 0.582 1.000 Kernel density estimation 1.000 0.672 0.433 0.600 Getis-Ord Gi* 0.672 1.000 0.547 0.669 Hotspot persistence 0.433 0.547 1.000 0.575 Hotspots conditional on presence 0.600 0.669 0.575 1.000 Diving/sea ducks Gulls Long-tailed duck (LTDU) Mergansers 23 Table 1.2 (cont’d). Kernel density estimation Hotspot Getis-Ord Gi* persistence Hotspots conditional on presence Scaup Kernel density estimation 1.000 0.878 0.562 0.661 Getis-Ord Gi* 0.878 1.000 0.586 0.623 Hotspot persistence 0.562 0.586 1.000 0.686 Hotspots conditional on presence 0.661 0.623 0.686 1.000 Kernel density estimation 1.000 0.808 0.047 0.210 Getis-Ord Gi* 0.808 1.000 0.063 0.258 Hotspot persistence 0.047 0.063 1.000 0.454 Hotspots conditional on presence 0.210 0.258 0.454 1.000 Kernel density estimation 1.000 0.800 0.027 0.049 Getis-Ord Gi* 0.800 1.000 0.032 0.075 Hotspot persistence 0.027 0.032 1.000 0.606 Hotspots conditional on presence 0.049 0.075 0.606 1.000 Loons Common loon (COLO) 24 (a) Hotspots (b) Green Bay Green Bay Georgian Bay Lake Huron Lake Huron < 75th 75 - 90th Georgian 90 - 95th Bay 95 - 99th > 99th sampled less than 4 times Lake Ontario Lake Ontario Lake Michigan Lake Michigan Lake St. Clair Lake Erie Research Entity (c) Green Bay Lake St. Clair Lake Erie (d) Biodiversity Research Institute Michigan Department of Natural Resources Georgian Michigan Natural Features Inventory United States Geological Bay Survey Lake Western Great Lakes Bird & Bat Observatory Green Bay Georgian Bay Lake Huron Huron Lake Ontario Lake Ontario Lake Michigan Lake St. Clair Lake Michigan Lake Erie Lake St. Clair Lake Erie Figure 1.2. Potential hotspots (values above the 75% percentile) across all sampled locations for the all-species-combined group 25 Figure 1.2 (cont’d). (includes: diving/sea ducks, gulls, loons, mergansers, and scaup) as estimated with each of the four hotspot analysis approaches: (a) kernel density estimation, (b) Getis-Ord Gi*, (c) hotspot persistence, and (d) hotspots conditional on presence. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. Note the survey regions are delineated for the hotspot persistence approach (c) because hotspots in this method are calculated relative to other grid cells within these specific regions. 26 (b) (a) Lake St. Clair Kernel Density Estimation 0.00 - 0.99 1.00 - 1.46 1.47 - 2.19 2.20 - 3.24 3.25 - 4.82 4.83 - 18.93 (c) Hotspot Persistence 0.00 - 0.19 0.20 - 0.59 0.60 - 0.70 0.71 - 0.80 0.81 - 0.90 0.91 - 1.00 Lake Erie Mean EffortCorrected Count 0.00 0.01 - 9.96 9.97 - 28.49 28.50 - 51.74 51.75 - 183.25 183.26 - 484.57 Lake St. Clair Lake Erie Lake St. Clair Getis-Ord Gi* Lake St. Clair -1.65 - 1.65 1.66 - 1.96 1.97 - 2.56 > 2.56 Lake Erie Lake Erie Conditional on Presence 0.00 - 0.19 0.20 - 0.59 0.61 - 0.70 0.71 - 0.80 0.81 - 0.90 0.91 - 1.00 Lake St. Clair Lake Erie Figure 1.3. Hotspot maps for the scaup species group (example pictured in a; upper panel) in western Lake Erie, including (a; lower panel) mean effort-corrected counts. The hotspot values are shown on the raw scales for each of the four methods: (b) non-parametric 27 Figure 1.3 (cont’d). spatial approaches: (upper panel) kernel density estimation and (lower panel) Getis-Ord Gi*; (c) parametric GLM-based approaches: (upper panel) hotspot persistence and (lower panel) hotspots conditional on presence. 28 (a) (b) Green Bay Lake Michigan Hotspots Green Bay Lake Michigan < 75th 75 - 90th 90 - 95th 95 - 99th > 99th sampled less than 4 times Figure 1.4. Potential hotspots (values above the 75% quantile) in a portion of Lake Michigan for the diving/sea duck species group, highlighting (a) single cell selection in hotspots conditional on presence method (a parametric, non-spatial approach) and (b) clustered cell selection in kernel density estimation (a non-parametric, spatial approach). 29 DISCUSSION Despite the frequent use of hotspot analyses in management and conservation, we found that the various hotspot approaches resulted in the identification of different hotspots within our Great Lakes waterbird dataset. If conservation and management decisions are based on hotspot analyses, then it is important to understand the advantages, limitations, and potential biases in the various approaches to facilitate selection of the most appropriate method to answer the question(s) of interest. Our analyses reveal that methods to estimate hotspots using spatial approaches, specifically kernel density and G*, produce highly correlated results and can likely be used as surrogates for one another. However, the non-spatial, GLM-based methods that we examined produce less consistent results with each other and the spatial methods, the degree to which varies by species. The hotspot persistence approach differed the most from the other methods. Unlike the three other techniques, hotspot persistence estimates hotspots based on unique sampling events within survey regions and then identifies whether those areas persist as hotspots over time. The other three approaches focus on average abundance over the entire survey period and may not be comparable to the hotspot persistence approach. The hotspot persistence approach (as with the other approaches) may perform best with many more sampling events rather than our minimum of four surveys (Kinlan et al. 2012, Zipkin et al. 2015). Our analyses cannot determine which method most appropriately estimates species hotspots. The best method for any given analysis will depend on the question and study region. The spatial methods are useful in assessing large areas where waterbirds have been present (with high densities) on at least one occasion. However, the spatial methods do not explicitly incorporate sampling effort and many applications to waterbird data ignore this issue (although we accounted for varying sampling intensity by dividing the observations in a grid cell by the 30 number of sampling events). Furthermore, the spatial methods can have difficulty identifying individual grid cells as hotspots, especially at small spatial scales and with heavily skewed data (Songchitruska and Zeng 2010, Harris et al. 2017). The parametric hotspot models are likely to be relatively more conservative in identifying hotspots (e.g., Fig. 1.4), which may be desirable depending on conservation and management priorities. The hotspots conditional on presence method calculates species hotspots using the estimated expected count in grid cells based on all sampling events over the entire study area. Locations are hotspots when expected abundance is high (in this case, three standard deviations above the mean) compared to the mean count of the target species in the entire reference region. The hotspot persistence method classifies a location as a hotspot if that location has high density or abundance consistently over time relative to the rest of the grid cells that were surveyed at the same time. Thus, if the goal is to identify areas of long-term high concentration for species protection within a smaller region, the hotspot persistence method (or a similar metric) may be most appropriate. There is subjectivity in all methods to identify hotspots because arbitrary thresholds are selected to delineate hotspot locations and can be changed (Harvey et al. 2017). Thus, there is a tradeoff in the rate of Type I and Type II errors in selecting a threshold level (Kinlan et al. 2012). A high threshold will constrain the number of hotspots and the likelihood that an area is falsely categorized as a hotspot decreases. A low threshold will increase the number of hotspots, possibly including locations that should not be considered hotspots. For the spatial techniques, a number of user-determined decisions are incorporated: the bandwidth and cell size must be specified in kernel density estimation and the number and direction of neighboring values must be decided in the Gi* method, both of which can easily be changed to suit the user (Canadas et al. 2014, Marchese 2015). For the parametric GLM-based approaches, the user must decide the 31 level that constitutes a hotspot (i.e., top 25% of cells in the hotspot persistence approach; three times in the mean in hotspots conditional on presence method; Canadas et al 2014, Zipkin et al. 2015). The production of hotspot maps is also subjective (Carolan 2009). We choose to classify and color hotspots as the top 25% of grid cell values for the four methods (Fig. 1.2). However, a different threshold could produce maps that appear more or less similar (particularly when evaluating methods on their original scales; e.g., Fig 1.3) and possibly lead to different conclusions on the congruency of the four approaches. A combination of methods using an integrative approach that synthesizes results, either within a single analysis or post-hoc, could lead to greater consistency (Marchese 2015). Nur et al. (2011) found that the use of multiple criteria (when defining a hotspot within a single method) prevents misidentifying an area that may be overlooked when using only a single criterion. Future analyses could not only combine methods, but also incorporate multiple criteria to define hotspots. Variation in hotspot identification across the methods may be due in part to scale. Scale is important to several aspects of hotspot identification: 1) the scale at which the data are collected, 2) the spatial scale at which the data are analyzed, and 3) the scale at which management decisions will be made. It is important to account for potential discrepancies in the geographic scale of population-level processes and the resolution of different datasets when deciding the spatial scale for analyses. The 5 km2 grid we used is a fine-scale resolution to identify and map hotspots and may result in less consistency across methods than a larger, coarser grid (Daru et al. 2015). The distance between and design of transect surveys can directly affect the outcome of hotspot analyses, therefore we recommend simultaneous and thorough consideration of survey design and analysis prior to data collection. Spatial methods explicitly incorporate autocorrelation and tend to identify large areas (i.e., multiple grid cells) as hotspots whereas the 32 non-spatial, GLM-based methods typically identify small, patchy areas (i.e., individual grid cells; Harvey et al. 2017). Thus, geographic scale and resolution (i.e., how finely the planning areas are divided into planning units/grid cells) is critical for developing effective management plans that recognize both ecological and social processes. The data used in our analyses were both spatially and temporally limited. The data come from the Great Lakes region, though only covered portions of three Great Lakes: Erie, Huron, and Michigan, as well as Lake St. Clair. To better assess hotspots in the entire Great Lakes region, sampling should be conducted in the remaining two lakes, as well as in other regions of the three sampled lakes. Our data spanned three seasons over two years with some areas sampled only once and others up to thirty times. The inconsistency in sampling required us to exclude data from areas that were sampled only a small number of times (less than four), although it is possible that even areas included in our analyses were not sampled enough for accurate hotspot detection (Hazen et al. 2014, Zipkin et al. 2015). The data used in this study were collected by five different research institutions, highlighting the importance of coordinated survey efforts, proper data descriptions, and open data sharing, especially for large-scale analyses such as those presented here. We attempted to break down the data to identify species-specific and species group hotspots; however, because the data are temporally constrained, we were limited to analyzing only a handful of possible species groups. Grouping species also allowed us to incorporate individuals that could not be identified to the species-level (21% of all observations), increasing sample sizes. Despite the constraints on the data, our results will help inform areas in need of greater sampling effort for future survey work (GLFWRA 2006). Resources for conservation and wildlife management are limited, requiring prioritization of conservation objectives (Araujo 2002). Hotspot analysis is a first step in understanding species 33 distribution patterns but it is equally or more important to determine why certain areas contain persistent aggregations of waterbirds. We did not include environmental variables (e.g., bathymetry, surface temperature, ice coverage, etc.) or season in our hotspot analyses, although they most certainly play an important role in the distribution and abundance patterns of waterbirds (Nur et al. 2011, Suryan et al. 2012). Environmental variables, such as habitat suitability and food availability, are critical to discerning species behaviors and patterns (Hyrenbach et al. 2000, Shirkey 2012, Briscoe et al. 2015). However, challenges arise with environmental variables that are constantly changing (e.g., ice cover, temperature), making identification of static hotspot locations relative to environmental variables difficult (Briscoe et al. 2015, Marchese 2015) and perhaps less useful for certain management related questions. Seasonal variability in waterbird species across the Great Lakes is an important factor that we did not consider in our analyses; abundances can fluctuate during migration or at overwintering locations and shifts in distributions may occur within seasons (Suryan et al. 2016). Human disturbance can also affect species distributions; for example, diving ducks may forage at night in different areas than during the day to avoid disturbance, such that diurnal distributions may not represent the spatial distribution of all important areas used (Shirkey 2012). Survey methods and modeling techniques have improved over time, but waterbird species are highly mobile, making the identification of priority areas difficult (Arcos et al. 2012, Harvey et al. 2017, Marchese 2015). Through our study, we demonstrate that delineating hotspots is often subjective, as different thresholds can produce varying results. Regardless of their drawbacks, hotspot analyses are likely to remain an important tool for conservation because of their relative ease to implement. Yet, researchers should clearly identify conservation and management goals to select the most appropriate analysis method(s). This will allow hotspot 34 analyses to be both useful and meaningful. Although individual methods may lead to incongruent hotspot identification, incorporating multiple techniques, especially those that combine spatial and non-spatial approaches, could increase insights and improve ability to identify true areas of high use. 35 APPENDIX 36 Table S1.1. Diving/sea ducks and loons reanalyzed without long-tailed duck and common loon, respectively. Pearson correlation matrix of pairwise comparisons between the four hotspot analysis approaches (kernel density estimation, Getis-Ord Gi*, hotspot persistence, and hotspots conditional on presence) with a Bonferroni adjustment and an alpha level of 0.05. Light gray values show copies (i.e., values above and below of the diagonals are mirror images). Kernel density estimation Hotspot Getis-Ord Gi* persistence Hotspots conditional on presence Diving/sea ducks without long-tailed duck Kernel density estimation 1.000 0.876 0.179 0.492 Getis-Ord Gi* 0.876 1.000 0.193 0.507 Hotspot persistence 0.179 0.193 1.000 0.420 Hotspots conditional on presence 0.492 0.507 0.420 1.000 Kernel density estimation 1.000 0.813 0.111 0.119 Getis-Ord Gi* 0.813 1.000 0.126 0.127 Hotspot persistence 0.111 0.126 1.000 0.806 Hotspots conditional on presence 0.119 0.127 0.806 1.000 Loons without common loon 37 CHAPTER 2 Combining models to identify waterbird hotspots in the Great Lakes ABSTRACT Waterbird species play an important role in ecosystems and are known indicators of ecosystem health. Most waterbird species exhibit patchy distributions with high aggregations that vary throughout the year, making it difficult to identify spatial patterns; however, waterbirds rapidly react to changes in the environment providing beneficial information about aquatic ecosystems and species. Detecting hotspots is an effective way to identify species-specific patterns and simultaneously inform decisions regarding habitat and ecosystem protection. The Great Lakes, and surrounding areas, offer many resources for both humans and wildlife. The region provides unparalleled habitat for many wildlife species, including many waterbirds throughout the year during migration, wintering, and breeding seasons. There are several methods to identify hotspots, many of which are arbitrary in selecting metrics or thresholds and may thus provide incongruent results. One solution is to combine methods, which may result in more accurate hotspot estimates than with any single analysis framework. We selected and combined two hotspot models commonly used for waterbird hotspot analyses to identify species-specific hotspots in the Great Lakes: one spatial method, Getis-Ord Gi*, and one non-spatial parametric method, hotspots conditional on presence. Our objective was to delineate a single hotspot value per location (i.e., 5 x 5 km grid cell), using a post-hoc integrated hotspot modeling approach, for each of the selected species and species groups. Our combined model showed Lake St. Clair and western Lake Erie had more hotspots than expected for half the species analyzed, which is likely due to the shallow depths of these two lakes. Lakes Michigan and Huron exhibited a higher 38 proportion of hotspots than expected for long-tailed duck. The difference in pattern between the diving/sea duck group and long-tailed duck suggests that diving and sea ducks exhibit very different distributional patterns and should (given sufficient data) be split out and analyzed separately. Uneven sampling across the Great Lakes region affects confidence that locations are or are not true hotspots, but using a combined hotspot approach can help alleviate some concerns associated with limited data availability. Our integrated modeling approach increases the consistency of hotspot detection, increasing accuracy in assessments of waterbird spatial patterns. INTRODUCTION Waterbird species have evolved in highly variable conditions (e.g., changes in temperature or ice cover), but rapid global changes within the last several decades pose new challenges (Montevecchi et al. 2012; Perry et al. 2005). One primary threat to waterbird viability and diversity is habitat loss due to human activities, such as energy development (e.g., gas, oil, and wind), commercial and recreational fishing, open water shipping, and coastal development (Garthe and Huppop 2004, Williamson et al. 2013, Adam et al. 2015, Kuletz et al. 2015). Another serious threat is increasing climate variability as open waterbird species respond to both small- and large-scale changes in weather conditions (e.g., daily to decadal temperature fluctuations). As such, waterbirds can serve as indicators of aquatic ecosystem health and habitat quality (Reese and Brodeur 2006, Piatt et al. 2007, Nur et al. 2011, Croxall et al. 2012, Santora and Sydeman 2015). They also provide regulatory (e.g., pest control, disease surveillance, regime shifts), cultural (e.g., recreational hunting, birdwatching, art), and provisionary services (e.g., meat, feathers for clothing, grease for waterproofing; Green et al. 2014). Because 39 waterbirds play important roles in ecosystems, their distributions and abundance patterns provide valuable information that can be used to make coordinated decisions for effective and sustainable resource allocation. Waterbird populations exhibit temporal and spatial variability in their abundances and distributions (Virkkala 2016). Their habitat use is often spatially and temporally unpredictable; however, individuals do move regularly and respond to dynamic climate conditions (Certain 2007, Piatt et al. 2007, Votier et al. 2008). These patchy aggregations make it difficult to assess waterbird spatial distributions (Santora and Veit 2013, Zipkin et al. 2015). Waterbird species patterns also vary throughout the year, based on habitat preference for breeding, overwintering, and migration locations. In open water ecosystems where both the environment and species distributions are highly dynamic, the probability that a single survey event is representative of the true population size at a particular location is low (Santora and Veit 2013, Marchese 2015). For these reasons, surveying waterbird populations is extremely difficult (Zipkin et al. 2015), and often results in data gaps. Understanding the dynamics and ecology of waterbird populations is essential to making informed conservation decisions. One of the leading methods to determining patterns of waterbird species is to identify hotspots, or locations of persistent aggregation. Conservation of hotspot locations is an effective way to protect many species simultaneously, while minimizing cost and resources. Hotspots can be identified at any geographic scale and are frequently representative of broader conservation priorities (Nur et al. 2011, Marchese 2015). By applying hotspot models, we can overcome many challenges (e.g., high cost) associated with limited datasets and identify important locations for both habitat and species conservation. 40 The Great Lakes play a unique and important role in the development of North America’s natural resources (EPA 1995), containing the second largest accumulation of fresh water on earth and more than 16,000 km of shoreline in eight U.S. states and one Canadian province (Prince et al. 1992, Gronewold 2013). The Great Lakes provide agriculture and drinking water to both the United States and Canada, allows for the transport of goods within North America, and are economically important as they connect the rich agricultural and mining regions in the west with the great industrial areas and large cities in the east (EPA 1995, Gronewold 2013). Apart from these economic services, the Great Lakes are also vitally important for wildlife, and birds in particular. More than 350 bird species (including many waterbird species) migrate through the Mississippi Flyway in the spring and fall, and rely upon habitat in and around the Great Lakes to provide essential resources throughout their journey (GLRI 2010). An array of habitats (e.g., open water, mudflats, and marshes) in the Great Lakes region produce food, provide cover and roosting areas, and therefore attract a variety of species journeying both north and south during migration (Rodewald and Ewert 2008). Coastal wetlands serve as critical migratory, breeding, and foraging habitat for many species (Riffell et al. 2003). Both nearshore (i.e., water 0-80 m deep) and offshore (i.e., water deeper than 80m) areas provide essential foraging habitat, ice-free open water, and vital roosting sites (Kreitinger et al. 2013). In addition to affecting waterbird species, increasing climate variability also impacts the Great Lakes. The effects of climate change on the Great Lakes are reflected directly through increases in both air and water temperatures and indirectly through changes in precipitation and reduction in lake water levels (Mortsch and Quinn 1996, Lofgren et al. 2011). These changes have the potential to a) affect the quality of aquatic habitats; b) alter species composition and dynamics; and c) cause socioeconomic problems through navigational challenges and the loss of hydropower and 41 shipping resources (Mortsch and Quinn 1996, Angel and Kunkel 2009). The Great Lakes are both economically and culturally important and as waterbird species are useful bioindicators, understanding their patterns and movement can be ecologically and economically beneficial to the Great Lakes basin. In this study, we combine two hotspot analysis methods, Getis-Ord Gi* (Gi*; a spatial model) and “hotspots conditional on presence” (a parametric GLM-based model), to identify hotspots for waterbird species in the Great Lakes (Sussman et al. in review). Spatial models identify clusters of grid cells as hotspots, which may be beneficial for highly mobile organisms such as waterbirds that move constantly in response to climate variability and food availability (Marchese 2015, Harvey et al. 2017). In contrast, non-spatial models typically identify individual grid cells as hotspots, and often incorporate environmental variables into the analysis. In ecology, non-spatial hotspot delineation methods are more common than spatial methods, but use arbitrary thresholds or metrics to delineate hotspots (Harvey et al. 2017). Yet, non-spatial approaches most often identify comparatively smaller locations as hotspots, which may be beneficial for conservation and management, and are less likely to overestimate hotspot locations (Harvey et al. 2017). Typically, these models are used in isolation; researchers usually use a singular approach to identify hotspots. Combining the two model types can provide a balance between the often more conservative GLM-based and less conservative spatial modeling techniques (Sussman et al. in review). 42 METHODS Study area & data description The data come from five different survey entities: (Biodiversity Research Institute (BRI), Michigan Department of Natural Resources, Michigan Natural Features Inventory [Michigan State University Extension], U.S. Geological Survey, and Western Great Lakes Bird and Bat Observatory) and are publicly available in the Midwest Avian Data Center (MWADC), a regional node of the Avian Knowledge Network (AKN) hosted by Point Blue Conservation Science (http://data.pointblue.org/partners/mwadc/). The raw data consist of aerial visual observations along survey transects (Fig. 2.1); observers recorded species, number of individuals seen, and latitude and longitude (using onboard GPS). Observers identified birds to the species level, or the lowest taxonomic group possible. As waterbirds tend to aggregate, many large flocks, covering large areas (i.e., up to 30 square kilometers) were recorded. For such flocks, the recorded location is an approximation to the center of the flock. The data were collected from late September 2012 through early June 2014 during fall, winter, and spring seasons (Table 2.1). The research entities conducted anywhere from one to twenty-nine distinct surveys, culminating in uneven sampling across the study region. Additionally, BRI conducted surveys only in the final year of data collection. In cases where the entity did not include transect attribution for individual observations, we used incremental buffering by 1-m to identify the closest transect to each observation record. We used data collected on the location of the transect line when available and GIS to reconstruct the transect lines from observations in instances when that information was not recorded. 43 Research Entity Biodiversity Research Institute Michigan Department of Natural Resources Michigan Natural Features Inventory United States Geological Survey Western Great Lakes Bird & Bat Observatory Green Bay Lake Huron Lake Ontario Lake Michigan Lake St. Clair Lake Erie Figure 2.1. Great Lakes study area, with aerial-survey transects flown over selected areas of the region during the fall, winter, and spring seasons from September 2012 to June 2014. Survey regions are colored according to the research entity who collected the data. Table 2.1. Temporal distribution of unique aerial-surveys conducted from September 2012 to June 2014 and summed across all research entities. Year Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Total 2012 0 0 0 0 0 0 0 0 1 3 8 5 17 2013 2 5 11 10 3 0 0 0 1 7 12 6 57 2014 3 3 4 11 6 1 0 0 0 0 0 0 28 Total 5 8 15 21 9 1 0 0 2 10 20 11 102 44 Species groups & data standardization We selected seven species and species groups (long-tailed duck, common loon [Gavia immer], gulls [Larus spp.], mergansers [Mergus spp. and Lophodytes spp.], scaup [Aythya spp.], loons [Gavia spp.], and diving/sea ducks [Aythya spp., Bucephala spp., Clangula hyemlais, Melanitta spp., Mergus spp., Oxyura jamaicensis, and Somateria spp.]) to analyze. We chose these species and species groups (hereafter referred to as species groups) because they exhibited a relatively even distribution across the study area and were identified by regional managers and stakeholders as species of interest. We also combined the data from all seven species groups to identify potential hotspots for an all-species-combined group. Individual species that appeared in multiple groups were only used once in the combined analysis (i.e., not double counted). We standardized the data using a 5-km2 grid that was superimposed over the entire Great Lakes region. We segmented the transects by the grid cell layer, so that all grid cells contained only the portion of each transect that occurred within the cell. We calculated the total length of the transect segment(s) within a grid cell (in kilometers) for each unique sampling event (i.e., survey) and included only those cells that contained a total transect length of at least 1-km. To account for unequal sampling effort across survey blocks, we divided the number of observations of a species group for the sampling event-grid cell combination by the summed transect length, resulting in a continuous effort-corrected count. We used data from all sampling events and grid cells surveyed in our analysis, though we limited the hotspot maps to grid cells that contained at least four sampling events to limit potential identification of false hotspots (Sussman et al. in review). 45 Hotspot analysis We selected two fundamentally different hotspot analysis models for our integrative approach: one spatial non-parametric and one non-spatial GLM-based model. Spatial models use data from a target location and its surrounding areas to detect patterns and derive areas of high density (Harvey et al. 2017). We chose Getis-Ord Gi* (Gi*; Getis and Ord 1992) for the spatial component of our combined model because it can identify spatially explicit hotspot locations and is becoming a widely-used spatial analysis tool in ecology (Kuletz et al. 2015). Given a set of weighted data points, Gi* identifies clusters of points with values higher in magnitude than expected by random chance (ESRI 2015, Kuletz et al. 2015). Gi* identifies hotspots based on the spatial relationship between a focal point (i.e., a grid cell) with its neighboring data points (or grid cells; Getis and Ord 1992, Santora et al. 2010, Kuletz et al. 2015). For this method, we assigned all observations to the midpoint of the grid cell within which the observation fell. We accounted for uneven sampling effort across the different research entities by dividing the summed effort-corrected count for a grid cell by the total number of unique sampling events for that grid cell. Because the model relies on a set of weighted points, we created a neighbor list, which identifies the relationship of each grid cell relative to its surrounding grid cells. There are different types of neighbor lists that can be built; we used a standard rook’s case contiguity neighbor list which identifies neighbors as those grid cells that share a border with the focal grid cell (spdep package in R; Bivand et al. 2013, Bivand and Piras 2015). Using the neighbor list, we calculated a row-standardized spatial weights matrix as our set of weighted data points, which informs every grid cell’s relationship to all other cells within the neighborhood (spdep package in R; Bivand et al. 2013, Bivand and Piras 2015, Kuletz et al. 2015). The spatial analysis resulted 46 in a z-score for every grid cell that represents the statistical significance of clustering, where high positive values are more likely to be considered hotspots than low values. Non-spatial hotspot models typically use statistical distributions to describe a population, with a defined metric or a threshold to identify patterns at independent locations (Oppel et al. 2012, Zipkin et al. 2015, Santora and Veit 2013). We used a GLM-based parametric model termed “hotspots conditional on presence” (Kinlan et al. 2012, Zipkin et al. 2015, Sussman et al. in review). This non-spatial approach combines count data from all sampling events and identifies individual grid cells as hotspots if the long-term average abundance is at least three times greater than the regional mean (Zipkin et al. 2015). To implement this hotspots approach, we first estimated the prevalence of the species group as the proportion of grid cells with occurrences relative to the total number of grid cells surveyed. We then fit the non-zero effortcorrected counts using a lognormal distribution (fitdistrplus package in R; Delignette-Muller and Dutang 2015). We chose the lognormal over other distributions because it is a positive (does not support zeros), continuous probability distribution which is well established to fit heavily rightskewed waterbird data (Zipkin et al. 2014). We simulated data using a Monte Carlo approach to calculate the probability that a particular grid cell with an observed mean count was a hotspot (Kinlan et al. 2012, Zipkin et al. 2015). The Monte Carlo simulation has two components: the first is a count component, for which we used the estimated mean and standard deviation from the lognormal distribution, and the second is a Bernoulli component, for which we used the prevalence estimate. A grid cell was considered a hotspot when the long-term average effortcorrected count (conditional on presence) was at least three times the mean of the reference region (also conditional on presence). The parametric model resulted in a value from zero to one for every grid cell which represents the proportion of simulated sample means that are greater 47 than three times the average count, where high values close to one are more likely to be hotspots than low values near zero. Combining hotspot models Our goal was to combine the results from the two different hotspot methods to assign a single hotspot value per grid cell. To do this, we ordered each method’s set of resulting values (i.e., range: (-∞, ∞) for Gi* and (0, 1) for hotspots conditional on presence for each grid cell) for all surveyed cells from smallest to largest. We then ranked each grid cell from 1 to 1767 (the total number of grid cells surveyed) based on its value from each model (i.e., two rankings, one for each analysis method). Grid cells whose values were the same were considered tied and were assigned the lowest rank (i.e., the minimum rank of the tied values) among those cells. Once all grid cells were ranked, we calculated the corresponding percentile by dividing the rank by the total number of grid cells surveyed. After calculating percentiles for each grid cell with each method, we calculated the average percentile of the two methods for all grid cells, resulting in a value from zero to one where high values closer to one are more likely to be hotspots than low values closer to zero. We produced a single hotspot map for each species group to highlight potential hotspot locations. We binned the percentiles for mapping purposes and consider a species hotspot as any grid cell with an average value above the 75th percentile (Sussman et al. in review). RESULTS Five out of eight species groups (diving/sea ducks, gulls, mergansers, scaup, and the allspecies-combined group; Table 2.2) had a higher percentage of hotspots (≥ 75th percentile) than 48 expected relative to the number of grid cells surveyed within Lake St. Clair (4.24%) and western Lake Erie (2.38%). The diving/sea duck group exhibited the highest proportion of grid cells considered hotspots in both Lake St. Clair and western Lake Erie (16.97% and 8.37%, respectively; Table 2.2). All-species-combined had the same proportion of grid cells considered hotspots as diving/sea ducks (16.95%) in Lake St. Clair, and the second highest proportion in western Lake Erie (8.14%; Table 2.2). Of the total number of blocks that were surveyed, 5.43% were in eastern Lake Erie. Half of the species groups analyzed exhibited a greater number of hotspots in eastern Lake Erie than expected by chance, including gulls (13.57%), mergansers (13.57%), loons (15.99%), and common loon (17.87%). Conversely, long-tailed ducks had no hotspots in the entirety of Lake Erie or Lake St. Clair. Instead, long-tailed ducks had a higher proportion of hotspots than was expected in Lakes Huron (18.33%), and Lake Michigan (81.67%; Table 2.2). Long-tailed ducks were the only species analyzed to have a higher proportion of hotspots than expected by chance in Lakes Huron and Michigan, although allspecies-combined group also had a large proportion of hotspots in both of these lakes, likely due to the large number of long-tailed ducks in the data. In Lake St. Clair and western Lake Erie, the combined model results showed hotspots both nearshore and offshore for all-species-combined, whereas in Lake Michigan and eastern Lake Erie more hotspots were found along the coastline (Fig. 2.2). Diving/sea ducks exhibited a similar pattern, which was expected as they make up 70% of all observations (Fig. 2.3). Both diving/sea ducks and all-species-combined demonstrated low proportions (6.78% and 4.3%, respectively; Table 2.2) of hotspots in Lake Huron relative to the number of grid cells surveyed (8.83%), with almost all hotspots falling within the 75-90th percentile (Figs. 2.2-2.3). The combined hotspot model identified clusters of grid cells as gull hotspots along the coastline in all 49 lakes except Lake Huron as well as offshore locations in Lake Erie and northern Lake Michigan (Fig. 2.4). Gull species had the second lowest percentage of hotspots in Lake Huron (1.58%) relative to the total number of grid cells surveyed within the lake (8.83%). Gulls and mergansers are the only two species groups to have a higher proportion of hotspots than expected due to random chance, in both surveyed areas of Lake Erie (western and eastern; Table 2.2). Longtailed ducks exhibited hotspots throughout the lakes in which they were found in, rather than strictly in near or offshore locations, like the other species groups (Fig. 2.5). Mergansers had the lowest proportion of hotspots relative to grid cells surveyed within Lake Huron (1.13%, Table 2.2, Fig. 2.6). In all lakes except St. Clair, more hotspots along the coastline than in offshore areas were identified for scaup species (Fig. 2.7). Every grid cell surveyed in Lake St. Clair was a hotspot for scaup, with the highest proportion of hotspots above the 90th percentile for all species analyzed (Fig. 2.7). Loons and common loon are the only group/species to have a higher proportion of hotspots (relative to the surveyed grid cells) in eastern Lake Erie (Table 2.2). They also had the second (79.05%) and third (78.51%) highest percentage of hotspots relative to all grid cells surveyed in Lake Michigan (79.12%; Table 2.2). No hotspots were identified for all loon species and common loons in Lake St. Clair or in western Lake Erie (Figs. 2.8-2.9). However, both species groups exhibited hotspots in the near and offshore, in eastern Lake Erie and in portions of northern and southern Lake Michigan (Figs. 2.8-2.9). There were no hotspots above the 90th percentile in the area surveyed in the middle of Lake Michigan; scaup, loons, and common loons had few hotspots (75-90th percentile) on the outermost grid cells of this region (Figs. 2.2-2.9). 50 Table 2.2. Percent hotspots within each lake for each species/group and the all-species-combined group. The top row of the graph shows the percent of all grid cells surveyed (out of 1767) within each lake. When comparing across the lakes, it is important to note percentages are relative to the number of grid cells surveyed within each lake. Bold font indicates that the percent values are greater than the proportion surveyed in that lake, and indicate more hotspots than would be expected by chance for a particular species group in a particular lake. Lake Lake Lake Eastern Western Species Huron Michigan Lake Erie Lake Erie St. Clair Percent of all cells 8.83% 79.12% 5.43% 2.38% 4.24% All-species-combined 4.30% 65.38% 5.20% 8.14% 16.97% Diving/Sea Ducks 6.79% 64.25% 3.62% 8.37% 16.97% Gulls 1.58% 69.91% 13.57% 7.24% 7.69% 18.33% 81.67% 0.00% 0.00% 0.00% Mergansers 1.13% 62.67% 13.57% 7.69% 14.93% Scaup 7.22% 66.30% 4.81% 7.78% 13.89% Loons 4.73% 79.05% 15.99% 0.23% 0.00% Common Loon 3.39% 78.51% 17.87% 0.23% 0.00% Long-tailed Duck 51 Hotspots <25th 25-50th 50-75th 75-90th 90-95th 95-99th >99th sampled less than 4 times Green Bay Lake Huron Lake Ontario Lake Michigan Lake St. Clair Lake Erie Figure 2.2. Potential hotspots (values above the 75% percentile) across all sampled locations for all-species-combined (includes: all diving/sea ducks, gulls, loons, mergansers, and scaup) as defined by the integrated hotspot modeling approach. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. 52 Hotspots <25th 25-50th 50-75th 75-90th 90-95th 95-99th >99th sampled less than 4 times Green Bay Lake Huron Lake Ontario Lake Michigan Lake St. Clair Lake Erie Figure 2.3. Potential hotspots (values above the 75% percentile) across all sampled locations for diving/sea ducks (includes: bufflehead, canvasback, common eider, long-tailed duck, redhead, ring-necked duck, ruddy duck, all eiders, all goldeneye, all mergansers, all scaup, all scoters, all unidentified diving ducks) as defined by the integrated hotspot modeling approach. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. 53 Hotspots <25th 25-50th 50-75th 75-90th 90-95th 95-99th >99th sampled less than 4 times Green Bay Lake Huron Lake Ontario Lake Michigan Lake St. Clair Lake Erie Figure 2.4. Potential hotspots (values above the 75% percentile) across all sampled locations for gulls (includes: Bonaparte's gull, glaucous gull, great black-backed gull, herring gull, Iceland gull, mew gull, ring-billed gull, all unidentified gulls) as defined by the integrated hotspot modeling approach. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. 54 Hotspots <25th 25-50th 50-75th 75-90th 90-95th 95-99th >99th sampled less than 4 times Green Bay Lake Huron Lake Ontario Lake Michigan Lake St. Clair Lake Erie Figure 2.5. Potential hotspots (values above the 75% percentile) across all sampled locations for long-tailed duck [Clangula hyemalis] as defined by the integrated hotspot modeling approach. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. 55 Hotspots <25th 25-50th 50-75th 75-90th 90-95th 95-99th >99th sampled less than 4 times Green Bay Lake Huron Lake Ontario Lake Michigan Lake St. Clair Lake Erie Figure 2.6. Potential hotspots (values above the 75% percentile) across all sampled locations for mergansers (includes: common merganser, hooded merganser, red-breasted merganser, all unidentified mergansers, all unidentified merganser/goldeneye) as defined by the integrated hotspot modeling approach. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. 56 Hotspots <25th 25-50th 50-75th 75-90th 90-95th 95-99th >99th sampled less than 4 times Green Bay Lake Huron Lake Ontario Lake Michigan Lake St. Clair Lake Erie Figure 2.7. Potential hotspots (values above the 75% percentile) across all sampled locations for scaup (includes: greater scaup, lesser scaup, all unidentified scaup) as defined by the integrated hotspot modeling approach. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. 57 Hotspots <25th 25-50th 50-75th 75-90th 90-95th 95-99th >99th sampled less than 4 times Green Bay Lake Huron Lake Ontario Lake Michigan Lake St. Clair Lake Erie Figure 2.8. Potential hotspots (values above the 75% percentile) across all sampled locations for loons (includes: common loon, redthroated loon, all unidentified loons) as defined by the integrated hotspot modeling approach. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. 58 Hotspots <25th 25-50th 50-75th 75-90th 90-95th 95-99th >99th sampled less than 4 times Green Bay Lake Huron Lake Ontario Lake Michigan Lake St. Clair Lake Erie Figure 2.9. Potential hotspots (values above the 75% percentile) across all sampled locations for common loon [Gavia immer] as defined by the integrated hotspot modeling approach. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. 59 DISCUSSION Our combined hotspot model shows that Lake St. Clair and western Lake Erie have more hotspots than expected relative to the number of grid cells surveyed in these two lakes. Within the context of this study, these areas are important for many waterbird species and should continue to be monitored, especially when considering future conservation objectives. On average, these two lakes are the shallowest, even at their maximum depth, of all lakes in the Great Lakes region (Bolsenga and Herdendorf 1993, EPA 1995). The shallow water depths of these lakes provide vital food and habitat resources for nesting colonial birds, migrating waterfowl and other waterbirds (Riffell et al. 2001, Monfils and Gehring 2012). For example, during the breeding season, female gulls have been shown to forage in nearshore and intertidal areas where prey species aggregate, resulting in less time spent traveling to and from the foraging area (Fox et al. 1990). By doing so, they are able to make more trips, and return to their young faster leaving them unprotected less often than if they foraged further offshore. Our results indicate that gulls are not the only species group to aggregate along the coastline more than offshore. In Lake Michigan and eastern Lake Erie in particular, diving/sea ducks, longtailed ducks, mergansers, and scaup showed more hotspots near the coast rather than offshore. Coastal wetlands have experienced increased degradation due to construction and development, destroying valuable habitat for wildlife, particularly waterfowl (EPA 1995). Despite the urbanization along the coasts, many waterbird species are aggregating in areas with remaining suitable habitat. Shallow areas nearshore provide more foraging opportunities and easier access to prey items than deeper offshore waters. Identifying locations of high-use along the coastline can help determine areas in need of protection from future development. 60 We chose to combine two of the four models, one spatial (Gi*) and one non-spatial parametric (hotspots conditional on presence) model, examined by Sussman et al. (in review). We selected these two models because although they demonstrated low to moderate correlation (0.075-0.670), they were more similar to one another more often than with other spatial and parametric methods (i.e., kernel density estimation and “hotspot persistence”; Sussman et al. in review) in identifying hotspots. We chose the hotspots conditional on presence approach for the parametric, non-spatial component of the combined model because simulating data using the Monte Carlo approach works well with sparse data. We selected the Getis-Ord Gi* approach as the spatial component because it indicates the statistical significance of identified hotspots and is less likely to identify false hotspots than other spatial modeling techniques (Santora et al. 2010, Kuletz et al. 2015). However, combinations of other model types could lead to different results. For example, although Sussman et al. (in review) found a high correlation between the two spatial models tested (kernel density estimation and Gi*), there is no evidence that one may be more accurate than the other. To test the robustness of our results, we reran our combined models in two ways: 1) using kernel density estimation instead of G*; and 2) by averaging kernel density and Gi* models. We found no discernable differences in the results. Kernel density estimation and Gi* are inherently similar; however, Gi* is slightly more conservative than kernel density estimation. Kernel density estimation requires more user-based input than Gi* and can therefore be influenced to produce more appealing or desirable results (Kuletz et al. 2015). Both spatial methods identify clusters of individual grid cells as hotspots, but Gi* is less influenced by surrounding grid cells that are farther away than kernel density estimation, resulting in greater confidence in central (to the cluster) locations being true hotspots. Likewise, there are other nonspatial models and techniques that have been used to identify hotspots. Typically, non-spatial 61 methods differ by threshold or metric used to define a hotspot. For example, the more commonly used species-based metric models focus on species richness, number of species within a region, or number of endemic, rare, or threatened species (Marchese 2015). Other non-spatial models focus on phylogenetic diversity and the diversity of ecological traits (Marchese 2015). These metrics are often arbitrary; choosing a different threshold can result in a different outcome. Ideally, hotspots should be identified for each species group by season, as distributions vary throughout the year. However, the quantity of data available can prevent the identification of seasonal hotspot patterns. Although our data span three seasons (fall, winter, and spring) over two years, there were not enough unique sampling events to break down the data into seasons. Zipkin et al. (2015) found that at least 40 sampling events are necessary to have adequate statistical power (>0.6) to detect species-specific seasonal hotspots for the most prevalent species, and greater than 100 sampling events for less common species. The sampling scheme in our analysis was quite uneven across the study region, ranging from one to twenty-nine sampling events. Yet, even the areas with the most surveys do not reach the minimum suggested number of sampling events. In our analysis, we chose to include all surveyed grid cells, regardless of the number of sampling events. However, in presenting hotspot maps, we only display hotspots for grid cells surveyed at least four times. This threshold allowed us to incorporate data from all survey entities into our analysis, increasing the geographic range; but may be misleading by identifying hotspot locations in cases in which grid cells were only sampled a limited number of times. The likelihood of detecting false hotspots (or failing to detect hotspots when they are actually present, type II error) decreases as the number of sampling events increases (Kinlan et al. 2012). Our results indicate where hotspots may be located but data were limited; therefore, more data would help identify hotspot locations with greater accuracy. 62 Slightly greater than 20% of all observations were not identified to the species-level. To utilize all available data, we grouped species at higher taxonomic levels and conducted species group rather than species-specific analyses. Our results for species groups are useful, but for species-specific concerns and conservation and management decisions, our group level analysis may not be ideal. The combined hotspot model demonstrates that long-tailed ducks do not exhibit the same pattern of aggregation as the diving/sea ducks within the Great Lakes region. Therefore, it is reasonable to believe that some species groups (e.g., diving and sea ducks) should probably be further split and analyzed separately as they exhibit different distributions (D. Luukkonen and M. Monfils, personal communication, 2017). Splitting the diving (tribe Aythyini) and sea (tribe Mergini) ducks into two different species groups may result in drastically different patterns across the Great Lakes. In an effort to discern hotspots for diving and sea ducks, we analyzed diving and sea ducks separately. Our combined hotspot model showed that diving and sea ducks do exhibit different distributions (Appendix 2, Figs. S2.1-S2.2). Diving ducks exhibited a higher proportion of hotspots than expected by chance in Lakes Erie and St. Clair. However, sea ducks showed a higher proportion of hotspots than expected by chance in Lakes Huron and Michigan, which is consistent with the pattern showed by long-tailed duck (Appendix 2, Table S.1). Although this supplemental analysis demonstrates that diving and sea ducks have different patterns in the Great Lakes, and should be split out for analyses, caution should be taken as there were not enough data to be confident in the detected hotspots. The inconsistency in sampling across research entities demonstrates that although data collected by many different groups can be incorporated into a single analysis, it is necessary to consider how different protocols and survey methods may affect results. There are likely temporal biases in our data as some research entities selected survey periods to coincide with 63 seasonal patterns and migration, focusing surveys around peaks of abundance for specific species. Such a sampling scheme can increase targeted species’ abundances while decreasing abundances of non-targeted species. Targeted species sampling may also introduce observer bias, as observers may disregard species they are not specifically interested in surveying or only identify non-targeted species to family or genus level rather than to the lowest possible taxonomic rank (i.e., species). These inconsistencies demonstrate the need for standardized survey methods. Standardized methods are important, particularly when multiple agencies or organizations are collecting data, and will reduce many potential biases. The data used in this study are also limited spatially. This is apparent in the distribution of sampling effort, where large areas of each lake were not surveyed. Had we surveyed in other portions of the lakes or included data collected outside the temporal scope of the study, we would perhaps have seen different patterns for some species groups. It is unrealistic to think we will ever be able to survey the entirety of the Great Lakes, but increasing the spatial coverage in future survey efforts, and thereby increasing the amount of data, will possibly allow us to predict where hotspots may occur in those areas not sampled. Future hotspot models should consider incorporating biotic and abiotic environmental predictors (e.g., ice cover, bathymetry, distance to shoreline), which have the potential to improve estimates of hotspot locations. The hotspot concept works extremely well within a static environment (e.g., coral reefs), but in a pelagic environment where the physical and chemical conditions of marine ecosystems are constantly changing, hotspot models can be difficult to apply (Marchese 2015). Hotspots may be a function of biophysical aggregation, where environmental factors lead to increased nutrient levels and prey availability for waterbird species (Hazen et al. 2013). Incorporating environmental variables into analyses can lead to more 64 accurate hotspot identification as well as predicting species responses to highly dynamic physical processes. Human activities such as alternative energy exploration (e.g., offshore wind), shipping, fishing, and coastal development may lead to habitat loss and degradation resulting in negative impacts on waterbird populations. Decisions on where to implement such activities should be carefully considered to minimize loss of both habitat and species. Combining multiple hotspot analysis methods in an integrated modeling approach can increase consistency in the identification of waterbird hotspots (Daru et al. 2015, Marchese 2015, Sussman et al. in review). Additionally, a combined approach may prevent false hotspot identification or ‘true’ hotspots being overlooked (Nur et al. 2011). Future studies can incorporate different models and metrics into a single hotspot analysis tool, thereby improving the ability to delineate hotspots and identify waterbird spatial patterns. With over 16,000 km of coastline, the Great Lakes will continue to provide drinking water, commercial transport, and recreational opportunities to the more than 30 million people in the United States and Canada within the basin (EPA 1995). Continued urbanization and development will impact the natural habitat along the lakes thereby limiting the availability of suitable habitat for waterbird species. Therefore, identifying waterbird hotspot locations will help to inform conservation and management decisions regarding the preservation of the Great Lakes ecosystem for future generations. 65 APPENDIX 66 Analysis of diving/sea ducks split into two distinct groups. Results should be interpreted with caution due to limited data availability. Diving ducks (tribe Aythyini) include canvasback, redhead, ring-necked duck, ruddy duck, and all scaup species. Sea ducks (tribe Mergini) include bufflehead, common eider, long-tailed duck, all goldeneye species, all merganser species, and all scoter species. Table S2.1. Percent hotspots within each lake for diving and sea duck species groups. The top row of the graph shows the percent of all grid cells surveyed (out of 1767) within each lake. When comparing across the lakes, it is important to note percentages are relative to the number of grid cells surveyed within each lake. Bold font indicates that the percent values are greater than the proportion surveyed in that lake, and indicate more hotspots than would be expected by chance for a particular species group in a particular lake. Lake Huron Lake Michigan Eastern Lake Erie Western Lake Erie Lake St. Clair Percent of all blocks 8.83% 79.12% 5.43% 2.38% 4.24% Diving Ducks 1.13% 63.57% 8.82% 9.50% 16.97% 10.18% 79.86% 2.71% 1.58% 5.66% species Sea Ducks 67 Hotspots <25th 25-50th 50-75th 75-90th 90-95th 95-99th >99th sampled less than 4 times Green Bay Lake Huron Lake Ontario Lake Michigan Lake St. Clair Lake Erie Figure S2.1. Potential hotspots (values above the 75% percentile) across all sampled locations for diving ducks (includes: canvasback, redhead, ring-necked duck, ruddy duck, and all scaup) as defined by the integrated hotspot modeling approach. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. 68 Hotspots <25th 25-50th 50-75th 75-90th 90-95th 95-99th >99th sampled less than 4 times Green Bay Lake Huron Lake Ontario Lake Michigan Lake St. Clair Lake Erie Figure S2.2. Potential hotspots (values above the 75% percentile) across all sampled locations for sea ducks (includes: bufflehead, common eider, long-tailed duck, all goldeneye, all mergansers, and all scoters) as defined by the integrated hotspot modeling approach. Grid cells sampled less than four times were excluded from the analysis and are shaded in gray. 69 LITERATURE CITED 70 LITERATURE CITED Adam M., Musilova S., Musil P., Zouhar J., and Romportl D. 2015. Long-term changes in habitat selection of wintering waterbirds: High importance of cold weather refuge sites. Museum and Institute of Zoology, Polish Academy of Sciences 50(2):127-138. Anderson M. 2017. The big four - diving ducks. Wetlands, Conservation, Waterfowl, Duck Hunting - World Leader in Wetlands Conservation - Ducks Unlimited. . Angel J.R. and Kunkel K.E. 2009. The response of Great Lakes water levels to future climate scenarios with an emphasis on Lake Michigan-Huron. Journal of Great Lakes Research 36:51-58. Araujo M.B. 2002. Biodiversity hotspots and zones of ecological transition. Conservation Biology 16(6):1662-1663. Beyer H. 2014. Kernel density estimation in Introducing the Geospatial Modelling Environment. Spatial Ecology LLC. Version: 0.7.2 RC2. . Bivand R. and Piras G. 2015. Comparing implementations of estimation methods for spatial econometrics. Journal of Statistical Software 63(18):1-36. Bivand R.S., Hauke J., and Kossowski T. 2013. Computing the Jacobian in Gaussian spatial autoregressive models: An illustrated comparison of available methods. Geographical Analysis 45(2):150-179. Bolker B. 2008. Chapter 4: Probability distributions in Ecological Models and Data in R. Princeton University Press, Princeton, NJ. 508pp. Bolsenga S.J. and Herdendorf C.E. 1993. Lake Erie and Lake St. Clair Handbook. Wayne State University Press, Detroit MI. Canadas E.M., Fenu G., Penas J., Lorite J., Mattana E., and Bacchetta G. 2014. Hotspots within hotspots: Endemic plant richness, environmental drivers, and implications for conservation. Biological Conservation 170:282-291. Carolan M.S. 2009. This is not a biodiversity hotspot: The power of maps and other images in the environmental sciences. Society and Natural Resources 22(3):278-286. Certain G., Bellier E., Planque B., and Bretagnolle V. 2007. Characterising the temporal variability of the spatial distribution of animals: an application to seabirds at sea. Ecography 30:695-708. 71 Conway J., Eddelbuettel D., Nishiyama T., Prayaga S.K., and Tiffin N. 2013. RPostgreSQL: R interface to the PostgreSQL database system. R package version 0.4. . Croxall J.P., Butchart S.H.M., Lascelles B., Stattersfield A.J., Sullivan B., Symes A., and Taylor P. 2012. Seabird conservation status, threats and priority actions: a global assessment. Bird Conservation International 22:1-34. Daru B.H., van der Bank M., and Davies T.J. 2015 Spatial incongruence among hotspots and complementary areas of tree diversity in southern Africa. Biodiversity Research 21:769780. Davoren G.K. 2007. Effects of gill-net fishing on marine birds in a biological hotspot in the northwest Atlantic. Conservation Biology 21(4):1032-1045. Delignette-Muller M.L. and Dutang C. 2015. fitdistrplus: An R package for fitting distributions. Journal of Statistical Software 64(4):1-34. . Dennis B. and Patil G.P. 1983. The gamma distribution and weighted multimodal gamma distributions as models of population abundance. Mathematical Biosciences 68(2):187212. EPA 1995. The Great Lakes: An Environmental Atlas and Resource Book. Government of Canada, Toronto, Canada and U.S. Environmental Protection Agency, Chicago, Illinois. 3rd edition 51 pp. ESRI 2015. ArcGIS Desktop: Release 10.3.1. Redlands, CA: Environmental Systems Research Institute. Fox G.A., Allan L.J., Weseloh D.V., and Mineau P. 1990. The diet of herring gulls during the nesting period in Canadian waters of the Great Lakes. Canadian Journal of Zoology 68:1075-1085. Garthe S. and Huppop O. 2004. Scaling possible adverse effects of marine wind farms on seabirds: developing and applying a vulnerability index. Journal of Applied Ecology 41:724-734. Gaston K.J. and David R. 1994. Hotspots across Europe. Biodiversity Letters 2(4):108-116. Getis A. and Boots B. 1978. Models of spatial processes: an approach to the study of point, line, and area patterns. Cambridge University Press, Cambridge, England. 198pp. Getis A. and Ord J.K. 1992. The analysis of spatial association by use of distance statistics. Geographical Analysis 24(3):189-206. GLRI Great Lakes Restoration Initiative. 2010. 72 Great Lakes Fish and Wildlife Restoration Act (GLFWRA) of 2006. S. 2430 - 109th Congress. . Green A.J. and Elmberg J. 2014. Ecosystem services provided by waterbirds. Biological Reviews 89:105-122. Gronewold A.D., Fortin V., Lofgren B., Clites A., Stow C.A., and Quinn F. 2013. Coasts, water levels, and climate change: A Great Lakes perspective. Climatic Change 120:697-711. Harcourt A.H. 1999. Coincidence and mismatch of biodiversity hotspots: a global survey for the order, primates. Biological Conservation 93:163-175. Harris N.L. 2017. Using spatial statistics to identify emerging hot spots of forest loss. Environmental Research Letters 12:1-13. Harvey G.K.A., Nelson T.A., Fox C.H., and Paquet P.C. 2017. Quantifying marine mammal hotspots in British Columbia, Canada. Ecosphere 8(7):1-22. Hengl T. 2006. Finding the right pixel size. Computers and Geosciences 32:1283-1298. Hobday A.J. and Pecl G.T. 2014. Identification of global marine hotspots: sentinels for change and vanguards for adaptation action. Reviews in Fish Biology and Fisheries 24:415-425. Hyrenbach K.D., Forney K.A., and Dayton P.K. 2000. Marine protected areas and ocean basin management. Aquatic Conservation: Marine and Freshwater Ecosystems 10:437-458. Johnson S.M., Connelly E.E., Williams K.A., Adams E.M., Stenhouse I.J., and Gilbert A.T. 2015. Integrating data across survey methods to identify spatial and temporal patterns in wildlife distributions in: Wildlife densities and habitat use across temporal and spatial scales on the Mid- Atlantic Outer Continental Shelf: Final report to the Department of Energy Efficiency and Renewable Energy Wind & Water Power Technologies Office. Williams K.A., Connelly E.E., Johnson S.M., and Stenhouse I.J. (eds.) Award Number: DE-EE0005362. Report BRI 2015-11, Biodiversity Research Institute, Portland, Maine. 56 pp. Kinlan, B.P., E.F. Zipkin, A.F. O’Connell, and C. Caldow. 2012. Statistical analyses to support guidelines for marine avian sampling: final report. U.S. Department of the Interior, Bureau of Ocean Energy Management, Office of Renewable Energy Programs, Herndon, VA. OCS Study BOEM 2012-101. NOAA Technical Memorandum NOS NCCOS 158. xiv+77 pp. Kreitinger K., Steele Y., and Paulios A. 2013. The Wisconsin All-bird Conservation Plan, v2.0. Wisconsin Bird Conservation Initiative. Wisconsin Department of Natural Resources. Madison, WI. 73 Kuletz K.J., Ferguson M.C., Hurley B., Gall A.E., Labunski E.A., and Morgan T.C. 2015. Seasonal spatial patterns in seabird and marine mammal distribution in the eastern Chukchi and western Beaufort seas: Identifying biologically important areas. Progress in Oceanography 136:175-200. Limpert E., Stahel W.A., and Abbt M. 2001. Log-normal distributions across the sciences: keys and clues. BioScience 51(5):341-352. Lofgren B.M., Hunter T.S., and Wilbarger J. 2011. Effects of using air temperature as a proxy for potential evapotranspiration in climate change scenarios of Great Lakes basin hydrolgogy. Journal of Great Lakes Research 37:744-752. Marchese C. 2015. Biodiversity hotspots: A shortcut for a more complicated concept. Global Ecology and Conservation 3:297-309. Mittermeier R.A., Turner W.R., Larsen F.W., Brooks T.M., and Gascon C. 2011. Global biodiversity conservation: The critical role of hotspots in Biodiversity Hotspots. pp 3-22. Monfils M.J. and Gehring J.L. 2012. Identifying migrant waterfowl and waterbird stopovers to inform wind energy development siting within Saginaw Bay. Michigan Natural Features Inventory, Lansing, MI. MNFI Report Number 2012-19. 50 pp. Montevecchi, W.A., Hedd A., McFarlane Tranquilla L., Fifield D.A., Burke C.M., Regular P.M., Davoren G.K., Garthe S., Robertson G.J., and Phillips R.A. 2012. Tracking seabirds to identify ecologically important and high risk marine areas in the western North Atlantic. Biological Conservation 156:62-71. Mortsch L.D. and Quinn F.H. 1996. Climate change scenarios for Great Lakes Basin ecosystem studies. Limnology and Oceanography 41(5): 903-911. Myers N., Mittermeier R.A., Mittermeier C.G., daFonseca G.A.B., and Kent J. 2000. Biodiversity hotspots for conservation priorities. Nature 403:853-858. Nelson T.A. and Boots B. 2008. Detecting spatial hot spots in landscape ecology. Ecography 31:556-566. North American Waterfowl Management Plan (NAWMP) 2004. Implementation Framework: Strengthening the Biological Foundation. Canadian Wildlife Service, U.S. Fish and Wildlife Service, Secretaria de Medio Ambiente y Recursos Naturales, 106 pp. Nur N., Jahncke J., Herzog M.P., Howar J., Hyrenbach K.D., Zamon J.E., Ainley D.G., Wiens J.A., Morgan K., Balance L.T., and Stralberg D. 2011. Where the wild things are: predicting hotspots of seabird aggregations in the California Current System. Ecological Applications 21: 2241-2257. 74 O’Brien S.H., Webb A., Brewer M.J., and Reid J.B. 2012. Use of kernel density estimation and maximum curvature to set Marine Protected Area boundaries: Identifying a Special Protection Area for wintering red-throated divers in the UK. Biological Conservation 156:15-21. Oppel S., Meirinho A., Ramírez I., Gardner B., O’Connell A.F., Miller P.I., and Louzao M. 2012. Comparison of five modeling techniques to predict the spatial distribution of seabirds. Biological Conservation 156:94-104. Orme C.D.L., Davies R.G., Burgess M., Eigenbrod F., Pickup N., Olson V.A., Webster A.J., Ding T., Rasmussen P.C., Ridgley R.S., Stattersfield A.J., Bennet P.M., Blackburn T.M., Gaston K.J., and Owens I.P.F. 2005. Global hotspots of species richness are not congruent with endemism or threat. Nature 436(18):1016-1019. Perry A.L., Low P.J., Ellis J.R., and Reynolds J.D. 2005. Climate change and distribution shifts in marine fishes. Science 308(5730):1912-1915. Piatt J.F., Sydeman W.J., and Wiese F. 2007. Introduction: a modern role for seabirds as indicators. Marine Ecology Progress Series 352:199-204. Piatt J.F., Wetzel J., Bell K., DeGange A.R., Balogh G.R., Drew G.S., Geernaert T., Ladd C., and Byrd G.V. 2006. Predictable hotspots and foraging habitat of the endangered shorttailed albatross (Phoebastria albatrus) in the North Pacific: implications for conservation. Deep-Sea Research II 53:387-398. Possingham H.P. and Wilson K.A. 2005. Turning up the heat on hotspots. Nature 436(18):919920. Prendergast J.R., Quinn R.M., Lawton J.H., Eversham B.C. & Gibbons D.W. 1993. Rare species, the coincidence of diversity hotspots and conservation strategies. Nature 365:335-337. Prince H.H., Padding P.I., and Knapton R.W. 1992. Waterfowl use of the Laurentian Great Lakes. Journal of Great Lakes Research 18(4):673-699. R Core Team 2015. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. . Reese D.C. and Brodeur R.D. 2006. Identifying and characterizing biological hotspots in the northern California Current. Deep-Sea Research II 53:291-314. Revelle W. 2015. psych: Procedures for personality and psychological research. Northwestern University, Evanston, Illinois, USA. . Riffell S.K., Keas B.E., and Burton T.M. Area and habitat relationships of birds in Great Lakes coastal wet meadows. Wetlands 21(4):492-507. 75 Rodewald P.G. and Ewert D. 2008. Managing habitats for migrating land birds in the western Lake Erie basin: A guide to landscaping and land management. The Nature Conservancy and Ohio Department of Natural Resources, Division of Wildlife. Santora J.A. and Sydeman W.J. 2015. Persistence of hotspots and variability of seabird species richness and abundance in the southern California Current. Ecosphere 6(11):1-19. Santora J.A. and Veit R.R. 2013. Spatio-temporal persistence of top predator hotspots near the Antarctic Peninsula. Marine Ecology Progress Series 487:287-304. Santora J.A., Reiss C.S., Loeb V.J., and Veit R.R. 2010. Spatial association between hotspots of baleen whales and demographic patterns of Antarctic krill Euphasia superba suggests size-dependent predation. Marine Ecology Progress Series 405:255-269. Schneider D.C. and Duffy D.C. 1985. Scale-dependent variability in seabird abundance. Marine Ecology Progress Series 25:211-218. Shirkey, B.T. 2012. Diving duck abundance and distribution on Lake St. Clair and western Lake Erie. M.S. Thesis, Michigan State University, East Lansing. Songchitruska P. and Zeng X. 2010. Getis-Ord spatial statistics to identify hot spots by using incident management data. Journal of the Transportation Research Board 2165:42-51. Suryan R.M., Kuletz K.J., Parker-Stetter, S.L., Ressler P.H., Renner M., Horne J.K., Farley E.V., and Labunski E.A. 2016. Temporal shifts in seabird populations and spatial coherence with prey in the southeastern Bering Sea. Marine Ecology Progress Series 549:199-215. Suryan R.M., Santora J.A., and Veit R.R. 2012. New approach for using remotely sensed chlorophyll a to identify seabird hotspots. Marine Ecology Progress Series 451:213-225. Sussman A.L., Gardner B., Adams E.M., Salas L., Kenow K.P., Luukkonen D.R., Monfils M.J., Mueller W.P., Williams K.A., Leduc-Lapierre M., and Zipkin E.F. In review. A comparative analysis of methods to identify species hotspots. Methods in Ecology and Evolution. Sydeman W.J., Brodeur R.D., Grimes C.B., Bychkov A.S., and McKinnell S. 2006. Marine habitat ‘hotspots’ and their use by migratory species and top predators in the North Pacific Ocean: introduction. Deep Sea Research II 53:247-249. Tremblay Y., Bertrand S., Henry R.W., Kappes M.A., Costa D.P., and Shaffer S.A. 2009. Analytical approaches to investigating seabird-environment interactions: a review. Marine Ecology Progress Series 391:153-164. Virkkala R. 2016. Variation in population trends and spatial dynamics of waterbirds in a boreal lake complex. Ornis Fennica 93:197-211. 76 Votier S.C., Birkhead T.R., Oro D., Trinder M., Granthan M.J., Clark J.A., McCleery R.H., and Hatchwell B.J. 2008. Recruitment and survival of immature seabirds in relation to oil spills and climate variability. Journal of Animal Ecology 77:974-983. Ward E.J., Marshall K.N., Ross T., Sedgley A., Hass T., Pearson S.F., Joyce G., Hamel N.J., Hodum P.J., and Faucett R. 2015. Using citizen-science data to identify local hotspots of seabird occurrence. PeerJ 3:e704. Williams P., Gibbons D., Margules C., Rebelo A., Humphries C., and Pressey R. 1996. A comparison of richness hotspots, rarity hotspots, and complementary areas for conserving diversity of British birds. Conservation Biology 10(1):155-174. Williamson L., Hudson M., O’Connell M., Davidson N., Young R., Amano T., and Szekely T. 2013. Areas of high diversity for the world’s inland-breeding waterbirds. Biodiversity and Conservation 22:1501-1512. Wilson L.J., McSorley, C.A., Gray C.M., Dean B.J., Dunn T.E., Webb A., and Reid J.B. 2009. Radio-telemetry as a tool to define protected areas for seabirds in the marine environment. Biological Conservation 142:1808-1817. Wong S.N.P., Gjerdrum C., Morgan K.H., and Mallory M.L. 2014. Hotspots in cold seas: The composition, distribution, and abundance of marine birds in the North American Arctic. Journal of Geophysical Research: Oceans 119:1691-1705. Zipkin E.F., Kinlan B.P., Sussman A., Rypkema D., Wimer and M., O’Connell A.F. 2015. Statistical guidelines for assessing marine avian hotspots and coldspots: A case study on wind energy development in the U.S. Atlantic Ocean. Biological Conservation 191: 216223. Zipkin E.F., Leirness J.B., Kinlan B.P., O’Connell A.F., and Silverman E.D. 2014. Fitting statistical distributions to sea duck count data: implications for survey design and abundance estimation. Statistical Methodology 17:67-81. 77