THREE ESSAYS IN ENVIRONMENTAL ECONOMICS
                          By
                     Andrew Earle
                 A DISSERTATION
                      Submitted to
              Michigan State University
      in partial fulfillment of the requirements
                   for the degree of
         Economics – Doctor of Philosophy
   Environmental Science and Policy – Dual Major
                         2023


                                              ABSTRACT
This dissertation studies the welfare generated by outdoor recreation. Chapters 1 and 2 study a
high-profile form of recreation: U.S. national park visitation. Chapter 3 uses innovative data to
study a classic topic: the value of water quality. All three chapters apply a two-stage estimation
procedure that exploits panel variation in visitation and resource quality within a random utility
maximization (RUM) travel cost model, the field’s “workhorse” model.
    In Chapter 1, I conduct the most comprehensive analysis of demand for the U.S. National Park
System to date. I create a versatile and unified framework to analyze demand for 140 national
parks throughout the contiguous United States. Combining nationally representative surveys, park-
level visitor counts, and a statistical atlas of park attributes, I estimate a RUM model of visitation
from 2005 through 2019. The model produces estimates of park awesomeness and explains
awesomeness using detailed park attribute data. Iconic parks like Glacier, Yellowstone, and Grand
Canyon all rank among the most awesome parks. Visitors prefer parks with charismatic wildlife,
like bison, and bald eagles, wide-ranging elevation, and coastline.
    My second chapter applies the data infrastructure and model from Chapter 1 to analyze how
climate change will impact the welfare generated by national park visitation. I estimate visitor
preferences for long-run average temperatures and short-run temperature deviations, and I use
these preferences to simulate visitor welfare under future climate conditions. Visitors prefer
temperatures between 70°F and 85°F, and on average, they dislike cold more than they dislike
extreme heat. Assuming limited changes to park resources, I find climate change will likely
increase the welfare generated by national park visitation. The overall gains are driven by large
benefits in cooler seasons that outweigh the losses from extreme heat in the summer.
    Chapter 3, co-authored with Hyunjung Kim, blends the modeling and estimation techniques
from Chapters 1 and 2 with a high-frequency, administrative park visitation dataset. We quantify
the losses from water quality-induced beach closures at Lake St. Clair Metropark in southeast
Michigan. Our park visitation data include the residential ZIP code and exact minute of park entry
for the universe of visits to the Huron-Clinton Metropark system. Our preferred model estimates a


daily panel of park fixed effects and regresses the fixed effects on a beach closure indicator in a
second stage. We estimate that the 2022 beach closures caused welfare losses of around $70,000.


                                  ACKNOWLEDGEMENTS
Without the support of my mentors, colleagues, friends, and family, I would never have completed
this dissertation or even dreamed of going to graduate school. My advisor, Soren Anderson,
generously devoted his time, energy, and expertise to helping me improve my research. His vision
and encouragement contributed massively to this final product. I am grateful for the opportunity
to absorb some of his knowledge. Frank Lupi deserves similar praise. His advising, as well as the
opportunity to work as his co-author, has made me a much better and more confident economist.
I am thankful for many other MSU faculty I have learned from, including Kyoo il Kim, Oren Ziv,
Mike Conlin, Justin Kirkpatrick, Todd Elder, Jeff Wooldridge, Stacy Dickert-Conlin, and Joe
Herriges. I was fortunate to learn with and from my fellow graduate students, and I will always
cherish our lunch table conversations, cookouts, and intramural games.
    I am also thankful for the opportunity to spend the summer of 2021 at the Property and
Environment Research Center (PERC) in Bozeman, Montana. I thank the staff for their support
of graduate fellowship program, the faculty for treating me as a colleague, and the other graduate
fellows for their feedback on my research and their friendship.
    My interest in environmental economics originated in Wyoming during an undergraduate
summer field course. I am grateful to the instructors, Mandi Lyons and Steve Latta, and my
classmates for sharing their passion for the environment. After returning to the University of
Pittsburgh, Randy Walsh, Jeremy Weber, Andrea LaNauze, and Katherine Wolfe provided
valuable mentorship as I explored my new interest and considered graduate school.
    Finally, I am grateful for my friends and family. My community at University Lutheran
Church has provided valuable perspective and support throughout my time in East Lansing. I am
thankful for my parents and brothers who have been a steady source of encouragement and energy
for my entire life. My partner, Emma, has supported me unwaveringly and filled my life with joy
in good times and bad. No one understands the trials and triumphs I have faced over the past five
years better than her. I am also grateful for my grandfather, Carville, who nurtured my curiosity
when I was little, and for my family which continues to share memories of him with me.
                                                iv


                                        TABLE OF CONTENTS
CHAPTER 1 VISITING AMERICA’S BEST IDEA: DEMAND FOR THE U.S. NATIONAL
PARK SYSTEM ............................................................................................................................. 1
CHAPTER 2 THE WELFARE IMPACT OF CLIMATE CHANGE ON U.S. NATIONAL
PARK SYSTEM VISITATION ................................................................................................... 23
CHAPTER 3 VALUING WATER QUALITY WITH HIGH-FREQUENCY DATA:
EVIDENCE FROM MICHIGAN BEACH CLOSURES (WITH HYUNJUNG KIM) ............... 40
BIBLIOGRAPHY ........................................................................................................................ 53
                                                            v


                                             CHAPTER 1
    VISITING AMERICA’S BEST IDEA: DEMAND FOR THE U.S. NATIONAL PARK
                                               SYSTEM
1.1     Introduction
    In 1916, the National Park Service was created to conserve America’s most treasured natural
resources. More than 100 years later, the National Park Service now manages 424 parks that
attract roughly 300 million visits each year. The national parks have also become part of America’s
cultural identity. The novelist Wallace Stegner even called them, “the best idea we ever had”.
    Their unique resources, popularity, and cultural importance make the parks economically
significant. The National Park System estimates that visitors spend $20.5 billion in parks and
their surrounding communities each year. Recent work by Szabó and Ujhelyi (2021) finds that
national parks increase economic development, incomes, and employment, with spillover effects
extending beyond the recreation and tourism sector. These economic contributions are one reason
why national parks enjoy some degree of bipartisan political support.
    Despite the importance of the national parks and the visitors they attract, there are large gaps
in our knowledge of why people visit. The National Park Service’s internal research is often park-
specific and based on infrequent surveys. Efforts to understand visitation at a system-wide level are
rare, and they typically say little about specific park resources.
    This paper analyzes demand for the U.S. National Park System with the goal of understanding
preferences for the national parks and their attributes. I create a random utility maximization (RUM)
model of visitation for 140 national parks, nearly all those protected for their natural resources, in
which individuals repeatedly choose which park to visit and whether to drive or fly. In the model,
an individual’s visitation decisions depend on the travel costs of accessing each park and the mean
utility provided by each park’s attributes. To allow park mean utilities to vary seasonally, the model
includes a full set of park-by-month fixed effects, which I call “park effects." These parameters
                                                    1


represent the mean utility of visiting a park after controlling for the travel costs needed to get
there and capture all a park’s observable and unobservable attributes. In plain terms, they measure
national park awesomeness.
    I combine three types of data to estimate the model and understand preferences for the parks
and their attributes. I obtain individual-level data on national park visitation from nationally
representative telephone surveys administered by the National Park Service in 2008-2009 and
2018. I complement these survey data with monthly park visitor counts from the National Park
Service’s Visitor Use Statistics. Finally, I consolidate a rich collection of data describing park
attributes to build a statistical atlas of the national parks, allowing me to estimate preferences for
attributes, such as elevation, infrastructure, and the presence of charismatic wildlife.
    I introduce a two-step estimation and calibration procedure to combine these data. The first
step combines the survey and visitor count data in a maximum likelihood procedure to estimate
the model during the survey periods, 2008-2009 and 2018. Using these estimates and the visitor
count data, I calibrate a monthly panel of park effects from January 2005 through December 2019.
The calibration uses annual American Community Survey microdata to account for demographic
changes and calculate time-varying travel costs over the fifteen year period. In the second step, I
regress the park effects on a collection of park attributes. For attributes that vary over time, such
as weather, the park effects’ panel structure allows me to use fixed effects to control for potential
omitted attributes.
    I find that “bucket list" parks such as Yellowstone, Glacier, and Grand Canyon consistently rank
in the top ten of my national parks awesomeness index. Observable park attributes explain 56% of
the variation in the index. Visitors tend to prefer parks with redwood forests, bison, bald eagles,
wide-ranging elevation, and shoreline. Many of these attributes vary little across time, which poses
a challenge for causal inference. Yet, my estimated park effects reveal underexplored seasonal
variation, making a causal interpretation more plausible for attributes that vary across time. For
parks with harsh winters, willingness to pay peaks in the summer months, while parks with more
moderate climates provide more stable mean utility throughout the year.
                                                     2


     Most of the previous literature on recreation demand for U.S. national parks has focused on
single parks or parks in a particular region (Walls, 2022). The limited number of nation-wide studies
often focus on the local economic impacts of visitation (Szabó and Ujhelyi, 2021; Cullinane Thomas
and Koontz, 2020). There has been little nation-wide research that explores preferences for park
attributes using detailed visitation and park attribute data.1 By analyzing demand for 140 parks
across the United States over fifteen years, building a structural model of individual visitation, and
combining survey, visitor count, and extensive park attribute data, this paper constitutes the most
comprehensive analysis of demand for the U.S. National Park System to date.
     My estimation procedure makes two methodological contributions to the broader recreation
demand literature. Recreation demand studies often employ random utility maximization (RUM)
travel cost models to estimate the value of recreation sites or environmental attributes. Most
applications of RUM models analyze recreation demand for a single season, likely because RUM
models are estimated with survey data which is costly to collect. Yet, many interesting and important
natural resource changes occur outside survey periods. My estimation and calibration procedure
demonstrates how to combine site-level visitor counts with a structural RUM model to bridge gaps
between individual surveys.
     Second, my estimation and calibration approach allows for panel data econometric techniques
to be applied within a RUM model. My two-stage procedure is similar to Murdock (2006), except
that I estimate a panel of park effects in the first stage to preserve variation across and within parks
for the second stage regression. Murdock estimates only a cross-section of park-fixed effects in the
first stage, leaving second stage estimates susceptible to omitted variables bias. Lupi et al. (2020)
recommend Murdock’s approach as a best practice for RUM travel cost models and also emphasize
the need for more rigorous identification in recreation demand studies. My estimation procedure
provides a method to improve identification in RUM models through the use of panel data, and it
builds on the recreation demand literature’s current best practices.
    1 Both Henrickson and Johnson (2013) and Neher et al. (2013), model visitation to parks across
the country as a function of park attributes, and Neher et al. uses individual visitation data.
However, both papers use a relatively small set of park attributes in their analysis.
                                                     3


    The paper proceeds as follows. Section 1.2 describes the nationally representative telephone
surveys, monthly park visitor counts, and the national park statistical atlas. Section 1.3 outlines
how I calculate flying and driving travel costs. Section 1.4 presents the model of national park
visitation. Section 1.5 details the two-step estimation procedure. Section 1.6 describes the results,
and Section 1.7 concludes.
1.2     Park Visitation and Attribute Data
    The main data sources for this project describe individual-level visitation, park-level visitation,
and physical and institutional attributes of the national parks. This section describes each of these
data sources.
    The individual-level visitation data come from the National Park Service’s Comprehensive
Survey of the American Public. The survey conducts telephone interviews with the primary
goal of gauging sentiment towards the National Park Service, their management practices, and
visitor experiences. The survey lasts approximately fifteen minutes and includes several questions
regarding respondents’ visitation history. These questions include the location of each respondent’s
most recent national park visit and the number of times they have visited to the National Park
System in the two years prior to the interview. For a random subset of sample, I also observe
whether respondents drove or flew on their most recent visit.
    Several characteristics of the Comprehensive Survey of the American Public make it a uniquely
useful data source for studying national park visitation. First, it is nationally representative. Phone
numbers are selected using a regionally-stratified random sampling design, and individual respon-
dents are randomly selected within each household. The data include weights to account for the
regional stratification and match sample demographic statistics to the Census, so weighted sample
demographics closely match the general population (table 1.1). I use these weights throughout my
analysis. The sampling design includes both visitors and non-visitors, which allows me to model
the extensive margin – the choice of whether or not to visit a national park.
    Another useful feature is that the survey was conducted in twice: once in 2008 and 2009
                                                   4


     Table 1.1: Telephone Survey Descriptive Statistics
                                  Unweighted    Weighted    Census
  Age
    18-29                            11.8         21.3        23.6
    30-39                            13.5         16.3        16.9
    40-49                            16.7         16.7        18.4
    50-59                            24.1         20.8        17.5
    60-69                            18.5         14.3        12.0
    70+                              15.1         10.4        11.3
  Income
    Less than $10,000                 4.5          6.0        12.6
    $10,000 to $25,000                9.5         11.0        15.0
    $25,000 to $50,000               20.3         23.2        23.5
    $50,000 to $75,000               20.8         22.2        18.9
    $75,000 to $100,000              17.3         15.9        13.5
    $100,000 to $150,000             15.4         13.1        10.7
    Greater than $150,000            12.0          8.3         5.4
  Education
    Some high school                  3.5          5.5
    High school degree               36.8         46.9
    College degree                   35.8         32.0
    Graduate degree                  22.8         14.6
  Has child                          29.7         35.3        38.4
  White, non-Hispanic                75.0         67.9        63.7
  Region
    Alaska                           14.1          0.2         0.2
    DC only                          11.6          0.2         0.1
    Intermountain                    14.9         14.9        15.1
    Midwest                          14.6         22.9        22.6
    Northeast                        15.1         22.9        23.2
    Pacific                          14.8         16.8        17.3
    Southeast                        14.7         21.8        21.2
  Visited in past 2 years            67.9         61.7
  Avg number of visits                9.2          4.7
  Flew (Subsample N = 1537)          13.5         12.6
  Sample size                        6762         6762
 Note: The table shows the share of respondents in various demo-
graphic groups for the pooled 2008-2009 and 2018 Comprehensive
Survey of the American Public survey data compared to statistics from
2010 Census data. Weights are included in the survey and match sur-
vey statistics to Census averages. Thus, the weighted variable means
align closely with Census means. The unweighted sample tends to be
older, richer, and more white, non-Hispanic than the general popula-
tion.
                                   5


and again in 2018. The two iterations are similar, with identical questions on visitation history.
The seasonal timing of interviews varies slightly between the two iterations. The 2008 and 2009
interviews were split evenly between seasons to account for seasonal variation in visitation. The
2018 survey, citing a lack of seasonality in the 2008 and 2009 data, conducted interviews from June
through November.
    The Comprehensive Survey of the American Public also has limitations. First, I observe
respondents’ home locations imprecisely. In the 2008 iteration, the data include each respondent’s
telephone area code and their state of residence. When the area code is within the state of residence,
I take the largest city in the area code as the home city when calculating travel costs. When I only
observe the state of residence, or the area code and state of residence do not match, I randomly
sample a home city according to the state’s population distribution. Second, the survey does
not include any information on visit dates, only that the visits occurred within two years of the
interview. This limits my ability to capture seasonal variation in certain parameters, including the
travel cost coefficient. I discuss the implications for estimation in Section ??. Finally, many less
visited national parks never appear as a most recent visit for any respondent. This poses challenges
for an estimation based on survey data alone. These shortcomings suggest more visitation data is
needed to monitor visitation more thoroughly, motivating the use of park-level visitor counts.
    I use park-level visitor count data from the National Park Service’s Visitor Use Statistics
database. The counts have a broad temporal and geographic scope, dating back to 1905 for the
oldest parks and covering 383 national parks in recent years. I use counts from January 2005
through December 2019, because this period overlaps closely with the individual-level survey data
and the American Community Survey microdata.
    Counting procedures vary by park and typically involve National Park Service rangers at entry
booths and/or strategically placed vehicle counters. Parks use person-per-vehicle multipliers to
convert vehicle counts to person counts. Busy peak seasons, available technology, and often
remote locations make it difficult to obtain exact counts in some cases. Nonetheless, the Visitor
Use Statistics are used administratively and in many academic studies (Henrickson and Johnson,
                                                   6


                            Table 1.2: Park Attribute Data Sources
  Source                     Variables
  USGS National Map          Elevation range, mean elevation, trail miles, number of lakes > 40
                             acres, area of lakes > 40 acres
  NPS Administrative Data    Designation (Park, Lakeshore, Seashore, etc), acreage, coastal, miles
                             of shoreline, species presence
  2004 NLCD                  Share of land by landcover type, mode landcover type, landcover di-
                             versity
  Census                     Road miles, population density of overlapping counties
  NCEI                       Monthly average high temperature, days with precipitation > 0.1”,
                             monthly ten-year average temperature and precipitation days
 Note: The table shows data sources for park attributes and variables generated from them. NPS
Administrative Data include the NPSpecies database, Annual Acreage Reports, and a 2011 Resource
Report on Shoreline length. NCEI data come from weather station-based Global Summary of the
Month reports. NLCD - National Land Cover Database, NCEI - National Centers for Environmental
Information.
2013; Fisichelli et al., 2015; Keiser et al., 2018).
    I adjust the raw visitor count data to make them suitable for recreation demand modeling.
This process accounts for re-entry, group size, international visitation, and the primary purpose of
trips. Each of these variables is included in using on-site surveys conducted by the National Park
Service. I use 105 on-site surveys conducted at 69 different parks between 1995 and 2019. For
parks that have not conducted an on-site survey, I impute missing information based on observable
park attributes. After imputation, I have a park-month panel of re-entry rates, average group sizes,
proportions of international visitors, and proportions of primary purpose trips. I use the panel to
convert raw visitor counts to the number of primary purpose visits.
    To understand visitor preferences for park attributes, I consolidate several datasets describing
the national parks themselves. Table 1.2 shows the full list of data sources and the variables I
generate from them.
1.3    How much does it cost to visit the national parks?
    This section describes the procedure for computing travel costs. I calculate travel costs at a
quarterly frequency for every individual in the nationally representative telephone surveys, as well
                                                     7


as every individual in the American Community Survey microdata between 2005 and 2019. I use
these microdata to calibrate the model outside the survey period. Travel costs include the time
and money required for individuals to access each of the national parks. These calculations largely
follow English et al., who also compute driving and flying travel costs at a national scale.
    To compute driving travel costs, I calculate the driving mileage and time from each respondent’s
home location to each national park using PC*Miler. I multiply mileage by the marginal cost of
driving, which I calculate with per-mile maintenance costs from annual AAA reports and regional
gas prices from the Energy Information Agency. For every twelve hours of driving, I add the
average U.S. hotel rate. I also make the standard assumption that the cost of travel time is one-third
of a respondent’s hourly wage rate.
    Flying travel costs include travel time, plus the cost of driving to the origin airport, airfare, and
the cost of driving from the destination airport to the park, which may include rental car prices.
Quarterly average airfare data come from the U.S. Department of Transportation’s Consumer
Airfare Report (2015), which includes the average airfare for flights between city markets, rather
than individual airports. I use the 2012 average rental car price from English et al., adjusting for
inflation to approximate rental car prices in other years. For each individual-park pair, I compute
travel costs for all routes originating at one of the four city-markets closest to the respondent’s home
and ending at one of the four city-markets closest to the park. I select the cheapest of these routes
as the individual’s travel cost of flying to the park.
    Figure 1.1 shows the flying and driving travel costs for a subset of the telephone survey sample.
Driving travel costs increase approximately linearly with driving-distance, with different slopes for
each income bin. On average, flying is more expensive than driving for trips under 1,600 miles but
is cheaper at longer distances, matching calculations from English et al.
1.4     A Model of National Park Visitation
    In this section, I outline a model describing the choices of which national park to visit and
how to travel. By jointly modeling the choice of park and the choice of travel mode, I build
                                                     8


                            Figure 1.1: Travel costs increase with distance
Note: The figure plots round trip travel costs on one-way driving distance for a three percent subset
of the 2008 suvery sample. Brown circle show driving travel costs, and blue x’s show flying travel
costs. Lines show average travel costs conditional on distance for both driving (brown-solid) and
flying (blue-dashed). On average, flying travel costs increase more gradually with distance.
on both the recreation demand literature, which typically focuses solely on location choice, and
the transportation literature, which has a rich history modeling travel mode choices (McFadden,
1974).2 The model also shares similarities with work by Chintagunta et al. (2005), which allows
for time-varying mean utilities in a model of demand for margarine.
    Suppose that each month individuals choose whether to visit a national park, which national
park to visit, and whether to drive or fly to the park. Denote the set of national parks J = {1, 2, ... }
and the set of travel modes M = {⇡, }, where ⇡ and            indicate driving and flying, respectively.
Let 9 = 0 denote the outside option, each individuals’ best way of spending the month that does
not involve visiting a national park. Because visits to the National Park System’s historic sites are
included in the data but differ from visits to nature-centered national parks, I group visits to historic
sites into a second outside option, 9 =        + 1. Given this choice set, let *8 9 <C denote the utility
individual 8 receives from visiting national park 9 using travel mode < during month C, where
    2 An exception, Hausman et al. (1995) model the travel mode choice in a recreation demand
context. They create a model to quantify the recreational use losses of the Exxon Valdez oil spill.
                                                     9


                               8
                               >
                               >
                               >
                               >
                               > X0C + n80C                            9 =0
                               >
                               >
                               >
                               >
                               >
                               >
                               < X 9C + V)⇠ )⇠8 9 ⇡C + n8 9 ⇡C
                               >                                       9 2 {1, ..., }, < = ⇡
                     *8 9 <C =                                                                     (1.1)
                               >
                               >
                               >
                               >
                               >
                               >
                                 X 9C + V + V)⇠ )⇠8 9 C + n8 9 C 9 2 {1, ..., }, < =
                               >
                               >
                               >
                               >
                               >
                               > X +1,C + n8, +1,C                     9 = +1
                               :
                                 8
                                 >
                                 >
                                 >
                                 >
                                 > +0C + n80C          9 =0
                                 >
                                 >
                                 <
                                 >
                              ⌘ +8 9 <C + n8 9 <C                                                  (1.2)
                                 >
                                 >
                                                       9 2 {1, ..., }, < 2 {⇡, }
                                 >
                                 >
                                 >
                                 >
                                 >
                                 >+ +1,C + n8, +1,C 9 = + 1
                                 :
    In equation refeq:util, coefficient V)⇠ represents the marginal disutility of travel costs, and V
represents the fixed cost of flying relative to driving. For 9 2 {1, ..., }, I call the park-by-month
fixed effect, X 9C , the park effect. It captures the mean utility provided by park 9 in month C after
controlling for travel costs. Ranking the park effects produces a national park awesomeness index.
I decompose the park effects further:
                                                X 9C = - 9C U + a 9C ,                             (1.3)
    where - 9C contains observable park attributes; U is a coefficient vector, and a 9C is unobservable.
    Assume the error term, n8 9 <C follows a Generalized Extreme Value distribution with a two-level
nested structure with the no-visit alternative in its own nest. This assumption allows error terms for
visit alternatives to be correlated and relaxes the Independence of Irrelevant Alternatives assumption
imposed by conditional logit models. The nested logit model still imposes the Independence of
Irrelevant Alternatives assumption within the visit nest. Under this nesting structure, the probability
of choosing each alternative has a closed form:
                                                        10


            8
            >                                          exp(+0C )
            >
            >
            >
            >                                       Õ +1 Õ                        , if 9 = 0
            >
            >                          exp(+0C ) + ( :=1 =2M exp( 8:=C
                                                                        +
            >                                                                )) _
            >
            >
                                                                          _
            >
            >
            >
            >                                Õ +1 Õ
            >
            >
                         +8 9 <C                                  +8:=C _
            <
            >       exp( _ )                ( :=1      =2M exp( _ ))
  %8 9 <C = Õ +1 Õ                                  Õ +1 Õ                        , if 9 2 {1, ..., }
            >
            >
                                 +8:=C
                            exp( _ ) exp(+0C ) + ( :=1 =2M exp( _ ))
                                                                        +8:=C _
            >
            >   :=1 =2M
            >
            >
            >
            >                                Õ +1 Õ
            >
            >
            >
                        +                                         +
            >      exp( 8, _+1,C )          ( :=1            exp( 8:=C
                                                                    _ ))
                                                                          _
            >
            >
                                                       =2M
            >
            > Õ    Õ                                Õ      Õ                      , if 9 = + 1
            > +1 =2M exp( +8:=C ) exp(+0C ) + ( +1 =2M exp( +8:=C )) _
            : :=1                  _                  :=1                 _
                                                                                                     (1.4)
    For visit alternatives, the choice probabilities include two terms. The first indicates the prob-
ability of visiting a specific park using a specific travel mode, conditional on choosing a visit
alternative. The second term indicates the probability of choosing any visit alternative. If an
individual chooses not to visit, then they do not select a specific park and travel mode. Thus, the
no-visit choice probability has only one term. The literature often refers to the parameter, _, as the
dissimilarity coefficient. For consistency with random utility maximization, _ is bounded between
zero and one. Higher values of _ indicate more dissimilar alternatives in the visit nest, and _ equal
to one simplifies the probabilities to match the conditional logit model.
1.5      A two-step approach to estimate demand
    This section describes the procedure to estimate demand for national park visitation. My
procedure builds on Murdock (2006), who introduces a two-step approach for estimating recreation
demand models. My estimation procedure is also similar to the maximum likelihood estimator
proposed by Berry et al. (2004), which combines micro and macro-level data to estimate a demand
system.
1.5.1     Step 1a: Maximum likelihood estimation
To begin, I use the survey data and visitor count data to estimate the parameters in equation 1.1.
These include the marginal disutility of travel costs, the relative preference for flying versus driving,
                                                   11


and the observable heterogeneity parameters. Because the survey data do not include the date of
respondents’ visits, I effectively observe a cross-section of visitation choices for each survey period.
Thus, I drop the C subscript from the model and estimate a constant park effect for each survey
period in Step 1a. After this initial estimation, I use the monthly visitor counts to recover a panel
of park effects in Step 1b (below).
    I specify a three part likelihood function that incorporates all the visitation choices observed in
the survey data — each individual’s most recently visited park, number of visits, and travel mode
(when observed). Using the choice probabilities from equation 1.4, the likelihood of observing
individual 8’s visitation choices is
                   ! 8 (V, X) = (⇧ 9=0 ⇧<2M %8 98<9< ) (1 %80 ) E 8 (%80 ) 24 1 E 8
                                                   H
                                                                                                   (1.5)
                                  |       {z          } | {z } | {z }
                                           (1)                (2)            (3)
    The first term represents the likelihood of individual 8’s most recent visit. For this visit, I
observe the park visited, and for a subset of respondents, I also observe the travel mode. The
second term represents the likelihood of all other visits in the two years prior to the interview,
where E8 indicates the number of visits in the past two years excluding the most recent visit. The
third term represents the likelihood from all non-visits in the two years prior to the interview. If an
individual never visits a national park in the two years prior to their interview, then they choose the
no visit alternative for each of the 24 months prior to their interview.
    When maximizing the log likelihood function, I constrain the visitation shares predicted by the
model to match the visitation shares observed in the visitor count data. I impose this constraint by
applying the contraction mapping introduced by Berry: X =+1 = X = + ;=(B)         ;=( B̂(X = , V))
As the optimization routine iterates over values of V, the contraction mapping solves for the unique
vector of park effects, X, that matches the observed and predicted visitation shares.
    Incorporating the contraction mapping has several practical benefits. First, it allows me to
simultaneously incorporate visitation information from the surveys and the visitor counts. Second,
the contraction mapping solves for the park effects, so the optimization routine must search only
                                                   12


over the remaining non-linear parameters, V, reducing the estimation time. These features of the
contraction mapping also allow me to pin down the park effects for parks that are never chosen in
the survey data.
    By estimating a full cross-section of park effects, I control for bias from unobserved park
attributes when estimating the remaining non-linear parameters. In industrial organization settings,
firms set prices and likely charge a higher price for products with desirable unobservable attributes.
The correlation between price and unobserved attributes biases naive estimates, and it has led to
the widespread use of instrumental variables when estimating demand systems. In the recreation
demand setting, travel costs are not set by firms directly, but unobserved park attributes, such as
remoteness, may still be correlated with travel costs. Including a full set of park effects controls for
all observed and unobserved park attributes when estimating the travel cost coefficient (Murdock,
2006). This is possible because, unlike prices in the industrial organization context, travel costs
vary at the individual-level and can be separately identified from park fixed effects.
    Geographic sorting remains an identification concern (Parsons et al., 2021). Individuals who
value national parks may choose their residential location to reduce their travel costs. If individuals
with low travel costs value national parks more highly than those far away and would visit more
often, even conditional on travel costs, then the marginal disutility of travel cost will be overstated.
This bias would overstate the value of money relative to park attributes and subsequently bias
willingness to pay estimates towards zero. Few travel cost papers address potential bias from
sorting, and because of the limited use of travel cost methods at a national scale, the magnitude of
the potential bias is unclear.
1.5.2   Step 1b: Calibrating a monthly panel of park effects
With estimates of the non-linear parameters, V, in hand, I now recover a monthly panel of park effects
by applying the contraction mapping month-by-month, from January 2005 through December 2019.
Calibration outside the survey period raises several concerns. Population demographics may change
meaningfully over the fifteen-year sample period. To account for this possibility, I calibrate the
                                                   13


model using annual American Community Survey (ACS) microdata samples from 2005 to 2019
(Ruggles et al., 2021). The microdata contain key demographic variables, such as income, family
structure, and age, and they are nationally representative. Both these features are critical for using
survey-based estimates from step 1a to predict visitation shares for the ACS sample.
    The calibration procedure also requires assumptions on the stability of the non-linear parameters
across time. In this paper, I assume the non-linear parameters are constant across the entire fifteen-
year calibration period. While this is not necessary, the assumption has empirical justification. In a
preliminary robustness check, I allow the non-linear parameters to vary in the 2008 and 2018 survey
period and obtain similar estimates. In a similar model of recreational marine fishing, Dundas and
von Haefen (2020) allow travel cost coefficients to vary annually and obtain fairly stable estimates
from 2004 through 2009.
    Given these assumptions, I calculate individual choice probabilities for each individual in the
ACS microdata samples. These choice probabilities imply predicted visitation shares for each park
in each month. Recall, the visitor count data also have a monthly panel structure. Beginning with
January 2005, I apply the contraction mapping to obtain the unique vector of park effects that
matches the predicted and observed visitation shares. Iteratively applying the contraction mapping
month-by-month produces a full panel of park effects through December 2019.
    The key insight is that applying the contraction mapping to solve for park effects does not require
individual-level choice data. Instead, one only needs an estimate of the non-linear parameters, a
reasonable microdata sample, and observed visitation shares.
1.5.3   Step 2: Estimating preferences for park attributes
In step 2, I estimate equation 1.3, which explains park effects as a function of park attributes.
Because step 1a produces a panel of park effects, panel data econometric techniques, such as
difference-in-differences and event studies can be implemented to rigorously identify preferences
for park attributes. I explore a variety of park attributes, so I do not use one of these classic causal
inference techniques. Nonetheless, my approach offers a promising method for blending modern
                                                   14


causal inference tools with recreation demand modeling.
    Applying panel data econometric techniques within the structural model has several benefits.
In a reduced-form regression, attribute changes at one park may cause visitors to substitute a visit
with another park, biasing estimates. The structural model of park choice controls for the quality
of substitute parks when estimating park effects. Therefore, spillovers do not bias estimates. The
structural model also provides a framework for calculating welfare impacts.
    I recover preferences for a broad range of park attributes using a correlated random effects
model with a Mundlak device. Some attributes, such as elevation and wildlife presence, do not
vary meaningfully across my fifteen-year period of interest. Other attributes, such as temperature,
do vary dramatically across parks and across time. While including a flexible set of fixed effects
(e.g., park or park-by-season) has the attractive property of controlling for unobserved attributes that
are constant across time, the fixed effects would subsume preference for time-invariant attributes.
    The correlated random effects framework solves this issue. For attributes that vary over time, it
recovers identical estimates to a fixed effects approach, and it preserves cross-sectional variation to
recover preferences for time-invariant attributes (Mundlak, 1978; Wooldridge, 2019). Specifically,
I estimate
                                  X 9C = - 9 U0 + - 9C U1 + -̄ 9 B U2 + a 9C ,                      (1.6)
where - 9 includes time-invariant attributes, - 9C includes time-varying attributes, and -̄ 9 B is the
mean of time-varying attributes at park j in season s.
1.6     Preferences for the U.S. National Parks
    Table 1.3 reports estimates of travel cost, travel mode, and heterogeneity coefficients for several
specifications of equation 1.1. For all specifications and all income groups, the travel cost coefficient
is negative and significantly different from zero. Individuals with lower incomes are more sensitive
to travel costs, consistent with a diminishing marginal utility of income.
    Preferences for travel mode differ meaningfully by income group, but on average, all prefer
                                                     15


                          Table 1.3: Estimates 2008 and 2018 Survey Periods
                                                        (1)        (2)        (3)
                     Travel cost ($100)               -0.251     -0.250     -0.251
                                                     ( 0.016)   ( 0.015) ( 0.016)
                     Fly                              -0.945     -1.132     -1.131
                                                     ( 0.087)   ( 0.109) ( 0.109)
                     TC x income < $25,000            -0.277     -0.328     -0.326
                                                     ( 0.022)   ( 0.027) ( 0.027)
                     TC x income > $100,000            0.127      0.120      0.119
                                                     ( 0.009)   ( 0.009) ( 0.009)
                     Flying x income < $25,000                    0.914      0.917
                                                                ( 0.178) ( 0.177)
                     Flying x income > $100,000                   0.351      0.241
                                                                ( 0.131) ( 0.138)
                     Flying x parent                                         0.568
                                                                           ( 0.117)
                     Dissimilarity coefficient         0.657      0.662      0.654
                                                     ( 0.042) ( 0.041) ( 0.042)
                     Note: The table shows estimates of the travel mode, travel cost,
                    and heterogeneity coefficients in equation 1.1 with standard er-
                    rors in parentheses. Income interaction coefficients are relative
                    to the middle income group.
driving to flying. The low income group is willing to pay $37 extra per-household, per-trip to drive
rather than fly, while the middle and high income groups are willing to pay a premium of $453 and
$674 to drive. One explanation for the driving premium is flexibility, as driving allows groups to
adjust their schedule and add side trips. Higher income groups may be willing and able to pay for
this flexibility. Although, the driving premium also reflects factors I do not include in travel costs,
such as airport parking fees, the risk of flight cancellations, or the fuel efficiency, reliability, and
comfort of respondents’ vehicles.
    The dissimilarity coefficient is between zero and one, implying the nested logit model is
consistent with utility maximizing behavior (McFadden, 1979). I use estimates from the model in
column 1 when calibrating the panel of park effects, but I plan to use results from a model with
more detailed heterogeneity in the future.
    Figure 1.2 shows estimated monthly park effects for two parks: Glacier NP and Great Smoky
                                                   16


Mountains NP. The park effects should be interpreted relative to the “no visit" alternative, which
is normalized to provide a zero mean utility each month.3 The consistently negative park effects
indicate that potential visitors prefer the “no visit" alternative to visiting a specific park, even when
that park has zero travel costs. In the context of the model, individuals will only choose to visit a
park if it has a large, positive error term draw. In interpreting this result, it is helpful to recall that the
“no visit" alternative encompasses all ways to spend a month that do not involve visiting a national
park. The estimated park effects also represent mean utilities for both visitors and non-visitors.
They are also sensitive to the specified number of choice occasions and the market size. Assuming
fewer choice occasions or a smaller market size raises park visitation shares and increases park
effects relative to the “no visit" alternative.
                              Figure 1.2: Travel costs increase with distance
Note: This figure plots park effect estimates for Great Smoky Mountains NP (solid-brown) and
Glacier NP (dashed-blue) in 2018. Both exhibit seasonal variation that has been largely overlooked
in RUM recreation demand models.
    Glacier’s park effects exhibit dramatic seasonal variation, peaking in the summer and collapsing
in the winter. Converting the seasonal differences to dollar terms, potential visitors are willing to
    3 Note that one can easily change the interpretation of the park effects by taking the residual from
a regression of the park effects on month-of-sample fixed effects. After this revision, park effects
can be interpreted relative to other parks, rather than an outside option that varies across time.
                                                     17


pay $1,032 more to visit Glacier in July rather than January. Great Smoky Mountains displays a
flatter peak period and a less extreme winter decline. Similar patterns at other parks suggest that
climate and weather drive seasonal variation in park effects.
     Results table 1.4 shows how park attributes impact park effects. Unshaded variables do not
vary meaningfully between 2005 and 2019, either due to data availability or geophysical processes.
They are identified with only cross-sectional variation, while variables in the shaded rows leverage
within-park variation.
     Conditional on other observable attributes, visitors are willing to pay more to visit parks
with redwood forests, bison, bald eagles, coastline, more roads, more trails, and large elevation
ranges. Population density in surrounding counties is also positively correlated with the park
effects. This reflects amenities nearby parks, such as restaurants, hotels, and other attractions, but
the estimate is likely biased upward, because desirable, unobserved park attributes attract visitors
and generate local economic impacts. The land cover coefficient estimates suggest that visitors
appreciate barren land, a category that includes rock and sand, more than other land cover types,
such as forest, wetland, and grassland. Willingness to pay is lower for parks with grizzly bears and
those with more diverse land cover, measure using a standardized Herfindahl-Hirschman Index.
Estimates described in this paragraph should be interpreted with caution, because they are identified
with cross-sectional variation. Nonetheless, they provide the most extensive, revealed-preference
evidence to date regarding what attracts visitors to U.S. national parks.
     I use within-season variation at each park to identify coefficients on time-varying attributes.
More rainy days in a month, both historically and contemporaneously, decreases willingness to pay,
while park acreage changes have minimal impact.
     The national park designation coefficient reflects the impact of switching a park’s designation to
“national park" from one of the various other designations. Common wisdom suggests redesignating
units with the official national park designation will increase their visibility and attract more visitors.
This has even been proposed as a method to reduce crowding at other parks, by making substitutes
more appealing. My estimate of the redesignation effect suggests that an official national park
                                                   18


           Table 1.4: Preferences for Park Attributes
                                                    Coefficient  WTP
  Redwoods present                                     0.872*     347
                                                      (0.508)
  Bison present                                         0.296     118
                                                      (0.268)
  Bald eagles present                                   0.152      60
                                                      (0.147)
  Coastal                                               0.087      35
                                                      (0.262)
  Elevation range (1000 ft)                            0.052*      21
                                                      (0.031)
  Land cover share: barren land                      0.020**       8
                                                      (0.007)
  Trail miles (10 miles)                             0.017**       7
                                                      (0.005)
  Nearby population density (100 per sq mile)        0.008**       3
                                                      (0.002)
  Road miles (10 miles)                                 0.002      1
                                                      (0.004)
  Land cover share: shrub/scrub                        0.003       1
                                                      (0.003)
  Lake acreage (100 acres)                              0.000      0
                                                      (0.000)
  Acreage (10k)                                         0.001       0
                                                      (0.002)
  Trail miles x elevation range                        0.000       0
                                                      (0.001)
  Precipitation days                                 -0.005**      -2
                                                      (0.001)
  Land cover share: grassland                         -0.009*      -3
                                                      (0.004)
  Average precipitation days                         -0.013**      -5
                                                      (0.006)
  Land cover share: emergent wetland                   -0.014      -6
                                                      (0.010)
  Land cover share: mixed forest                     -0.016**      -7
                                                      (0.006)
  National Park designation                           -0.027*     -11
                                                      (0.014)
  Coastal x elevation range                            -0.070     -28
                                                      (0.093)
  Land cover diversity (standardized)                -0.311**    -124
                                                      (0.110)
  Grizzly bears present                              -0.806**    -321
                                                      (0.388)
  R-squared:                                            0.557
 * - Significant at 90% Level, ** - Significant at 95% Level. Estimates
for shaded variables are equivalent to estimates from a model including
park-by-season fixed effects. Unshaded variables use only between-
park variation. Flexible temperature controls are also included. Will-
ingness to pay (WTP) is calculated by dividing each attribute coeffi-
cient by the travel cost coefficient for the middle income group and
multiplying by 100.
                                   19


designation has little impact on the willingness to pay for a visit. Importantly, my estimate
is identified from only three redesignations (Pinnacles, Gateway Arch, and Indiana Dunes) that
occurred between 2005 and 2019. When analyzing a broader set of redesignations, Szabó and
Ujhelyi (2021) find that an official national park designation does increase visitation. Although
our studies vary methodologically, the discrepancy in our estimates is likely from my more limited
sample. This suggests that the impact of redesignations may vary substantially by park. In short,
redesignations do not seem to be an all-powerful shortcut for attracting visitors.
    Even with this broad array of park attributes and temperature controls, roughly 44% of the
variation in park effects remains unexplained. Given the unique resources the parks protect, this is
not surprising. It is difficult to estimate the value of iconic park attributes, such as Arches’ arches
or Yellowstone’s Old Faithful geyser, which are often idiosyncratic and, famously, remain largely
unchanged over time.
    By capturing mean utilities after controlling for travel costs, monthly park effects provide a
national park awesomeness index. Table 1.5 shows the implied ranking for 2018 based on national
parks’ maximum park effect throughout the year. I convert the park effects to a 100-point scale. The
maximum park effect between 2005 and 2019 scores 100 and the minimum scores 0. This ranking
method offers an attractive alternative to rankings from the popular media that are typically based on
travel bloggers’ personal experiences or raw visitation counts. Unlike experience-based rankings,
my ranking is systematic and incorporates the visitation history of the entire U.S. population.
Unlike rankings based on raw visitor counts, my ranking controls for the travel costs of reaching a
park to isolate the appeal of the park itself.
    The top ten ranking includes many of the most famous national parks, such as Glacier, Yellow-
stone, and Grand Canyon. One surprising results is that Golden Gate National Recreation Area
tops the list. Golden Gate provides views of the famous Golden Gate Bridge, beaches hiking
trails, and popular attractions like Alcatraz Island, but for several reasons, its ranking is likely
inflated. Although the model controls for the travel costs of accessing each park, it does not control
for complementary destinations near a park. Visitors to Golden Gate likely visit other Bay Area
                                                    20


                                Table 1.5: Most Awesome National Parks
                       Rank    Park                                             Rating
                         1     Golden Gate RA                                     97.4
                         2     Glacier                                            93.9
                         3     Yellowstone                                        92.9
                         4     Grand Canyon                                       92.5
                         5     Grand Teton                                        91.9
                         6     Mount Rainier                                      91.4
                         7     Acadia                                             91.0
                         8     Rocky Mountain                                     90.9
                         9     Olympic                                            90.6
                        10     Zion                                               90.5
                      Note: The National Park awesomeness index combines visitation
                     and travel cost data to rank parks by the mean utility they provide
                     visitors. The ranking reflects each parks maximum park effect
                     throughout 2018.
attractions on the same trip, while Glacier NP, for example, has fewer convenient complementary
attractions. Furthermore, local residents may visit Golden Gate several times per month, or even
several times per week. While my assumption that visitors take at most one trip per month-long
choice occasion may be appropriate for most people and most parks, it is likely too coarse for local
residents. Golden Gate’s proximity to the Bay Area means there are many local residents that may
visit frequently and bias its park effect upward.
1.7     Conclusion
    This paper conducts the most comprehensive analysis to date of demand for U.S. national parks.
The results describe preferences for the national parks and their attributes. On average, potential
visitors are willing to pay $376 more to drive instead of fly to a park, even with identical driving and
flying travel costs. Visitors are willing to pay more to visit parks with iconic wildlife, wide-ranging
elevation, and coastline, and preferences vary dramatically across seasons, particularly at parks
with harsh winters.
    I produce a national parks awesomeness index that provides a systematic alternative to existing
rankings and controls for the travel costs of accessing each park. It produces largely intuitive results,
ranking many of the most iconic parks in the top ten. Observable park attributes explain 56% of
                                                      21


the variation in the index, meaning idiosyncratic, unobservable, or difficult to quantify attributes
play an important role in driving visitation.
    My model, data infrastructure, and estimation procedure are valuable tools for studying the
national parks and recreation demand more broadly. The estimation procedure provides a method
of controlling for changing travel costs and demand system spillovers. It filters visitor count
data through a structural model. This preserves the panel structure of the visitor count data,
which is useful for identification, while the model provides the structure for welfare analysis
and counterfactual simulations. It also provides a technique for bridging gaps in individual-level
survey data. These advances make the framework relevant for policy and management decisions
throughout the National Park System, such as crowding, the impacts of climate change, and potential
infrastructure investments. This is particularly important given recent legislative actions, which
provide new resources for the continued conservation of the country’s most treasured resources.
                                                22


                                           CHAPTER 2
      THE WELFARE IMPACT OF CLIMATE CHANGE ON U.S. NATIONAL PARK
                                     SYSTEM VISITATION
2.1     Introduction
    For over a century, a core mission has guided the National Park Service: “preserve unimpaired
the the natural and cultural resources and intrinsic values of the National Park System for the
enjoyment, education, and inspiration of this and future generations" (Org, 1916). Steadfast
dedication to this mission is one reason the national parks have amassed over 15 billion visits,
become iconic global landmarks, and been dubbed “America’s Best Idea." Yet, climate change
poses a fundamental challenge for the national parks. Warming temperatures, sea level rise,
drought, and an increased frequency of extreme weather and wildfire make preserving the national
parks unimpaired increasingly difficult.
    This paper evaluates how climate change will impact the welfare generated by national park
visitation. Adapting the model from Chapter 1, I estimate preferences for long-run average tem-
peratures and short-run temperature deviations within a random utility maximization (RUM) travel
cost model. My empirical strategy identifies preferences for long-run average temperatures using
within-season variation park average temperatures, and I allow preferences for short-run temper-
ature deviations to vary across average temperature bins. Using these estimated preferences and
projected climate and weather conditions, I then simulate national park visitor welfare under climate
change.
    Abstracting from closures and changes to park resources, I find that climate change will likely
increase the surplus generated by national park visitation. When simulating future welfare under a
moderate climate projection, average annual total welfare from 2040 to 2049 is $600 million greater
than from 2010 to 2019. Beneath this overall increase, the change in welfare varies substantially by
season. Welfare decreases in the summer months due to the warming of already hot temperatures,
                                                 23


but these losses are offset by large welfare gains from warming cooler months.
    Strong preferences against cold temperatures drive the welfare results. Willingness to pay
(WTP) is maximized at long-run average temperatures between 70 F and 85 F, and cold tempera-
tures reduce WTP much more than extreme heat. Relative to the ideal temperature range, visiting
a park when the long-run average temperature is 30 F reduces average household per-trip WTP by
$503, while visiting when the temperature is 95 F reduces WTP by just $107.
    Preferences for short-run deviations also suggest gains from warming temperatures. Although,
WTP for favorable short-run temperature deviations is roughly five times smaller than for equiv-
alent changes in long-run average temperatures. Positive temperature deviations increase WTP
at temperatures below 80 F. I do not find a significant negative impact of warmer-than-average
months at hotter temperatures. However, my estimates have large standard errors in this range, so I
cannot rule out negative impacts.
    I contribute to a growing literature studying the nonmarket impacts of climate change. Previous
research in this space has explored how climate change will impact crime, mortality, and other
aspects of human health (Hsiang et al., 2017; Carleton et al., 2022; Deschenes et al., 2009).
Many of these papers exploit short-run temperature deviations as plausibly exogenous temperature
variation. While increased variability of short-run temperature deviations is one aspect of climate
change, this literature often abstracts from changes in long-run average temperatures. Motivated
by Bento et al. (2020), I exploit within-season variation to estimate the impacts of both long-run
average temperatures and short-run deviations. My results, that visitors have strong preferences for
long-run average temperatures, suggest that the existing literature’s focus on short-run deviations
may omit an important aspect of climate change.
    Several other papers study how climate change will impact recreation demand. Almost all of
these papers focus on specific recreational activities or geographic regions. Given their different
contexts, they produce mixed results on the overall impact of climate change on recreation. Chan
and Wichman’s study of cycling predicts welfare gains, while Parthum and Christensen (2022) and
Dundas and von Haefen (2020) predict welfare losses for skiing and marine fishing. In a more
                                                   24


broadly focused study, Chan and Wichman (2022) use short-run temperature deviations to study
eight outdoor activities using time-use diaries from across the United States. Their results suggest
climate change will produce aggregate welfare gains of at least $5 billion for these activities.
    The lack of consensus surrounding welfare impacts of climate change on recreation makes
this study valuable. My setting includes recreation sites from a broad geographic range, and my
outcome of interest, park visitation, subsumes activities, like angling, hiking, and cycling, that
have previously been studied in isolation. By studying a range of sites and a more general activity,
my findings provide important evidence regarding the overall welfare impact of climate change on
outdoor recreation.
    Dundas and von Haefen are the only other paper to quantify the welfare impacts of climate
change on outdoor recreation using a random utility maximization framework. I extend their
methodological contribution by allowing climate and weather to influence both the participation
and site choice decisions. In Dundas and von Haefen’s model, temperature only influences the
participation choice. This contribution is critical in my setting, where climate and weather vary
dramatically across parks in the choice set. Allowing temperatures to impact the site choice
decision is likely important in many other recreation demand settings as well. For example, a site
with swimming opportunities may provide more enjoyment at high temperatures than a site without
water-based recreation.
    Fisichelli et al. (2015) also study the impact of climate change on U.S. National Park System
visitation.1 They regress monthly visitation on temperature at 340 national parks and find the
highest visitation at temperatures between 63 F and 77 F. They predict an 8 to 23% increase in
system-wide visitation by mid-century, driven by increased visitation in off-peak seasons. My work
builds on their analysis by examining welfare impacts, accounting for inter-park substitution using
a discrete-choice framework, and separately identifying the impact of average temperatures and
temperature deviations.
    The remainder of this paper is organized as follows. Section 2.2 introduces the model. Sec-
    1 Several papers, such as Henrickson and Johnson (2013), include more limited discussion of
temperature and national park visitation.
                                                  25


tion 2.3 discusses the visitation, climate, and weather data. Section 2.4 provides the details regarding
the welfare simulation. Sections 2.5 discusses estimated preferences for temperature. Section 2.6
presents simulated welfare impacts, and Section 2.7 concludes.
2.2     Model
    In Chapter 1 of this dissertation, I introduce a model of individuals’ national park visitation
decision. The model in this section differs only in how I decompose the park-month fixed effects.
The Chapter 1 model focuses on explaining the park-month fixed effects using a suite of observable
park attributes. Here, I include a flexible set of fixed effects that subsume most park attributes, and
I focus on how temperatures explain variation in the park-month fixed effects.
    Suppose that each month individuals choose whether to visit a national park, which park to
visit, and whether to drive or fly on their visit. Denote the set of national parks J = {1, 2, ... }
and the set of travel modes M = {⇡, }, where ⇡ and              indicate driving and flying. Let 9 = 0
denote the outside option, which is each individual’s preferred way of spending a month that does
not involve visiting a national park. I group visits to the National Park System’s historic units as
alternative 9 =    + 1. Define the utility individual 8 receives from visiting national park 9 using
travel mode < during month C as
                            8
                            >
                            >
                            >
                            >
                            > X0C + n80C                         9 =0
                            >
                            >
                            >
                            >
                            >
                            >
                            < X 9C + V)⇠ )⇠8 9 ⇡C + n8 9 ⇡C
                            >                                    9 2 {1, ..., }, < = ⇡
                  *8 9 <C =                                                                         (2.1)
                            >
                            >
                            >
                            >
                            >
                            >
                              X 9C + V + V)⇠ )⇠8 9 C + n8 9 C    9 2 {1, ..., }, < =
                            >
                            >
                            >
                            >
                            >
                            > X +1,C + n8, +1,C                  9=     +1
                            :
.
    Coefficient V)⇠ represents the marginal disutility of travel costs, and coefficient V represents
the fixed cost of flying relative to driving. For 9 2 {1, ... }, the park-month fixed effect, X 9C ,
captures the mean utility provided by a park after controlling for travel costs. In plain terms, the
park-month fixed effects capture the awesomeness of each national park in each month.
                                                     26


     Chapter 1 describes the relevant details for estimation of equation 2.1. I estimate the parameters
via maximum likelihood, using a contraction mapping to incorporate individual and park-level
visitation data. I then calibrate the model to produce a monthly panel of park fixed effects from
January 2005 through December 2019.
     I decompose the park-month fixed effects as
        ’                            ’
X 9C =     (U1 +⌧ 1(C4< ? 9C 2 1)) +       (U1⇡⇢+ C4< ? ⇡⇢+                      -
                                                        9C 1(C4< ? 9C 2 1)) + U - 9C + W 9 B(C) + qC + a 9C
         1                             1
                                                                                                     (2.2)
     The primary variables of interest in equation 2.2 are C4< ? and C4< ? ⇡⇢+ . The variable C4< ?
represents the average temperature at a park over the past ten years in a given calendar month (e.g.,
the average temperature at Yellowstone in May over the previous ten years if 9 = “Yellowstone" and
C corresponds to the month of May). The variable C4< ? ⇡⇢+ represents the deviation from C4<           ¯?
that occurs at a park in a given month (i.e., how much warmer or colder than average is the park).
My specification allows for a flexible relationship between temperature and the park-month fixed
effects by estimating a separate C4<¯ ? coefficient for 5 F bins (denoted with the 1 subscript). It also
allows preferences for deviations to vary by average temperature. This allows individuals to prefer
warmer-than-average temperatures when temperatures are typically cold and cooler-than-average
temperatures when temperatures are typically hot.
     The set of control variables, - 9C , includes the number of days with precipitation. Just like for
temperature, I define ten-year moving average and deviation variables. For parsimony, I do not
specify a non-linear relationship between precipitation and park-month fixed effects.
     Equation 2.2 also includes month-of-sample fixed effects (qC ). These parameters capture
system-wide shocks to national park mean utilities, and they influence the interpretation of park-
month effects across time. The estimation procedure, outlined in Chapter 1, normalizes the mean
utility from the outside option to zero in each month. This implies that all park-month fixed effects
should be interpreted relative to their month’s outside option. If the quality of the outside option
changes over time, it complicates cross-month comparisons of park-month effects. The month-
                                                    27


of-sample fixed effect absorbs variation in the quality of the outside option, allowing for a more
natural cross-month comparison of park-month effects.
    The park-season fixed effects in equation 2.2 (W 9 B(C) ) play a critical role in identifying preferences
for temperature. These parameters control for all observed and unobserved park characteristics
constant throughout a season, such as park programs or tours, which tend to be more active in
peak months. Including these fixed effects leaves within-season variation at each park to identify
preferences for temperature. For example, estimation will attribute variation in Yellowstone’s park-
month fixed effects between March’s, April’s, and May’s to variation in average temperatures and
temperature variations, after controlling for system-wide shocks and precipitation.
    Park attributes that vary within a season, are correlated with temperatures, and influence
visitation, still pose threats to identification. Consider events like fall foliage viewing, which
attracts visitors and occurs for a limited portion of the fall season. Seasonal road closures due
to heavy snow, which are common at high-elevation parks, are also correlated with temperature
and typically occur in the fall season then re-open in the spring. These are just two possible
examples that complicate a causal interpretation of the temperature coefficients, especially the
average temperature coefficient.
    Despite these concerns, I argue that this specification isolates relevant variation for understand-
ing climate impacts. For example, the National Park Service has already documented evidence of
springtime conditions (e.g., trees gaining their leaves) occurring earlier in the season. Thus, in the
coming decades, national parks in March may experience average temperatures similar to how they
currently do in April. So while within-season variation may not isolate the impact of temperature
alone on visitation, to some extent, it captures both the impact of temperature on visitor comfort
and the impact of temperature on park management and ecology.
2.3     Visitation, Weather, and Climate Projection Data
    I observe park visitation using individual-level survey data and park-level visitor counts. The
survey data come from the National Park Service’s Comprehensive Survey of the American Public,
                                                   28


a nationally representative telephone survey administered in 2008 and 2018. The survey contains
several variables describing each respondent’s national park visitation history: the park they visited
most recently, the number of times they visits national parks in the two years prior to the interview,
and whether the respondents drove or flew on their most recent visit. Respondents also report their
state of residence, allowing me to compute the travel costs of reaching any of the national parks.
    One strength of the Comprehensive Survey of the American Public is that it includes both visitors
and non-visitors. This feature is rare for a national survey of recreation demand. Unfortunately
for my analysis, the survey does not include the date or timing of respondent’s visits, limiting my
ability to identify preferences for climate and weather from the survey data alone.
    I obtain park-level visitor counts from the National Park Service’s Visitor Use Statistics. The
visitor counts are published for 383 of the over 400 national parks at the monthly level. I focus on
months between 2005 and 2019, which overlap the telephone survey periods. Chapter 1 describes
the survey and visitor count data in more detail.
    To understand how climate and weather impact visitation, I collect park temperature and
precipitation variables from the Global Historical Climatology Network’s Global Summary of the
Month datasets (Lawrimore et al., 2016). These data document temperature and precipitation
observations collected by weather monitoring stations. I extract two monthly variables for each
station: mean daily high temperature and the number of days with more than 0.1 inches of
precipitation.
    Parks often have several weather stations in their vicinity. For each park, I select the nearest
station with less than 25% of months missing data as the representative station. If a park has
multiple stations within its boundaries that meet the completeness criteria, I select the station with
the most complete data as the representative station. On average, representative stations are 5.2
miles from the park they represent. When representative stations are missing data, which occurs
for 10% of the station-months, I predict missing temperature and precipitation variables using
observations from nearby stations.
    To characterize long-run temperature and precipitation, I calculate the ten-year average of these
                                                  29


two variables for each month of the year at each park. For example, I calculate the average daily
high temperature in Yellowstone National Park over the ten previous Aprils. With contemporaneous
monthly variables and averages in hand, I calculate the deviation from monthly averages at each
park in each month.
    The average temperature, average number of precipitation days, deviation from average temper-
ature, and deviation from average precipitation are the weather variables in my model. Roughly,
average temperature and precipitation reflect the weather a visitor could expect to observe at a park
in a certain month. This expected weather is relevant for people planning their trip more than a few
weeks in advance. The deviation variables capture short-run weather events, like heatwaves and
cold snaps, that are not easily foreseen weeks before a visit.
    I calculate the same climate and weather variables for future conditions using downscaled
CMIP5 Climate Projections (Bureau of Reclamation, 2013). There are dozens of CMIP5 climate
projections available. For now, I use the Community Earth System Model Contributor’s projection
for representative concentration pathway (RCP) 4.5, which assumes society makes moderate emis-
sions reductions. Even within RCP’s, climate projections differ, so I intend to incorporate several
climate projections in future work.
    Unlike the weather station data, climate projections are gridded products that provide predictions
every 1/8th degree of latitude and longitude (around eight miles in the contiguous United States).
I select one grid point to represent each park. For grid points within 0.5 degrees of the park, I
compare existing weather observations at the grid points to observations at the park’s representative
weather station. I select the grid point with the most similar weather as the park’s representative
grid point.
    By selecting one representative station and grid point, I abstract from intra-park variation in
weather, which is substantial in some cases. An alternative method would be to average station
observations or use a gridded product and average points within each park. I prefer using a
representative station for two reasons. First, weather stations are often located near visitor centers
or gateway communities. Both are heavily trafficked by park visitors, meaning the weather observed
                                                  30


by stations is often relevant to visitor decision-making. Second, parks with wide-ranging weather
conditions often have rugged terrain and expansive backcountry that are sparsely visited. A
technique that averages grid points or stations is more likely to be influenced by these backcountry
locations, which experience substantially different weather than more highly visited areas.
2.4       Calculating Welfare Impacts
      I simulate the welfare impacts of changes in temperature and precipitation in two steps. First,
I predict a monthly panel of park effects under climate projection forecasts. Then, I calculate the
welfare change between current park effects and climate change park effects. I begin by predicting
a monthly panel of park effects under future climate projections. I denote the predicted park effect
as
          ’                              ’
  X̂ 9C =    ( Û1 +⌧ 1(C4< ? 9C 2 1))+       ( Û1⇡⇢+ C4< ? ⇡⇢+                      -
                                                               9C 1(C4< ? 9C 2 1))+ Û - 9C + W̄ 9 B(C) + q̄C (2.3)
           1                                1
      The prediction depends on temperature and precipitation under climate change, estimated
preferences for temperature and precipitation, and the park-season and month-of-sample fixed
effects. Temperature and precipitation variables come from future climate projections, and I use
parameter estimates for temperature and precipitation coefficients directly from my estimation.
While the model estimation produces estimates of park-season fixed effects and month-of-sample
fixed effects from 2005 to 2019, these parameters capture residual variation
      After predicting a monthly panel of park effects, I calculate the compensating variation (CV)
of national park visitation under the climate and weather conditions in month C relative to 2010
conditions.
                                                     1
                                 ⇠+8 ( X̂C ) =         (⇢* ( X̂C ) ⇢* ( X̂2010 )),                            (2.4)
                                                 V)⇠
                                                   8
      where ⇢* represents the expected utility of a choice occasion and is given by
                                                         31


                                                   ’’          +̂8 9 <C _
                             ⇢* ( X̂C ) = ;=(1 + (       4G ?(         )) ).                       (2.5)
                                                      <
                                                                  _
                                                   9
    In equation 2.5, the term +̂8 9 <C is the predicted deterministic portion of the utility function
(equation 2.1) given parameter estimates.
    The welfare simulations embed several assumptions on how variables and parameters evolve
over time. For simplicity, I hold demographics and travel costs fixed at 2019 levels through all
welfare simulations. I also assume preferences for temperature and precipitation, park-season fixed
effects, and month-of-sample fixed effects remain constant across time. Each of these parameters
plays an important role in determining the magnitude of welfare impacts, and it is likely that each of
them will change, given a long enough time horizon. For example, visitors may become acclimate to
warmer temperatures, mitigating the impact of extreme heat. It is also plausible that any noticeable
change to park resources would affect park-season fixed effects. Meanwhile, month-of-sample fixed
effects would change if cultural norms affect interest in the national parks relative to recreation and
tourism alternatives. In future work, I could explore the sensitivity of my estimates to a variety of
many parameter evolution paths.
2.5    Preferences for Temperatures
    I begin estimating results by focusing on preferences for average temperatures and temperature
deviations. These are critical inputs to the welfare analysis in the section that follows.
    Results figure 2.1 shows that visiting a park provides the most surplus when average high
temperatures fall between 70 F and 85 F. WTP decreases sharply as temperatures become colder.
Relative to 70 F, visiting when the average high temperature is 30 F reduces household WTP by
$503 per trip. Moving from the ideal temperature range to hotter temperatures also reduces WTP
but not as dramatically. Visiting a park when the average high temperature is 95 F reduces WTP
by just $107.
    These results describing preferences for temperature align with the existing literature. Dundas
and von Haefen find that recreational marine fishing participation is maximized when daily high
                                                   32


temperatures are between 60 F and 85 F and that cold temperatures reduce participation more than
extreme heat. Both of these findings are similar to my results. In a hedonic analysis, Albouy et al.
find that individuals prefer daily average temperatures around 65 F. While Albouy et al. estimate
WTP for daily average rather than daily high temperature, a rough conversion from average to
high temperatures (e.g., adding 10 F) makes their preferred temperature estimate in line with my
estimates of the preferred temperature range.
                      Figure 2.1: WTP for long-run average high temperatures
Note: The figure shows potential visitors’ willingness to pay for a park visit across long-run average
high temperatures. All estimates are relative to the 75 F bin.
    Preferences for temperature deviations vary across the range of average high temperatures
(figure 2.2). When average high temperatures are cold, increasing the contemporaneous temperature
increases WTP by up to $6. For average high temperatures above 80 F, the impact of warming
temperatures is not statistically significant. However, I cannot rule out potential adverse effects of
heatwaves, because estimates have large standard errors for average high temperatures above 90 F.
                                                  33


                Figure 2.2: WTP for temperature deviations by average temperature
Note: The figure shows potential visitors’ willingness to pay for a one degree Fahrenheit positive
temperature shock.
    Results figure 2.3 compares the impact of equal-sized changes in average temperatures and
temperature deviations. Increasing temperatures, both average and contemporaneous, has the
largest impact between 40 F and 50 F. In this range, a 5 F increase in the average high temperature
raises WTP by $107, and a 5 F increase in the contemporaneous temperature raises WTP by $32.
At a temperature of 90 F, increasing the average high temperature to 95 F reduces WTP by $39,
while a 5 F warmer-than-average month has little impact on WTP.
    The magnitude of WTP for changes in average high temperatures is almost always greater than
the WTP for changes in contemporaneous temperatures. This may be driven by visitors that plan
trips far in advance and respond to average temperatures, because they cannot observe deviations
at the time of their choice. After committing to their trip, it may be costly to cancel or substitute to
another location, minimizing the observable response to temperature deviations.
                                                 34


                             Figure 2.3: WTP for temperature increases
Note: The graph shows the willingness to pay for a five degree increase in long-run average temper-
ature (blue bars) and short-run temperature shocks (brown dots) by long-run average temperature.
It provides a uniform change in temperature for comparing estimates displayed in figures 2.1 and
2.2.
    Visitors may also respond to deviations by changing their behavior on a trip. They could
spend less time in a park or shift their park time to a different time of day. I cannot observe these
responses in either my survey or visitor count data, which may lead me to underestimate preferences
for temperature deviations. Even still, it is clear that both average temperatures and temperature
deviations influence the visitation decision, and my results suggest stronger preferences for average
temperatures.
    The importance of average temperatures has implications for the broader literature on climate
impacts, which often focuses on short-run temperature deviations because of their econometric
convenience. Yet in this national park visitation setting, focusing on deviations alone would
capture only a fraction of the response to climate change.
                                                  35


    To summarize, these temperature preference estimates produce two main findings. First,
individuals have a strong preference against cold temperatures. The preference against extreme
heat is more modest. Second, for an equal-sized temperature change, preferences for average high
temperatures are stronger than preferences for temperature deviations. Ignoring preferences for
average high temperatures would miss a large component of future welfare changes.
2.6    Welfare Impacts of Climate Change
    Figure 2.4 shows the predicted total welfare change under an RCP4.5 climate projection. The
overall welfare change is large and positive. Relative to a 2010 to 2019 baseline, average annual
welfare increases $440 million by 2030-2039 and $1.1 billion by 2050-2060.
                         Figure 2.4: Predicted welfare change under RCP4.5
Note: The figure shows predicted welfare under an RCP4.5 climate projection. The light-grey line
indicates annual predicted welfare, and the black line indicates the average annual welfare for each
decade. These predictions fix all non-weather variables and parameters at 2019 levels.
    While overall welfare increases, it varies substantially from year to year, and it becomes more
variable over time. Annual changes in welfare are driven by contemporaneous temperature and
precipitation, as well as updates to the ten-year averages. The variation in annual welfare rises 22%
from the 2010’s to the 2040’s and over 100% farther into the future. Ten-year temperature averages
change gradually, so the increased variation is likely caused by larger deviations from average
                                                   36


temperatures. While I flexibly model preferences for these deviations, I do not explicitly model
uncertainty in weather conditions. If climate change brings greater weather uncertainty, models
with a more direct focus on uncertainty may be an important area for future research, particularly
in settings where recreation decisions (e.g., whether to recreate and where to recreate) are made in
advance.
    The overall welfare gains are driven by increased welfare during cooler months, which offsets
welfare losses in the summer (figure 2.5). This result aligns with the results describing preferences
for temperatures, which showed the WTP to warm cool temperatures is roughly five times greater
than the losses from extreme heat. As the magnitude of climate change increases over time, these
seasonal welfare changes become more pronounced.
                            Figure 2.5: Welfare impacts vary seasonally
Note: The figure shows the average welfare change by month for the 2020-2039 (solid) and 2040-
2059 (dashed) time periods relative to a 2000-2019 baseline.
    Just as welfare impacts differ by season, they will also differ by geographic location. Warmer
parts of the country may experience welfare losses, while cooler climates will likely experience
gains. Mapping these welfare changes is straightforward given my model, and I plan to explore
geographic heterogeneity in future work.
    Although I estimate overall welfare gains, the full story of climate change’s impact on the
national parks is complex. The estimated overall welfare gains depend on past variation in climate,
                                                 37


weather, and visitation. Climate change may bring unexpected and dramatic changes to park
resources and ecology that society has not yet experienced. Large-scale resource changes, which
could be caused by wildfire, sea level rise, or invasive species, would likely result in negative
welfare impacts that are not captured by the estimates presented in this section.
2.7     Conclusion
    The U.S. National Park System contains some of the world’s most treasured resources. Climate
change presents challenges for managing these iconic places, and it will influence visitor welfare
through its impact on visitor comfort and park resources.
    In this paper, I quantify the welfare impacts of climate change on U.S. national park visitation.
I find that climate change will likely increase the total welfare generated by the national parks.
Seasonal heterogeneity underlies these overall impacts. Welfare gains in the fall, winter, and spring
months outweigh losses in summer. These welfare impacts are driven by strong preferences against
cold and more modest preferences against extreme heat. Given the scope of this analysis, these
findings contribute important evidence to the debate over the impact of climate change on outdoor
recreation more broadly.
    I also find that preferences for average temperatures are up to five times larger than preferences
for temperature deviations. This suggests that the climate impacts literature’s focus on temperature
shocks, may miss a substantial portion of climate change’s overall impact. Although, the importance
of averages and deviations is likely context-specific.
    These findings are facilitated by my methodological innovation. Building on Dundas and von
Haefen and Chapter 1 of this dissertation, I build the first RUM recreation demand model that
allows climate and weather variables to influence both the participation and destination choice.
The model and estimation procedure can be broadly applied in recreation demand contexts where
panel data describing natural resources is available.
    My findings should be interpreted with several caveats in mind. First, my welfare estimates do
not account for many potential changes to park resources. Future resource changes may be highly
                                                  38


unpredictable and severe. Invasive species, wildfire, sea level rise, and other climate impacts could
dramatically reshape and alter park resources. My analysis is also somewhat limited by my monthly
visitor counts and survey data. For example, I cannot observe intra-month responses to weather
conditions. Finally, I hold most parameters and variables constant when simulating welfare under
climate change. Although, I plan to test the sensitivity of my estimates to this assumption.
    These findings and limitations suggest promising avenues for future research. My model and
estimation procedure provides a methodological blueprint for exploiting natural experiments to
value nonmarket resources. This offers an encouraging opportunity to understand how climate
change may impact welfare by changing the quality of environmental resources. Understanding
these climate impacts is important for the National Park Service and for outdoor recreation more
broadly.
                                                39


                                            CHAPTER 3
  VALUING WATER QUALITY WITH HIGH-FREQUENCY DATA: EVIDENCE FROM
                                MICHIGAN BEACH CLOSURES
                                    (WITH HYUNJUNG KIM)
3.1     Introduction
    Large and detailed administrative datasets have transformed economic research in many fields.
While some recreation demand studies have incorporated administrative data or cell phone-based
mobility data, surveys remain the overwhelming norm. However, surveys are often costly to
implement and face concerns regarding sampling and various forms of measurement error. These
weaknesses could have meaningful implications for nonmarket valuation estimates produced by
recreation demand models. Keiser (2019) finds that measurement error in water quality data is a
key reason for “missing” water quality benefits. It is plausible that more precise visitation data
could also improve estimates of water quality benefits.
    This paper uses administrative park visitation data collected by entry pass scanning technology
from a regional park system in southeast Michigan. The data document visitor ZIP codes and the
exact minute of park entry for the universe of park system visits. For annual passholders, we can
observe individuals’ full visitation history across the park system. The detail and scope of park
visitation captured by this dataset stands out among alternative sources of recreation demand data.
    We use these data to estimate the welfare impacts of water-quality-induced beach closures at
Lake St. Clair Metropark in southeast Michigan. Between July 21 and September 15, 2022,
elevated bacteria levels closed the park’s beach for eleven days and forced 21 days of contamination
advisories. Such beach closures are not uncommon at Lake St. Clair or throughout the Great Lakes
region.
    To value these closures, we complement our visitation data with twice-a-week water sampling
reports and daily beach closure records published by the Michigan Department of Environment,
Great Lakes, and Energy. The frequency of these datasets allows for an innovative modeling
                                                  40


and estimation approach. Our random utility maximization model of park visitation includes a
full panel of park-date fixed effects. We estimate the model in two stages. The first stage uses
maximum likelihood estimation and solves for the park-date fixed effects with the contraction
mapping introduced by Berry (1994). The second stage regresses the panel of park-date fixed
effects on a beach closure indicator and controls. The panel structure provides allows us to apply
causal inference techniques, such as difference-in-differences, in this second stage.
     In the raw data, we observe that Lake St. Clair Metropark receives 24,000 fewer visits after the
first 2022 beach closure than it did over the same time period in 2021. Using several specifications,
we estimate total welfare losses from the 2022 beach closures of about $70,000. Results from the
model with the panel of park-date fixed effects produce similar results to a more approach that
includes a cross-section of park fixed effects and estimates parameters in one stage.
     Our paper makes several contributions to the recreation demand literature. We are the first
recreation demand paper to use administrative, high-frequency park visitation data that captures
the universe of park system visitors. By introducing this dataset, we build on a small number of
recent papers applying innovative datasets to study recreation demand. Gellman et al. (2022) use
campsite reservations from the Recreation.gov website to understand the recreational impacts of
wildfire smoke. Knittel et al. (2023) and Newbold et al. (2022) each use cellphone mobility data to
estimate recreation demand models.
     Our data have several advantages over these existing datasets. Unlike the campsite reservation
data, we observe the universe of visits within the park system. We also avoid the problem of "no
shows", where campers may make and maintain a reservation but not show up. Our data also avoid
concerns common with cell phone mobility data, such as “black-box" sampling and aggregation
techniques employed by third-party data aggregators. Additionally, cell phone mobility data often
provide zonal visitor counts, while for the annual passholders in our data, we observe the full history
of individual visitation within the park system.
     These rich data allow us to apply panel data econometric techniques within a RUM travel cost
model. Our model and estimation procedure build on Chapters 1 and 2 of this dissertation to
                                                  41


estimate a full panel of park-date fixed effects. In this setting, though, richer visitation data allow
us to leverage daily, rather than monthly, variation in the second stage regression.
    We also contribute to an extensive literature valuing the recreational welfare impacts of poor
water quality. In this domain, our work is most similar to Boudreaux et al. (2023), who also study
the impact of poor water quality on recreation in the Great Lakes region. They use a combination
of stated and revealed preference survey data and quantify a stigma effect that makes sites less
appealing days after a water quality warning has ended.
    Finally, we believe that our analysis takes place in an important but understudied setting. Our
analysis includes the twelve “metroparks” in the Detroit, Michigan suburbs. These metroparks are
accessible recreation sites for millions of Michiganders. Even though their natural resources may
be less unique than state or national parks, their proximity to densely populated areas makes them an
inexpensive recreation alternative with the potential to generate sizable surplus. This is evidenced
by the 7 million visits the system attracts every year. Many cities throughout the country have
similar suburban park systems, so our results could inform the management of similar recreation
sites beyond our setting.
3.2     Data
3.2.1   Background and variable description
Our visitation data include all visits to the Huron-Clinton Metroparks between May 15 and October
15 in 2021 and 2022. The metroparks system includes twelve parks, all of which are located in
southeast Michigan, throughout the Detroit suburbs. The metroparks offer a variety of amenities,
and several have substantial recreational infrastructure, such as swimming pools, splash pads, water
slides, paved trails, and nature centers. To enter the parks, visitors must purchase an annual pass
($40) or a daily pass ($10).
    Visitation data are collected at staffed booths located at every park entrance. At the booths,
visitors can purchase daily entry or display their annual passes. When visitors purchase daily entry,
the payment system records their credit card ZIP code. If visitors pay with cash, the park employee
                                                    42


asks for their ZIP code of residence. Visitors affix annual passes as windshield stickers, and park
employees scan barcodes on the passes before allowing cars to enter.
    The visitation data contain the minute of entry and visitor ZIP code for all park system visits. For
annual passholders, we have a unique household ID, allowing us to observe their complete visitation
history within the system. Unfortunately, we do not observe any demographic information. We
plan to incorporate ZIP code-level demographic data from Simply Analytics, a mapping, analytics,
and data visualization application, in future work.
    Obtaining accurate visitation data depends on how consistently entrances are staffed and how
reliably staff follow visit logging procedures. To understand the quality of these data, we flag all
days with zero visits for each metropark. The typical daily visitation for each park is high enough
that days with zero visits are likely the result of inaccurate visitor counts. Six parks regularly
document zero visitation throughout the sample period. For some of these, the days with zero visits
appear to reflect strategic entrance staffing. For example, Wolcott Mill Metropark reports zero
visitation on most weekdays but positive visitation on weekends. The remaining six parks have no
days with zero visits between May 15 and September 19. For our analysis, we drop the six parks
that report zero visitation and focus on the time period from May 15 to September 19.1
    We complement these visitation data with regular water quality and beach closures records
from the Michigan Department of Environment, Great Lakes, and Energy (EGLE). EGLE conducts
water quality tests at many swimming beaches throughout Michigan. We use tests from the beach at
Lake St. Clair Metropark, which were conducted twice each week in the 2021 and 2022 swimming
seasons. Specifically, we observe E. coli bacteria levels per 100 milliliters. State law sets one-day
and 30-day standards for bacteria levels, and a beach is closed to swimming if bacteria levels
exceed either the one-day or 30-day threshold. Elevated bacteria levels at Lake St. Clair forced
beach closures for eleven days between July 21 and September 15, 2022, as well as 21 days of
contamination advisories. Huron-Clinton Metroparks management has expressed interest in how
these closures impacted visitor welfare and visitation.
    1 The six parks we include in our choice set are Kensington, Lake Erie, Lake St. Clair, Lower
Huron, Stony Creek, and Willow Metroparks.
                                                  43


     It is worth noting that we do not observe if or when potential visitors become aware of closures.
Our conversations with Lake St. Clair Metropark staff revealed that visitors sometimes arrive at
the park unaware the beach is closed. However, water quality test results, closures, and advisories
are published on the EGLE BeachGuard website, and beach closures typically receive local news
coverage. Thus, many potential visitors are aware of closures when making recreation decisions.
We also point out that imperfect awareness would make our estimates overly conservative.
3.2.2     Descriptive Statistics
We observe 115,000 annual passholders in the 2022 data, and these passholders combine to take
roughly 740,000 visits to the metroparks system. 75% of annual passholders visit only one park
during our sample, while 19% visit two parks and 6% visit three or more. This suggests that a
non-trivial number of visitors are familiar with multiple parks. In response to amenity changes,
like the beach closures we study, these visitors may be more likely to substitute to other parks.
     Table 3.1 shows total visitation by metropark for the annual passholder population. Stony Creek
and Kensington Metroparks are the most highly visited. They are followed by Lake St. Clair, which
also received more than 100,000 visits in our 2022 sample period. Visitation at these most highly
visited parks is far greater than visitation to the least visited parks.
     Figure 3.1 plots weekly visitation at Lake St. Clair Metropark. The shaded area indicates weeks
that experienced a beach closure for at least one day. The first beach closure occurred July 21. It
lasted one day, and it was followed by several other one-day closures on July 28 and August 4.
Twelve days later the experienced an extended, ten-day closure from August 16 through August 25.
It reopened August 26 but remained under a contamination advisory until September 14.
     Visitation declines significantly, though not monotonically, from the first beach closure event
through the end of our sample. Lake St. Clair includes a variety of amenities, including a
swimming pool, paved trails, and ball fields, so we would not expect the beach closure to drive
visitation completely to zero. This figure is purely descriptive, and several factors could cause
the decline. The beach closure is one obvious candidate. The weather and the start of the school
                                                    44


                               Table 3.1: Visitation by Metropark 2022
                                 Metropark                    Visitation
                                 Stony Creek                    201,882
                                 Kensington                     197,122
                                 Lake St. Clair                 117,376
                                 Lower Huron                     56,678
                                 Willow                          52,218
                                 Lake Erie                       37,722
                                 Hudson Mills                    35,629
                                 Indian Springs                  27,280
                                 Oakwoodss                         5,175
                                 Dexter-Huron                      4,411
                                 Delhi                             1,880
                                 Wolcott Mill Farm Center              785
                                Note: The table shows total visitation by
                                annual passholders between May 15 and
                                October 15, 2022.
year likely explain some of the decrease as well. Both may cause park visitation to decline around
the end of August. We attempt to control for these factors by including day-of-year, or date, fixed
effects in our formal estimation.
3.3    Model
    We model the individual-level visitation decision using a repeated RUM travel cost framework.
Assume that every day an individual chooses to visit the park that provides the highest utility. The
choice set consists of the six metroparks, indexed by 9 = {1, ... } in southeast Michigan, as well as
the outside option ( 9 = 0) – the most preferred way of spending a day that does not involve visiting
one of the six metroparks in our sample. The utility individual 8 receives from visiting metropark
on day C is given by
                                     *8 9C = X 9C + V)⇠ )⇠8 9 + n8 9C ,                         (3.1)
    and we decompose the alternative specific constants,X 9C , as
                  X 9C = V⇠ ! ⇢E4A⇠;>B43 9C %>BC⇠;>BDA4 9C + V - - 9C + q 9 + bC + a 9C         (3.2)
                                                    45


                     Figure 3.1: Weekly visitation at Lake St. Clair Metropark
Note: The figure plots 2022 weekly visitation by all visitors (annual and day pass) at Lake St. Clair
Metropark. Shaded weeks experienced a beach closure for at least one day.
    The variable )⇠ represents the travel costs of reaching a park. The variable ⇢E4A⇠;>B43
equals one if the park’s beach is closed at any time during the 2022 season. It equals one if
9 = !0:4 (C. ⇠;08A "4CA> ?0A : and zero otherwise. The variable %>BC⇠;>BDA4 equals one for
any date after July 21, the date of the first beach closure at Lake St. Clair and zero otherwise. The
coefficient of interest, V⇠ ! , captures the average effect of the beach closures on Lake St. Clair
Metropark’s alternative specific constant for all days after the initial closure. The variable - 9C
includes any time-varying controls.
    We assume the error term, n8 9C , follows a Type I Extreme Value distribution that is independent
and identically distributed across individuals, parks, and choice occasions. This produces the
conditional logit choice probabilities:
                                                   4G ?(+8 9C )
                                        %8 9C = Õ                  ,                             (3.3)
                                                  := 4G ?(+ )
                                                  :=0          8:C
                                                   46


where +8 9C = X 9C + V)⇠ )⇠8 9 is the deterministic portion of utility in equation 3.1.
    The model provides structure for evaluating the welfare impacts of beach closures. We define
the compensating variation (CV) of any change in park attributes, including a beach closure, as
                                   1       ’                     ’
                       ⇠+8C =         {;=(           1 ))
                                               4G ?(+8:C     ;=(            0 ))}.
                                                                     4G ?(+8:C                    (3.4)
                                 V)⇠       :=0                   :=0
When valuing the beach closures at Lake St. Clair, + 1 represents the utility generated by observed
park conditions in the summer of 2022. We define + 0 as the utility that would have been generated
in absence of any beach closures. More specifically,
                       0 = X + V)⇠ )⇠
                     + 9C                      V⇠ ! ⇢E4A⇠;>B43 9C %>BC⇠;>BDA4 9C                  (3.5)
                             9C           89
For all parks except Lake St. Clair Metropark, + 9C 1 = +0 .
                                                          9C
3.4    Estimation
    Our estimation procedure applies techniques from Murdock (2006) and Chapters 1 and 2 of this
dissertation. We estimate the model in two stages, and the panel of park-date fixed effects provides
variation both across and within-parks for the second stage regression.
    First, we estimate the parameters in equation 3.1 using maximum likelihood. Rather than
estimate the park-date fixed effects directly, we apply the Berry (1994) contraction mapping. The
contraction mapping solves for the park-date fixed effects that match the daily park visitation shares
predicted by the model to the daily park visitation shares observed in the data. Estimation leveraging
the contraction mapping produces the same estimates as a direct estimation of the park-date fixed
effects. The benefit of the contraction mapping is that the optimization routine does not need
to search over as many parameters. Because our model contains 756 park-date fixed effects (six
parks times 126 dates), we suspect the contraction mapping substantially reduces the computational
burden. At the end of the first stage, we obtain estimates of V)⇠ and the panel of park-date fixed
effects X.
                                                   47


    In the second stage, we estimate the parameters in equation 3.2. Because we observe data for
only one treated park and five controls, we plan to explore the use of synthetic controls to estimate
the treatment effect of beach closure. For now though, we include park and date fixed effects in our
second stage regression. These control for constant differences in park amenities and system-wide
amenity shocks. Thus, any threat to identification must come from unobserved factors correlated
with beach closure that vary with time and impact parks differentially.
    One benefit of this two-stage procedure is that it accounts for demand spillovers by explicitly
modeling inter-park substitution. To yield consistent treatment effect estimates, difference-in-
differences and event study approaches require a stable unit treatment value assumption (SUTVA).
That is, untreated units’ outcomes must be unaffected by the treatment of other units. In a linear
regression with visitation as the outcome and beach closure as the treatment, inter-park substitution
resulting from the closure would violate SUTVA. Our structural model allows us to use park-
date fixed effects as the second-stage outcome. Because the park-date fixed effects are structural
parameters representing the mean utility provided by a park, SUTVA is likely to hold, as the beach
closure will not influence the mean utility provided by untreated parks.
3.5     Results
    Table 3.2 presents coefficient estimates from several model specifications. Results in column
(1) come from a model with a cross-section of park fixed effects rather than a daily panel. In this
model, the beach closure coefficient can be separately identified from the park fixed effects and the
estimation occurs in one stage. Results in columns (2) and (3) come from models that include a
full set of park-by-date fixed effects in the first stage, and estimation follows the discussion in the
previous section.
    Estimates of the travel cost coefficient are nearly identical across all three specifications. This
is not surprising. Consider that park fixed effects control for unobservable park attributes that are
correlated with travel cost, such as remoteness. In doing so, they reduce the possibility of omitted
variables bias when estimating the travel cost coefficient. In a model with a cross-section of park
                                                    48


                                     Table 3.2: Coefficient Estimates
                      Variable                         (1)         (2)      (3)
                      Travel cost ($10)             -1.5445     -1.5461  -1.5461
                                                    (0.0069)   (0.0041)  (0.0041)
                      Beach closure                 -0.2371     -0.2548  -0.2503
                                                    (0.0017)   (0.0306)  (0.0307)
                      First-stage fixed effects
                         Park                           Y
                         Park-date                                  Y        Y
                      Second-stage fixed effects
                         Date                                       Y        Y
                         Park-day of week                                    Y
fixed effects, changes in unobserved attributes correlated with travel costs still pose a threat to
identification, so theoretically, park-by-date fixed effects could improve identification of the travel
cost coefficient. In practice though, unobservable park attributes correlated with travel cost vary
little within the five-month period of our analysis. This means the decision of whether to include
a cross-section or panel of park fixed effects has little impact when identifying the travel cost
coefficient.
     The beach closures decrease the mean utility provided by Lake St. Clair Metropark. Dividing
by the travel cost coefficient indicates that the individual mean willingness to pay to avoid the park
after the closures ranges from $1.54 for the cross-section of park-fixed effects model to $1.65 and
$1.62 for the two park-date fixed effects models. These estimates are substantially smaller than
the existing literature and should be compared with caution for two reasons. First, the metroparks
provide many amenities aside from the beach, so our willingness to pay estimate averages over
many individuals who have little interest in swimming in Lake St. Clair. Second, we assign all dates
after the initial closure as treated, which includes a many days with no beach closures. Boudreaux
et al. (2023) estimate that beachgoers are willing to pay roughly $266 to avoid a beach with a
bacterial warning, but these estimates focus exclusively on beachgoers and the illicit preferences
for the exact day when a beach is closed.
     Figure 3.2 shows how the beach closures impact the park-date fixed effect estimates. The light
                                                    49


gray lines track the park-date fixed effects for two control parks: Kensington Metropark and Lake
Erie Metropark. Anecdotally, these parks provide similar amenities to Lake St. Clair Metropark
(whose park-date fixed effects are shown in black).
                    Figure 3.2: Park-Date Fixed Effect Estimates for Three Parks
Note: The figure shows park-date fixed effect estimates for three parks: Kensington Metropark (top-
gray), Lake St. Clair Metropark (middle-black), and Lake Erie Metropark (bottom-gray). Shaded
areas indicate dates when Lake St. Clair Metropark experienced a beach closure or contamination
advisory. A power outage affected data collection at Lake St. Clair. on August 4 and 5, and I drop
data for August 4 and 5 throughout the entire analysis.
    Through the first beach closure period, Lake St. Clair Metropark’s park-date fixed effects are
roughly the average of Kensington and Lake Erie’s. By early August though, Lake St. Clair
Metropark’s park-date fixed effects have fallen, and they are roughly equivalent to Lake Erie
Metropark’s. They show some sign of rebounding, but they never return to their pre-closure level
relative to these two control parks. This is consistent with the possibility that visitors gain awareness
                                                   50


of beach closures as they occur more frequently. The figure also suggests that closures may reduce
visitation even after beaches reopen.
    We use our estimated model to calculate the welfare loss caused by the 2022 beach closures.
As described in section 3.3, we calculate the welfare loss relative to a baseline scenario with no
beach closures while all other conditions remain at 2022 levels. Table 3.3 presents the welfare loss
estimates for the three models described above.
                                 Table 3.3: Welfare Loss Estimates
                                                (1)        (2)        (3)
                         Total welfare loss   $67,776   $73,425    $71,964
    Using the same three models as table 3.2, we estimate the total welfare loss of the closures was
between $68,000 and $73,000. This translates to an average daily welfare loss of around $1,200
($71,000 divided by 59 post-closure days). All three models generate similar welfare loss estimates,
which is not very surprising given their similar parameter estimates (table 3.2). In this setting,
including a park-date fixed effects rather than a cross-section of park fixed effects does not affect
parameter estimates or welfare loss estimates.
    Although the magnitude of these welfare loss estimates is modest, we believe our results are
important for several reasons. First, our results capture the welfare loss for a subset of visitors
(annual passholders), and the total welfare loss would be weakly larger if we also consider visitors
who do not own an annual pass. Annual passholders make up roughly 70% of visits to the
metroparks system, and if annual passholders incur 70% of the welfare loss, including all visitors
would raise the total welfare loss estimate to about $100,000.
    Second, while we analyze beach closures at a single site, beach closures are not uncommon,
especially in the Great Lakes region. Any high-quality estimates of the welfare impacts of beach
closures are useful for benefit-transfer analyses in other contexts. Our high-frequency, adminis-
trative data and identification strategy make our estimates a reliable data point for valuing other
beach closures. Furthermore, most existing studies of beach closures rely on stated preference data
                                                 51


to value hypothetical beach closures, making our results a valuable point of comparison for the
literature.
3.6     Conclusion
     This paper introduces a new dataset for the study of recreation demand, which tracks the exact
minute of park entry for the universe of park system visitors. We explore the potential benefits of
such high-frequency, administrative data for estimating recreation demand models, and we leverage
the methodological advances from Chapters 1 and 2 of this dissertation to estimate the welfare
losses of beach closures. Our estimation exploits daily visitation and water quality variation, which
is unique in the recreation demand literature. Our findings show that the 2022 beach closures
decreased the mean utility provided by Lake St. Clair Metropark. Our preliminary results value
the total welfare loss around $70,000 for the system’s annual passholders.
     Our study has several limitations. We observe visitation to only six parks, while visitors likely
substitute to many other recreation sites throughout the region. Given the lack of comparably
detailed park visitation data, gauging the impact of our limited choice set for our estimates is
difficult. There may be some way to combine our data with other sources, like surveys or cell phone
data, to fill gaps in our choice set. The data’s lack of demographics is another limitation, and we
plan to incorporate ZIP code demographic information in future models.
     Our current analysis focuses exclusively on annual passholders for simplicity, but this ignores
visitors who purchase daily entry passes. Unlike annual passholders, we cannot track individuals
who purchase daily entry across visits. This complicates the modeling and estimation procedure.
We are unsure whether to drop these visitors or invest in creating a more flexible approach moving
forward.
     Despite these limitations, our paper illustrates the potential for innovative datasets to improve
recreation demand research. As similar datasets become available, they will provide more oppor-
tunities for detailed models and rigorous empirical strategies in the recreation demand field.
                                                   52


                                        BIBLIOGRAPHY
(1916). U.S. Code Title 16 - Organic Act.
Albouy, D., Graf, W., Kellogg, R., and Wolff, H. (2016). Climate amenities, climate change, and
   American quality of life. Journal of the Association of Environmental and Resource
   Economists, 3.
Bento, A., Miller, N. S., Mookerjee, M., and Severini, E. R. (2020). A Unifying Approach to
   Measuring Climate Change Impacts and Adaptation. NBER Working Paper 27247.
Berry, S. (1994). Estimating Discrete-Choice Models of Product Differentiation. The RAND
   Journal of Economics, 25:242–262.
Berry, S., Levinsohn, J., and Pakes, A. (2004). Differentiated Products Demand Systems from a
   Combination of Micro and Macro Data: The New Car Market. Journal of Political Economy,
   112.
Boudreaux, G., Lupi, F., Sohngen, B., and Xu, A. (2023). Measuring beachgoer preferences for
   avoiding harmful algal blooms and bacterial warnings. Ecological Economics, 204.
Bureau of Reclamation (2013). Downscaled CMIP3 and CMIP5 climate and hydrology
   projections. U.S. Department of the Interior.
Carleton, T., Jina, A., Delgado, M., Greenstone, M., Houser, T., Hsiang, S., Hultgren, A., Kopp,
   R. E., McCusker, K. E., Nath, I., Rising, J., Rode, A., Seo, H. K., Viaene, A., Yuan, J., and
   Zhang, A. T. (2022). Valuing the global mortality consequences of climate change accounting
   for adaptation costs and benefits. The Quarterly Journal of Economics, 137:2037–2105.
Chan, N. and Wichman, C. J. (2020). Climate Change and Recreation: Evidence from North
   American Cycling. Environmental and Resource Economics, 76:119–151.
Chan, N. W. and Wichman, C. J. (2022). Valuing nonmarket impacts of climate change on
   recreation: From reduced form to welfare. Environmental & Resource Economics, 81:179–213.
Chintagunta, P., Dubé, J.-P., and Goh, K. Y. (2005). Beyond the endogeneity bias: The effect of
   unmeasured brand characteristics on household-level brand choice models. Management
   Science, 51:832–849.
Cullinane Thomas, C. and Koontz, L. (2020). 2019 National Park Visitor Spending Effects:
   Economics Contributions to Local Communities, States, and the Nation. National Park Service.
Deschenes, O., Greenstone, M., and Guryan, J. (2009). Climate change and birth weight. American
   Economic Review, 99:211–217.
                                                53


Dundas, S. J. and von Haefen, R. H. (2020). The Eects of Weather on Recreational Fishing
   Demand and Adaptation: Implications for a Changing Climate. Journal of the Association of
   Environmental and Resource Economists, 7(2):209–242.
English, E., von Haefen, R. H., Herriges, J., Leggett, C., Lupi, F., McConnell, K., Welsh, M.,
   Domanski, A., and Meade, N. (2018). Estimating the value of lost recreation days from the
   Deepwater Horizon oil spill. Journal of Environmental Economics and Management, 91:26-45.
Fisichelli, N. A., Schuurman, G. W., Monahan, W. B., and Ziesler, P. S. (2015). Protected area
   tourism in a changing climate: Will visitation at US national parks warm up or overheat? PLOS
   One, 10(6).
Gellman, J., Walls, M., and Wibbenmeyer, M. (2022). Non-market damages of wildfire smoke:
   evidence from administrative recreation data. Working Paper.
Hausman, J. A., Leonard, G. K., and McFadden, D. (1995). A utility-consistent, combined discrete
   choice and count data model: assessing recreational use losses due to natural resource damage.
   The Journal of Public Economics, 56:1–30.
Henrickson, K. E. and Johnson, E. H. (2013). The Demand for Spatially Complementary National
   Parks. Land Economics, 89:330–345.
Hsiang, S., Kopp, R., Jina, A., Rising, J., Delgado, M., Mohan, S., Rasmussen, D. J., Muir-Wood,
   R., Wilson, P., Oppenheimer, M., Larsen, K., and Houser, T. (2017). Estimating economic
   damage from climate change in the United States. Science, 356:1362–1369.
Keiser, D., Lade, G., and Rudik, I. (2018). Air pollution and visitation at U.S. national parks.
   Science Advances, 4(7).
Keiser, D. A. (2019). The missing benefits of clean water and the role of mismeasured pollution.
   Journal of the Association of Environmental and Resource Economists, 6(4):669–707.
Knittel, C. R., Li, J., and Wan, X. (2023). I love that dirty water? value of water quality in
   recreation sites. Working Paper.
Lawrimore, J. H., Applequist, R., Korzeniewski, B., and Menne, M. J. (2016). Global summary of
   the month (gsom), version 1.0.3.
Lupi, F., Phaneuf, D., and von Haefen, R. (2020). Best Practices for Implementing Recreation
   Demand Models. Review of Environmental Economics and Policy, 14:302–323.
McFadden, D. (1974). The measurement of urban travel demand. The Journal of Public
   Economics, 3:303–328.
McFadden, D. (1979). Quantitative methods for analysing travel behaviour of individuals: Some
   recent developments. In Hensher, D. A. and Stopher, P. R., editors, Behavioural Travel
   Modelling, chapter 13, pages 279–318. London.
                                                 54


Mundlak, Y. (1978). On the pooling of time series and cross section data. Econometrica, 46:69–
   85. Murdock, J. (2006). Handling unobserved site characteristics in random utility models of
   recreation demand. Journal of Environmental Economics and Management, 51:1–25.
Neher, C., Dueld, J., and Patterson, D. (2013). Valuation of National Park System Visitation: The
   Ecient Use of Count Data Models, Meta-Analysis, and Secondary Visitor Survey Data.
   Environmental Management, 52:683–698.
Newbold, S. C., Lindley, S., Albeke, S., Viers, J., Parsons, G., and Johnston, R. (2022). Valuing
   satellite data for harmful algal bloom early warning systems. RFF Working Papers.
Office of Aviation Analysis (2015). Consumer airfare report.
Parsons, G., Leggett, C., Herriges, J., Boyle, K., Bockstael, N., and Chen, Z. (2021). A Site-
   Portfolio Model for Multiple-Destination Recreation Trips: Valuing Trips to National Parks in
   the Southwestern United States. Journal of the Association of Environmental and Resource
   Economists, 8:1–25.
Parthum, B. and Christensen, P. (2022). A market for snow: Modeling winter recreation patterns
   under current and future climate. Journal of Environmental Economics and Management, 113.
Ruggles, S., Flood, S., Foster, S., Goeken, R., Pacas, J., Schouweiler, M., and Sobek, M. (2021).
   Ipums usa: Version 11.0 [dataset].
Szabó, A. and Ujhelyi, G. (2021). Conservation and Development: Economic Impacts of the US
   National Park System. Working Paper.
Walls, M. (2022). Economics of the us national park system: Values, funding, and resource
   management challenges. Annual Review of Resource Economics, 14:579–96.
Wooldridge, J. M. (2019). Correlated random effects models with unbalanced panels. Journal of
   Econometrics, 211:137–150.
                                                 55