COMPARISON OF FIVE ACCELEROMETER METRICS FOR ASSESSING THE
TEMPORAL PATTERNS OF CHILDREN’S FREE-PLAY PHYSICAL ACTIVITY
                                   By
                       Katherine Louise McKee
                                A THESIS
                               Submitted to
                       Michigan State University
               in partial fulfillment of the requirements
                            for the degree of
            Department of Kinesiology – Master of Science
                                  2022


                                             ABSTRACT
Accelerometers are frequently used to measure physical activity (PA) in children, which is
important for overall health and development. Lack of uniformity in data processing methods,
such as the metric used to summarize accelerometer data, limits comparability between studies.
The objective was to determine the convergent validity of five accelerometer metrics for
characterizing the intensity and temporal patterns of first and second graders’ (n=88) recess PA.
At a 5-s epoch level, Pearson’s correlations between various metrics ranged from 0.69 to 0.98.
When each epoch was classified into one of four activity levels based on quartiles, agreement
between metrics as indicated by weighted kappa ranged from 0.81 to 0.96. When collapsed to
time spent in each activity level, metrics were most often statistically equivalent for estimating
time spent in quartile 3 or 4. Children were ranked from least to most active, and agreement
between metrics was strong with Spearman’s correlation coefficients of over r=0.86. Temporal
patterns were characterized using five fragmentation indices calculated using each of the five
metrics. Pearson’s correlations between metrics ranged from r=0.53 to 0.99, with the strongest
associations for number of high activity bouts. Most fragmentation indices were not statistically
equivalent between metrics. While metrics captured similar trends in activity intensity and
temporal patterns, caution is warranted when making comparisons of point estimates derived
from different metrics. However, all metrics were able to similarly capture higher intensity
activity (i.e., quartile 3 or 4), the most common outcome of interest in intervention studies


                                              TABLE OF CONTENTS
INTRODUCTION .......................................................................................................................... 1
LITERATURE REVIEW ............................................................................................................... 5
     Nature of Children’s Physical Activity ............................................................................... 5
     Measurement of Children’s Physical Activity .................................................................... 7
     Accelerometry ..................................................................................................................... 7
     Count-Based Metrics .......................................................................................................... 9
     Vertical Axis Counts ......................................................................................................... 10
     Triaxial Vector Magnitude Counts ................................................................................... 10
     Acceleration-Based Metrics .............................................................................................. 10
     Mean amplitude deviation................................................................................................. 12
     Euclidean norm minus one................................................................................................ 13
     Activity index.................................................................................................................... 14
     Comparability of Metrics .................................................................................................. 15
     Temporal Patterns of Physical Activity ............................................................................ 17
     Measurement of Temporal Patterns .................................................................................. 17
     Conclusions ....................................................................................................................... 19
METHODS ................................................................................................................................... 20
     Setting and Participants..................................................................................................... 20
     Data Collection ................................................................................................................. 20
     Data Processing ................................................................................................................. 21
     Physical Activity Intensity ................................................................................................ 21
     Temporal Patterns ............................................................................................................. 22
     Statistical Analyses ........................................................................................................... 22
RESULTS ..................................................................................................................................... 24
     Participant characteristics ................................................................................................. 24
     Aim 1 ................................................................................................................................ 24
     Aim 2 ................................................................................................................................ 32
DISCUSSION ............................................................................................................................... 46
     Strengths & Limitations .................................................................................................... 50
     Future Directions .............................................................................................................. 50
REFERENCES ............................................................................................................................. 52
                                                                  iii


                                        INTRODUCTION
        Regular physical activity (PA) participation during childhood is associated with improved
cognition, increased bone density and muscular strength, and decreased risk of developing
noncommunicable disease in later life (Loprinzi et al., 2012). Despite these and other benefits,
many children fail to meet the recommendations set forth in the National Physical Activity
Guidelines for Americans, which state that children aged 6- through 17-years should engage in at
least 60-minutes∙day-1 of moderate-to-vigorous physical activity (MVPA) (Health and Human
Services, 2018). Unfortunately, only about 25% of school-aged children meet PA
recommendations according to both parent-report and device-based measures of PA
(Katzmarzyk et al., 2016; Troiano et al., 2008). This is a major public health concern, thus
prompting investigation into the underlying causes of insufficient PA levels among children.
        Interventions aimed at promoting and increasing children’s PA often occur in the school
setting due to the bulk of time children spend in school. There are few opportunities for children
to engage in PA during the school day; however, some opportunities for PA include physical
education (Meyer et al., 2013), classroom activity breaks (Carlson et al., 2015), or after-school
programs (Arundell et al., 2015). The outdoor recess period, however, is often the only
opportunity for children to engage in a developmentally important type of play called
unstructured free-play PA, in which children are able to determine the structure and intensity of
play without a high degree of adult influence (Ginsberg, 2007). This type of play, along with the
benefits from regular PA engagement, contributes to children’s social, emotional, and physical
development (Ginsberg, 2007). Therefore, efforts to optimize the unstructured free play PA that
the outdoor recess period provides is salient in improving the overall well-being of children.
                                                  1


         Research on children’s PA during recess has primarily focused on overall PA levels
(Ridgers et al., 2005; Mota et al., 2005; Holmes et al., 2012) or differences amongst groups (e.g.,
boys vs girls; Trost et al., 2002; Mota et al., 2005; Pate et al., 2013). Interventions to promote PA
during recess have included strategies like modifications to the physical environment (e.g.,
painted markings; Ridgers et al., 2007; Blaes et al., 2013) and addition of portable equipment
(e.g., playground balls, jump ropes; Ridgers, Fairclough, & Stratton, 2010). However, little
research has considered the chronological succession of children’s activity over time, or the
temporal patterns of PA. It is known that children do not maintain the same level of PA over the
recess period (McKenzie et al., 1997); therefore, better understanding of the temporality of PA
can highlight critical periods when children are most and/or least active during recess. Further
research into how these temporal patterns of recess PA differ by participant demographic or
environmental factors is warranted as this information can be used to create activity-promoting
interventions to optimize PA participation during the recess period.
         Accelerometry is an ideal method to capture children’s PA due to its ability to capture the
timing, frequency, intensity, and duration of movement, as well as the temporality of children’s
PA. However, there are a multitude of decisions required to process the accelerometer data, and
lack of consistency amongst processing methods limits data harmonization and comparability
across studies. Accelerometers measure acceleration in gravitational units (g; 1 g = 9.8 ms -2),
which has historically been filtered, rectified, and summed over a specific period, or epoch (e.g.,
5-s) as activity counts (Farrahi et al., 2019; Strath et al., 2012). Each of the epochs can then be
assigned an activity intensity of light, moderate, or vigorous depending on the count value using
population-specific thresholds called cut points, ultimately allowing researchers to determine
time spent in each PA intensity.
                                                    2


        Over the years, advances in accelerometry have resulted in the shift from vertical axis
(VA) counts to triaxial counts measured across three planes (i.e., anteroposterior, mediolateral,
vertical), aggregated as vector magnitude (VM) counts. However, the proprietary nature of the
manufacturer-specific algorithms used to calculate the count-based metrics has greatly reduced
generalizability of results across studies, particularly when studies use different device
manufacturers (Duncan et al., 2016; Rowlands et al., 2014). A proposed solution to this issue
was the development of open-source metrics based upon the raw acceleration – not activity
counts – to determine activity intensities (Farrahi et al., 2019; Clevenger et al., 2020). Since
researchers gained access to raw acceleration data around a decade ago (John & Freedson, 2012),
several acceleration-based metrics have emerged. Specifically, three commonly used
acceleration-based metrics are mean amplitude deviation (MAD) (Vähä-Ypyä et al., 2015;
Aittasalo et al., 2015), the Euclidean norm minus one (ENMO) (van Hees et al., 2013;
Hildebrand et al., 2017), and the activity index (AI) (Bai et al., 2016).
        Previous research in adults has investigated the comparability of these metrics for
assessing overall PA participation (Sasaki et al., 2013; Bakrania et al., 2016; Karas et al., 2022)
but there is a paucity of information regarding the comparability of these acceleration-based
metrics compared to count-based metrics in populations of children, who participate in more
sporadic and variable PA compared to adults (Bailey et al., 1995; Welk, Corbin, & Dale, 2000).
Furthermore, little research has investigated whether these metrics similarly capture temporal
patterns of activity. Therefore, the overall purpose of this study is to determine the convergent
validity of count- and acceleration-based metrics to measure the overall levels and temporality of
children’s PA during recess. Two specific aims and their respective hypotheses will be
examined.
                                                   3


        Aim 1: To determine the convergent validity of VM, VA, MAD, ENMO and AI when
characterizing children’s PA intensity during recess, at both the epoch and participant level.
        Hypothesis 1A: At the epoch-level, acceleration-based metrics will show moderate to
high correlations with each other (r 0.60) and count-based metrics will show moderate to high
correlation with each other (r 0.60) but not with acceleration-based metrics (r 0.60).
        Hypothesis 1B: Time spent in each quartile of activity intensity will be equivalent
between all metrics and agreement at the epoch-level will be substantial (k = 0.61 to 0.80).
        Hypothesis 1C: At the participant-level, rank-order of children’s overall PA will be
similar amongst all metrics ( 0.80)
        Aim 2: To determine comparability of VM, VA, MAD, ENMO and AI when
characterizing temporal patterns of children’s PA during recess.
        Hypothesis 2A: Temporal patterns, as assessed using fragmentation indices (transition
probabilities, mean fragment duration, and number of fragments), will be moderately to highly
correlated amongst all metrics (r 0.60).
        Hypothesis 2B: Fragmentation indices will be statistically equivalent among
accelerometer metrics.
        Investigation into these aims will help characterize the comparability of accelerometer
metrics for measuring both overall activity levels and the temporal pattern of PA, thus adding to
the growing body of literature utilizing acceleration-based metrics instead of traditional count-
based approaches to capturing PA. This information can inform future study design and enhance
PA researchers understanding of comparability amongst studies.
                                                  4


                                      LITERATURE REVIEW
        It has been thoroughly substantiated that regular participation in PA during childhood is
important in reducing risk of noncommunicable disease (Lee et al., 2012). The school setting is
an attractive environment to analyze children’s PA because children spend about 30-hours per
week in school (Nettlefold et al., 2011). One of the most commonly used tools for measurement
of PA is the accelerometer (Trost & O’Neil et al., 2014). However, there is little uniformity when
it comes to processing and analyzing the acceleration data produced from these devices, leading
to misinterpretation of study results and inability to generalize findings across studies. Following
an overview of the nature, benefits, and context of children’s physical activity with a focus on
the school recess setting, this literature review will discuss analysis of accelerometer data for
capturing PA intensity and characterizing temporal patterns of children’s PA.
Nature of Children’s Physical Activity
        In their seminal paper from 1995, Bailey and colleagues highlighted the highly transitory
nature of children’s PA which occurs in bouts lasting no more than 15-s at a time. Currently, the
precise reasoning behind this pattern of PA innate to children is unknown. One prominent theory
posits that the spontaneous PA movements demonstrated by children provides the central
nervous systems with adequate stimulation, information about one’s environment, and helps
maintain homeostasis via energy balance and metabolic regulation (Hills, King, & Armstrong,
2007). Because children’s PA is inherently different than adult PA (Bailey et al., 1995), special
attention must be made when selecting an appropriate method of PA measurement in children to
ensure it is being accurately captured.
        While PA is largely responsible for children’s physical development, regular
participation in PA during childhood is also positively associated with children’s cognitive and
                                                   5


socio-emotional development, as problem-solving and social skills are developed through play
interaction with other children (Bjorklund & Brown, 1998; Pellegrini & Smith, 1998). Therefore,
encouragement of PA is important to facilitate typical growth and development of children
overall (Rowland, 1998). While the developmental benefits of PA participation are abundantly
clear, more than half of children in the United States struggle to meet the National Physical
Activity Guidelines for Americans (Katzmarzyk et al., 2016; Troiano et al., 2008; HHS, 2018).
This is a major public health concern and has led to the development and implementation of PA
interventions aimed at increasing children’s PA levels.
        The context in which PA occurs is an important factor to consider when promoting
children’s PA. Many PA-promoting interventions are focused on the school setting because most
children spend the bulk of their time in school. While there are other opportunities for PA during
the school day like physical education or active classroom breaks, recess is of particular interest
because it is often the only opportunity for unstructured PA during the otherwise structured
school day (Mota et al., 2005). Recess contributes about 20% of children’s daily PA, although
children spend approximately half of recess participating in MVPA (Rooney, 2018). Common
interventions to promote PA during recess include addition of painted playground markings
(Stratton & Mullan, 2005; Ridgers et al., 2007) or portable equipment (Ridgers, Fairclough, &
Stratton, 2010; Yu, Kulinna, & Mulhearn, 2021). However, a more thorough understanding of
the temporal patterns of recess PA could provide useful information to inform novel activity-
promoting interventions, since children’s PA does not occur at a continuous rate over the course
of recess (McKenzie et al., 1997; Holmes, 2012). Understanding the temporality of children’s
PA can highlight periods and or patterns of high and low PA and ultimately provide PA
                                                  6


researchers with ways to maintain PA levels over the recess period and increase children’s
overall PA.
Measurement of Children’s Physical Activity
        There are a multitude of ways to measure children’s PA. Some of the most common
methods are direct observation, self- or proxy-reported measures like questionnaires or daily
activity logs, and device-based measures like pedometry or accelerometry. Direct observation is
considered a gold standard method of PA measurement, particularly when observers are
thoroughly trained. However, direct observation inherently has a high degree of researcher
burden (i.e., time-consuming, high-energy cost) making it an unattractive option in large-scale
studies (Gardner, 2000; Sylvia et al., 2013; Rachele et al., 2013). Self-reported measures like
questionnaires in young populations are often obscured by children’s attention spans and
tendency to inaccurately recount their daily activities, particularly if the recall period is longer
than one week, while proxy-reports are often not feasible since children are not always with their
parent or guardian (Sylvia et al., 2013; Corder et al., 2013; Biddle et al., 2011). While
pedometers are relatively inexpensive, they are unable to capture horizontal or upper limb
movements, as well as the frequency, intensity, timing, and duration of PA, all of which are key
contributors to overall PA and can ultimately result in inaccurate quantification of PA (Rowlands
& Eston, 2007; Sylvia et al., 2013; Butte et al., 2012). While all methods of PA measurement
have their respective strengths, weaknesses, and uses, accelerometry is a device-based method of
measurement with specific traits that make it ideal when capturing children’s PA patterns.
Accelerometry
        Accelerometry is frequently used to capture PA of children due to its high sampling rates
and ability to collect and store large amounts of data. One of the most commonly used
                                                   7


manufacturers of research-grade accelerometers in PA research is ActiGraph (Actigraph LLC,
Pensacola, FL). ActiGraph has produced several generations of activity monitors with each
iteration receiving updates in physical appearance, storage capacity, hardware and firmware,
Bluetooth capability, and number of axes (e.g., uniaxial versus triaxial) (Grydeland et al., 2014).
ActiGraph’s hallmark monitor, the wGT3x-BT, has been validated in age groups spanning from
preschool to older adulthood, in populations with physical disabilities, and across multiple wear-
locations (e.g., hip, wrist, ankle) (Albaum et al., 2019; Hänggi et al., 2014; Johansson et al.,
2015; Johansson et al., 2016; Karaca et al., 2021; McGarty, Penpraze, & Melville, 2016;
Nyström et al., 2017; Pate et al., 2013; Peterson et al., 2015; Sylvia et al., 2013; Trost & O’Neil,
2014). Accelerometry can detect information about the frequency, intensity, timing, and duration
of PA across a variety of participants and settings. Lastly, this high-resolution data can be
collected for weeks at a time, stored, and analyzed at a later date, making accelerometers a
particularly advantageous tool in large scale epidemiological studies.
        There are several decisions to be made when collecting and analyzing accelerometer data,
and each will impact the data collected and, ultimately, the harmonization of data and
comparability of results across studies. One decision that must be made is the ‘accelerometer
metric,’ or the way in which raw acceleration data are summarized, which can subsequently be
used to identify activity intensity. For example, acceleration data can be summarized as
variability in acceleration or total magnitude of the acceleration. While the accelerometer metric
will be the focus of this literature review, other decisions include (but are not limited to) device
brand, wear location, and epoch length.
        Accelerometer metrics are often summarized over a time interval, usually ranging from 1
to 60-s, called an epoch. From there, “intensity bins” called cut points can be used to classify
                                                   8


each epoch as a certain activity intensity. Many PA measurement researchers have created their
own cut points to classify activity intensities, and selection of the “right” or “best” cut point
greatly depends on researcher judgement in relation to other published work in the same
population (Kim et al., 2012). For example, Freedson et al. (2005), Evenson et al. (2008), and
Puyau et al. (2002) have all developed their own cut points for moderate-to-vigorous PA
(MVPA) intensity for children aged 6- through 10-years. Not only do these cut points often yield
considerably different estimates of activity intensities, they are often validated under dissimilar
conditions, such as different age groups (e.g., 5-8 y vs. 6-18 y) (Trost et al., 2012; Evenson et al.,
2008; Pfeiffer et al., 2006; Clevenger et al., 2020), different wear locations (e.g., hip vs waist)
(Crotti et al., 2020; Hangii et al., 2013; Montoye et al., 2018), and different epoch lengths
(Hilsop et al., 2012) making it nearly impossible to equate findings across studies.
Count-Based Metrics
         Activity counts or count-based metrics have been the standard since accelerometers were
first used to capture movement. Counts are derived from the raw acceleration and summed over
the three axes. From there, the counts are aggregated to a user-specific time interval ranging
from 1 to 60-s, called an epoch. Next, upper and lower thresholds are applied to the count data
called cut points to separate the data into light, moderate, and vigorous intensity activity. Cut
points are created by PA researchers and often times validated against criterion measures such as
oxygen consumption or heart rate to determine intensity level. However, since the count data
originate from the software-specific proprietary algorithms that for many years were not open-
sourced, transparency and comparability of estimates of time spent in activity intensities from
various device brands, and various applied cut points, is not possible.
                                                   9


Vertical Axis Counts
        Vertical axis (VA) counts are the uniaxial count values produced exclusively from the
vertical axis (i.e., y-axis). In the early years of development, VA were widely used and have their
own validated cut points (Freedson, Melanson, & Sirard, 1998). However, typical human
movement does not occur exclusively in one plane of motion, and utilization of VA disregards
important acceleration produced from the other planes of motion (Howe, Staudenmayer, &
Freedson, 2009), urging the use of triaxial vector magnitude counts. Despite their potential
shortcomings, VA are still the most used metric in children (Migueles et al., 2017).
Triaxial Vector Magnitude Counts
        Triaxial vector magnitude (VM) counts are defined as the square root of the sum of the
squared count values across each axis. The equation to calculate VM is:
Where a1 equals the axis 1 (vertical) counts, a2 equals axis 2 (mediolateral) counts, and a3 equals
axis 3 (anteroposterior) counts. Compared to uniaxial VA, triaxial VM is less dependent on
monitor orientation as the metric encompasses all three axes, making it advantageous to use in
children.
Acceleration-Based Metrics
        In 2009, a council of PA measurement scholars at the Objective Measurement of Physical
Activity Expert Consensus Meeting expressed the need for a more transparent and open-sourced
method to process accelerometer data (Troiano, 2005; Welk et al., 2012; Rowlands, 2018). From
this meeting emerged the shift to raw acceleration-based metrics, which are calculated using the
data obtained by the raw signal before any other processing has occurred (e.g., reintegration into
                                                     10


activity counts) (Rowlands, 2018). The use of raw acceleration data is attractive for a variety of
reasons. Until recently, the algorithms used to calculate activity counts were proprietary to each
device manufacturer, which greatly limited transparency of processing methods and
generalizability of data sets. Using the raw acceleration theoretically allows findings to be
equated across studies regardless of device manufacturer (Aittasalo et a., 2015) and, ultimately,
create more meaningful ways to interpret PA data obtained via surveillance and/or intervention
research (Rowlands, 2018). Assuming the accelerometer has been appropriately calibrated, has
similar dynamic range, and sampling rate, among other factors, the acceleration output should,
theoretically, be comparable across brands (e.g., ActiGraph, GENEActiv, Hookie) due to the fact
that data are collected in the same unit (i.e., g), and that there is no opaque filtration or
processing (Rowlands, 2018; Aittasalo et a., 2015; Rowlands et al., 2018). While acceleration-
based metrics may still rely on cut points to determine activity intensities, the cut points
themselves are no longer derived from the arbitrary activity counts.
        A variety of acceleration-based metrics have gained popularity throughout the years as
the call to switch to metrics utilizing the raw acceleration has grown. While machine learning
approaches (e.g., random forests, artificial neural networks) for raw acceleration have also been
developed, they will not be included in this literature review. A non-exhaustive list of metrics
that have gained traction among PA researchers in recent years are as follows: Euclidean norm of
the high-pass filtered signals (HFEN), HFEN plus Euclidean norm of the low-pass filtered
signals minus 1 g (HFEN+), Euclidean norm minus one with negative values set to zero (ENMO)
(van Hees et al., 2013), mean amplitude deviation (MAD) (Aittasalo et al., 2015; Vähä-Ypyä et
al., 2015), and the activity index (AI) (Bai et al., 2016). In its conception paper, HFEN+
outperformed metrics HFEN and ENMO when detecting total PA energy expenditure (van Hees
                                                    11


et al., 2013). However, HFEN+ proved to be too computationally complex for understanding and,
ultimately reproduction, compared to ENMO, thus, halting its development (van Hees et al.,
2013). Therefore, MAD, ENMO, and the AI have been selected for this proposed master’s thesis
and will be the only acceleration-based metrics discussed in the remainder of the literature
review.
Mean amplitude deviation
         Mean amplitude deviation (MAD), measured in milli-gravitational units (mg), is a raw
acceleration metric that captures the variation around the mean of the raw acceleration signals
(Aittasalo et al., 2015; Vähä-Ypyä et al., 2015). The equation to calculate MAD is:
Where n is the number of samples in each epoch, is the resultant acceleration in the ith
timepoint, and is the mean resultant acceleration value of the entire epoch. The resultant
acceleration is defined as the vector magnitude of acceleration in the three axes (Vähä-Ypyä et
al., 2015).
         MAD has been developed and validated in a variety of populations, but the most
applicable to the current thesis was in a population of 20 healthy adolescents aged 13- to 15-
years (female=10; age=14.2) (Aittasalo et al., 2015). Participants wore a heart rate monitor at the
chest and two accelerometers, an ActiGraph GT3X at the left hip and Hookie AM13 at the right
hip, on an elastic belt while performing a variety of free-living activities (e.g., sitting while
working on computer, lying supine) as well as walking and jogging around a 100-m indoor track.
The MAD values obtained via the ActiGraph and Hookie monitors exhibited a strong linear
                                                 12


correlation with heart rate (0.97 and 0.96, respectively). For the ActiGraph, three intensity-
specific cut points were determined to separate sedentary, light, moderate, and vigorous activity
(26.9, 332.0, 558.3 mg).
        Despite using two different manufacturers (e.g., ActiGraph and Hookie), the MAD cut
points obtained from both devices were mostly consistent and differed, at most, by 45.5 mg
(558.3 mg vs 603.8 mg for ActiGraph and Hookie, respectively) at the vigorous intensity.
Aittasalo et al. (2018) state that this dissonance could be due in part to Hookies’ higher sampling
frequency of 100Hz, compared to ActiGraph’s 30Hz sampling rate at the time of the study. Since
this study, ActiGraph monitors have the ability to sample at 100Hz, which may result in even
more precise MAD performance between brands.
Euclidean norm minus one
        Euclidean norm minus one (ENMO), reported in milli-gravitational units (mg), is a
second raw acceleration metric that is calculated as the square root of the sum of the squared
accelerations values in each axis, minus 1, as an adjustment for gravity, and with the negative
values rounded to zero (van Hees et al., 2013; Hildebrand et al., 2014). Some researchers
calculate ENMO after conducting an auto-calibration procedure (van Hees et al., 2014).
                     where negative values are rounded to zero after subtraction
Where a1 equals axis 1 acceleration, a2 equals axis 2 acceleration, and a3 equals axis 3
acceleration. ENMO was developed and validated in children and adults aged 7- through 11-
years and 18- through 65-years, respectively (Hildebrand et al., 2014). Participants completed
eight activities of varying intensities (e.g., lying, sitting, running) while wearing an ActiGraph
                                                    13


and GENEActiv accelerometer at the right hip and non-dominant wrist. Oxygen consumption
(VO2) was measured via portable metabolic unit and metabolic equivalents (METs) were used as
the criterion measure for development of ENMO thresholds. MET values were classified as light
(<3 METs), moderate (>3 but <6 METs), or vigorous (6 METs) intensity activity for both
children and adults (Ainsworth et al., 2011).
        Overall, the ENMO metric exhibited strong correlations with VO2 (Hildebrand et al.,
2014). The intensity classification accuracy of the developed ENMO cut points was highest
when detecting sedentary/light intensity activities, correctly identifying 93-96% of values, and
lowest at moderate intensity activities, correctly identifying 54-59% of values. Between brands,
ENMO performed well, demonstrating no main differences in adults or children.
Activity index
        The activity index (AI) is the final acceleration-based metric of interest that will be used
in the current investigation. AI is a unitless metric and is calculated as the variance of the
acceleration value along the three axes (Bai et al., 2016). The equation to calculate the AI is:
Where is the variance of the participant i’s acceleration signals along each axis, m (m=1, 2, 3), in
the window of length H starting at t. The value , called sigma, is the systematic noise variance
which is calculated when the device is not moving.
                                                  14


        The AI was developed and validated by Bai et al. (2016) in a sample of healthy adult
women between 60- and 91-years of age. Participants wore an ActiGraph accelerometer at the
right hip and performed activities of daily living (e.g., sitting, doing laundry, washing dishes)
and more intense activities like brisk walking. Oxygen consumption (VO2) was measured via
portable metabolic unit and METs were used as the criterion measure to which both the AI and
VM were compared. MET values <1.5 were classified as sedentary, ≥1.5 but less than 3 were
classified as light PA, and values ≥3 were classified as MVPA. Unlike metrics MAD and
ENMO, the AI was not validated in children, and findings from the study by Bai et al. (2016)
cannot be applied to younger populations. Therefore, a study utilizing the AI in populations of
children is warranted to understand normative values of the AI metric.
Comparability of Metrics
        Despite the shifted interest in developing and validating acceleration-based metrics, little
work has been done on the comparability of these metrics to each other or to established count-
based metrics. This is crucial in overall metric development, as this supports the basis of data
harmonization. For example, surveillance PA data from the United States has primarily been
collected in the activity counts metric (Luke et al., 2011) while surveillance data from Finland
has used the acceleration-based metric MAD (Husu et al., 2016) and large-scale epidemiological
studies in the United Kingdom use ENMO (Doherty et al., 2017). This makes harmonization
problematic, and comparisons cannot be made between PA data sets on the country-wide level.
        Some work that has been done regarding comparability of metrics has demonstrated
inconclusive results. A study done in 2016 by Bai and colleagues comparing acceleration-based
metric ENMO and count-based VM against their newly developed acceleration-based metric, the
AI, added to the literature demonstrating the ability of acceleration-based metrics to outperform
                                                  15


count-based metrics. Bai and colleagues (2016) found that the AI was a better classifier of
sedentary/light activity, but not MVPA, when compared to ENMO. Furthermore, the AI metric
outperformed VM when classifying all activity intensities as demonstrated by receiver operating
curve (ROC) analyses (Bai et al., 2016). Bai et al. (2016) also found that ENMO outperformed
traditional activity counts when detecting MVPA, but activity counts demonstrated better
classification of sedentary/light activities compared to ENMO.
         Other studies have compared count- and acceleration-based metrics amongst themselves
in free living and without a criterion measure. Migueles and colleagues (2019) reported
significantly higher estimations of time spent in MVPA produced by VM when compared to
estimations of MVPA derived from VA and the acceleration-based metric ENMO. A recent
paper from Karas and colleagues (2022) compared VM to MAD, ENMO, AI, and a fourth
acceleration summary metric, the Movement-Independent Movement Summary (MIMS-unit)
(John et al., 2019). Using data from the Baltimore Study of Longitudinal Aging, Karas and
colleagues found strong minute-level Pearson’s correlations across all metric comparisons,
ranging from 0.87 (MIMS vs ENMO) to 0.99 (VM vs MIMS). Both the acceleration- and count-
based metrics yielded similar graphical curves when estimating minute-level patterns of daily PA
(Karas et al., 2022). However, drawing conclusions on the performance of acceleration- versus
count-based metrics is difficult due to the lack of studies directly comparing estimates of PA as
produced by each metric.
         In addition to the limited number of studies in this area, existing literature often does not
comprehensively compare multiple metrics. This presents a considerable problem as researchers
continue to develop and utilize new metrics without understanding the strengths and limitations
of other acceleration-based metrics. Furthermore, there are no studies that provide comparability
                                                  16


data on these metrics in populations of children. Performance of different metrics is critical when
attempting to look at patterns of PA overtime, as different data reduction techniques via different
metrics may alter overall PA.
Temporal Patterns of Physical Activity
        Another aspect of PA that should be considered but for which we have limited
information are temporal patterns, which are defined as the chronological succession of PA over
time (De Baere et al., 2015). The context and temporality of PA, particularly during the recess
period, can provide useful insight into what drives or deters unstructured free play PA in
children. Previous research regarding the temporality of recess PA indicate that PA is higher
during the transition to and start of recess, and gradually declines as the recess period continues
(McKenzie et al., 1997; Holmes, 2012). However, more specific, recent information about
temporal patterns of PA is not available due to limited work in this area.
Measurement of Temporal Patterns
        Accelerometry is an ideal method for capturing temporal patterns of movement behaviors
because data are time-stamped and collected at a high resolution. These data can be processed
using machine learning approaches like clustering or by quantifying the frequency and intensity
of bouts. Activity fragmentation is an approach that is used to identify patterns of fragmented
activity, and or continuous or discontinuous PA patterns via accelerometry with a focus on
length and duration of active or inactive bouts (Wanigatunga et al., 2019; Palmberg et al., 2020;
Tian et al., 2021). The activity fragmentation approach can be operationalized or quantified in a
number of ways with some of the more common methods being identification of transition
probabilities (e.g., active-to-sedentary, sedentary-to-active) (Schrack et al., 2019), number of
activity fragments, and mean duration of activity fragments (Chastin et al., 2012).
                                                  17


         Previously, activity fragmentation has been used primarily in populations of older adults
to illustrate the inverse relationship between more fragmented PA and worse functional health
(Wanigatunga et al., 2019; Palmberg et al., 2020; Tian et al., 2021). While activity fragmentation
is a relatively new approach, and no prior research has used activity fragmentation metrics in
children, activity fragmentation may manifest differently in children and prove be a useful tool
for quantifying the temporal patterns of children's PA by providing a numerical representation of
continous or discontinous PA. Investigation into the temporal patterns of children’s recess
activity, for example by using activity fragmentation metrics, can be used to highlight beneficial
or detrimental patterns, as well as group differences, that can further inform PA interventions.
However, because children’s PA occurs in transient bouts lasting approximately 20-s (Bailey et
al. 1995), important fluctuations of children’s PA would be lost if PA was condensed over a 60-s
epoch, as is standard practice in adults. As shorter epochs capture more accurate estimates of PA
when reducing accelerometry data in children (Aadland et al., 2020), use of a 5-s window length
for calculating activity fragmentation metrics (i.e., classifying bouts and transitions) may be
more appropriate in children.
         In populations of adolescents and adults, activity fragmentation has been calculated using
traditional count-based metrics (Schrak et al., 2019; Del Pozo Cruz & Del Pozo-Cruz, 2021), as
well as acceleration-based metrics ENMO (Osborn et al., 2018) and MAD (Palmberg et al.,
2020), but no research on comparability of activity fragmentation calculated from different
accelerometer metrics has been done. Similar to the issue with accelerometer metrics and
quantification of PA, it is unknown if different metrics similarly capture the temporal nature of
activity fragmentation metrics and should be investigated as popularity of acceleration-based
metrics grow.
                                                 18


Conclusions
        In summary, the use of count-based metrics was an important contribution to the field of
PA measurement via accelerometry but is outdated as between-manufacturer and between-study
comparisons cannot be made. Using metrics generated from raw acceleration facilitates
uniformity, transparency, and comparability between devices and studies, and is a critical step
for the development of PA measurement research. Further investigation into the comparability of
the acceleration metrics of interest, MAD, ENMO, and the AI, for capturing both overall
intensity and temporality of PA is warranted. Accelerometer metrics in combination with activity
fragmentation can provide crucial information regarding children’s recess PA levels in order to
develop more effective strategies to optimize the PA obtained during outdoor recess, further
contributing to children’s overall health and development.
                                                19


                                             METHODS
Setting and Participants
         Three elementary schools in East Lansing, Michigan agreed to participate in this study.
Children (N=88; age=7.8±0.7 yrs) from five classrooms per school, including seven 1st grade
(n=50) and eight 2nd grade (n=38) classrooms, participated. All children enrolled in 1st or 2nd
grade at the time of data collection were eligible to participate. One parent/guardian provided
written informed consent and each child provided verbal and written assent on the first day of
data collection.
         Data collection occurred during May and June 2019, when the average temperature was
72 degrees Fahrenheit. Each school provided two, 20-minute recess periods per day with one
recess in the morning and one in the afternoon for a total of 40-minutes per day of scheduled
outdoor recess. Recess took place on the schoolyard, which consisted of an open grassy field,
fixed equipment (e.g., slides, swings), and asphalt areas used for games like basketball or
foursquare.
Data Collection
         Each child participated in up to four days of data collection. However, to reduce
inconsistencies in the amount of data per child due to absences and to limit the hierarchical
nature of the data, one recess period per child was selected using a random number generator for
inclusion in the analyses.
         Children wore a triaxial accelerometer (ActiGraph, LLC, Pensacola, FL) on an elastic
belt at the right hip, the most commonly used accelerometer wear location in this age group
(Migueles et al. 2017). The accelerometer was an ActiGraph wGT3X-BT (firmware v1.9.2; n =
50), GT3X+ (firmware v3.2.1; n = 24), wGT3X+ (firmware v3.2.1; n = 9), or a GT9X Link
                                                  20


(v1.7.2; n = 5). Multiple generations of ActiGraphs were used due to limited availability of
accelerometers at the time of the study. The GT3X+ and wGT3X+ devices have a slightly
different dynamic range (±6g) and internal processing steps compared to the wGT3X-BT and
GT9X (dynamic range of ±8g). Clevenger et al. (2020) demonstrated that, despite these small
differences, data are comparable across multiple generations of ActiGraph devices. Furthermore,
the goal of the current study was to compare metric outcomes within a participant from the same
device, so differences in accelerometer models should not affect the findings. Accelerometers
were initialized to record raw acceleration data at 30 Hz with the same start time using ActiLife
software (version 2.0.0). After each day, accelerometers were returned to a study member and
data were downloaded using the ActiLife software and stored on a computer in a protected
location on Michigan State University campus as raw acceleration and activity counts per 5-s
Data Processing
        Only data from the selected recess periods were included in the present analysis. Recess
start/end times were identified using the schedule provided by each school and video recordings
of the schoolyard. The start of each recess period was determined by the first child in the camera
angle that set foot outside the school building and onto the schoolyard. Similarly, the end of
recess was determined by the last child in the camera angle that set foot inside the school
building. Attendance logs completed by research staff were used to determine each child’s
presence/absence during each recess period.
Physical Activity Intensity
        Count and raw data from the accelerometers were loaded into RStudio (version 1.3.1056)
as “.csv” files using the “AGread” package (version 1.1.1) (Hibbing, 2018), which was also used
to calculate ENMO (Hildebrand et al., 2014). MAD was calculated as the variability in the
                                                 21


triaxial acceleration about the mean (Aittasalo et al., 2015). AI (Bai et al., 2016) was calculated
the variance of the acceleration value along the three axes (Bai et al., 2016) using the package
“SummarizedActigraphy” (version 0.5.0). All five accelerometer metrics (VM, VA, MAD,
ENMO, and AI) were calculated over a 5-s epoch and are continuous variables wherein higher
values indicate higher intensity activity. For each metric, all 5-s epoch data across all participants
was used to identify quartiles. Each 5-s epoch was then assigned a value of 1 through 4 based on
these quartile thresholds as a proxy for activity intensity. For each participant, the average of
each metric (e.g., mean VM) and time spent in each quartile according to each metric was
calculated over the participant’s selected recess period. Finally, each child was ranked from most
to least active using the mean of each of the five metrics and these rankings were used to further
classify children as least active, below average, above average, or most active.
Temporal Patterns
         For each participant, five fragmentation indices were calculated for each metric using
modified code from the “GGIR” package (version 2.6) (van Hees et al., 2022). The number of
active fragments were calculated by segmenting epoch-level data into active (quartiles 2 to 4) or
inactive (quartile 1). Number of high activity fragments were calculated similarly but were
defined as epochs classified as quartiles 3 to 4. Mean duration of high activity fragments were
calculated in seconds. Inactivity-to-physical-activity transition probability, which represents the
likelihood of switching from inactivity to activity, was calculated as 1 divided by the mean
duration of inactive fragments. Physical-activity-to-inactivity transition probability was
calculated as the reciprocal of the mean duration of activity fragments.
Statistical Analyses
         Pearson’s (r) correlations were used to assess the strength of the association between
                                                  22


metrics at the epoch-level and were interpreted as poor (r=0.20 to 0.30), fair (r=0.30 to 0.50),
moderate (r=0.60 to 0.70), strong (r=0.80 to 0.90), and perfect (r=1.0) (Chan, 2003). Weighted
kappa was () was calculated using the “irr” package (version 0.84.1) (Gamer, Lemon, & Singh,
2012) to assess agreement between metrics in the assigned quartile of each epoch (i.e., 1-4) and
interpreted as no agreement (=0 to 0.20), minimal (=0.21 to 0.39), weak (=0.40 to 0.59),
moderate (=0.69 to 0.79), strong (=0.80 to 0.90), and almost perfect (=>0.90) (McHugh, 2012).
Weighted kappa is appropriate because it accounts for ordering of the categories (Cohen, 1968).
For example, a misclassification of a quartile 4 epoch as quartile 1 is more greatly penalized than
a quartile 4 epoch misclassified as quartile 3.
        Once collapsed to the participant level, Pearson’s r correlation coefficients were used to
compare the mean of each metric, time spent in each quartile, and the five fragmentation indices
between metrics. Two one-sided tests of equivalence were performed using the “TOSTER”
package (version 0.4.0) (Lakens, 2017) and assessed equivalence in percent of time spent in each
quartile and the fragmentation indices between the five metrics. Equivalence testing is more
appropriate than traditional hypothesis testing when a meaningful difference is not expected
between comparison group averages (Dixon et al., 2018). Equivalence bounds were determined
as five percent of the mean of each metric. Spearman’s rho was used to the assess associations of
each child’s ranking based on the mean of each metric. Confusion matrices were created using
the package “caret” (version 6.0-86) (Kuhn, 2008) to assess agreement between metrics when
categorizing children as least active, below average, above average, or most active.
                                                 23


                                            RESULTS
Participant characteristics
        The final study sample consisted of 50 first graders and 38 second graders across all 3
schools (N=88) with a mean age of 7.8 years (SD=0.7). School 1 had 52 children participate;
School 2 had 16 participants; and School 3 had 20 children participate. Three children were
excluded from the present analysis (one child dropped out due to behavioral issues, two children
did not wear the accelerometer belt). Overall, there were more female participants (74%; n=65)
than males (26%; n=23). Children included in the final analyses had an average of 25.9 3.0
minutes of data for the randomly selected recess period.
Aim 1
Physical activity intensity
        Mean values per 5-s epoch and per participant for the five accelerometer metrics are
reported in Table 1. At the epoch-level, correlation coefficients for the association between
metrics ranged from moderately high (r=0.69, VM vs ENMO) to high (r=0.98, AI vs ENMO, AI
vs MAD, MAD vs ENMO) (Table 2). Weighted kappa indicated that agreement between each 5-
s epoch ranged from strong to almost perfect, with the weakest agreement between VM and
ENMO ( = 0.81) and the strongest agreement between AI and MAD ( = 0.96) (see Table 2).
                                                 24


Table 1. Means SD for metrics at the epoch- and participant-level.
 Metric                             Epoch-level                        Participant-level
 VM (counts/5s)                     305.3 327.6                          308.4 146.9
 VA (counts/5s)                     154.4 224.9                           156.5 89.7
 ENMO (mg)                          135.8 188.4                           137.8 77.9
 MAD (mg)                           192.7 241.9                          195.4 104.1
 AI                                   1.1 1.2                               1.1 0.5
VM = vector magnitude; VA = vertical axis; ENMO = Euclidean norm minus one; MAD = mean
amplitude deviation; AI = activity index
Note: AI is a unit less metric.
Table 2. Pearson’s (r) correlations and weighted kappa () for all metrics when compared at a
5-s epoch.
        Comparison                 Pearson’s r              Kappa
         AI vs VM                     0.70                   0.87
       AI vs ENMO                     0.98                   0.93
        AI vs MAD                     0.98                   0.96
         AI vs VA                     0.70                   0.87
      MAD vs ENMO                     0.98                   0.94
       MAD vs VM                      0.72                   0.84
        MAD vs VA                     0.75                   0.86
      VM vs ENMO                      0.69                   0.81
        VM vs VA                      0.94                   0.90
       ENMO vs VA                     0.72                   0.83
AI = activity index; VM = vector magnitude; ENMO = Euclidean norm minus one;
MAD = mean amplitude deviation; VA = vertical axis
        Participant-level correlations between mean metrics ranged from moderately high
(r=0.77, AI vs VM) to high (r=0.99, ENMO vs MAD, r=0.99, MAD vs AI). Participant-level
                                                25


correlations between time spent in each quartile ranged from small (r=0.29, ENMO vs VA
quartile 2) to high (r=0.99, MAD vs AI quartiles 1 and 4). Percent of time spent in quartile 1 was
equivalent for five of the ten pairwise metric comparisons (AI vs VM, AI vs MAD, AI vs VA,
VM vs MAD, VM vs VA) (Table 3). Similarly, percent of time spent in quartile 2 was equivalent
for five metric comparisons (AI vs VM, AI vs MAD, AI vs VA, VA vs VM, VA vs MAD)
(Table 4). All metrics were equivalent when estimating percent of time spent in quartile 3 (Table
5) while percent of time spent in quartile 4 was equivalent for all but three comparisons (VM vs
AI, VM vs MAD, and VM vs ENMO) (Table 6).
Table 3. Equivalence between metrics in percent of time spent in quartile 1 activity level.
                             Bias                            Equivalence Test
    Comparison              Mean          SE       Confidence Interval          Equivalent
     AI vs. VM               0.11        0.34          -0.45, 0.67                 Yes
  MAD vs. ENMO              -0.15        1.51          -2.67, 2.36                 No
    VM vs. MAD               0.15        0.50          -0.68, 0.99                 Yes
   ENMO vs. VM              <0.00        1.52          -2.53, 2.54                 No
   VA vs. ENMO               0.15        1.55          -2.44, 2.73                 No
   AI vs. ENMO               0.11        1.49          -2.37, 2.58                 No
    MAD vs. AI              -0.04        0.22          -0.41, 0.33                 Yes
      VA vs. AI              0.25        0.51          -0.60, 1.10                 Yes
     VM vs. VA              -0.14        0.41          -0.82, 0.53                 Yes
    MAD vs. VA               0.30        0.62          -0.73, 1.32                 No
AI = activity index; VM = vector magnitude; MAD = mean amplitude deviation; ENMO =
Euclidean norm minus one; VA = vertical axis
Confidence intervals were compared to an equivalence bound of -1.230, 1.230 (5% of the mean)
to determine equivalence.
                                                 26


Table 4. Equivalence between metrics in percent of time spent in quartile 2 activity level.
                          Bias                             Equivalence Test
    Comparison            Mean         SE       Confidence Interval           Equivalent
     AI vs. VM            -0.10       0.66           -1.14, 1.04                  Yes
  MAD vs. ENMO             0.24       1.39           -2.10, 2.54                  No
    VM vs. MAD            -0.09       0.76           -1.35, 1.17                  No
   ENMO vs. VM             0.15       1.48           -2.32, 2.62                  No
   VA vs. ENMO             0.13       1.43           -2.25, 2.51                  No
   AI vs. ENMO            -0.20       1.36           -2.46, 2.10                  No
    MAD vs. AI             0.04       0.28           -0.42, 0.50                  Yes
     VA vs. AI            -0.10       0.59           -1.05, 0.92                  Yes
     VM vs. VA             0.02       0.54           -0.88, 0.91                  Yes
    MAD vs. VA            -0.11       0.65           -1.19, 0.98                  Yes
 AI = activity index; VM = vector magnitude; MAD = mean amplitude deviation; ENMO =
Euclidean norm minus one; VA = vertical axis
Confidence intervals were compared to an equivalence bound of -1.248, 1.248 (5% of the mean)
to determine equivalence.
                                              27


Table 5. Equivalence between metrics in percent of time spent in quartile 3 activity level.
                          Bias                             Equivalence Test
    Comparison            Mean         SE       Confidence Interval           Equivalent
     AI vs. VM            -0.03       0.52           -0.88, 0.84                  Yes
  MAD vs. ENMO            -0.08       0.42           -0.78, 0.61                  Yes
    VM vs. MAD             0.02       0.57           -0.95, 0.95                  Yes
   ENMO vs. VM            -0.09       0.70           -1.25, 1.08                  Yes
   VA vs. ENMO            -0.14       0.64           -1.20, 0.92                  Yes
   AI vs. ENMO             0.07       0.47           -0.71, 0.85                  Yes
    MAD vs. AI            -0.02       0.25           -0.44, 0.40                  Yes
     VA vs. AI            -0.08       0.45           -0.83, 0.68                  Yes
     VM vs. VA             0.06       0.52           -0.81, 0.93                  Yes
    MAD vs. VA            -0.06       0.48           -0.86, 0.74                  Yes
 AI = activity index; VM = vector magnitude; MAD = mean amplitude deviation; ENMO =
Euclidean norm minus one; VA = vertical axis
Confidence intervals were compared to an equivalence bound of -1.254, 1.254 (5% of the mean)
to determine equivalence.
                                              28


Table 6. Equivalence between metrics in percent of time spent in quartile 4 activity level.
                            Bias                               Equivalence Test
    Comparison             Mean           SE       Confidence Interval          Equivalent
     AI vs. VM              -0.04        0.77            -1.32, 1.24               No
  MAD vs. ENMO             < -0.00       0.26            -0.44, 0.43               Yes
    VM vs. MAD              -0.06        0.80            -1.40, 1.27               No
   ENMO vs. VM              -0.07        0.80            -1.40, 1.27               No
   VA vs. ENMO              -0.13        0.62            -1.16, 0.90               Yes
   AI vs. ENMO               0.02        0.30            -0.48, 0.53               Yes
    MAD vs. AI               0.02        0.26            -0.41, 0.45               Yes
      VA vs. AI             -0.11        0.64            -1.18. 0.96               Yes
     VM vs. VA               0.07        0.46            -0.70, 0.83               Yes
    MAD vs. VA              -0.13        0.60            -1.13, 0.87               Yes
 AI = activity index; VM = vector magnitude; MAD = mean amplitude deviation; ENMO =
Euclidean norm minus one; VA = vertical axis
Confidence intervals were compared to an equivalence bound of -1.268, 1.268 (5% of the mean)
to determine equivalence.
        When children were ranked from most to least active using each of the five accelerometer
metrics, agreement between metrics was very strong 0.87 (Table 7). When these rankings were
used to classify children into one of four groups (least active, below average, above average,
most active), the highest agreement was seen for classifying the least active children, wherein
82-95% of children were categorized as least active by both the referent and comparison metric
(e.g., VA classification and VM classification). There was more variability when examining the
below average (50-91% concordance) and above average (50-86% concordance) groups. Finally,
64-91% of children were categorized as most active by the referent and comparison metrics in
their respective comparisons (Table 8).
                                                 29


Table 7. Association between the ranking of children’s activity level during recess by five
accelerometer metrics (Spearman’s rho).
                     VM Rank        MAD Rank      ENMO Rank           VA Rank
 VM Rank                 -
 MAD Rank               0.88               -
 ENMO Rank              0.87             0.99            -
 VA Rank                0.97             0.90          0.89               -
 AI Rank                0.87             0.99          0.98             0.87
VM = vector magnitude; MAD = mean amplitude deviation; ENMO = Euclidean norm minus
one; VA = vertical axis; AI = activity index
Table 8. Confusion matrix showing agreement in the classification of each child’s overall
activity level as least active, below average activity, above average activity, and most active
according to five accelerometer metrics. The metric in the left-most column served as the
referent group and numbers are represented as raw number (percent) of children within each
activity group according to the referent group that were also classified as that activity group by
the other metric.
                                                    VA Classification
 VM Classification                Least          Below Average        Above Average     Most
    Least                        19 (86)               3 (14)               0 (0)        0 (0)
    Below Average                 3 (14)              15 (68)              4 (18)        0 (0)
    Above Average                  0 (0)              4 (18)              16 (73)        2 (9)
    Most                           0 (0)               0 (0)                2 (9)      20 (91)
                                                   MAD Classification
 VM Classification                Least          Below Average        Above Average     Most
    Least                        18 (82)               4 (18)               0 (0)        0 (0)
    Below Average                 4 (18)              12 (55)              4 (18)        2 (9)
    Above Average                  0 (0)              5 (23)              11 (50)       6 (27)
    Most                           0 (0)               1 (5)               7 (32)      14 (64)
                                                  ENMO Classification
 VM Classification                Least          Below Average        Above Average     Most
    Least                        18 (82)               4 (18)               0 (0)        0 (0)
    Below Average                 4 (18)              11 (50)              5 (23)        2 (9)
    Above Average                  0 (0)              6 (27)              11 (50)       5 (23)
    Most                           0 (0)               1 (5)               6 (27)      15 (68)
                                                    AI Classification
 VM Classification                Least          Below Average        Above Average     Most
    Least                        19 (86)               3 (14)               0 (0)        0 (0)
    Below Average                 3 (14)              12 (55)              6 (27)        1 (5)
    Above Average                  0 (0)              5 (23)              11 (50)       6 (27)
    Most                           0 (0)               2 (9)               5 (23)      15 (68)
                                                  30


 Table 8 (cont’d)
                                                  VA Classification
 ENMO Classification          Least             Below Average        Above Average  Most
   Least                    18 (82)                  4 (18)              0 (0)       0 (0)
   Below Average             4 (18)                 13 (59)              4 (18)      1 (5)
   Above Average              0 (0)                  5 (23)             10 (45)     7 (32)
   Most                       0 (0)                   0 (0)             8 (36)     14 (64)
                                                   AI Classification
 ENMO Classification          Least             Below Average        Above Average  Most
   Least                    21 (95)                   1 (5)              0 (0)       0 (0)
   Below Average              1 (5)                 19 (86)              2 (9)       0 (0)
   Above Average              0 (0)                   2 (9)             19 (86)      1 (5)
   Most                       0 (0)                   1 (5)              1 (5)     21 (95)
                                                 MAD Classification
 ENMO Classification          Least             Below Average        Above Average  Most
   Least                    21 (95)                   1 (5)              0 (0)       0 (0)
   Below Average              1 (5)                 19 (86)              1 (5)       0 (0)
   Above Average              0 (0)                   2 (9)             18 (82)      2 (9)
   Most                       0 (0)                   0 (0)              2 (9)     20 (91)
                                                   AI Classification
 MAD Classification           Least             Below Average        Above Average  Most
   Least                    21 (95)                   1 (5)              0 (0)       0 (0)
   Below Average              1 (5)                 20 (91)              1 (5)       0 (0)
   Above Average              0 (0)                   1 (5)             19 (86)      2 (9)
   Most                       0 (0)                   0 (0)              2 (9)     20 (91)
                                                  VA Classification
 MAD Classification           Least             Below Average        Above Average  Most
   Least                    19 (86)                  3 (14)              0 (0)       0 (0)
   Below Average             3 (14)                 15 (68)              4 (18)      0 (0)
   Above Average              0 (0)                  3 (14)             11 (50)     8 (36)
   Most                       0 (0)                   1 (5)             7 (32)     14 (64)
                                                   AI Classification
 VA Classification            Least             Below Average        Above Average  Most
   Least                    18 (82)                  4 (18)              0 (0)       0 (0)
   Below Average             4 (18)                 13 (59)              5 (23)      0 (0)
   Above Average              0 (0)                  3 (14)             11 (50)     8 (36)
   Most                       0 (0)                   2 (9)             6 (27)     14 (64)
VM = vector magnitude; VA = vertical axis; MAD = mean amplitude deviation; ENMO =
Euclidean norm minus one; AI = activity index
Least = Children whose overall activity level fell into the lowest quartile
Most = Children whose overall activity level fell into the highest quartile
                                                31


Aim 2
Temporal characteristics
         Accelerometer metrics are plotted over the randomly selected recess periods by school
(i.e., school 1, 2, and 3) in Figures 1 to 3. Time in number of 5-s epochs is shown on the x-axis
and metrics are plotted on the y-axis, with the right side of the y-axis displaying the unitless
metric AI. Means and standard deviations for fragmentation indices are displayed in Table 9.
Figure 1. Five accelerometer metrics plotted over randomly selected recess periods for school 1.
VM = vector magnitude; VA = vertical axis; MAD = mean amplitude deviation;
ENMO = Euclidean norm minus one; AI = activity index
                                                   32


Figure 2. Five accelerometer metrics plotted over randomly selected recess periods for school 2.
VM = vector magnitude; VA = vertical axis; MAD = mean amplitude deviation;
ENMO = Euclidean norm minus one; AI = activity index
                                               33


Figure 3. Five accelerometer metrics plotted over randomly selected recess periods for school 3.
VM = vector magnitude; VA = vertical axis; MAD = mean amplitude deviation;
ENMO = Euclidean norm minus one; AI = activity index
                                               34


Table 9. Fragmentation indices derived from five accelerometer metrics.
                                                                                                         Mean Duration of
                                     Transition         Number of High Activity  Number of Activity
                                                                                                            High Activity
 Metric        Mean SD          Probability Physical           Fragments             Fragments
                                                                                                            Fragments in
                                Activity to Inactivity           (Q3-4)                 (Q2-4)
                                                                                                               Seconds
  VM          308.4 146.9            0.36 0.10                62.30 23.38            41.65 21.34             22.61 11.79
   VA           156.5 9.7             0.35 0.09               60.82 23.06            43.05 21.64             23.08 10.39
  MAD         195.4 104.5            0.31 0.09                51.36 21.24            34.85 19.80             29.25 20.18
 ENMO          137.8 77.9            0.30 0.11                50.91 21.42            34.18 22.01             33.03 43.95
   AI            1.1 0.5             0.32 0.10                52.75 22.14            37.84 20.58             28.58 19,93
VM = vector magnitude; VA = vertical axis; MAD = mean amplitude deviation; ENMO = Euclidean norm minus one; AI = activity
index
                                                                35


        The lowest physical-activity-to-inactivity transition probabilities (Figure 4) were
observed with acceleration-based metrics ENMO, MAD, and AI (30, 31, 32%, respectively). The
highest physical-activity-to-inactivity transition probabilities were seen with metrics VA and
VM, 35 and 36%, respectively. Similar trends were observed with the inactivity-to-physical-
activity transition probability shown in Figure 4. Number of high activity fragments (Figure 5)
and activity fragments (Figure 6) was highest for VM (62.30) and lowest for ENMO (50.91).
Mean duration of high activity fragments according to 5-s epochs was highest for ENMO (6.61
5-s epochs) and lowest for VM (4.52 5-s epochs) (Figure 7).
Figure 4. Probability of transitioning from inactivity to activity according to five accelerometer
metrics.
VM = vector magnitude; VA = vertical axis; MAD = mean amplitude deviation;
ENMO = Euclidean norm minus one; AI = activity index
                                                  36


Figure 5. Probability of transitioning from activity to inactivity according to five accelerometer
metrics.
VM = vector magnitude; VA = vertical axis; MAD = mean amplitude deviation;
ENMO = Euclidean norm minus one; AI = activity index
                                                 37


Figure 6. Number of high activity fragments reported according to five accelerometer metrics.
VM = vector magnitude; VA = vertical axis; MAD = mean amplitude deviation;
ENMO = Euclidean norm minus one; AI = activity index
                                              38


Figure 7. Number of activity fragments according to five accelerometer metrics.
VM = vector magnitude; VA = vertical axis; MAD = mean amplitude deviation;
ENMO = Euclidean norm minus one; AI = activity index
        Correlations for each of the fragmentation indices are displayed in Table 10. Correlations
for number of high activity fragments were all high (r=0.80). For number of activity fragments,
correlations were lowest for VA vs ENMO (r=0.67, moderately high), and highest for MAD vs
AI (r=0.97, high). Correlations for mean duration of high activity fragments in number of 5-s
epochs ranged from moderate (r=0.53, ENMO vs VM) to high (r=0.99, MAD vs AI).
                                                39


Table 10. Pearson’s correlations for fragmentation indices as reported by comparison of each of
the five metrics.
                        Transition       Transition                                        Mean
                                                         Number of     Number of
                       Probability      Probability                                      Duration of
                                                        High Activity    Activity
     Comparison          Physical       Inactivity to                                  High Activity
                                                         Fragments      Fragments
                        Activity to       Physical                                     Fragments in
                                                           (Q3-4)         (Q2-4)
                        Inactivity        Activity                                        Seconds
      AI vs. VM            0.91             0.92            0.92           0.96             0.87
   MAD vs. ENMO            0.84             0.82            0.97           0.78             0.73
     VM vs. MAD            0.90             0.89            0.91           0.92             0.84
    ENMO vs. VM            0.78             0.73            0.89           0.69             0.53
    VA vs. ENMO            0.77             0.78            0.89           0.67             0.54
    AI vs. ENMO            0.83             0.84            0.96           0.73             0.71
     MAD vs. AI            0.98             0.96            0.99           0.97             0.99
      VA vs. AI            0.91             0.92            0.92           0.95             0.82
      VM vs. VA            0.92             0.92            0.95           0.96             0.88
     MAD vs. VA            0.90             0.89            0.92           0.92             0.82
AI = activity index; VM = vector magnitude; MAD = mean amplitude deviation;
ENMO = Euclidean norm minus one; VA = vertical axis
        Equivalence of various fragmentation indices as categorized by each metric are show in
Tables 11-15. Only two comparisons (MAD vs AI, VM vs VA) were equivalent when comparing
transition probability of activity to inactivity. One comparison (VM vs VA) was equivalent when
comparing transition probability of inactivity to activity. For number of high activity fragments,
only two comparisons were equivalent (MAD vs ENMO, MAD vs AI). None of the comparisons
were equivalent for number of activity fragments. MAD vs AI and MAD vs VA were the only
equivalent comparisons for mean duration of high activity fragments in number of 5-s epochs.
                                                    40


Table 11. Equivalence of five metrics for determining transition probability of activity to
inactivity.
                            Bias                               Equivalence Test
    Comparison             Mean            SE      Confidence Interval          Equivalent
      AI vs. VM            -0.04         <0.01          -0.04, -0.03                 No
  MAD vs. ENMO             -0.01          0.01          -0.02, 0.003                 No
    VM vs. MAD             -0.05         <0.01          -0.06, -0.04                 No
   ENMO vs. VM             -0.06         <0.01          -0.07, -0.05                 No
   VA vs. ENMO              0.05         <0.01           0.04, 0.06                  No
    AI vs. ENMO             0.02         <0.01           0.006, 0.03                 No
     MAD vs. AI             0.01         <0.01           0.007, 0.01                Yes
      VA vs. AI            -0.03         <0.01          -0.04, -0.03                 No
     VM vs. VA             <0.01         <0.01          -0.01, 0.005                Yes
    MAD vs. VA              0.05         <0.01           0.04, 0.05                  No
AI = activity index; VM = vector magnitude; MAD = mean amplitude deviation; ENMO =
Euclidean norm minus one; VA = vertical axis
Confidence intervals were compared to an equivalence bound of -0.0163, 0.0163 (5% of the
mean) to determine equivalence.
                                               41


Table 12. Equivalence of five metrics for determining transition probability of inactivity to
activity.
                          Bias                                Equivalence Test
    Comparison           Mean            SE      Confidence Interval           Equivalent
     AI vs. VM           -0.02         <0.01         -0.03, -0.008                 No
 MAD vs. ENMO            -0.02         <0.01          -0.03, 0.002                 No
   VM vs. MAD            -0.03         <0.01          -0.04, -0.02                 No
  ENMO vs. VM            -0.02          0.01          -0.03, 0.002                 No
   VA vs. ENMO            0.06          0.01           0.04, 0.07                  No
   AI vs. ENMO            0.03         <0.01            0.01, 0.04                 No
    MAD vs. AI            0.02         <0.01           0.007, 0.02                 No
     VA vs. AI           -0.02         <0.01          -0.04, -0.02                 No
    VM vs. VA            <0.01         <0.01          -0.004, 0.02                Yes
    MAD vs. VA            0.04         <0.01            0.03, 0.05                 No
AI = activity index; VM = vector magnitude; MAD = mean amplitude deviation; ENMO =
Euclidean norm minus one; VA = vertical axis
Confidence intervals were compared to an equivalence bound of -0.0176, 0.0176 (5% of the
mean) to determine equivalence.
                                               42


Table 13. Equivalence of five metrics for determining number of high activity fragments.
                          Bias                                Equivalence Test
    Comparison           Mean            SE      Confidence Interval           Equivalent
     AI vs. VM            -9.55          1.0         -11.14, -7.94                No
  MAD vs. ENMO            -0.45         0.60           -1.44, 0.53                Yes
    VM vs. MAD           -10.93         1.04         -12.67, -9.19                No
   ENMO vs. VM           -11.39         1.15         -13.29, -9.48                No
   VA vs. ENMO             9.91         1.14          8.01, 11.81                 No
   AI vs. ENMO             1.84         0.65            0.75, 2.93                No
    MAD vs. AI             1.39         0.37            0.77, 2.00                Yes
     VA vs. AI            -8.07         0.97          -9.68, -6.45                No
     VM vs. VA            -1.48         0.79          -2.79, -0.17                No
    MAD vs. VA             9.45         0.99          7.81, 11.10                 No
AI = activity index; VM = vector magnitude; MAD = mean amplitude deviation; ENMO =
Euclidean norm minus one; VA = vertical axis
Confidence intervals were compared to an equivalence bound of -2.781, 2.781 (5% of the mean)
to determine equivalence.
                                               43


Table 14. Equivalence of five metrics for determining number of activity fragments.
                          Bias                                Equivalence Test
    Comparison           Mean            SE      Confidence Interval           Equivalent
     AI vs. VM            -3.81         0.62          -4.84, -2.77                No
  MAD vs. ENMO            -0.67         1.52           -3.20, 1.86                No
    VM vs. MAD            -6.80         0.90          -8.30, -5.30                No
   ENMO vs. VM            -7.47         1.83         -10.51, -4.42                No
   VA vs. ENMO             8.86         1.89          5.71, 12.01                 No
   AI vs. ENMO             3.66         1.66            0.89, 6.42                No
    MAD vs. AI             2.99         0.51            2.13, 3.84                No
     VA vs. AI            -5.20         0.73          -6.42, -3.99                No
     VM vs. VA             1.39         0.69            0.24, 2.56                No
    MAD vs. VA             8.19         0.91            6.68, 9.71                No
AI = activity index; VM = vector magnitude; MAD = mean amplitude deviation; ENMO =
Euclidean norm minus one; VA = vertical axis
Confidence intervals were compared to an equivalence bound of -1.916, 1.916 (5% of the mean)
to determine equivalence.
                                               44


Table 15. Equivalence of five metrics for determining mean duration of high activity fragments.
                          Bias                                Equivalence Test
    Comparison           Mean            SE      Confidence Interval           Equivalent
     AI vs. VM             1.19         0.69            0.79, 1.59                No
  MAD vs. ENMO             0.75         0.69           -0.39, 1.90                No
    VM vs. MAD             1.33         0.26            0.90, 1.76                No
   ENMO vs. VM             2.10         0.83            0.70, 3.46                No
   VA vs. ENMO            -1.99         0.84          -3.38, -0.69                No
   AI vs. ENMO            -0.89         0.70           -2.06, 0.28                No
    MAD vs. AI            -0.13         0.06          -0.24, -0.03                Yes
     VA vs. AI             1.10         0.27            0.65, 1.55                No
     VM vs. VA             0.09         0.12           -0.11, 0.30                No
    MAD vs. VA            -0.13         0.06          -0.23, -0.03                Yes
AI = activity index; VM = vector magnitude; MAD = mean amplitude deviation;
ENMO = Euclidean norm minus one; VA = vertical axis
Confidence intervals were compared to an equivalence bound of -0.273, 0.273 (5% of the mean)
to determine equivalence.
                                               45


                                           DISCUSSION
        Accelerometers can capture high resolution information about the frequency, intensity,
duration, and timing of children’s physical activity. Disparities in how these data are summarized
(i.e., which metric is used) can impede harmonization across data sources or comparability of
results from different studies or surveillance systems. The present analysis provides preliminary
evidence regarding the convergent validity of five count- and acceleration-based metrics at both
the epoch- and participant-level for capturing both overall activity levels and temporal patterns.
While metrics were often strongly associated, they were not always statistically equivalent.
Continued use and development of various acceleration-based metrics is warranted to provide an
open-source and device-independent alternative to count-based metrics while maintaining
comparability to past and future research, as well as providing reliable estimates of time spent in
activity levels across studies.
        We found strong epoch-level (r=0.69-0.98) and overall (r=0.77-0.99) correlations
between metrics. While there is limited prior research on the comparability of the five
accelerometer metrics used in the present study, particularly in children or utilizing a 5-s epoch,
the findings of the present study are comparable to studies using samples of adults, and therefore,
longer epochs. Specifically, correlations in the present study were similar to or stronger than
those reported by Migueles et al (2019) who compared both acceleration- and count-based
metrics at the right hip during waking hours ENMO and MAD (r=0.74 vs 0.98 in the present
study), VM and ENMO (r=0.48 vs 0.69), and VM and MAD (r=0.81 vs 0.72) in a sample of
free-living young adults. Furthermore, a study of older adults by Karas et al (2022) found
stronger correlations between activity counts and acceleration-based metrics ENMO (r=0.87),
MAD (r=0.91), and AI (r=0.97) than the present study (r=0.69, 0.82, and 0.70, respectively).
                                                 46


While the current and past studies support positive associations between metrics across different
age groups, more research is needed to clarify the strength of this association.
         While all metrics were at least moderately associated, relationships were stronger
amongst acceleration-based metrics (all r=0.98) and between the two count-based metrics
(r=0.94), while weaker associations were found between acceleration based and count-based
metrics (r=0.69-0.75). The limited prior research in this area precludes us from making final
conclusions about this finding, but this indicates that comparisons between data or research using
different types of metrics (acceleration- versus count-based) should be done more cautiously than
comparisons when using similar types of metrics. This finding also supports the increased use of
acceleration-based metrics in future studies; in addition to supporting comparability across
studies because they can be calculated using any device brand, our findings indicate that these
metrics capture similar trends in activity intensity even when the acceleration data are processed
differently.
         In addition to correlating the five metrics to each other, we compared time spent in four
quartiles of activity intensity. While we did not use existing cut-points due to lack of availability
for all five metrics in this population, these quartiles may serve as a proxy for sedentary, light,
moderate, and vigorous intensity. For example, the most used cut-point in this age group for
classifying MVPA is the Evenson cut-point of ≥1003 VA counts∙15-sec-1 (≥334 counts∙5-sec-1),
compared to the quartile 3 cut-off of 224 counts∙5-sec-1. When extrapolated to a 20-min recess
period, differences between metrics in time spent in each quartile were less than 5-minutes.
Whether these differences are too large in magnitude may depend on the research question, but a
review of physical activity interventions reported changes of 1.2 to 2-minutes of MVPA, on
average (Parrish et al., 2020). However, the equivalence tests revealed that metrics almost or
                                                   47


almost always captured similar amount of time spent in quartile 3 and quartile 4. Coupled with
the fact that all metrics were able to similarly rank and classify children as most or least active,
the ability of all five metrics to capture the highest intensity activities is promising because these
are the outcomes of interest for future intervention research.
        In addition to activity intensity, accelerometers are capable of capturing rich information
about temporal features of physical activity but whether different accelerometer metrics similarly
capture these patterns was previously unknown. The fragmentation indices used in the present
study have proven useful in samples of older adults but have been underutilized in younger
samples. While not directly comparable due to the focus on recess, number of activity fragments
and high activity fragments in the current study were compared to prior work from Wanigatunga
et al (2019), and transition probabilities in the current study were compared to prior work from
Schrak et al (2019) indicating that children's recess activity was more fragmented than adult’s
free-living PA, which is to be expected.
        Findings from the activity fragmentation metrics in the current study also corroborate
findings from Bailey et al (1995), in that children exhibit highly transient activity, particularly at
higher intensities. In their study, Bailey et al (1995) used direct observation and reported that the
majority of high intensity activities generally lasted no longer than 15-s. The current study found
that average high activity bout duration was also quite short and lasted no longer than 30-s. This
supports the idea that accelerometer-measured activity fragmentation metrics are able to detect
highly fragmented bouts, similar to what has been seen in previous literature. This may be an
advantageous approach for further characterizing recess PA overall and by group (e.g., by sex)
with the overarching goal of informing activity-promoting interventions.
                                                   48


        All five accelerometer metrics captured similar trends in fragmentation indices, as
indicated by correlation coefficients from 0.53 to 0.99. The strongest associations were found for
number of high activity bouts (≥0.89), indicating this metric could be used comparably in future
studies employing different processing techniques. Conversely, mean duration of ENMO and
VM resulted in the weakest correlations between metrics (≥0.53), indicating this metric should
be used and compared between studies more cautiously. Despite overall associations, count-
based metrics demonstrated slightly higher fragmentation than acceleration-based metric and
many comparisons were not equivalent between metrics. Notably, because we did not have a
criterion measure, we cannot conclude whether any particular metric (or type of metric) most
accurately captured the temporal patterns of children’s behavior.
        Despite overall associations, many metrics were not statistically equivalent for the
fragmentation indices. However, future analysis is needed to better understand what equivalence
bounds would be relevant or meaningful in this sample and setting, as the equivalence bounds
used in the present study (5% of the mean) may have been too stringent. For example, bias for
mean duration of high activity bouts ranged from 0.1 to 2.1-s, which may be an acceptable level
of difference for this outcome. Those interested in applying different equivalence bounds can do
so by comparing the equivalence bound of interest to the confidence intervals reported in Tables
5-8. For instance, the confidence interval for the comparison of mean duration of high activity
bouts determined using AI and VM was 0.79-1.59, which would be equivalent if using an
equivalence bound of ±5.0-s. Associating the fragmentation indices with outcomes like weight
status or cardiometabolic health would elucidate clinically relevant equivalence bounds to be
used in future research.
                                                 49


Strengths & Limitations
        The use of an unstructured, free play environment is a strength of the current study, as
this non-laboratory setting offers greater external validity. While all five accelerometer metrics
have been previously validated and compared to criterion measures (e.g., heart rate, energy
expenditure), the current study is the first of its kind to compare metrics to each other in this
population and setting. However, the current study is not without limitations. First, only hip-
worn devices were employed in the present study. Accelerations produced from the hip, and
therefore, the results of the current study, may not be generalizable to wrist-worn devices in
children. While this is the most common wear location (Migueles et al., 2017), wrist-worn
devices are becoming increasingly popular to improve wear compliance. Lastly, only one recess
period was selected per child for analyses. Each recess period was 20-minutes long and occurred
during the warmer spring months in Michigan. These are factors that contribute to ideal PA
conditions and may have influenced the pattern and intensity of PA. However, the primary
purpose of the study was not to describe typical recess PA, but simply to compare PA and
temporal patterns recorded by the five accelerometer metrics. Lastly, the use of recess periods
alone is only a snapshot of the entire PA profile and is not generalizable to 24-hour PA patterns.
Future Directions
        The present study provides preliminary support for the comparability of five
accelerometer metrics for capturing activity intensity and temporal characteristics. However,
there may be other metrics, like Monitor-Independent Movement Summary (MIMS) units, which
should be included in future analyses. Further, fragmentation indices which capture other
temporal characteristics that are more relevant to children could be created, such as median bout
                                                    50


duration as reported by Bailey et al. (1995). Finally, further work is needed to determine
meaningful equivalence bounds for these temporal metrics.
                                                 51


                                           REFERENCES
Aadland, E., Andersen, L. B., Anderssen, S. A., Resaland, G. K., & Kvalheim, O. M. (2020).
        Accelerometer epoch setting is decisive for associations between physical activity and
        metabolic health in children. Journal of sports sciences, 38(3), 256–263.
        https://doi.org/10.1080/02640414.2019.1693320
Aguilar-Farías, N., Brown, W. J., & Peeters, G. M. (2014). ActiGraph GT3X+ cut points for
        identifying sedentary behaviour in older adults in free-living environments. Journal of
        science and medicine in sport, 17(3), 293–299.
        https://doi.org/10.1016/j.jsams.2013.07.002
Aittasalo, M., Vähä-Ypyä, H., Vasankari, T., Husu, P., Jussila, A.-M., & Sievänen, H. (2015).
        Mean amplitude deviation calculated from raw acceleration data: a novel method for
        classifying the intensity of adolescents’ physical activity irrespective of accelerometer
        brand. BMC Sports Science, Medicine and Rehabilitation, 7(1).
        https://doi.org/10.1186/s13102-015-0010-0
Albaum, E., Quinn, E., Sedaghatkish, S., Singh, P., Watkins, A., Musselman, K., & Williams, J.
        (2019). Accuracy of the Actigraph wGT3x-BT for step counting during inpatient spinal
        cord rehabilitation. Spinal Cord, 57(7), 571–578. https://doi.org/10.1038/s41393-019-
        0254-8 (Albaum et al., 2019)
Ainsworth, B. E., Haskell, W. L., Herrmann, S. D., Meckes, N., Bassett, D. R., Jr, Tudor-Locke,
        C., Greer, J. L., Vezina, J., Whitt-Glover, M. C., & Leon, A. S. (2011). 2011
        Compendium of Physical Activities: a second update of codes and MET values. Medicine
        and science in sports and exercise, 43(8), 1575–1581.
        https://doi.org/10.1249/MSS.0b013e31821ece12
Arundell, L., Hinkley, T., Veitch, J., & Salmon, J. (2015). Contribution of the After-School
        Period to Children's Daily Participation in Physical Activity and Sedentary
        Behaviours. PloS one, 10(10), e0140132. https://doi.org/10.1371/journal.pone.0140132
Bai, J., Di, C., Xiao, L., Evenson, K. R., LaCroix, A. Z., Crainiceanu, C. M., & Buchner, D. M.
        (2016). An activity index for raw accelerometry data and its comparison with other
        activity metrics. PLoS ONE, 11(8). https://doi.org/10.1371/journal.pone.0160644
Bailey, RC, Olson, J., Pepper, SL, Porszasz, J., Barstow, TJ, & Cooper, DM. (1995). The level
        and tempo of children's physical activities: an observational study. Medicine and science
        in sports and exercise, 27(7), 1033-1041. http://dx.doi.org/10.1249/00005768-
        199507000-00012 Retrieved from https://escholarship.org/uc/item/03n3p8bz
Beighle, A., Morgan, C. F., le Masurier, G., & Pangrazi, R. P. (2006). Children’s Physical
        Activity During Recess and Outside of School. In Journal of School Health (Vol. 76,
        Issue Ó). American School Health Association.
                                                  52


Berman, N., Bailey, R., Barstow, T. J., & Cooper, D. M. (1998). Spectral and bout detection
       analysis of physical activity patterns in healthy, prepubertal boys and girls. American
       journal of human biology : the official journal of the Human Biology Council, 10(3),
       289–297. https://doi.org/10.1002/(SICI)1520-6300(1998)10:3<289::AID-
       AJHB4>3.0.CO;2-E
Biddle, S. J. H., Gorely, T., Pearson, N., & Bull, F. C. (2011). An assessment of self-reported
       physical activity instruments in young people for population surveillance: Project
       ALPHA. International Journal of Behavioral Nutrition and Physical Activity, 8.
       https://doi.org/10.1186/1479-5868-8-1
Bjorklund, D. F., & Brown, R. D. (1998). Physical play and cognitive development: integrating
       activity, cognition, and education. Child development, 69(3), 604–606.
Blaes, A., Ridgers, N. D., Aucouturier, J., Van Praagh, E., Berthoin, S., & Baquet, G. (2013).
       Effects of a playground marking intervention on school recess physical activity in French
       children. Preventive medicine, 57(5), 580–584.
       https://doi.org/10.1016/j.ypmed.2013.07.019
Bornstein, D. B., Beets, M. W., Byun, W., Welk, G., Bottai, M., Dowda, M., & Pate, R. (2011).
       Equating accelerometer estimates of moderate-to-vigorous physical activity: In search of
       the Rosetta Stone. Journal of Science and Medicine in Sport, 14(5), 404–410.
       https://doi.org/10.1016/j.jsams.2011.03.013
Brown, W. H., Pfeiffer, K. A., McIver, K. L., Dowda, M., Addy, C. L., & Pate, R. R. (2009).
       Social and environmental factors associated with preschoolers' nonsedentary physical
       activity. Child development, 80(1), 45–58. https://doi.org/10.1111/j.1467-
       8624.2008.01245.x
Buchowski, M. S., Acra, S., Majchrzak, K. M., Sun, M., & Chen, K. Y. (2004). Patterns of
       physical activity in free-living adults in the Southern United States. European journal of
       clinical nutrition, 58(5), 828–837. https://doi.org/10.1038/sj.ejcn.1601928
Butte, N. F., Ekelund, U., & Westerterp, K. R. (2012). Assessing physical activity using
       wearable monitors: Measures of physical activity. Medicine and Science in Sports and
       Exercise, 44(SUPPL. 1). https://doi.org/10.1249/MSS.0b013e3182399c0e
Button, B. L. G., Clark, A. F., Martin, G., Graat, M., & Gilliland, J. A. (2020). Measuring
       temporal differences in rural canadian children’s moderate-to-vigorous physical activity.
       International Journal of Environmental Research and Public Health, 17(23), 1–14.
       https://doi.org/10.3390/ijerph17238734
Carlson, J. A., Engelberg, J. K., Cain, K. L., Conway, T. L., Mignano, A. M., Bonilla, E. A.,
       Geremia, C., & Sallis, J. F. (2015). Implementing classroom physical activity breaks:
       Associations with student physical activity and classroom behavior. Preventive
       medicine, 81, 67–72. https://doi.org/10.1016/j.ypmed.2015.08.006
                                                  53


Chan, Y. H. (2003). Biostatistics 104: correlational analysis. Singapore Med J, 44(12), 614-619.
Chastin, S. F., Ferriolli, E., Stephens, N. A., Fearon, K. C., & Greig, C. (2012). Relationship
       between sedentary behaviour, physical activity, muscle quality and body composition in
       healthy older adults. Age and ageing, 41(1), 111–114.
       https://doi.org/10.1093/ageing/afr075
Chen, K. Y., & Bassett, D. R., Jr (2005). The technology of accelerometry-based activity
       monitors: current and future. Medicine and science in sports and exercise, 37(11 Suppl),
       S490–S500. https://doi.org/10.1249/01.mss.0000185571.49104.82
Clevenger, K. A., Grady, S. C., Erickson, K., & Pfeiffer, K. A. (2020). Use of a spatiotemporal
       approach for understanding preschoolers’ playground activity. Spatial and Spatio-
       Temporal Epidemiology, 35. https://doi.org/10.1016/j.sste.2020.100376
Clevenger, K. A., McKee, K. L., & Pfeiffer, K. A. (2021). Classroom Location, Activity Type,
       and Physical Activity During Preschool Children’s Indoor Free-Play. Early Childhood
       Education Journal. https://doi.org/10.1007/s10643-021-01164-7
Clevenger, K. A., Moore, R. W., Suton, D., Montoye, A. H. K., Trost, S. G., & Pfeiffer, K. A.
       (2018). Accelerometer responsiveness to change between structured and unstructured
       physical activity in children and adolescents. Measurement in Physical Education and
       Exercise Science, 22(3), 224–230. https://doi.org/10.1080/1091367X.2017.1419956
Clevenger, K. A., Pfeiffer, K. A., & , A. H. K. (2020a). Cross-generational comparability of hip-
       and wrist-worn ActiGraph GT3X+, wGT3X-BT, and GT9X accelerometers during free-
       living in adults. Journal of Sports Sciences, 38(24), 2794–2802.
       https://doi.org/10.1080/02640414.2020.1801320
Clevenger, K. A., Pfeiffer, K. A., & Montoye, A. H. K. (2020b). Cross-Generational
       Comparability of Raw and Count-Based Metrics from ActiGraph GT9X and wGT3X-BT
       Accelerometers during Free-Living in Youth. Measurement in Physical Education and
       Exercise Science, 24(3), 194–204. https://doi.org/10.1080/1091367X.2020.1773827
Cohen J. (1968). Weighted kappa: nominal scale agreement with provision for scaled
       disagreement or partial credit. Psychological bulletin, 70(4), 213–220.
       https://doi.org/10.1037/h0026256
Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Hillsdale, NJ:
       Lawrence Erlbaum Associates, Publishers.
Corder, K., van Sluijs, E. M., Wright, A., Whincup, P., Wareham, N. J., & Ekelund, U. (2009). Is
       it possible to assess free-living physical activity and energy expenditure in young people
       by self-report?. The American journal of clinical nutrition, 89(3), 862–870.
       https://doi.org/10.3945/ajcn.2008.26739
                                                  54


Crotti, M., Foweather, L., Rudd, J. R., Hurter, L., Schwarz, S., & Boddy, L. M. (2020).
        Development of raw acceleration cut-points for wrist and hip accelerometers to assess
        sedentary behaviour and physical activity in 5–7-year-old children. Journal of Sports
        Sciences, 38(9), 1036–1045. https://doi.org/10.1080/02640414.2020.1740469
de Baere, S., Lefevre, J., de Martelaer, K., Philippaerts, R., & Seghers, J. (2015). Temporal
        patterns of physical activity and sedentary behavior in 10-14 year-old children on
        weekdays. BMC Public Health, 15(1). https://doi.org/10.1186/s12889-015-2093-7
Del Pozo Cruz, B., & Del Pozo-Cruz, J. (2021). Associations between activity fragmentation and
        subjective memory complaints in middle-aged and older adults. Experimental
        gerontology, 148, 111288. https://doi.org/10.1016/j.exger.2021.111288
Dixon, P. M., Saint-Maurice, P. F., Kim, Y., Hibbing, P., Bai, Y., & Welk, G. J. (2018). A
        Primer on the Use of Equivalence Testing for Evaluating Measurement
        Agreement. Medicine and science in sports and exercise, 50(4), 837–845.
        https://doi.org/10.1249/MSS.0000000000001481
Doherty, A., Jackson, D., Hammerla, N., Plötz, T., Olivier, P., Granat, M. H., White, T., van
        Hees, V. T., Trenell, M. I., Owen, C. G., Preece, S. J., Gillions, R., Sheard, S., Peakman,
        T., Brage, S., & Wareham, N. J. (2017). Large Scale Population Assessment of Physical
        Activity Using Wrist Worn Accelerometers: The UK Biobank Study. PloS one, 12(2),
        e0169649. https://doi.org/10.1371/journal.pone.0169649
Duncan, M. J., Wilson, S., Tallis, J., & Eyre, E. (2016). Validation of the Phillips et al.
        GENEActiv accelerometer wrist cut-points in children aged 5–8 years old. European
        Journal of Pediatrics, 175(12), 2019–2021. https://doi.org/10.1007/s00431-016-2795-6
Evenson, K. R., Catellier, D. J., Gill, K., Ondrak, K. S., & McMurray, R. G. (2008). Calibration
        of two objective measures of physical activity for children. Journal of Sports Sciences,
        26(14), 1557–1565. https://doi.org/10.1080/02640410802334196
Fairclough, S. J., Noonan, R., Rowlands, A. V., Van Hees, V., Knowles, Z., & Boddy, L. M.
        (2016). Wear Compliance and Activity in Children Wearing Wrist- and Hip-Mounted
        Accelerometers. Medicine and science in sports and exercise, 48(2), 245–253.
        https://doi.org/10.1249/MSS.0000000000000771
Farrahi, V., Niemelä, M., Kangas, M., Korpelainen, R., & Jämsä, T. (2019). Calibration and
        validation of accelerometer-based activity monitors: A systematic review of machine-
        learning approaches. In Gait and Posture (Vol. 68, pp. 285–299). Elsevier B.V.
        https://doi.org/10.1016/j.gaitpost.2018.12.003
Freedson, P., Bowles, H. R., Troiano, R., & Haskell, W. (2012). Assessment of physical activity
        using wearable monitors: Recommendations for monitor calibration and use in the field.
                                                  55


       Medicine and Science in Sports and Exercise, 44(SUPPL. 1).
       https://doi.org/10.1249/MSS.0b013e3182399b7e
Freedson, P. S., Melanson, E., & Sirard, J. (1998). Calibration of the Computer Science and
       Applications, Inc. accelerometer. Medicine and science in sports and exercise, 30(5),
       777–781. https://doi.org/10.1097/00005768-199805000-00021
Freedson, P., Pober, D., & Janz, K. F. (2005). Calibration of accelerometer output for children.
       Medicine and Science in Sports and Exercise, 37(11 SUPPL.).
       https://doi.org/10.1249/01.mss.0000185658.28284.ba
Gamer, M., & Lemon, J., & Singh, I. (2010). irr: Various Coefficients of Interrater Reliability
       and Agreement.
Gardner F. (2000). Methodological issues in the direct observation of parent-child interaction: do
       observational findings reflect the natural behavior of participants?. Clinical child and
       family psychology review, 3(3), 185–198. https://doi.org/10.1023/a:1009503409699
Ginsburg, K. R., American Academy of Pediatrics Committee on Communications, & American
       Academy of Pediatrics Committee on Psychosocial Aspects of Child and Family Health
       (2007). The importance of play in promoting healthy child development and maintaining
       strong parent-child bonds. Pediatrics, 119(1), 182–191.
       https://doi.org/10.1542/peds.2006-2697
Goran, M. I., Gower, B. A., Nagy, T. R., & Johnson, R. K. (1998). Developmental changes in
       energy expenditure and physical activity in children: evidence for a decline in physical
       activity in girls before puberty. Pediatrics, 101(5), 887–891.
       https://doi.org/10.1542/peds.101.5.887
Groffik, D., Fromel, K., & Badura, P. (2020). Composition of weekly physical activity in
       adolescents by level of physical activity. BMC Public Health, 20(1).
       https://doi.org/10.1186/s12889-020-08711-8
Grydeland, M., Hansen, B. H., Ried-Larsen, M., Kolle, E., & Anderssen, S. A. (2014).
       Comparison of three generations of ActiGraph activity monitors under free-living
       conditions: Do they provide comparable assessments of overall physical activity in 9-year
       old children? BMC Sports Science, Medicine and Rehabilitation, 6(1).
       https://doi.org/10.1186/2052-1847-6-26
Gubbels, J. S., Kremers, S. P., van Kann, D. H., Stafleu, A., Candel, M. J., Dagnelie, P. C., Thijs,
       C., & de Vries, N. K. (2011). Interaction between physical environment, social
       environment, and child characteristics in determining physical activity at child
       care. Health psychology : official journal of the Division of Health Psychology, American
       Psychological Association, 30(1), 84–90. https://doi.org/10.1037/a0021586
                                                 56


Hänggi, J. M., Phillips, L. R. S., & Rowlands, A. v. (2013). Validation of the GT3X ActiGraph
        in children and comparison with the GT1M ActiGraph. Journal of Science and Medicine
        in Sport, 16(1), 40–44. https://doi.org/10.1016/j.jsams.2012.05.012
HHS. (2018). Physical Activity Guidelines for Americans 2nd edition.
Hildebrand, M., Hansen, B. H., van Hees, V. T., & Ekelund, U. (2017). Evaluation of raw
        acceleration sedentary thresholds in children and adults. Scandinavian Journal of
        Medicine and Science in Sports, 27(12), 1814–1823. https://doi.org/10.1111/sms.12795
Hildebrand, M., van Hees, V. T., Hansen, B. H., & Ekelund, U. (2014). Age group comparability
        of raw accelerometer output from wrist-and hip-worn monitors. Medicine and Science in
        Sports and Exercise, 46(9), 1816–1824. https://doi.org/10.1249/MSS.0000000000000289
Hills, A. P., King, N. A., & Armstrong, T. P. (2007). The contribution of physical activity and
        sedentary behaviours to the growth and development of children and adolescents:
        implications for overweight and obesity. Sports medicine (Auckland, N.Z.), 37(6), 533–
        545. https://doi.org/10.2165/00007256-200737060-00006
Hislop, J. F., Bulley, C., Mercer, T. H., & Reilly, J. J. (2012). Comparison of epoch and uniaxial
        versus triaxial accelerometers in the measurement of physical activity in preschool
        children: a validation study. Pediatric exercise science, 24(3), 450–460.
        https://doi.org/10.1123/pes.24.3.450
Holmes, R. (2012). The Outdoor Recess Activities of Children at an Urban School: Longitudinal
        and Intraperiod Patterns. American Journal of Play, 4, 327-351.
Howe, C. A., Staudenmayer, J. W., & Freedson, P. S. (2009). Accelerometer prediction of
        energy expenditure: vector magnitude versus vertical axis. Medicine and science in sports
        and exercise, 41(12), 2199–2206. https://doi.org/10.1249/MSS.0b013e3181aa3a0e
Husu, P., Suni, J., Vähä-Ypyä, H., Sievänen, H., Tokola, K., Valkeinen, H., Mäki-Opas, T., &
        Vasankari, T. (2016). Objectively measured sedentary behavior and physical activity in a
        sample of Finnish adults: a cross-sectional study. BMC public health, 16(1), 920.
        https://doi.org/10.1186/s12889-016-3591-y
Johansson, E., Ekelund, U., Nero, H., Marcus, C., & Hagströmer, M. (2015). Calibration and
        cross-validation of a wrist-worn Actigraph in young preschoolers. Pediatric
        obesity, 10(1), 1–6. https://doi.org/10.1111/j.2047-6310.2013.00213.x
Johansson, E., Larisch, L. M., Marcus, C., & Hagströmer, M. (2016). Calibration and Validation
        of a Wrist- and Hip-Worn Actigraph Accelerometer in 4-Year-Old Children. PloS
        one, 11(9), e0162436. https://doi.org/10.1371/journal.pone.0162436
                                                  57


John, D., & Freedson, P. (2012). ActiGraph and Actical physical activity monitors: a peek under
        the hood. Medicine and science in sports and exercise, 44(1 Suppl 1), S86–S89.
        https://doi.org/10.1249/MSS.0b013e3182399f5e
John, D., Tang, Q., Albinali, F., & Intille, S. (2019). An Open-Source Monitor-Independent
        Movement Summary for Accelerometer Data Processing. Journal for the measurement of
        physical behaviour, 2(4), 268–281. https://doi.org/10.1123/jmpb.2018-0068
Karas, M., Muschelli, J., Leroux, A., Urbanek, J. K., Wanigatunga, A. A., Bai, J., Crainiceanu,
        C. M., & Schrack, J. A. (2022). Comparison of accelerometry-based measures of physical
        activity. MedRxiv, 2022.03.16.22272518. https://doi.org/10.1101/2022.03.16.22272518
Karaca, A., Demirci, N., Yılmaz, V., Hazır Aytar, S., Can, S., & Ünver, E. (2021). Validation of
        the ActiGraph wGT3X-BT Accelerometer for Step Counts at Five Different Body
        Locations in Laboratory Settings. Measurement in Physical Education and Exercise
        Science. https://doi.org/10.1080/1091367X.2021.1948414
Katzmarzyk, P. T., Denstel, K. D., Beals, K., Bolling, C., Wright, C., Crouter, S. E., McKenzie,
        T. L., Pate, R. R., Saelens, B. E., Staiano, A. E., Stanish, H. I., & Sisson, S. B. (2016).
        Results from the United States of America’s 2016 report card on physical activity for
        children and youth. Journal of Physical Activity and Health, 13(11), S307–S313.
        https://doi.org/10.1123/jpah.2016-0321
Kim, Y., Beets, M. W., & Welk, G. J. (2012). Everything you wanted to know about selecting
        the “right” Actigraph accelerometer cut-points for youth, but...: A systematic review. In
        Journal of Science and Medicine in Sport (Vol. 15, Issue 4, pp. 311–321).
        https://doi.org/10.1016/j.jsams.2011.12.001
Kuhn, M. (2008). Building Predictive Models in R Using the caret Package. Journal of Statistical
          Software, 28(5), 1 - 26. doi:http://dx.doi.org/10.18637/jss.v028.i05
Lakens, D. (2017). “Equivalence tests: A practical primer for t-tests, correlations, and meta-
        analyses.” Social Psychological and Personality Science, 1, 1–8.
        doi: 10.1177/1948550617697177.
Larson, T. A., Normand, M. P., Morley, A. J., & Hustyi, K. M. (2014). The Role of the Physical
        Environment in Promoting Physical Activity in Children Across Different Group
        Compositions. Behavior Modification, 38(6), 837–851.
        https://doi.org/10.1177/0145445514543466
Latorre-Román, P. A., Martínez-Redondo, M., Salas-Sánchez, J., García-Pinillos, F., & Pérez-
        Jiménez, I. (2017). Suid-Afrikaanse Tydskrif vir Navorsing in Sport. South African
        Journal for Research in Sport, Physical Education and Recreation, 39(3), 57–66.
Lee, I. M., Shiroma, E. J., Lobelo, F., Puska, P., Blair, S. N., Katzmarzyk, P. T., & Lancet
        Physical Activity Series Working Group (2012). Effect of physical inactivity on major
        non-communicable diseases worldwide: an analysis of burden of disease and life
                                                  58


       expectancy. Lancet (London, England), 380(9838), 219–229.
       https://doi.org/10.1016/S0140-6736(12)61031-9
Leitzmann, M. F., Park, Y., Blair, A., Ballard-Barbash, R., Mouw, T., Hollenbeck, A. R., &
       Schatzkin, A. (2007). Physical Activity Recommendations and Decreased Risk of
       Mortality. https://jamanetwork.com/
Loprinzi, P. D., Cardinal, B. J., Loprinzi, K. L., & Lee, H. (2012). Benefits and environmental
       determinants of physical activity in children and adolescents. In Obesity Facts (Vol. 5,
       Issue 4, pp. 597–610). https://doi.org/10.1159/000342684
Luke, A., Dugas, L. R., Durazo-Arvizu, R. A., Cao, G., & Cooper, R. S. (2011). Assessing
       physical activity and its relationship to cardiovascular risk factors: NHANES 2003-
       2006. BMC public health, 11, 387. https://doi.org/10.1186/1471-2458-11-387
McKenzie, T. L., Sallis, J. F., Elder, J. P., Berry, C. C., Hoy, P. L., Nader, P. R., Zive, M. M., &
       Broyles, S. L. (1997). Physical activity levels and prompts in young children at recess: a
       two-year study of a bi-ethnic sample. Research quarterly for exercise and sport, 68(3),
       195–202. https://doi.org/10.1080/02701367.1997.10607998
McGarty, A. M., Penpraze, V., & Melville, C. A. (2016). Calibration and Cross-Validation of the
       ActiGraph wGT3X+ Accelerometer for the Estimation of Physical Activity Intensity in
       Children with Intellectual Disabilities. PloS one, 11(10), e0164928.
       https://doi.org/10.1371/journal.pone.0164928
McHugh M. L. (2012). Interrater reliability: the kappa statistic. Biochemia medica, 22(3), 276–
       282.
Meyer, U., Roth, R., Zahner, L., Gerber, M., Puder, J. J., Hebestreit, H., & Kriemler, S. (2013).
       Contribution of physical education to overall physical activity. Scandinavian journal of
       medicine & science in sports, 23(5), 600–606. https://doi.org/10.1111/j.1600-
       0838.2011.01425.x
Montoye, A. H. K., Clevenger, K. A., Pfeiffer, K. A., Nelson, M. B., Bock, J. M., Imboden, M.
       T., & Kaminsky, L. A. (2020). Development of cut-points for determining activity
       intensity from a wrist-worn ActiGraph accelerometer in free-living adults. Journal of
       Sports Sciences, 2569–2578. https://doi.org/10.1080/02640414.2020.1794244
Montoye, A. H. K., Nelson, M. B., Bock, J. M., Imboden, M. T., Kaminsky, L. A., MacKintosh,
       K. A., McNarry, M. A., & Pfeiffer, K. A. (2018). Raw and Count Data Comparability of
       Hip-Worn ActiGraph GT3X+ and Link Accelerometers. Medicine and Science in Sports
       and Exercise, 50(5), 1103–1112. https://doi.org/10.1249/MSS.0000000000001534
Mota, J., Silva, P., Santos, M. P., Ribeiro, J. C., Oliveira, J., & Duarte, J. A. (2005). Physical
       activity and school recess time: Differences between the sexes and the relationship
                                                   59


        between children’s playground physical activity and habitual physical activity. Journal of
        Sports Sciences, 23(3), 269–275. https://doi.org/10.1080/02640410410001730124
Neishabouri A, Nguyen J, Samuelsson J, et al. (2022) Quantification of Acceleration as Activity
        Counts in ActiGraph Wearables. Research Square. DOI: 10.21203/rs.3.rs-1370418/v1
Nettlefold, L., McKay, H. A., Warburton, D. E. R., McGuire, K. A., Bredin, S. S. D., & Naylor,
        P. J. (2011). The challenge of low physical activity during the school day: At recess,
        lunch and in physical education. British Journal of Sports Medicine, 45(10), 813–819.
        https://doi.org/10.1136/bjsm.2009.068072
Nyström, C., Pomeroy, J., Henriksson, P., Forsum, E., Ortega, F. B., Maddison, R., Migueles, J.
        H., & Löf, M. (2017). Evaluation of the wrist-worn ActiGraph wGT3x-BT for estimating
        activity energy expenditure in preschool children. European journal of clinical
        nutrition, 71(10), 1212–1217. https://doi.org/10.1038/ejcn.2017.114
Osborn, W., Simm, P., Olds, T., Lycett, K., Mensah, F. K., Muller, J., Fraysse, F., Ismail, N.,
        Vlok, J., Burgner, D., Carlin, J. B., Edwards, B., Dwyer, T., Azzopardi, P., Ranganathan,
        S., & Wake, M. (2018). Bone health, activity and sedentariness at age 11-12 years: Cross-
        sectional Australian population-derived study. Bone, 112, 153–160.
        https://doi.org/10.1016/j.bone.2018.04.011
Palmberg, L., Rantalainen, T., Rantakokko, M., Karavirta, L., Siltanen, S., Skantz, H.,
        Saajanaho, M., Portegijs, E., & Rantanen, T. (2020). The Associations of Activity
        Fragmentation With Physical and Mental Fatigability Among Community-Dwelling 75-,
        80-, and 85-Year-Old People. The journals of gerontology. Series A, Biological sciences
        and medical sciences, 75(9), e103–e110. https://doi.org/10.1093/gerona/glaa166
Parrish, A. M., Chong, K. H., Moriarty, A. L., Batterham, M., & Ridgers, N. D. (2020).
        Interventions to Change School Recess Activity Levels in Children and Adolescents: A
        Systematic Review and Meta-Analysis. Sports medicine (Auckland, N.Z.), 50(12), 2145–
        2173. https://doi.org/10.1007/s40279-020-01347-z
Pate, R. R., Dowda, M., Brown, W. H., Mitchell, J., & Addy, C. (2013). Physical activity in
        preschool children with the transition to outdoors. Journal of physical activity &
        health, 10(2), 170–175. https://doi.org/10.1123/jpah.10.2.170
Pellegrini, A. D., & Smith, P. K. (1998). Physical activity play: the nature and function of a
        neglected aspect of playing. Child development, 69(3), 577–598.
Peterson, N. E., Sirard, J. R., Kulbok, P. A., DeBoer, M. D., & Erickson, J. M. (2015).
        Validation of Accelerometer Thresholds and Inclinometry for Measurement of Sedentary
        Behavior in Young Adult University Students. Research in nursing & health, 38(6), 492–
        499. https://doi.org/10.1002/nur.21694
                                                 60


Pfeiffer, K. A., McIver, K. L., Dowda, M., Almeida, M. J. C. A., & Pate, R. R. (2006).
        Validation and calibration of the actical accelerometer in preschool children. Medicine
        and Science in Sports and Exercise, 38(1), 152–157.
        https://doi.org/10.1249/01.mss.0000183219.44127.e7
Pope, Z. C., Huang, C., Stodden, D., McDonough, D. J., & Gao, Z. (2020). Effect of children’s
        weight status on physical activity and sedentary behavior during physical education,
        recess, and after school. Journal of Clinical Medicine, 9(8), 1–10.
        https://doi.org/10.3390/jcm9082651
Puyau, M. R., Adolph, A. L., Vohra, F. A., & Butte, N. F. (2002). Validation and calibration of
        physical activity monitors in children. Obesity research, 10(3), 150–157.
        https://doi.org/10.1038/oby.2002.24
Rachele, J. N., McPhail, S. M., Washington, T. L., & Cuddihy, T. F. (2012). Practical physical
        activity measurement in youth: a review of contemporary approaches. World journal of
        pediatrics : WJP, 8(3), 207–216. https://doi.org/10.1007/s12519-012-0359-z
Rastogi, T., Backes, A., Schmitz, S. et al. (2020). Advanced analytical methods to assess
        physical activity behaviour using accelerometer raw time series data: a protocol for a
        scoping review. Syst Rev 9, 259 https://doi.org/10.1186/s13643-020-01515-2
Ridgers, N. D., Stratton, G., & Fairclough, S. J. (2005). Assessing physical activity during recess
        using accelerometry. Preventive Medicine, 41(1), 102–107.
        https://doi.org/10.1016/j.ypmed.2004.10.023
Ridgers, N. D., Stratton, G., Fairclough, S. J., & Twisk, J. W. (2007). Long-term effects of a
        playground markings and physical structures on children's recess physical activity
        levels. Preventive medicine, 44(5), 393–397.https://doi.org/10.1016/j.ypmed.2007.01.009
Ridgers, N. D., Fairclough, S. J., & Stratton, G. (2010). Variables associated with children's
        physical activity levels during recess: the A-CLASS project. The international journal of
        behavioral nutrition and physical activity, 7, 74. https://doi.org/10.1186/1479-5868-7-74
Ridgers, N. D., Timperio, A., Cerin, E., & Salmon, J. (2014). Compensation of physical activity
        and sedentary time in primary school children. Medicine and Science in Sports and
        Exercise, 46(8), 1564–1569. https://doi.org/10.1249/MSS.0000000000000275
Ridgers, N. D., Timperio, A., Crawford, D., & Salmon, J. (2012). Five-year changes in school
        recess and lunchtime and the contribution to children’s daily physical activity. British
        Journal of Sports Medicine, 46(10), 741–746. https://doi.org/10.1136/bjsm.2011.084921
Romanzini, M., Petroski, E. L., Ohara, D., Dourado, A. C., & Reichert, F. F. (2014). Calibration
        of ActiGraph GT3X, Actical and RT3 accelerometers in adolescents. European journal
        of sport science, 14(1), 91–99. https://doi.org/10.1080/17461391.2012.732614
                                                  61


Rooney, L. (2018). Contribution of Physical Education and Recess towards the overall Physical
         Activity of 8-11 year old children. Journal of Sport and Health Research, 10(2), 303-316.
         [8]. http://www.journalshr.com/index.php/issues/70-vol-10-n2-may-august-2018/307-
         rooney-l-mckee-d-2018-contribution-of-physical-education-and-recess-towards-the-
         overall-physical-activity-of-8-11-year-old-children-journal-of-sport-and-health-research-
         102303-316
Routen, A. C., Upton, D., Edwards, M. G., & Peters, D. M. (2012). Discrepancies in
         accelerometer-measured physical activity in children due to cut-point non-equivalence
         and placement site. Journal of sports sciences, 30(12), 1303–1310.
         https://doi.org/10.1080/02640414.2012.709266
Rowlands, A. v. (2018). Moving forward with accelerometer-assessed physical activity: Two
         strategies to ensure meaningful, interpretable, and comparable measures. In Pediatric
         Exercise Science (Vol. 30, Issue 4, pp. 450–456). Human Kinetics Publishers Inc.
         https://doi.org/10.1123/pes.2018-0201
Rowlands, A. v., Dawkins, N. P., Maylor, B., Edwardson, C. L., Fairclough, S. J., Davies, M. J.,
         Harrington, D. M., Khunti, K., & Yates, T. (2019). Enhancing the value of
         accelerometer-assessed physical activity: meaningful visual comparisons of data-driven
         translational accelerometer metrics. Sports Medicine - Open, 5(1).
         https://doi.org/10.1186/s40798-019-0225-9
Rowlands, A. v., Edwardson, C. L., Davies, M. J., Khunti, K., Harrington, D. M., & Yates, T.
         (2018). Beyond Cut Points: Accelerometer Metrics that Capture the Physical Activity
         Profile. Medicine and Science in Sports and Exercise, 50(6), 1323–1332.
         https://doi.org/10.1249/MSS.0000000000001561
Rowlands, A. V., & Eston, R. G. (2007). The Measurement and Interpretation of Children's
         Physical Activity. Journal of sports science & medicine, 6(3), 270–276.
Rowlands, A. v., Rennie, K., Kozarski, R., Stanley, R. M., Eston, R. G., Parfitt, G. C., & Olds, T.
         S. (2014). Children’s physical activity assessed with wrist- and hip-worn accelerometers.
         Medicine and Science in Sports and Exercise, 46(12), 2308–2316.
         https://doi.org/10.1249/MSS.0000000000000365
Rowland T. W. (1998). The biological basis of physical activity. Medicine and science in sports
         and exercise, 30(3), 392–399. https://doi.org/10.1097/00005768-199803000-00009
Sallis, J. F., Prochaska, J. J., & Taylor, W. C. (2000). A review of correlates of physical activity
         of children and adolescents. Medicine and science in sports and exercise, 32(5), 963–975.
         https://doi.org/10.1097/00005768-200005000-00014
Schrack, J. A., Kuo, P. L., Wanigatunga, A. A., Di, J., Simonsick, E. M., Spira, A. P., Ferrucci,
         L., & Zipunnikov, V. (2019). Active-to-Sedentary Behavior Transitions, Fatigability, and
                                                   62


        Physical Functioning in Older Adults. Journals of Gerontology - Series A Biological
        Sciences and Medical Sciences, 74(4), 560–567. https://doi.org/10.1093/gerona/gly243
Stenholm, S., Pulakka, A., Leskinen, T., Pentti, J., Heinonen, O. J., Koster, A., & Vahtera, J.
        (2021). Daily Physical Activity Patterns and Their Association With Health-Related
        Physical Fitness Among Aging Workers—The Finnish Retirement and Aging Study. The
        Journals of Gerontology: Series A, 76(7), 1242–1250.
        https://doi.org/10.1093/gerona/glaa193
Strath, S. J., Pfeiffer, K. A., & Whitt-Glover, M. C. (2012). Accelerometer use with children,
        older adults, and adults with functional limitations. Medicine and Science in Sports and
        Exercise, 44(SUPPL. 1). https://doi.org/10.1249/MSS.0b013e3182399eb1
Stratton, G., & Mullan, E. (2005). The effect of multicolor playground markings on children's
        physical activity level during recess. Preventive medicine, 41(5-6), 828–833.
        https://doi.org/10.1016/j.ypmed.2005.07.009
Sugiyama, T., Leslie, E., Giles-Corti, B., & Owen, N. (2009). Physical activity for recreation or
        exercise on neighbourhood streets: associations with perceived environmental
        attributes. Health & place, 15(4), 1058–1063.
        https://doi.org/10.1016/j.healthplace.2009.05.001
Sylvia, L. G., Bernstein, E. E., Hubbard, J. L., Keating, L., & Anderson, E. J. (2014). Practical
        guide to measuring physical activity. Journal of the Academy of Nutrition and Dietetics,
        114(2), 199–208. https://doi.org/10.1016/j.jand.2013.09.018
Telama, R. (2009). Tracking of physical activity from childhood to adulthood: A review. In
        Obesity Facts (Vol. 2, Issue 3, pp. 187–195). https://doi.org/10.1159/000222244
Telford, R. M., Telford, R. D., Olive, L. S., Cochrane, T., & Davey, R. (2016). Why Are Girls
        Less Physically Active than Boys? Findings from the LOOK Longitudinal Study. PloS
        one, 11(3), e0150041. https://doi.org/10.1371/journal.pone.0150041
Troiano, R. P. (2005). A timely meeting: Objective measurement of physical activity. Medicine
        and Science in Sports and Exercise, 37(11 SUPPL.).
        https://doi.org/10.1249/01.mss.0000185473.32846.c3
Troiano, R. P., Berrigan, D., Dodd, K. W., Mâsse, L. C., Tilert, T., & Mcdowell, M. (2008).
        Physical activity in the United States measured by accelerometer. Medicine and Science
        in Sports and Exercise, 40(1), 181–188. https://doi.org/10.1249/mss.0b013e31815a51b3
Trost, S. G., Fees, B. S., Haar, S. J., Murray, A. D., & Crowe, L. K. (2012). Identification and
        validity of accelerometer cut-points for toddlers. Obesity, 20(11), 2317–2319.
        https://doi.org/10.1038/oby.2011.364
                                                  63


Trost, S. G., Loprinzi, P. D., Moore, R., & Pfeiffer, K. A. (2011). Comparison of accelerometer
        cut points for predicting activity intensity in youth. Medicine and Science in Sports and
        Exercise, 43(7), 1360–1368. https://doi.org/10.1249/MSS.0b013e318206476e
Trost, S. G., & O’Neil, M. (2014). Clinical use of objective measures of physical activity. In
        British Journal of Sports Medicine (Vol. 48, Issue 3, pp. 178–181).
        https://doi.org/10.1136/bjsports-2013-093173
Trost, S. G., Pate, R. R., Sallis, J. F., Freedson, P. S., Taylor, W. C., Dowda, M., & Sirard, J.
        (2002). Age and gender differences in objectively measured physical activity in youth. In
        Med. Sci. Sports Exerc (Vol. 34, Issue 2). http://www.acsm-msse.org
Vähä-Ypyä, H., Vasankari, T., Husu, P., Suni, J., & Sievänen, H. (2015). A universal, accurate
        intensity-based classification of different physical activities using raw data of
        accelerometer. Clinical physiology and functional imaging, 35(1), 64–70.
        https://doi.org/10.1111/cpf.12127
Vähä-Ypyä, H., Vasankari, T., Husu, P., Mänttäri, A., Vuorimaa, T., Suni, J., & Sievänen, H.
        (2015). Validation of cut-points for evaluating the intensity of physical activity with
        accelerometry-based Mean Amplitude Deviation (MAD). PLoS ONE, 10(8).
        https://doi.org/10.1371/journal.pone.0134813
van Hees V, Fang Z, Zhao J, Heywood J, Mirkes E, Sabia S, Migueles J (2022). GGIR: Raw
        Accelerometer Data Analysis. doi:10.5281/zenodo.1051064, R package version 2.7-
        1, https://CRAN.R-project.org/package=GGIR.
van Hees, V. T., Fang, Z., Langford, J., Assah, F., Mohammad, A., M da Silva, I. C., Trenell, M.
        I., White, T., Wareham, N. J., Brage, S., Hees, van V., & Silva, da I. (2014).
        Autocalibration of accelerometer data for free-living physical activity assessment using
        local gravity and temperature: an evaluation on four continents. J Appl Physiol, 117, 738–
        744. https://doi.org/10.1152/japplphysiol.00421.2014.-Wearable
van Hees, V. T., Gorzelniak, L., Dean León, E. C., Eder, M., Pias, M., Taherian, S., Ekelund, U.,
        Renström, F., Franks, P. W., Horsch, A., & Brage, S. (2013). Separating Movement and
        Gravity Components in an Acceleration Signal and Implications for the Assessment of
        Human Daily Physical Activity. PLoS ONE, 8(4).
        https://doi.org/10.1371/journal.pone.0061691
Vanderloo, L. M., Di Cristofaro, N. A., Proudfoot, N. A., Tucker, P., & Timmons, B. W. (2016).
        Comparing the Actical and ActiGraph Approach to Measuring Young Children's Physical
        Activity Levels and Sedentary Time. Pediatric exercise science, 28(1), 133–142.
        https://doi.org/10.1123/pes.2014-0218
Vanderloo, L. M., Tucker, P., Johnson, A. M., & Holmes, J. D. (2013). Physical activity among
        preschoolers during indoor and outdoor childcare play periods. Applied physiology,
                                                    64


       nutrition, and metabolism = Physiologie appliquee, nutrition et metabolisme, 38(11),
       1173–1175. https://doi.org/10.1139/apnm-2013-0137
Wanigatunga, A. A., Di, J., Zipunnikov, V., Urbanek, J. K., Kuo, P. L., Simonsick, E. M.,
       Ferrucci, L., & Schrack, J. A. (2019). Association of Total Daily Physical Activity and
       Fragmented Physical Activity with Mortality in Older Adults. JAMA Network Open,
       2(10). https://doi.org/10.1001/jamanetworkopen.2019.12352
Wanigatunga, A. A., Ferrucci, L., & Schrack, J. A. (2019). Physical activity fragmentation as a
       potential phenotype of accelerated aging. In Oncotarget (Vol. 10, Issue 8).
       www.oncotarget.com
Wanigatunga, A. A., Simonsick, E. M., Zipunnikov, V., Spira, A. P., Studenski, S., Ferrucci, L.,
       & Schrack, J. A. (2018). Perceived Fatigability and Objective Physical Activity in Mid-
       to Late-Life. The journals of gerontology. Series A, Biological Sciences and medical
       sciences, 73(5), 630–635. https://doi.org/10.1093/gerona/glx181
Welk, G. J., Corbin, C. B., & Dale, D. (2000). Measurement issues in the assessment of physical
       activity in children. Research quarterly for exercise and sport, 71(2 Suppl), S59–S73.
Welk, G. J., McClain, J., & Ainsworth, B. E. (2012). Protocols for evaluating equivalency of
       accelerometry-based activity monitors. Medicine and Science in Sports and Exercise,
       44(SUPPL. 1). https://doi.org/10.1249/MSS.0b013e3182399d8f
Yu, H., Kulinna, P. H., & Mulhearn, S. C. (2021). The Effectiveness of Equipment Provisions on
       Rural Middle School Students' Physical Activity During Lunch Recess. Journal of
       physical activity & health, 18(3), 287–295. https://doi.org/10.1123/jpah.2019-0661
                                                65