IDENTIFYING AND MEASURING COMMON ELEMENTS OF NATURALISTIC DEVELOPMENTAL BEHAVIORAL INTERVENTIONS By Kyle M. Frost A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Psychology Master of Arts 2018 ABSTRACT IDENTIFYING AND MEASURING COMMON ELEMENTS OF NATURALISTIC DEVELOPMENTAL BEHAVIORAL INTERVENTIONS By Kyle M. Frost Evidence - based interventions for young children with autism spectrum disorder (ASD) share theoretical origins in developmental and behavioral theories, and have been acknowledged to share key strategies (Schreibman et al., 2015) . However, the extent to which these interventions share strategies has not been examined in research to date. In addition, there is no standardized measure for assessing intervention impleme ntation that was developed for use across different interventions. This paper presents two studies, the first of which had the goal of developing a comprehensive taxonomy of strategies of caregiver - mediated NDBI and refining an observational rating scheme using quantitative feedback from experts. Twenty strategies comprised the comprehensive taxonomy, 11 of which were determined to be common elements using quantitative methods. From these items, an 8 - item observational rating scheme, the NDBI - Fi , was develo ped. The goal of study two was to establish preliminary reliability and validity of the NDBI - Fi , by rating caregiver - child interaction videos from various completed intervention trials . Results lend support to the utility of the NDBI - Fi as a measure of car egiver use of intervention strategies across NDBI models . iii ACKNOWLEDGEMENTS This project could not have been completed without support from numerous mentors, collaborators, family, and friends. Foremost thanks to my mentor, Brooke Ingersoll, for consisten t enthusiasm, support, expertise, and encouragement. Second, I would like to thank Kaylin Russell, who dedicated so much of her time to learning the NDBI - Fi. Third, I want to thank my family, Benji, Chelsea, and my lab mates, for being the best support system someone could ask for. Last, but certainly not least, this project was completed with the expert input of numerous experts in ASD intervention (in alphabetical order), as well as their trainees and colleagues: Jessica Bri an, Susan Bryson, Geraldine Dawson, Helen Flanagan, Grace Gengoux, Ann Kaiser, Connie Kasari, So Hyun Kim, Rebecca Landa, Catherine Lord, Mendy Minjarez, Jennifer Nietfeld, Sarah Rieth, Sally Rogers, Stephanie Shire, Isabel Smith, and Aubyn Stahmer. I am s o thankful for the expertise you all have lent to this endeavor. iv TABLE OF CONTENTS LIST OF TABLES ................................ ................................ ................................ ......................... vi LIST OF FIGURES ................................ ................................ ................................ ...................... vii INTRODUCTION ................................ ................................ ................................ .......................... 1 Treatment fidelity ................................ ................................ ................................ ........................ 2 Caregiver - Mediated Intervention ................................ ................................ ................................ 2 Evaluating Intervention Outcomes ................................ ................................ ............................. 4 Measuring common elements to advance intervention research ................................ ................ 6 STUDY 1 ................................ ................................ ................................ ................................ ........ 8 METHOD ................................ ................................ ................................ ................................ ....... 9 Development of an Intervention Taxonomy ................................ ................................ ............... 9 Qualitative Review ................................ ................................ ................................ .................... 11 Refinement of Observational Rating Scheme ................................ ................................ ........... 11 Measures ................................ ................................ ................................ ................................ ... 12 RESULTS AND DISCUSSION ................................ ................................ ................................ ... 14 Clarit y ................................ ................................ ................................ ................................ ....... 14 Observability ................................ ................................ ................................ ............................. 14 Content Validity ................................ ................................ ................................ ........................ 14 Development of NDBI - Fi Rating Scheme ................................ ................................ ................ 16 STUDY 2 ................................ ................................ ................................ ................................ ...... 18 METHOD ................................ ................................ ................................ ................................ ..... 19 Participants ................................ ................................ ................................ ................................ 19 Measures ................................ ................................ ................................ ................................ ... 19 NDBI - Fi ................................ ................................ ................................ ................ 19 Caregiver - child Inte raction (CCX). ................................ ................................ ...... 20 Established NDBI Fidelity. ................................ ................................ ................... 20 Brief Observation of Social Communication Change (BOSCC). ......................... 20 Mullen Scales of Early Learning (MSEL). ................................ ........................... 21 Analysis Plan ................................ ................................ ................................ ............................ 21 Reliability. ................................ ................................ ................................ ............. 22 Validity. ................................ ................................ ................................ ................ 22 Sensitivity. ................................ ................................ ................................ ............ 23 RESULTS AND DISCUSSION ................................ ................................ ................................ ... 24 Reliab ility ................................ ................................ ................................ ................................ .. 24 v Validity ................................ ................................ ................................ ................................ ..... 24 Concurrent validity. ................................ ................................ .............................. 24 Convergent and discriminant validity. ................................ ................................ .. 25 Predictive validity. ................................ ................................ ................................ 26 Sensitivity ................................ ................................ ................................ ................................ . 27 CONCLUSION ................................ ................................ ................................ ............................. 28 Limitations and Future Directions ................................ ................................ ............................ 30 APPENDICES ................................ ................................ ................................ .............................. 32 APPENDIX A : Taxonomy of NDBI Strategies ................................ ................................ ........ 33 APPENDIX B : NDBI - Fi Item Definitions and Rating Anchor s ................................ .............. 47 APPENDIX C : Tables ................................ ................................ ................................ .............. 52 APPENDIX D : Figures ................................ ................................ ................................ ............. 57 APPENDIX E : Frequency D istributions of I ndividual NDBI - Fi I tems ................................ ... 60 REFERENCES ................................ ................................ ................................ ............................. 62 vi LIST OF TABLES Table 1. Characteristics of Established NDBI Fidelity Measures 53 Table 2. Content Validity Ratios for Intervention Taxonomy Items 54 Table 3. Participant Demographics 55 Table 4. Mean, standard deviation, and normality of NDBI - Fi items and Average Score 56 Table 5 . Reliability of individual NDBI - Fi items and Average Rating 56 vii LIST OF FIGURES Figure 1. Study 1 Method Flowchart 58 Figure 2. Frequency distribution of NDBI - Fi Average Score 59 Figure 3. Frequency distribution of NDBI - Fi average ratings by group assignment 59 1 INTRODUCTION Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder that emerges in early childhood and is characterized by deficits in social communication and the presence of restricted and repetitive interests and/or behaviors (American Psychiatric Association, 2013) . According to the National Research Council, intervention for ASD should be intensive, at least 25 hours per week, and begin immediately after the diagnosis is given (2001) . Current b est practice s for the treatment of young children with ASD include interventions that integrate developmental approaches, which focus broadly on child - centered activities and adult responsiveness, and behavioral approaches , which focus on teaching skills via contingencies (Zwaigenbaum et al., 2015) . In (National Research Council, 2001; Zwaigenbaum et al., 2015) . There is a growing evidence base for several such manualized interventions , broadly classified as Naturalistic Developmental Behavioral Interventions or NDBI (Schreibman et al., 2015) . While individual NDBI were developed in different labs and emphasize different theoretical perspectives, they share several common elements, including child - led teaching episodes , environmental arrangement, natural reinforcement, use of prompting techniques, turn - taking, imitation, modeling , and caregiver involvement ( Schreibman et al., 2015). Although experts agree th at there are several shared strategies, this question has not been addressed quantitatively. Despite a growing evidence base for the efficacy of NDBI , there are current limitations in our knowledge of treatment mechanisms and active ingredients in these interventions . NDBI are usually studied as comprehensive treatment packages, without dismantling of their component parts. Although numerous common elements exist among these manualized interventions , 2 researchers do not articulate or measure these element s in the same way, and often define different components as fundamental to their intervention s . Treatment fidelity Measuring treatment fidelity, or adherence to the intervention protocol, is essential for understanding the active ingredients of these trea tments and interpreting the results of intervention trials (Wainer & Ingersoll, 2013) . However, when treatment a dherence or fidelity is reported in publications, it tends to be a summary rating, such as overall percent adherence to the treatment protocol. In terms of evaluating treatment outcomes in RCTs, fidelity to specific intervention techniques is often not lin ked directly to intervention outcomes; therefore, it is unclear what specific active ingredients result in improvements in child social communication. Further, among NDBI, measures of treatment fidelity used for research are often unpublished; therefore, i t is not known which strategies contribute to the overall rating. Finally , to our knowledge, NDBI intervention fidelity measures have not been examined psychometrically in a published study, and therefore it is not known whether they are valid, reliable ac ross short time intervals, or sensitive to change. Without common terminology to describe intervention components and a common measurement tool for reporting fidelity, researchers cannot easily compare intervention ingredients across studies. This limits o ur ability to understand the active ingredients of NDBI , and to identify specific elements that lead to positive outcomes. Caregiver - Mediated Intervention These issues are compounded when the NDBI is delivered by a caregiver rather than trained therapists. Caregiver - mediated interventions, in which the caregiver is taught to child receives, and creates opportunities for children to learn in a variety of settings an d activities 3 (Bearss, Burrell, Stewart, & Scahill, 2015) . In addition, collateral effects have been found for reduci and self - efficacy (Estes et al., 2014; Ingersoll, Wainer, Berger, Pickard, & Bonter, 2016; Tonge et al., 2006) , al though some studies have found no reduction in parent stress (Kasari, Gulsrud, Paparella, Hellemann, & Berry, 2015) . For these reasons, a number of NDBI , such as Project ImPACT (Ingersoll & Dvortcsak, 2009) and the Social ABCs , (Brian, Smith, Zwaigenbaum, & Bryson, 2017; Brian, Smith, Zwaigenbaum, Roberts, & Bryson, 2016) , have been designed specifically to be delivered by caregivers, and other NDBI have been tested for efficacy as caregiver - mediated interventions (Kaiser, Hancock, & N ietfeld, 2000; Kaiser & Roberts, 2013; Kasari et al., 2015; Kasari et al., 2014; Rogers et al., 2012) . Although relativel y few randomized controlled trials (RCTs) of caregiver - mediated NDBI have been completed (McConachie & Diggle, 2007; Oono, Honey, & McConachie, 2013) , studies show that caregivers who receive training in NDBI can successfully implement interv ention techniques with their children, and that children demonstrate improvements in specific intervention targets such as language use and joint attention (Bradshaw, Koegel, & Koegel, 2017; Brian et al., 2017; Gulsrud, Hellemann, Shire, & Kasari, 2016; Ingersoll & Wainer, 2013; Ingersoll et al., 2016; Kasari e t al., 2015; Patterson, Elder, Gulsrud, & Kasari, 2014) . E vidence has also been found for improvement in the quality of caregiver - child interactions in t he areas of shared attention and parental synchrony (Oono et al., 2013) , both of which can be considered facets of caregiver responsiveness , which is a focus in NDBI and developmental interventions . In addition, recent research has also linked caregiver intervention fidelity to child outcomes. For example, a recent study found that children whose caregivers were trained in the Social ABCs NDBI showed increases in vocal responsiveness as well as vocal 4 initiations, and that caregiver fidel ity to the intervention protocol predicted child responsiveness over and above the effect of treatment group (Brian et al., 2017) . Specific components of caregiver fidelity have also been shown to predict child language gains (Ingersoll & Wainer, 20 13) as well as joint engagement (Gulsrud et al., 2016) . However, another study did not find relationship s between improvement in parent fidelity and child improvement on standardized measures (Rogers et al., 2012) . In addition, not all studies have shown that caregiver - mediated NDBI improve child outcomes (Oosterling et al., 2010; Rogers et al., 2012) . Oosterling et al. investigated a low - intensity caregiver - mediated NDBI , and found no main effects of treatment; however they noted that care as usual in the Netherlands is of high quality, therefore a low - dose caregiver - mediated intervention may not have provided significant improvement above and beyond the high quality care children otherwise received (2010) . Another RCT o f a brief, low - dose caregiver - mediated NDBI found no effect of group assignment on outcomes ; however the community intervention control group received significantly more intervention hours than those in the study treatment group, thus confounding the resul ts (Rogers et al., 2012) . Overall, limitations of these studies make it difficult to ascertain whether null effects were due to lack of efficacy, individual differences in treatment response, low fidelity of implementation, and/ or good quality of community care received by the control group . Direct comparison o f caregiver fidelity and child outcomes is important for understanding the efficacy and active ingredients of caregiver - mediated interventions. Evaluating Intervention Outcomes Studies of caregiver - mediated interventions for young children with ASD often measure caregiver and child outcomes based on behavioral coding of video - recorded caregiver - child interactions (CCXs ) . Observational methods are especially useful when nonverbal 5 (Bakeman & Quer a, 2011) . Observations that take place in the home provide a useful sample of behavior (Gardner, 2000) . Best practice recommendations for the assessment of young children include a focus on direct observation as well as evaluation of the functioning of the child in natural contexts (Bagnato, 2005; Division for Early Childhood, 2014) , making the CCX a useful and practical focus of intervention outcome. In addition, CCXs allow researchers to examine intervention response through multiple lenses, by lookin g at change in caregiver behavior (e.g. increased use of intervention strategies over time) , change in child behavior (e.g. improvements in social communication skills over time) , and how both play a role in shaping the interaction . This is evidenced by re cent research spent in joint engagement (Kaale, Smith, Nordahl - Hansen, Fagerland, & Kasari, 2017) . Th is study showed that maternal behaviors were predictive of time spent in a joint - engaged state, whereas child behaviors were not, al though mothers behaviors were often related to child behaviors in the same domain (i.e. maternal positive affect related to child positive affect). This study highlights the importance of considering both caregiver and child behavior concurrently when evaluating CCXs. Despite a need for characterizing caregiver behaviors to evaluate research outcomes , there is a lack of consistency in how caregiver behaviors are measured across studies. For example, published intervention trials have evaluated car egiver behavior in various ways, including intervention fidelity (Casenhiser, Shanker, & Stieben, 2013; Gulsrud et al., 2016) , rating s of caregiver re sponsiveness (Karaaslan & Mahoney, 2015; Mahoney & Solomon, 2016; Patterson et al., 2014; Shire, Gulsrud, & Kasari, 2016) , as well as parental synchrony (Green et 6 al., 2010; Hudry et al., 2013; Pickles et al., 2015) evaluating caregiver behavior across studies. Further, many NDBI intervention s involve strategies that caregiver s use instinctively to some degree . A recent pilot RCT showed that, at baseline, 20% of parents were considered to have met overall fidelity, before receiving training in an NDBI (Stahmer et al., 2017) . Thus, caregivers enter research studies with different repertoires of behavi or , and some caregivers who naturally perform these strategies may not substantially improve in fidelity with training. Measuring common elements to a dvance i ntervention r esearch A standardized way to evaluate caregiver fidelity during CCXs would allow for cross - study evaluation (e.g. meta - analysis) of NDBI techniques and treatment fidelity, as well as consortium - style collaborative research studies with larger sample sizes and more power, which are essential for evaluating the active ingredients of interve ntions (Tate et al., 2016) . In addition, a standardized measure could be applied to better characterize similarities among treatment groups (Godfrey, Chalder, Ridsdale, Seed, & Ogden, 2007) , including active treatment and treatment - as - usual control groups. Sig nificant overlap in strategies used in research and community settings , as well as the recognition that caregivers vary widely in their use of strategies before training, may clarify the small effect sizes and null results found in some RCTs of caregiver - m ediated NDBI (Oosterling et al., 2010; Rogers et al., 2012) . However, d espite the similarity of key behaviors taught to caregivers across NDBI , there is currently no standardize d set of common intervention elements, nor a standardized measure for assessing intervention implementation by caregivers . Development of an intervention taxonomy, or comprehensive set of intervention strategies, can support our understanding of evidence - based interventions by providing the field with standardized language, and a standardized way to 7 describe and compare intervention ingredients across studies (Barth & Liggett - Creel, 2014; Chorpita, Daleiden, & Weisz, 2005; Lokker, McKibbon, Colquhoun, & Hempel, 2015; McHugh, Murray, & Barlow, 2009) . Accor dingly, common elements of evidence - based interventions have been examined in the context of many types of behavioral treatments , including those targeting disruptive behavior disorders (Garland, Hawley, Brookman - Frazee, & Hurlburt, 2008; Kaehler, Jacobs, & Jones, 2016) , obesity (Tate et al., 2016) , bipolar disorder (Miklowitz, Goodwin, Bauer, & Geddes, 2008) , trauma (Strand, Hansen, & Courtney, 2013) , and parenting skills (Barth & Liggett - Creel, 2014) . The goals of this research were : 1) to develop a comprehensive taxonomy of common elements of caregiver - mediated NDBI based on review of existing fidelity measures and qualitative feedback from an expert panel ; 2) to develop and refine an observational rating scheme via quantitative feedback from experts ; and 3) to establish preliminary reliability and validity of the new measure based on ratings of CCX videos from various completed intervention trials. 8 STUDY 1 The purpose of this study was to develop a comprehensive taxonomy of common elements of caregiver - mediated NDBI and to determine the common elements across NDBI using q uantitative data. We expected that a broad set of items could be clearly defined based on input from intervention manuals and fidelity rating schemes, as well as qualitative feedback from an expert panel. Next, we expected that a subset of these items woul d emerge as common across the NDBI under consideration. 9 METHOD Development of an Intervention Taxonomy Guided by the methodology proposed by McKenzie et al. (1999) to establish content validity, a multistep process was used to determine the common elements of NDBI ( Figure 1 ). The first author requested published and unpublished NDBI fidelity measures from doctoral - level intervention developers and experts in order to develop a broad taxonomy of NDBI strategies. Several authors of the Schreibman et al. (2015) paper, as well as known colleagues who have conducted RCTs of the interventions identified by Schreibman et al. were invited by email to collaborate on this work. A total of 1 1 research teams (14 individuals) w ere approached, with some having more than one expert individual per site. One research team did not respond. Interventions examined include d Early Achievements tuart, 2011) , Early Start Denver Model (ESDM; Rogers & Dawson, 2010) , Enhanced Milieu Teaching (EMT; Kaiser et al., 2000; Kaiser & Hester, 1994) , Joint Attention, Symbolic Play, Engagem ent & Regulation (JASPER; Kasari, Freeman, & Paparella, 2006; Kasari, Gulsrud, W ong, Kwon, & Locke, 2010) , Pivotal Response Training (PRT; Schreibman & Koegel, 2005) , Project ImPACT (Ingersoll & Dvortcsak, 2009) , and Social ABCs (Brian et al., 2017; Brian et al., 2016) . Each of these interventions has been examined in a research context, and has demonstrated some evidence of efficacy as a therapist - delivered and/or caregiver - mediated intervention. While the intervention approaches used in this study do not represent a co mprehensive list of all interventions that could be characterized as NDBI, those with expertise in the above interventions agreed to collaborate on this endeavor. Examination of treatment fidelity forms across research groups and interventions revealed a range of ways in which fidelity is measured. The number of items rated across 10 interventions varied substantially, from as few as 6 to as many as 32, suggesting variability in the comprehensiveness of these measures. While some research teams utilize interv al coding methods (e.g. rating presence or absence of a behavior during each one - minute interval), others use more global measures (e.g. a 1 - 5 likert - type rating ranging from little - to - no use of strategies to high - quality implementation). Across research t eams, scores are generally averaged and converted to an overall percent rating of treatment fidelity. Characteristics of these fidelity forms are summarized in Table 1 . A preliminary taxonomy of intervention elements was establis hed by examining the content of available NDBI fidelity rating forms and treatment manuals. The taxonomy was inclusive of items that were intervention - specific (i.e. not common across all interventions) , as well as those shared among most or all interventi ons . In cases when additional descriptors or clarification was needed, published and unpublished intervention manuals were reviewed and discussed with colleagues in order to define strategies. The taxonomy was refined over several iterations, integrating i nformal feedback and conversation with colleagues with expertise in NDBI. A total of 20 items were defined ( Appendix A ). Of the 20 items, 9 focused on promoting child engagement in an activity with the adult, 3 focused on adul t modeling of skills, 2 focused on encouraging spontaneous communication, and 6 focused on direct teaching strategies. Of the 6 items focusing on direct teaching strategies, 5 were represent different individual components of a multistep teaching procedure. Each strategy was formally defined based on the content of the examined fidelity forms and manuals ; examples and non - examples were generated for each item to further clarify the definition of the strategy . 11 Qualitative Review The content of the 20 - item preliminary taxonomy was refined based on expert feedback using an adapted Delphi Method. T he preliminary taxonomy was sent to 13 collaborators with expertise in intervention research and develop ment for open - ended critique and commentary in the form of tracked - changes edits and comments in a word processing document . Three of the original 13 individuals did not provide written feedback; two indi viduals shared the taxonomy with a colleague to provide additional feedback, and one individual asked a colleague to provide feedback in her place. Thus, suggestions and revisions were obtained from a total of 12 individuals. The definitions and examples w ere subsequently revised . The most substantive change to content was for items pertaining to and Imitating the child . In the initial draft, these two items were Child choice of activity and Imitation and joining in the activity ; however, expert critique made clear that imitation was distinct from joining the child, whereas child choice and joining the child were both key aspects Additional changes included clarifying terminology, clarifying and adding to examples, an d the addition of a glossary to define key terms. These items were then resent to collaborators in the form of a survey for additional feedback and refinement. Refine ment of O bservational R ating S cheme Next, a group of experts provide d quantitative feedback on the refined item list in order to reduce items to the common elements and increase the content validity of the item set. Collaborators were asked to nominate additional raters with expertise in their respective interventions as needed such that each intervention would have 4 re presentatives, for a total of 28 raters. Specifically, collaborators were asked to nominate individ uals who they would consider T wo 12 intervention developers nominated fewer than 4 respondents , therefore a total of 25 individuals were contacted with the survey link . Of these, 21 individuals responded (85%) . In order to prevent over - representation of any one intervention, survey responses from 2 - 3 experts per intervention were used, with additional responses dropped from analysis . Individuals with fewer years of experience were dropped first ; where there was an equal amount of intervention experience, one was chosen at random by flipping a coin. A total of 19 responses were analyzed , representing 7 NDBI . Measures A Qualtrics survey link was distributed to expert collaborators and their nominees. Respondents were presented with the text for the 20 revised item s , and 3 questions per item. Questions included Likert - scale ratings of clarity and the extent to which item s could be rated on video by an expert and by a well - trained nonexpert. In addition, r espondents rated the extent to which each item was a part of a given intervention using the following scale, adapted from Lawshe (1975) : Essential (3): This item is a component of [intervention] , and it is describ ed explicitly in the intervention manual. Interventionists use it consistently during sessions. Useful, but non - essential (2): This item is good clinical practice, and interventionists u se it when providing [intervention] , but it is not described in the in tervention manual. Neutral (1) : I would not discourage use of this strategy when providing [intervention] , but interventionists do not typically use it, and it is not described in the intervention manual. 13 Conflicting (0): This item conflicts with the [inte rvention] intervention protocol. Intervention trainees and caregivers are discou raged from using this strategy. A content validity ratio (CVR) was calculated for each item, using the following formula: , where n e = number of respondents , and N = the total number of respondents. The CVR was used to quantitatively evaluate the extent to which each item was characteristic of NDBI . A negative CVR indicates that fewer than 50% of raters classified , whereas a positive CVR indicates that greater than 50% of raters classified an item as The published recommended cutoff for achieving statistically significant agreement with our sample size was (0.42) was used to dete rmine which items will be retained in the final measure (Lawshe, 1975; Veneziano & Hooper, 1997) . 14 RESULTS AND DISCUSSI ON Clarity Item clarity was rated on a 5 - point Likert scale, ranging from extremely clear (1) to extremely unclear (5) , with items with a rating greater than or equal to 3 .0 considered in need of revision or exclusion from the final measure . Of all items, the average clarity rating was 1.68, with scores ranging from 1.21 to 2.16. Therefore, no items were eliminated or further refined due to lack of clarity. Observability Respondents were asked to rate how well each item could be rated from a 10 - minute video by an NDBI expert and by a well - trained non - expert, with response options ranging from extremely well (1) to no t well at all (5). Like ratings for clarity, items rated greater or equal to 3.0 were considered in need of revision or exclusion from the final measure. Across all items, the average rating was 1.47 for NDBI experts (range: 1.11 to 1.84), and 1.98 for non - experts (range: 1.32 to 2.84). No items were eliminated or further refined due to difficulty rating from video. Content Validity CVRs were calculated for each individual item in two ways : 1) considering the number of respondents indicating a score of only; and 2) considering the respondents who indicated a score of - ( Table 2 ) . Examination of only addition of - on general clinical skills when providing int ervention, rather than using only elements specific to a single manualized treatment. 15 When considering only items rated of the 20 items exceeded the statistically significant cutoff of 0.42 . This suggests that there are numerous common ele ments of NDBI, including those focusing on increasing child engagement, modeling new skills, encouraging spontaneous communication, and teaching new skills. One additional item , which was examined further and refined based on feedback from the expert panel. Specifically, some interventions used a particular prompting hierarchy that was precluded based on the original wording of the item ; therefore this item was modified to contain more generic language and was included in the final set of common items . As expected, these items, described below, encompass strategies found in developmental and relational interventions (e.g. child - led activities) as well as those found in applied behavior analytic interventions (e.g. direct teaching episodes based on principles of operant conditioning). This lends support to the content validity of this set of items. impor tant and defined in the treatment manual) and - which are commonly used and considered good clinical practice, but not necessarily in the intervention manual), only one of the original 20 identified items did not exceed the statistically significant cutoff of 0.42: Imitating the child . This suggests that trained clinicians consistently use many strategies that are not explicit components of the specific NDBI being delivered, al though they are a manualized component of at least one other NDBI. This calls into question seem to be shared among NDBI in practice, but not necessarily shared among treatment manuals and fidelity forms . In addition, the presence of these common practices may compromise direct 16 comparison of different interventions , and obscures our understanding of which treatment strategies promote improvement in child outcomes . Development of NDBI - Fi Rating Scheme The 10 quantitatively - derived common items as well as the revised item regarding use of prompting strategies during direct teaching were used to develop the NDBI - Fi measure. More specifically, an 8 - item rating scheme was developed. Each item is described briefly here, and the full item text and rating anchors can be viewed in Appendix B . Face - to - level refers to how often the adult is directly in front of the child and at a similar height (i. e. focuses on the extent to which the adult joins the child in a child - chosen activity. Displaying positive affect and animation rates the extent to which the adult uses exaggerated positive vocal tone, facial expressions, gestures, etc. The focus of Modeling appropriate language is on how often the adult makes developmentally appropriate comments on the activity, rather than givin g commands, asking rhetorical questions, or remaining silent. Responding to attempts to communicate rates the extent to which the adult Using communicative temptations refers to how often the adult nonverbally elicits communication using one of several techniques paired with wait time for the child to communicate. Several items focus on direct teaching episode s , which comprise a multi - step procedure based on principles o f operant conditioning which have an antecedent - behavior - consequence (ABC) structure . These includ ed Clear and appropriate teaching opportunities, Motivating and relevant teaching opportunities, Supporting a correct response using prompts, and Providing co ntingent natural and social reinforcement . Together, these direct teaching episode , and were collapsed into a single item, Quality of direct teaching , for the 17 purposes of the rating scheme. Frequency of direct te aching rates how many times the adult completes a direct teaching episode with an antecedent - behavior - consequence (A - B - C) structure. Together, these Frequency and Quality items account for how often and how well caregivers used direct teaching strategies. An observational rating scheme and scoring manual was developed for the NDBI - Fi . A macro - level rating scheme (i.e. a 1 - 5 rating scale) rather than a micro - level discrete coding system was designed, both to align with many of the existing fidelity measures and to increase the likelihood that the measure would not be burdensome or costly to use. This type of rating can be accomplished without specialized computer software, and in a relatively short amount of time. Not only do many existing NDBI fidelity measu res use these types of rating schemes, but research also suggests that they yield similar information in much less time than fine - grained coding approaches (Bakeman & Quera, 2011) . The NDBI - Fi manual includes practical considerations for rating, such as a recommended system for note - taking and the number of p asses in which to rate videos. It also specifies item definitions, examples and non - examples, and includes a glossary and descriptive anchors for assigning ratings. The rating scheme was piloted on a small set of videos by two raters in order to refine the descriptive rating anchors, and to achieve inter - rater reliability. Scoring differences were discussed, and items and rating anchors were refined in order to improve clarity and ease of scoring. 18 STUDY 2 Study 2 pilot ed an intervention - independent ratin g scheme, the NDBI - Fi , whose 8 items were based on the common intervention elements defined in Study 1. To complete this goal, this study involved applying the rating scheme to videos of families who participated in completed or ongoing RCTs of NDBI , and e xamining the reliability, validity, and sensitivity of the measure. 19 METHOD Participants This study involved analyzing existing data from completed or ongoing treatment trials of caregiver - mediated NDBI with children with ASD aged 7 years - old or younger. Videos were contributed from several sites 1 (including Michigan State University, University of California San Diego, and Weill Cornell Medical College) with representation from two interventions, including Project ImPACT (Ingersoll & Dvortcsak, 2009) and JASPER (Kasari et al., 2006; Kasari et al., 2010) . All families consented for their videos to be used for research purposes . This study was approved by the Institutional Review Board at Michigan State University. The study sample involved 60 parent - child dyads. Demographic information is reported in Table 3 . At intake, children were an average of 35.5 months old (SD = 13.4) . Measures NDBI - Fi . The NDBI - Fi is an 8 - item observational rating scheme based on the common elements of NDBI ascertained in Study 1. Two raters , the first author of this paper and an undergraduate research assistant familiar with observational rating schemes but inexperienced in intervention, independently coded videos and held consensus meetings to discuss discrepancies in ratings until inter - ra ter reliability was met. Raters were considered reliable when 3 consecutive videos were rated with the following criteria for agreement: At least 7 out of 8 items were within 1 point. No items were greater than 2 points apart. 1 Additional data are being obtained from other sites conducting research with other treatment models; Institutional review boards and data use agreements are pending. 20 The average score was within 0.5 points (i.e. +/ - 0.25 points). The primary rater was kept blind to intervention type (specific NDBI) when possible, as well as group assignment (study treatment vs. control) for all videos . Caregiver - child Interaction (CCX) . Collaborators contribute d caregiver - child interaction (CCX) videos from existing treatment trials of their respective caregiver - mediated NDBI . All CCX videos involve d an approximately 10 - minute free play interaction between the child and the caregiver at th e post - intervention time point, and included families in the treatment and control groups (e.g. waitlist, treatment - as - usual). Sites were asked to select English - speaking participants within the treatment and control groups at random, using an online random numbe r generator ( https://www.random.org/integer - sets ). A total of 60 post - timepoint videos were contributed from 3 intervention trials , including 37 who received treatment and 23 controls . A subset of pre - po st video pairs and follow - up data from an ongoing RCT at Michigan State University were used to examine sensitivity to change and predictive validity (n=2 4 families) . Established NDBI Fidelity . Caregiver p ercent fidelit y to the intervention protocol was supplied by collaborators from two sites for each CCX video using the established fidelity measure for their respective intervention . Established NDBI Fidelity was compared to fidelity ratings on the new NDBI - Fi measure to determine convergent validity . Brief Observation of Social Communication Change (BOSCC) . The BOSCC was used to code child behaviors in the CCXs to assess predictive validity. The BOSCC is an observational coding scheme that was developed as a sensitive outcome 21 measure for intervention studies in ASD (Grzadzinski et al., 201 6) . It captures change in child social communication and repetitive behaviors in the context of a short play interaction. Flow charts with specific decision points are used to score each of 15 items. Item codes range from 0 to 5, where 0 represents more typical behavior and 5 represents more atypical behavior. The BOSCC has been used in a number of stu dies, and has shown promise as a tool for assessing change in child behavior (Kitzerow, Teufel, Wilker, & Freitag, 2016; Nordahl - Hansen, Fletcher - Watson, McConachie, & Kaale, 2016; Pijl et al., 2016) . Mullen Scales of Early Learning (MSEL). Age equivalent scores from the MSEL (Mullen, 1995) were used to characterize the developmental level of the sample and to evaluate discriminant validity. The MSEL has four domains that evaluate verbal development ( Receptive Language and Expressive Language) and nonverbal development (Fine Motor and Visual Reception). Analysis Plan Behavioral rating schemes must be both reliable and valid (Chorney, McMurtry, Chambers, & Bakeman, 2015) . Observer agreement in particular is essential when using observational measures with multiple raters (Bakeman & Quera, 2011) . Validity was examined (2003) recommendations for developing a new measure. This include d investigating between - group differences that are consistent with the fidelity construct (criter ion validity), and the correlations between measures of similar and dissimilar constructs (convergent and discriminant validity). In addition, we examined the extent to which the measure predicted relevant outcomes (predictive validity). In order to assess whether the NDBI - Fi has potential for use in future intervention trials, we also examined whether the measure capture d change over the 22 course of a treatment study (sensitivity ) . In addition, the basic properties of the item distributions were examined. Re liability. was used to evaluate the internal consistency of the items within the NDBI - Fi . In addition, a total of 49 videos from two sites were coded by two raters. Intra - class correlations (ICCs) were used to evaluate agreement between coders on individual items as well as overall score. ICCs were selected because they can be used for ordinal and interval data, and incorporate the magnitude of disagreement in order to estimate inter - rater reliability (Hallgren , 2012) . A single - measures, two - way mixed design based on absolute agreement was used. Validity . To address concurrent validity, an independent samples t - test was used to determine if caregivers who received training differ ed from those who did not at the post - timepoint . We expect ed that caregivers who were in the active study treatment group would receive a significantly higher NDBI - Fi rating than caregivers who were in control groups. Conve rgent and discriminant validity were examined by conducting Pearson correlation s. We expected that overall ratings for the Established NDBI Fidelity would be significantly correlated to the NDBI - Fi Average Rating with a medium to large effect size. Next, we expected that the NDBI - Fi would not be relat ed to child chronological age, or child developmental age equivalent (i.e. a small effect size, r < 0.2). In order to assess predictive validity, a subset of videos ( n=2 1 ) of the same dyad pre - and post - training were rated using the NDBI - Fi. T hese same dy ads had their pre - treatment and follow - up videos rated using th e BOSCC , which evaluates child response to intervention. Increases in c aregiver use of intervention strategies form pre - to post - treatment should 23 theoretically relate to improvements in child social communication from pre - treatment to follow - up. Change scores for the NDBI - Fi were calculated by subtracting pre - treatment from post - treatment ratings, such that a positive change score indicated improvem ent. Change scores for the BOSCC were calculated by subtracting follow - up scores from pre - treatment scores, such that a positive change score indicated improvement. Pearson correlations were used to evaluate the extent to which change in caregiver scores o n the NDBI - Fi related to improvement in child social communication skills. Sensitivity. To evaluate the sensitivity of the NDBI - Fi , a subset of videos of the same dyad pre - and post - training were rated (n= 2 4 ) . A paired samples t - test was used to assess fo r significant change in caregiver use of strategies from pre - to post - training. We expect ed that, on average, caregivers would score significantly higher on the NDBI - Fi after receiving training. 24 RESULTS AND DISCUSSI ON The NDBI - Fi Average Score (M = 3. 26 , SD = 0. 66 ) was adequately normally distributed ( Figure 2 ) , with skewness of - 0.1 4 (SE = 0. 31 ) and kurtosis of - .71 (SE = 0. 61 ). Two individual items deviated from normality according to skewness and kurtosis values ( Table 4 ), with one representing a low - frequency behavior with positive skew (6. Communicative Temptations) and one representing a high - frequency behavior with negative skew (7. Frequency of Direct Teaching). Frequency distributions of individ ual items are included in Appendix E . Reliability The 8 NDBI - Fi 77 , thereby demonstrating good internal consistency. I nter - item correlations ranging from - 0. 11 to 0.71. The single measures ICC for the NDBI - Fi Average Rating was 0.79 , demonstrating excellent reliability (Cicchetti, 1994) . Individual item ICCs ranged from 0.52 to 0.85 ( Table 5 ) , with 3 items with fair reliability, 2 items with good reliability, and 3 items with excellent reliability (Cicchetti, 1994) . ICCs suggest that, while absolute agreement between raters for the Average Rating was excellent , some items were mo re difficult to rate consistently than others. However, one of the two rater s did not have any direct intervention experience . Despite her limited experience, she was able to learn the rating scheme, and reach reliability according to our training criteria. For raters without intervention experience, it is possible that more stringent training criteria may be needed in order to obtain high in ter - rater reliability across all measure items. Validity Concurrent validity. An independent samples t - test was used to compare post - timepoint ratings for c aregivers in the active study treatment groups (n = 37 ) and control groups (n = 2 3). Caregivers who 25 received training (M = 3. 50 , SD = 0.61) received higher NDBI - Fi Average R atings than caregivers in the study control groups on average , with a large effect size (M = 2.8 9 , SD = 0. 57 ), t ( 58 ) = 3.81 , p < 0.001, d = 1.01 . As expected, this shows that parents who have received training in an NDBI demonstrate greater adherence to common NDBI strategies than parents in study control groups. However, there was also overlap in the frequency distributions of trained and untraine d caregivers, with some untrained caregivers demonstrating high fidelity, and some trained caregivers demonstrating low fidelity ( Figure 3 ) . 54 % of trained caregivers exceeded a cutoff of 3.5 on the average NDBI - Fi rating . Of unt rained caregivers, 1 7 % exceeded a cutoff of 3.5, which is consistent with the 20% reported by Stahmer et al. (2017) . These data suggest that - training is important in understanding the change in treatment dose a child receives as a result of being assigned to the treatment group in an RCT. I n other words, some children assigned to the control condition (i.e. those with caregivers who naturally use NDBI strategies) may actually receive a similar dose of treatment to children assigned to the active study treatment. Convergent and discriminant validity. A Pearson correlation showed that the NDBI - Fi Average rating correlated significantly with individual intervention fidelity collected at tw o sites 2 with a large effect size ( r = 0.45, p = 0.001). As expected, caregivers who performed the interventions at higher fidelity also received higher ratings on the NDBI - Fi . Pearson correlations revealed that the NDBI - Fi Average rating did not significa ntly correlate with either developmental level, as measured by averaging the age equivalent score across the four MSEL domains ( r = 0.19, p = 0.11), or child chronological age 2 D ata are being collected at additional sites. 26 at the start of the study ( r = 0.16, p = 0.18). This is consistent with our pred iction and lends support to the validity of the NDBI - Fi . Predictive v alidity. A small subset of the sample who had received training had pre - post data for the NDBI - Fi as well as pre - follow up data for the BOSCC (n = 21). Improvement in NDBI - Fi Average Rating from pre - to post - intervention did not significantly correlate with improvement on BOSCC from pre - intervention to follow - up for the Social Communication (SC) subscale ( r = 0.23, p = 0.31 ) or Total score ( r = 0.37, p = 0.10) . However, given that these analyses were underpowered, it is more useful to evaluate this relationship based on the effect size . Hemphill suggested that correlation coefficients between 0.2 and 0.3 fall within the middle third of effect sizes reported in psychological stu dies (2003) , thus suggesting some relationship between caregiver intervention fidelity and child improvement . These results should be interpreted cautiously and considered preliminary, given the small sample with which this analysis was conducted. Although the BOSCC has demonstrated preliminary reliability, validity, and sensitivity to change (Grzadzinski et al., 2016) , research has not yet linked concurrent adult behavior with child behavior using this outcome measure. Further, al though the BOSCC has been demonstrated to c apture change in some samples (Grzadzinski et al., 2016; Kitzerow et al., 2016) , this findi ng has not been replicated in others (Fletcher - Watson et al., 2016; Nordahl - Hansen et al., 2016) . Therefore, although we expected that improvement in NDBI - Fi Average Rating fr om pre - to post - intervention w ould be significantly associated with child improvement on the BOSCC from pre - intervention to follow - up testing, there are several reasons this might 27 communication skill may not have been captured adequately in a short observation (10 minutes) due to illness, fatigue, time of day, or challenging behavior. Sensitivity As expected, caregivers who had been trained on Project ImPACT scored significantly higher at post - intervention on the NDBI - Fi Average rating ( M = 3.65, SD = 0.54) than at pre - intervention ( M = 2.81, SD = 0.55), t (23) = 5.93, p < 0 .001 , d = 1.53 . This indi cates that the NDBI - Fi is sensitive to change over the short - term treatment period associated with most NDBI caregiver - mediated interventions. Therefore, it may be a useful instrument for quantifying change in a research context. 28 CONCLUSION Various NDBI for young children with ASD have been independently developed and validated. While researchers acknowledge common strategies across these treatments (Schrei bman et al., 2015) , this study represents the first attempt to quantitatively evaluate the extent to which individual strategies are shared across manualized treatment packages to our knowledge. Study 1 involved development of a comprehensive taxonomy of intervention techniques through the examination of treatment fidelity forms and manuals and input from individuals with expertise in various NDBI. This large collaborative effort yielded a list of 20 defined strategies, refined by expert clinical scient ists, with accompanying examples and non - examples to illustrate the strategies. Given the differences in terminology often used across NDBI models, these refined definitions may be useful in translating information among research teams and in the community . Findings demonstrated that , of these 20 items, strategies that are shared across NDBI, and that these strategies can be measured using an intervention - independent fidelity rating scheme. Although evidence is pre liminary, the NDBI - Fi has the potential to facilitate multisite research that cuts across interventions by providing a mechanism for evaluating change in common NDBI strategies during intervention trials. S tudy 1 also revealed that there are many strategie s which are not explicit components of multiple NDBI manuals but are consistently used by clinicians with expertise in different NDBI. In other words, clinicians seem to use NDBI strategies that are part of at least one NDBI, although not necessarily the N DBI they deliver. In fact, using the broader criteria in which respondents rated their use of manualized strategies and good clinical practice, only 1 item was 29 not considered a common element based on the statistically significant cutoff used for this study . This has several implications. First, it is clear that there is substantial overlap among strategies delivered across different NDB I . This overlap in strategies makes comparison among NDBI challenging, given that differences in treatm ent manuals may not completely reflect differences in strategy use. This is further reflected by the variability found in existing fidelity measures used in research, which ranged in comprehensiveness from 6 - item to 32 - item rating schemes. In addition, it is possible that some of these strategies considered good clinical practice are used by practitioners regardless of treatment model, NDBI or otherwise. Given that comparison groups in - as - nderstanding what usual care entails. If community practitioners are indeed using many similar strategies to those delivered as part of intervention trials, this may help explain some of the modest effect sizes found in some RCTs . In addition, pilot testin g of the NDBI - Fi showed variability in scores of caregivers with and without training in an NDBI, with some untrained caregivers demonstrating use of several NDBI strategies , and some trained caregivers demonstrating limited used of strategies . This has im plications for interpretation of efficacy trials, insofar as it affects the extent to which randomization to the study treatment group indicates meaningful manipulation of caregiver behavior. In other words, dose of intervention appears to vary substantial ly across participants in both treatment and control conditions. In future research, it will be important to consider how change in caregiver fidelity of implementation relates to child outcomes, in addition to between - group comparisons. In practice , this finding has implications for the use of stepped - care models in careg iver - mediated interventions for ASD (Phaneuf & McIntyre, 2011; Wainer & Ingersoll, 2015; Wood, 30 McLeod, Klebanoff, & Brookman - Frazee, 2015) . Caregivers who do not intuitively use many of these strategies may have the most t o gain from training and may require a higher level of support to be successful. On the other hand , caregivers who do intuitively use some NDBI strategies may benefit from less supportive training, or training targeting other areas of need. Last, r esearch in implementation science has documented barriers to providing evidence - based interventions (EBIs) in the community for social services more broadly (Osterling & Austin, 2008; Pagoto et al., 2007) and for ASD intervent ions specifically (Pickard, Kil gore, & Ingersoll, 2016; Wood et al., 2015) . Research suggests that practitioners have concerns about the use of packaged treatment manuals, perhaps due to the perceived inflexibility of treatment manuals, or difficulty knowing which treatment manual(s) to use. The present study . It also suggests that there may not be a need for training in mor e than one NDBI , given the demonstrated overlap across treatment models. Limitations and Future Directions This study was limited to examining common strategies used across a selection of NDBI for young children with ASD. Future research should attempt to evaluate this measure across additional NDBI, and on a greater number of CCX videos, which would lend furth er support to the validity of the measure. Data on inter - rater reliability suggest that while training a non - expert in rating caregiver fidelity can be achieved, it yields reliability estimates that are acceptable but could be improved. In addition, f ut ure research should attempt to clarify if and how often these intervention techniques are utilized by clinicians with expertise in other areas, such as more structured 31 applied behavior analysis interventions , special education, and speech - language patholog y. Understanding the extent to which community clinicians use similar strategies is important in determining the quality of services children receive as part of usual care. In addition, this would help clarify comparisons between NDBI and other tr eatment m odels, and NDBI and usual care. 32 APPENDICES 33 APPENDIX A Taxonomy of NDBI Strategies 34 35 36 37 38 39 40 41 42 43 44 45 46 47 APPENDIX B NDBI - Fi Item Definitions and Rating Anchors 48 49 50 51 52 APPENDIX C Tables 53 Table 1. Characteristics of Established NDBI Fidelity Measures . Intervention Items Subscales Rating scale Type of coding Total Score Early Achievements 21 0 1 - 5 Global ESDM 13 0 1 - 5 Per Activity EMT 22 0 0 - 2 or 3 Global JASPER 32 7 0 - 5 Global PRT 1 8 3 0 - 1 Interval (1 - minute) PRT 2 6 0 0 - 1 Interval (2 - minute) Project ImPACT 3 29 5 1 - 5 Global Project ImPACT for Toddlers 1 19 7 1 - 5 Global Social ABCs 10 0 0 - 1 Interval (1 - minute) Notes . 1 University of California San Diego site, 2 Stanford University site, 3 Michigan State University site. 54 Table 2. Content Validity Ratios for Intervention Taxonomy Items . Item Essential or Useful Essential 1 Face - to - 0.89 0.68 2 Setting up the activity space 0.89 0.37 3 0.89 0.89 4 Imitating the child 0.37 0.05 5 Supporting turn - taking 0.79 0.26 6 Displaying positive affect and animation * 1.00 0.68 7 Engaging the child in play routines 0.79 0.16 8 Engaging the child in social routines 0.79 - 0.37 9 Managing problem behavior and dysregulation 1.00 0.37 10 Modeling appropriate language * 1.00 0.58 11 Modeling gestures and JA 0.47 0.05 12 Modeling new play acts 0.79 0.37 13 Responding to attempts to communicate * 0.89 0.89 14 Using communicative temptations * 1.00 0.79 15 Pace and frequency of direct teaching opportunities * 0.89 0.58 16 Varying difficulty of direct teaching target 0.68 0.05 17 Using clear and appropriate teaching opportunities * 0.79 0.79 18 Providing motivating and relevant teaching opportunities * 1.00 1.00 19 Supporting a correct response using prompts * 0.68 0.37 20 Providing contingent natural and social reinforcement * 0.89 0.79 Note: * denotes i tems included in the NDBI - Fi Measure ; Bold text denotes items exceeding the statistically significant cutoff of 0.42. 55 Table 3. Participant Demographics . Children Gender n % Male 50 83.3 Female 10 16.7 Race n % White/Caucasian 38 63.3 Black/African - American 7 11.7 Asian/Pacific - Islander 6 10.0 Biracial/Mixed Race 1 1.7 Other 5 8.3 Missing 3 5.0 Ethnicity n % Hispanic/Latinx 9 15.0 Not Hispanic/Latinx 50 83.0 Missing 1 1.7 MSEL Subscale AE (months) M SD Visual Reception 24.2 9.2 Fine Motor 23.8 7.9 Receptive Language 19.9 9.5 Expressive Language 19.7 9.2 Caregivers Gender n % Male 3 5.0 Female 57 95.0 Mother's highest complete education n % Graduate/Professional degree 17 28.3 Bachelor's degree 16 26.7 Associate's degree 4 6.7 High school degree/GED 18 30.0 Did not complete high school 1 1.7 Missing 4 6.7 Father's highest complete education n % Graduate/Professional degree 20 33.3 Bachelor's degree 12 20.0 Associate's degree 4 6.7 High school degree/GED 12 20.0 Did not complete high school 0 0.0 Missing 12 20.0 Note. MSEL = Mullen Scales of Early Learning, AE = age equivalent 56 Table 4. Mean, standard deviation, and normality of NDBI - Fi items and Average Score. Mean SD Skewness Kurtosis NDBI - Fi Item Statistic SE Statistic SE 1. Face to Face 2.63 1.30 0.31 0.31 - 1.03 0.61 2. Follow Child's Lead 3.47 1.42 - 0.71 0.31 - 0.73 0.61 3. Positive Affect 3.62 1.32 - 0.59 0.31 - 0.93 0.61 4. Modeling Language 3.20 1.08 - 0.31 0.31 - 0.79 0.61 5. Responding to Communication 3.31 1.06 - 0.16 0.31 - 0.42 0.61 6. Communicative Temptations 1.44 1.06 1.76 0.31 2.99 0.61 7. Frequency of Direct Teaching 3.92 0.91 - 1.09 0.31 1.70 0.61 8. Quality of Direct Teaching 3.87 0.89 - 0.60 0.31 0.08 0.62 Average Score 3.18 0.71 - 0.14 0.31 - 0.72 0.61 Note. SE = Standard Error. Table 5. Reliability of individual NDBI - Fi items and Average Rating. NDBI - Fi Item ICC Cronbach's alpha 1. Face to Face 0.83 0.91 2. Follow Child's Lead 0.63 0.78 3. Positive Affect 0.79 0.88 4. Modeling Language 0.57 0.72 5. Responding to Communication 0.52 0.75 6. Communicative Temptations 0.68 0.81 7. Frequency of Direct Teaching 0.85 0.73 8. Quality of Direct Teaching 0.55 0.35 Average Score 0.79 0.91 57 APPENDIX D Figures 58 1. Review of existing fidelity forms and treatment manuals 2. Development of item definitions and examples 3. Qualitative feedback from expert panel regarding item definitions and examples Development of broad taxonomy of NDBI intervention strategies 4. Quantitative survey data from expert respondents on strategy use, item clarity, and ability to rate from video 5. Calculation of content - validity ratios to determine item inclusion Item reduction to common, content - valid items 6. Piloting and subsequent development of rating anchors 7. Development of scoring manual and coding conventions Refinement of NDBI - Fi rating scheme Figure 1. Study 1 Method Flowchart. 59 0 2 4 6 8 10 12 [1.5-2.0) [2.0-2.5) [2.5-3.0) [3.0-3.5) [3.5-4.0) [4.0-4.5) [4.5-5.0) Frequency count NDBI - Fi Average Rating Control Treatment Figure 2. Frequency distribution of NDBI - Fi Average Score. 0 2 4 6 8 10 12 14 16 18 [1.5-2.0) [2.0-2.5) [2.5-3.0) [3.0-3.5) [3.5-4.0) [4.0-4.5) [4.5-5.0) Frequency count NDBI - Fi Average Rating All cases Figure 3. Frequency distribution of NDBI - Fi average ratings by group assignment. 60 APPENDIX E Frequency D istributions of I ndividual NDBI - Fi I tems 61 14 15 14 9 8 1 2 3 4 5 Frequency count NDBI - Fi Item Rating 1. Face to Face 7 7 8 18 20 1 2 3 4 5 Frequency count NDBI - Fi Item Rating 2. Follow Child's Lead 2 11 9 15 23 1 2 3 4 5 Frequency count NDBI - Fi Item Rating 3. Positive Affect 2 14 13 25 6 1 2 3 4 5 Frequency count NDBI - Fi Item Rating 4. Modeling Language 2 9 22 18 9 1 2 3 4 5 Frequency count NDBI - Fi Item Rating 5. Responding to Communication 39 16 3 2 0 1 2 3 4 5 Frequency count NDBI - Fi Item Rating 6. Communicative Temptations 2 2 11 30 15 1 2 3 4 5 Frequency count NDBI - Fi Item Rating 7. Frequency of Direct Teaching 0 3 10 29 16 1 2 3 4 5 Frequency count NDBI - Fi Item Rating 8. Quality of Direct Teaching 62 REFERENCES 63 REFERENCES American Psychiatric Association (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. Bagnato, S. J. (2005). The authentic alternative for assessment in early intervention: An emerging evidence - based practice. Journal of Early Intervention, 28 (1), 17 - 22. Bakeman, R., & Quera, V. (2011). Sequential analysis and observational methods for the behavioral sciences. In: Cambridge University Press. Barth, R. P., & Liggett - Creel, K. (2014). Common components of parenting programs for children birth to eight years of age involved with child welfare services. Children and Youth Services Review, 40 , 6 - 12. Bearss, K., Burrell, T. L., Stewart, L., & Scahill, L. (2015). Parent Training in Autism Spectrum Clinical child and family psychology review, 18 (2), 170 - 182. Bradshaw, J., Koegel, L. K., & Koegel, R. L. (2017). Improving Functional Language and Social Motivation with a Parent - Mediated Intervention for Toddlers with Autism Spectrum Disorder. J Autism Dev Disord, 47 (8), 2443 - 2458. doi:10.1007/s10803 - 017 - 3155 - 8 Brian, J. A., Smith, I. M., Zwaigenbaum, L., & Bryson, S. E. (20 spectrum disorder. Autism Research . Brian, J. A., Smith, I. M., Zwaigenbaum, L., Roberts, W., & Bryson, S. E. (2016). The Social ABCs care Feasibility, acceptability, and evidence of promise from a multisite study. Autism Research, 9 (8), 899 - 912. Casenhiser, D. M., Shanker, S. G., & Stieben, J. (2013). Learning through i nteraction in children with autism: preliminary data from asocial - communication - based intervention. Autism, 17 (2), 220 - 241. doi:10.1177/1362361311422052 Chorney, J. M. L., McMurtry, C. M., Chambers, C. T., & Bakeman, R. (2015). Developing and Modifying Beh avioral Coding Schemes in Pediatric Psychology: A Practical Guide. In J Pediatr Psychol (Vol. 40, pp. 154 - 164). 64 Chorpita, B. F., Daleiden, E. L., & Weisz, J. R. (2005). Identifying and selecting the common elements of evidence based interventions: a distil lation and matching model. Ment Health Serv Res, 7 (1), 5 - 20. Cicchetti, D. V. (1994). Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychological assessment, 6 (4), 284. Division for Early Childhood. (2014). DEC recommended practices in early intervention/early childhood special education 2014. Estes, A., Vismara, L., Mercado, C., Fitzpatrick, A., Elder, L., Greenson, J., . . . Rogers, S. (2014). The impact of parent - deli vered intervention on parents of very young children with autism. J Autism Dev Disord, 44 (2), 353 - 365. doi:10.1007/s10803 - 013 - 1874 - z Fletcher - Watson, S., Petrou, A., Scott - Barrett, J., Dicks, P., Graham, C., O'Hare, A., . . . McConachie, H. (2016). A trial of an iPad intervention targeting social communication skills in children with autism. Autism, 20 (7), 771 - 782. doi:10.1177/1362361315605624 Gardner, F. (2000). Methodological issues in the direct observation of parent child interaction: Do observational f indings reflect the natural behavior of participants? Clinical child and family psychology review, 3 (3), 185 - 198. Garland, A. F., Hawley, K. M., Brookman - Frazee, L., & Hurlburt, M. S. (2008). Identifying common elements of evidence - based psychosocial trea tments for children's disruptive behavior problems. J Am Acad Child Adolesc Psychiatry, 47 (5), 505 - 514. doi:10.1097/CHI.0b013e31816765c2 Godfrey, E., Chalder, T., Ridsdale, L., Seed, P., & Ogden, J. (2007). Investigating the active ingredients of cognitive behaviour therapy and counselling for patients with chronic fatigue in primary care: developing a new process measure to assess treatment fidelity and predict outcome. Br J Clin Psychol, 46 (Pt 3), 253 - 272. doi:10.1348/014466506x147420 Green, J., Charman, T., McConachie, H., Aldred, C., Slonims, V., Howlin, P., . . . Pickles, A. (2010). Parent - mediated communication - focused treatment in children with autism (PACT): a randomised controlled trial. Lancet, 375 (9732), 2152 - 2160. doi:10.1016/s0140 - 6736(10)60587 - 9 Grzadzinski, R., Carr, T., Colombi, C., McGuire, K., Dufek, S., Pickles, A., & Lord, C. (2016). Measuring Changes in Social Communication Behaviors: Preliminary Development of the Brief Observation of Social Communication Change (BOSCC). J Autism Dev Dis ord, 46 (7), 2464 - 2479. doi:10.1007/s10803 - 016 - 2782 - 9 65 Gulsrud, A. C., Hellemann, G., Shire, S., & Kasari, C. (2016). Isolating active ingredients in a parent - mediated social communication intervention for toddlers with autism spectrum disorder. J Child Psyc hol Psychiatry, 57 (5), 606 - 613. doi:10.1111/jcpp.12481 Hallgren, K. A. (2012). Computing inter - rater reliability for observational data: an overview and tutorial. Tutorials in quantitative methods for psychology, 8 (1), 23. Hemphill, J. F. (2003). Interpre ting the magnitudes of correlation coefficients. American Psychologist, 58 (1), 78 - 79. Hudry, K., Aldred, C., Wigham, S., Green, J., Leadbitter, K., Temple, K., . . . McConachie, H. (2013). Predictors of parent - child interaction style in dyads with autis m. Res Dev Disabil, 34 (10), 3400 - 3410. doi:10.1016/j.ridd.2013.07.015 Ingersoll, B. R., & Dvortcsak, A. (2009). Teaching Social Communication to Children With Autism: A Practitioner's Guide to Parent Training and a Manual for Parents : Guilford Press. Inger soll, B. R., & Wainer, A. L. (2013). Initial efficacy of project ImPACT: a parent - mediated social communication intervention for young children with ASD. J Autism Dev Disord, 43 (12), 2943 - 2952. doi:10.1007/s10803 - 013 - 1840 - 9 Ingersoll, B. R., Wainer, A. L., Berger, N. I., Pickard, K. E., & Bonter, N. (2016). Comparison of a Self - Directed and Therapist - Assisted Telehealth Parent - Mediated Intervention for Children with ASD: A Pilot RCT. J Autism Dev Disord, 46 (7), 2275 - 2284. doi:10.1007/s10803 - 016 - 2755 - z Kaal e, A., Smith, L., Nordahl - Hansen, A., Fagerland, M. W., & Kasari, C. (2017). Early interaction in autism spectrum disorder: Mothers' and children's behaviours during joint engagement. Child Care Health Dev . doi:10.1111/cch.12532 [doi] Kaehler, L. A., Jacob s, M., & Jones, D. J. (2016). Distilling Common History and Practice Elements to Inform Dissemination: Hanf - Model BPT Programs as an Example. Clin Child Fam Psychol Rev, 19 (3), 236 - 258. doi:10.1007/s10567 - 016 - 0210 - 5 Kaiser, A. P., Hancock, T. B., & Nietfel d, J. P. (2000). The effects of parent - implemented enhanced milieu teaching on the social communication of children who have autism. Early Education and Development, 11 (4), 423 - 446. Kaiser, A. P., & Hester, P. P. (1994). Generalized effects of enhanced mi lieu teaching. Journal of Speech, Language, and Hearing Research, 37 (6), 1320 - 1340. 66 Kaiser, A. P., & Roberts, M. Y. (2013). Parents as communication partners: An evidence - based strategy for improving parent support for language and communication in everyd ay settings. Perspectives on Language Learning and Education, 20 (3), 96 - 111. Karaaslan, O., & Mahoney, G. (2015). Mediational analyses of the effects of responsive teaching on the developmental functioning of preschool children with disabilities. Journal of Early Intervention , 1053815115617294. Kasari, C., Freeman, S., & Paparella, T. (2006). Joint attention and symbolic play in young children with autism: A randomized controlled intervention study. Journal of Child Psychology and Psychiatry, 47 (6), 611 - 6 20. Kasari, C., Gulsrud, A., Paparella, T., Hellemann, G., & Berry, K. (2015). Randomized comparative efficacy study of parent - mediated interventions for toddlers with autism. J Consult Clin Psychol, 83 (3), 554 - 563. doi:10.1037/a0039080 Kasari, C., Gulsrud, A. C., Wong, C., Kwon, S., & Locke, J. (2010). Randomized controlled caregiver mediated joint engagement intervention for toddlers with autism. Journal of autism and developmental disorders, 40 (9), 1045 - 1056. Kasari, C., Lawton, K., S hih, W., Barker, T. V., Landa, R., Lord, C., . . . Senturk, D. (2014). Caregiver - Mediated Intervention for Low - Resourced Preschoolers With Autism: An RCT. In Pediatrics (Vol. 134, pp. e72 - 79). Elk Grove Village, IL, USA. Kazdin, A. E. (2003). Research Design in Clinical Psychology (4th Edition ed.). Boston, MA: Allyn Bacon. Kitzerow, J., Teufel, K., Wilker, C., & Freitag, C. M. (2016). Using the brief observation of social communication change (BOSCC) to measure autism - specific development. Autism Res, 9 (9), 940 - 950. doi:10.1002/aur.1588 development of socially synchronous engagement in toddlers with autism spectrum disorder: a randomized controlled trial. Journal of Child Psychology and Psychiatry, 52 (1), 13 - 21. Lawshe, C. H. (1975). A quantitative approach to content validity. Personnel psychology, 28 (4), 563 - 575. Lokker, C., McKibbon, K. A., Colquhoun, H., & Hempel, S. (2015). A scoping review of classificatio n schemes of interventions to promote and integrate evidence into practice in healthcare. Implement Sci, 10 , 27. doi:10.1186/s13012 - 015 - 0220 - 6 67 Mahoney, G., & Solomon, R. (2016). Mechanism of Developmental Change in the PLAY Project Home Consultation Progra m: Evidence from a Randomized Control Trial. J Autism Dev Disord, 46 (5), 1860 - 1871. doi:10.1007/s10803 - 016 - 2720 - x McConachie, H., & Diggle, T. (2007). Parent implemented early intervention for young children with autism spectrum disorder: a systematic revi ew. J Eval Clin Pract, 13 (1), 120 - 129. doi:10.1111/j.1365 - 2753.2006.00674.x McHugh, R. K., Murray, H. W., & Barlow, D. H. (2009). Balancing fidelity and adaptation in the dissemination of empirically - supported treatments: The promise of transdiagnostic int erventions. Behav Res Ther, 47 (11), 946 - 953. doi:10.1016/j.brat.2009.07.005 McKenzie, J. F., Wood, M. L., Kotecki, J. E., Clark, J. K., & Brey, R. A. (1999). Establishing content validity: Using qualitative and quantitative steps. American Journal of Healt h Behavior . Miklowitz, D. J., Goodwin, G. M., Bauer, M. S., & Geddes, J. R. (2008). Common and specific elements of psychosocial treatments for bipolar disorder: a survey of clinicians participating in randomized trials. J Psychiatr Pract, 14 (2), 77 - 85. d oi:10.1097/01.pra.0000314314.94791.c9 Mullen, E. M. (1995). Mullen scales of early learning : AGS Circle Pines, MN. National Research Council. (2001). Educating children with autism. In C. Lord & J. P. McGee (Eds.), Committee on Educational Interventions fo r Children with Autism . Washington, DC: National Academy Press. Nordahl - Hansen, A., Fletcher - Watson, S., McConachie, H., & Kaale, A. (2016). Relations between specific and global outcome measures in a social - communication intervention for children with aut ism spectrum disorder. Research in Autism Spectrum Disorders, 29 , 19 - 29. Oono, I. P., Honey, E. J., & McConachie, H. (2013). Parent - mediated early intervention for young children with autism spectrum disorders (ASD). Cochrane Database Syst Rev (4), Cd00977 4. doi:10.1002/14651858.CD009774.pub2 Oosterling, I., Visser, J., Swinkels, S., Rommelse, N., Donders, R., Woudenberg, T., . . . Buitelaar, J. (2010). Randomized controlled trial of the focus parent training for toddlers with autism: 1 - year outcome. Journa l of autism and developmental disorders, 40 (12), 1447 - 1458. 68 Osterling, K. L., & Austin, M. J. (2008). The dissemination and utilization of research for promoting evidence - based practice. Journal of evidence - based social work, 5 (1 - 2), 295 - 319. Pagoto, S. L., Spring, B., Coups, E. J., Mulvaney, S., Coutu, M. F., & Ozakinci, G. (2007). health professionals. Journal of clinical psychology, 63 (7), 695 - 705. Patterson, S. Y., E lder, L., Gulsrud, A., & Kasari, C. (2014). The association between parental interaction style and children's joint engagement in families with toddlers with autism. Autism, 18 (5), 511 - 518. doi:10.1177/1362361313483595 Phaneuf, L., & McIntyre, L. L. (2011) . The application of a three - tier model of intervention to parent training. Journal of Positive Behavior Interventions, 13 (4), 198 - 207. Pickard, K. E., Kilgore, A. N., & Ingersoll, B. R. (2016). Using Community Partnerships to Better Understand the Barrie Intervention for Autism Spectrum Disorder in a Medicaid System. American journal of community psychology, 57 (3 - 4), 391 - 403. Pickles, A., Harris, V., Green, J., Aldred, C., McConachie, H., Slonims, V., . . . C harman, T. (2015). Treatment mechanism in the MRC preschool autism communication trial: implications for study design and parent - focussed therapy for children. J Child Psychol Psychiatry, 56 (2), 162 - 170. doi:10.1111/jcpp.12291 Pijl, M. K., Rommelse, N. N., Hendriks, M., De Korte, M. W., Buitelaar, J. K., & Oosterling, I. J. (2016). Does the Brief Observation of Social Communication Change help moving forward in measuring change in early autism intervention studies? Autism , 1362361316669235. Rogers, S. J., & Dawson, G. (2010). Early start Denver model for young children with autism: Promoting language, learning, and engagement : Guilford Press. Rogers, S. J., Estes, A., Lord, C., Vismara, L., Winter, J., Fitzpatrick, A., . . . Dawson, G. (2012). Effects of a brief Early Start Denver Model (ESDM) based parent intervention on toddlers at risk for autism spectrum disorders: A randomized controlled trial. Journal of the American Academy of Child & Adolescent Psychiatry, 51 (10), 1052 - 1065. Schreibman, L., Dawson, G., Stahmer, A. C., Landa, R., Rogers, S. J., McGee, G. G., . . . Halladay, A. (2015). Naturalistic Developmental Behavioral Interventions: Empirically Validated Treatments for Autism Spectrum Disorder. J Autism Dev Disord, 45 (8), 2 411 - 2428. doi:10.1007/s10803 - 015 - 2407 - 8 69 Schreibman, L., & Koegel, R. L. (2005). Training for parents of children with autism: Pivotal responses, generalization, and individualization of interventions. Hibbs & Jensen (Eds.), Psychosocial treatments for chil d and adolescent disorders: Empirically based strategies for clinical practice , 605 - 631. Shire, S. Y., Gulsrud, A., & Kasari, C. (2016). Increasing Responsive Parent - Child Interactions and Joint Engagement: Comparing the Influence of Parent - Mediated Inter vention and Parent Psychoeducation. J Autism Dev Disord, 46 (5), 1737 - 1747. doi:10.1007/s10803 - 016 - 2702 - z Stahmer, A. C., Brookman - Frazee, L., Rieth, S. R., Stoner, J. T., Feder, J. D., Searcy, K., & Wang, T. (2017). Parent perceptions of an adapted evidenc e - based practice for toddlers with autism in a community setting. Autism, 21 (2), 217 - 230. Strand, V. C., Hansen, S., & Courtney, D. (2013). Common Elements Across Evidence - Based Trauma Treatment: Discovery and Implications. Advances in Social Work, 14 (2), 334 - 354. Retrieved from Tate, D. F., Lytle, L. A., Sherwood, N. E., Haire - Joshu, D., Matheson, D., Moore, S. M., . . . Michie, S. (2016). Deconstructing interventions: approaches to studying behavior change techniques across obesity interventions. Transl Behav Med, 6 (2), 236 - 243. doi:10.1007/s13142 - 015 - 0369 - 1 Tonge, B., Brereton, A., Kiomall, M., Mackinnon, A., King, N., & Rinehart, N. (2006). Effects on parental mental health of an education and skills training program for parents of young children with autism: a randomized controlled trial. J Am Acad Child Adolesc Psychiatry, 45 (5), 561 - 569. doi:10.1097/01.chi.0000205701.48324.26 Veneziano, L., & Hooper, J. (1997). A method for quantifying content validity of health - related questionnaires. American Journ al of Health Behavior, 21 (1), 67 - 70. Wainer, A. L., & Ingersoll, B. (2013). Intervention fidelity: An essential component for understanding ASD parent training research and practice. Clinical psychology: Science and practice, 20 (3), 335 - 357. Wainer, A. L ., & Ingersoll, B. R. (2015). Increasing access to an ASD imitation intervention via a telehealth parent training program. Journal of autism and developmental disorders, 45 (12), 3877 - 3890. Wood, J. J., McLeod, B. D., Klebanoff, S., & Brookman - Frazee, L. ( 2015). Toward the implementation of evidence - based interventions for youth with autism spectrum disorders in schools and community agencies. Behavior therapy, 46 (1), 83 - 95. 70 Zwaigenbaum, L., Bauman, M. L., Choueiri, R., Kasari, C., Carter, A., Granpeesheh, D., . . . Natowicz, M. R. (2015). Early Intervention for Children With Autism Spectrum Disorder Under 3 Years of Age: Recommendations for Practice and Research. Pediatrics, 136 Suppl 1 , S60 - 81. doi:10.1542/peds.2014 - 3667E