IMPROVING JUVENILE RISK ASSESSMENT MEASUREMENT MODELS:
    A PSYCHOMETRIC COMPARISON OF SCORING METHODS
                                  By
                     Mary Katherine Kitzmiller
                         A DISSERTATION
                              Submitted to
                      Michigan State University
              in partial fulfillment of the requirements
                           for the degree of
                 Psychology – Doctor of Philosophy
                                 2022


                                             ABSTRACT
          IMPROVING JUVENILE RISK ASSESSMENT MEASUREMENT MODELS:
                 A PSYCHOMETRIC COMPARISON OF SCORING METHODS
                                                  By
                                      Mary Katherine Kitzmiller
         Juvenile risk assessments are standardized rating instruments that measure criminogenic
risk in court-involved youth. Juvenile court practitioners use scores from risk assessments to
inform judicial decisions throughout case processing. It is critically important that risk scores
accurately reflect court-involved youths’ latent level of criminogenic risk; both artificially high
and low scores incur significant detriments to youths, courts, and communities. In light of the
consequences of risk misevaluation, there is urgent need to develop and evaluate alternate
juvenile risk assessment measurement models
         The current study aspired to improve measurement of criminogenic risk through the
development of a Novel Scoring Algorithm which innovated upon current juvenile risk
assessment scoring twofold: (1) it adjusted the weights of assessment items and domain sub-
scores to reflect their correlation with latent constructs of criminogenic risk; and (2) it integrated
the mitigating impact of prosocial protective factors into cumulative risk scores. Drawing upon a
sample of 559 youth who entered the supervision of a county-level juvenile circuit court for the
first time, the Novel Scoring Algorithm outperformed the current method of scoring (i.e.,
summing all unweighted risk factors) in both absolute and relative model fit. However, the Novel
Scoring Algorithm yielded no incremental improvement in diagnostic accuracy, affirming the
Scoring-as-Usual method as an acceptable procedure for assessing likelihood of recidivism in
court-involved youth. Implications for effectively and equitably managing risk are discussed.


                                       ACKNOWLEDGMENTS
         It takes a village to write a dissertation, and I want to recognize my village for their
contributions to this work. First, thank you to my advisor and co-Chair, Dr. Caitlin Cavanagh. I
have grown so much as a researcher and a writer because of your thoughtful and compassionate
mentorship. You are my favorite editor, co-author, and giver-of-advice, and I know that our work
together will not end here. I also want to thank my co-Chair, Dr. Rebecca Campbell, for your
persistent advocacy on my behalf over the last five years. To my committee members, Drs. Julie
Krupa and Cris Sullivan, thank you for believing in this project from start to finish and for your
thoughtful feedback along the way. It has been a joy to learn from all of you.
         Special thanks to all of the past and current student members of the Juvenile Risk
Assessment Team, who have worked tirelessly over the last 18 years to produce the high-quality
data that I had the privilege of using in this dissertation. Thank you to our juvenile court
collaborators, especially Scott, for many thought-provoking conversations on risk assessment
over the years. Your relentless commitment to upholding best practice is laudable and humbling.
         To my Eco-Community family, thank you for supporting and challenging me to get this
right. Jen, Isi, and Funmi, your friendship has gotten me through the most difficult parts of this
process. The time spent with you all – the Saturday morning long runs, the Monday night reality
TV watch parties, and the midday trips to Sparty’s – are the parts of this journey that I cherish
the most. Thank you.
         To my parents, thank you for nourishing my love of learning and encouraging me to
forge my own path. Dad, your genuine interest and investment in my research means the world
to me. Mom, I am forever changed by your love, support, and encouragement. I wish you were
here for this moment.
                                                     iii


       To my partner, Jacob. Time and time again, you have grounded me with patience,
perspective, and love. I am so incredibly grateful that you were willing to embark on this
Michigan odyssey together. I know we will look back on this season of life with fondness.
       Onward.
                                                 iv


                                               TABLE OF CONTENTS
LIST OF TABLES ........................................................................................................................ vii
LIST OF FIGURES ..................................................................................................................... viii
INTRODUCTION .......................................................................................................................... 1
LITERATURE REVIEW ............................................................................................................... 4
  Developmental & Ecological Perspectives on Juvenile Delinquency ........................................ 4
    Differential Involvement in Juvenile Delinquency. ................................................................ 6
    Differential Selection in the Juvenile Justice System. ............................................................ 6
  Development of Juvenile Risk Assessment ................................................................................ 8
    Risk-Needs-Responsivity Model. ........................................................................................... 8
  Innovations in Juvenile Risk Assessment ................................................................................. 10
    Ecologically Informed Measurement. ................................................................................... 10
    Integration of Strengths-Based Assessment.......................................................................... 10
  Advantages of Juvenile Risk Assessment Utilization ............................................................... 11
    Accurate Assessment. ........................................................................................................... 11
    Consistent Appraisal. ............................................................................................................ 11
    Improved Service Delivery. .................................................................................................. 11
  Challenges to Risk Assessment Utilization .............................................................................. 12
    Racism in Risk Assessment. ................................................................................................. 12
    Gender Bias in Risk Assessment. ......................................................................................... 13
  Juvenile Risk Assessment & Systematic Misevaluation .......................................................... 13
    False Negatives. .................................................................................................................... 14
    False Positives....................................................................................................................... 14
THE CURRENT STUDY ............................................................................................................. 16
  Innovation I: Adjust Item and Domain Weights ....................................................................... 16
  Innovation II: Integrate Risk and Protective Factors ................................................................ 17
PLAN OF WORK ......................................................................................................................... 18
  Methods..................................................................................................................................... 18
    Sample................................................................................................................................... 18
    Measures. .............................................................................................................................. 20
    Data Collection. .................................................................................................................... 22
    Analytic Plan......................................................................................................................... 23
RESULTS ..................................................................................................................................... 28
  Phase Ia: Development of Novel Scoring Algorithm ............................................................... 28
  Phase Ib: Development of Scoring-as-Usual Method ............................................................... 29
  Comparing Risk Scores Across Models ................................................................................... 31
  Phase II: Evaluation of Scoring Methods ................................................................................. 32
    Absolute Fit........................................................................................................................... 32
                                                                    v


    Relative Fit. ........................................................................................................................... 33
    Diagnostic Accuracy. ............................................................................................................ 34
    Summary of Evaluation. ....................................................................................................... 37
  Phase III: Cohort Comparisons ................................................................................................. 37
    Gender Variation in Diagnostic Accuracy. ........................................................................... 38
    Racial/Ethnic Variation in Diagnostic Accuracy .................................................................. 39
    Court Division Variation in Diagnostic Accuracy ................................................................ 41
DISCUSSION ............................................................................................................................... 44
  Patterns in Diagnostic Accuracy ............................................................................................... 45
    Cohort Comparisons. ............................................................................................................ 46
  Moving Towards Equitable Decision-Making.......................................................................... 50
  Recommendations for Effective Risk Management ................................................................. 52
    Importance of Protective Factors. ......................................................................................... 52
    Identifying Extraneous Assessment Items. ........................................................................... 54
  Summary ................................................................................................................................... 58
  Strengths & Limitations ............................................................................................................ 59
  Conclusions ............................................................................................................................... 62
APPENDICES ...............................................................................................................................63
  Appendix A: Frequency of YLS and PFRJR Item Endorsement ............................................. 64
  Appendix B: Correlations Between YLS and PFRJR Assessment Items ................................. 66
  Appendix C: Summary of the Novel Scoring Algorithm ......................................................... 69
  Appendix D: Summary of the Scoring-as-Usual Method ......................................................... 71
REFERENCES ..............................................................................................................................73
                                                                  vi


                                                   LIST OF TABLES
Table 1. Demographics and charge types of study sample ............................................................19
Table 2. Sample YLS/CMI scores, PFRJR scores, and recidivism rates .......................................22
Table 3. Latent variable model evaluation metrics and criteria for acceptable fit .........................26
Table 4. Composite risk estimates using the Novel Scoring Algorithm and Scoring-as-Usual
       method................................................................................................................................30
Table 5. Relative and absolute fit indices for the Novel Scoring Algorithm and Scoring-as-Usual
       method................................................................................................................................34
Table 6. Diagnostic accuracy of the Novel Scoring Algorithm and Scoring-as-Usual method ....36
Table 7. Diagnostic accuracy across sample cohorts.....................................................................38
Table 8. Comparing within-model diagnostic accuracy across cohorts ........................................42
Table 9. DeLong tests comparing between-model performance across sample cohorts ...............43
Table 10. Frequency of YLS and PFRJR item endorsement .........................................................64
Table 11. Correlations between YLS and PFRJR assessment items .............................................66
Table 12. Summary of the Novel Scoring Algorithm ....................................................................69
Table 13. Summary of the Scoring-as-Usual method ....................................................................71
                                                                 vii


                                      LIST OF FIGURES
Figure 1. Factor model describing the relationship between assessment items and domains on the
       YLS/CMI and PFRJR........................................................................................................25
Figure 2. Relationship between composite risk scores estimated by the Novel Scoring Algorithm
       and Scoring-as-Usual method............................................................................................32
                                                    viii


                                         INTRODUCTION
         In 2019, an estimated 690,000 minors were arrested in the United States for the first time
(Office of Juvenile Justice & Delinquency Prevention [OJJDP], 2019). Adolescents are uniquely
primed to engage in law breaking by virtue of their psychosocial characteristics (e.g.,
impulsivity, susceptibility to peer pressure) (Steinberg et al., 2015). However, systems of
oppression have reinforced disparate outcomes at every stage in the juvenile case processing,
including arrest, conviction, and detention (Birckhead, 2012; Piquero, 2008). A substantial body
of literature has attributed these outcomes to “differential selection”: the juvenile justice system
upholds systems of oppression by imposing more punitive forms of court supervision on
marginalized youths (Piquero, 2008). One of the most documented mechanisms for differential
selection lies in discretion-based methods of risk evaluation, wherein case processing decisions
reflect youths’ perceived threat of future harm and receptibility to available services (Mulvey &
Iselin, 2008)
         In order to promote fair and equitable justice administration for court-involved youths, 46
states have instituted statutes that support or require juvenile risk assessment utilization (Juvenile
Justice Geography, Policy, Practice & Statistics [JJGPS], 2020). Juvenile risk assessments are
standardized instruments that estimate the likelihood of recidivism based upon empirically
validated criteria. Assessment items reflect criminogenically-linked characteristics of the
individual youths (e.g., personality, attitudes) and their proximal social environment (e.g.,
school, family, community) (Andrews & Bonta, 2010). Subsequently generated risk scores,
which correspond to the unweighted sum of all risk factors identified, can inform several
important judiciary decisions, including type and duration of court supervision (Vincent et al.,
2012).
                                                   1


         Juvenile risk assessment utilization is considered favorable over discretion-based
methods of risk evaluation for a number of reasons: (1) risk assessments are more accurate in
predicting general delinquent recidivism (Bonta & Andrews, 2007; Oleson et al., 2011); (2) they
ensure youth are evaluated a consistent set of empirically supported criteria (Peck & Jennings,
2016; St. John et al., 2020); and (3) they are often administered with a separate, but
complementary, protective factors assessment, which facilitates wholistic case planning (Vincent
et al., 2012). Court jurisdictions that utilize juvenile risk assessments witness lower rates of
recidivism and higher rates of treatment compliance, signaling the importance of these tools in
facilitating effective case management (Schwalbe, 2007).
         Despite these advantages, advocates of justice system reform have raised concern that
juvenile risk assessments sustain, rather than prevent, discriminatory judicial decisions (Green,
2020). Risk scores reflect population-level inequities, legitimizing inappropriately punitive, and
ultimately harmful, justice system sanctions directed towards marginalized youths (Harcourt,
2010; Miron et al., 2021). Furthermore, juvenile risk assessments were developed and calibrated
using predominantly male delinquent samples, which calls into question their appropriateness for
measuring criminogenic risk among status offenders and girls (Onifade et al., 2009; Van Voorhis
et al., 2010). While it is unlikely that risk assessment will eradicate structural oppression upheld
through the juvenile justice system, courts can reduce related harm by ensuring that risk scores
accurately and consistently reflect youths’ latent trait of criminogenic risk.
         Inaccurate, inconsistent risk assessment scores directly inhibit effective service delivery.
Processing decisions based on artificially low risk scores enable youth to re-enter their
communities with unaddressed criminogenic needs, placing them at higher likelihood of
reoffending (McCarter, 2016). On the other hand, processing decisions based on artificially high
                                                    2


scores justifies the prescription of inappropriately restrictive or intensive services. In addition to
misallocating court resources, these inappropriate services may harm youth by damaging their
self-perception, disrupting their at-home routines, and increasing their association with higher
risk peers (Cecile & Born, 2009; Gatti et al., 2009; Leve & Chamberlain, 2005). Current methods
of juvenile risk assessment scoring may contribute to systematic misevaluation of court-involved
youth. Therefore, there is urgent need to develop more precise, strengths-based, and ecologically
informed juvenile risk assessment measurement models.
        The overall objective of this dissertation was to develop and evaluate a novel juvenile
risk assessment scoring algorithm (hereinafter “Novel Scoring Algorithm”) with the intention of
improving the current method of measuring criminogenic risk (hereinafter “Scoring-as-Usual
Method”). Using second-order Confirmatory Factor Analysis (CFA), analyses drew upon risk
assessment records from 559 court-involved youths who had been formally petitioned to juvenile
court for the first time. Results have immediate implications towards accurately and equitably
measuring criminogenic risk in youth via juvenile risk assessment.
                                                   3


                                     LITERATURE REVIEW
Developmental & Ecological Perspectives on Juvenile Delinquency
        Juvenile delinquency is common during adolescence; an estimated 1,909 minors are
arrested each day in the United States (OJJDP, 2019). Key social and physiological
characteristics prime adolescents to engage in law breaking in ways that distinguish them from
adults (Littlefield et al., 2010; Kuhn, 2009, Cauffman & Steinberg, 2000). Adolescents are
entrusted with more responsibilities and freedoms than they once were as children; however,
they lack psychosocial maturity, rendering them unable to regulate strong emotions, foresee
future consequences, and resist peer pressure (Cauffman et al., 2016; O’Brien et al., 2011;
Sebastian et al., 2010). As a result, adolescents are drawn to high-risk behaviors, which in many
cases includes delinquency. Fortunately, most teens’ law breaking is contained within
adolescence, even among those who commit serious crimes; as a result, juvenile delinquency has
been termed both temporary and situational (Moffitt, 1993).
        While all young people experience roughly the same psychosocial changes during
adolescence, contact and interactions with the juvenile justice system vary notably by
race/ethnicity, socioeconomic status, and gender. Rates of disproportionate minority contact
(DMC) have been reported at every stage of justice system involvement, with racial disparities
widening as youths advance through the stages of court processing (i.e., arrest, formal
processing, conviction, incarceration) (Piquero, 2008; Zane & Pupo, 2021). Similarly, while
socioeconomic indicators are not reported nationally, some jurisdictions indicate that as many as
60% of youths under court supervision live below the poverty line (Birckhead, 2012). While girls
are underrepresented within the general delinquent population, the juvenile justice system has
historically failed to respond to the unique social context which primes them towards
                                                   4


delinquency, further inhibiting successful rehabilitation (Hubbard & Matthews, 2008). These
disparate experiences exemplify how the juvenile justice system is reflective of deeply
entrenched social, cultural, historical inequities; therefore, solely conceptualizing them within
the framework of individual development is limiting.
        Community psychologists have advocated for ecologically informed models of
understanding, addressing, and preventing juvenile delinquency (Fountain & Mahmoudi, 2021;
Javdani & Allen, 2016; Roesch, 1988). Ecological inquiry here refers to an umbrella of
multidisciplinary theories and concepts which describe human behavior as the product of
reciprocal interactions between an individual and their environment (McBride & McCoy, 1981).
Person-environment interactions are often identified and studied within different contextual
systems, including the immediate social environment (e.g., family, school, peers) as well as
broader social systems (e.g., community, society, culture) (Bronfenbrenner, 1977; Kumpfer &
Turner, 1990). A well-substantiated body of research has documented how ecological contexts
can increase and reduce the likelihood of deviant behavior in adolescence, including substance
use (Hawkins et al., 2004), violence (Gorman-Smith et al., 1996; Tarter et al., 2002), and
delinquency (Moon et al., 2010; Windle, 2000).
        The disparate contact and treatment of youth in juvenile justice system can be better
understood from an ecological perspective. Although scholars have used different terms to
describe structural forms of oppression (e.g., systemic, institutional), these terms center the idea
that white supremacist, capitalistic, and patriarchal values are codified into our society’s policies,
laws, practices, structures, and institutions (Homan, 2019; Rucker & Richeson, 2021). Structural
forms of oppression interact to produce differential access to power and essential resources,
resulting in differential access to high-quality education, safe housing, employment
                                                   5


opportunities, healthcare, and wealth (Bailey et al., 2017; Bonilla-Silva, 1997; Jones, 2000;
Powell, 2007). Within the context of juvenile delinquency, structural oppression can be
understood as both an on-ramp to law breaking (i.e., differential involvement in crime) as well as
a contributing factor to disparate justice system outcomes (i.e., differential selection in the justice
system).
         Differential Involvement in Juvenile Delinquency. Structural oppression exercised
through federal subsidies, predatory mortgage lending restrictions, and subsidized housing
locations has created and maintained racially segregated neighborhoods marked by
“disinvestment and concentrated poverty” (Williams & Mohammed, 2013; Powell, 2007).
Youths who reside within these neighborhoods often affected by chronic unemployment,
inadequate living conditions, and under resourced schools. In the absence of legally viable
pathways to achieve upward social and economic mobility, these youths may engage in law
breaking as a means of survival and financial security (Nunn, 2001). Furthermore, racial
profiling, increased neighborhood surveillance, and other heavy-handed police tactics render
racially and socioeconomically marginalized youths acutely vulnerable to contact with law
enforcement resulting in arrest (Feinstein, 2015).
         Differential Selection in the Juvenile Justice System. Structural oppression is also
enacted through the operations of the juvenile court system. The juvenile court was developed
under the legal doctrine of parens patriae (the State as Parent): unlike the criminal justice
system, actions of the juvenile court are intended rehabilitate youth from deviant behavior and
facilitate prosocial transitions into adulthood (Bilchik, 1998; Center on Juvenile and Criminal
Justice [CJCJ], 2021). In practice, parens patriae has allowed juvenile court actors near-
unchecked levels of discretion in court processing, viewed originally as a favorable relaxation of
                                                  6


the formal procedures carried out by the criminal justice system (Stohs, 2003). Under discretion-
based methods of risk evaluation, case processing decisions reflect youths’ perceived threat of
future harm and receptibility to available services (Mulvey & Iselin, 2008). However, the
subsequent lack of procedural safeguards has often come at a detriment to court-involved youths,
as their experience and outcomes can vary widely by virtue of the legal actors they encounter.
While the consequences of discretion-based biases in the juvenile justice system are both
complex and intersectional, an abundance of research has documented its distinct harms to both
youth of color and girls:
        Empirical research highlights the ways in which racial discrimination, particularly anti-
Black racism, is carried out though discretion-based methods of risk evaluation. Bridges and
Steen (1998) found that probation officers describe Black and White youth differently in their
unstructured evaluations of criminogenic risk: narratives of Black youth were more likely to
reference negative personality traits (e.g., unremorseful), while narratives of White youth were
more likely to include descriptions of negative environmental influences (e.g., peer pressure).
Furthermore, when decisions are guided by legal actors’ discretion, youths of color are more
likely to be placed in pretrial detention (Bishop & Frazier, 1995; Johnson & Secret, 1990;
Wordes et al., 1994), formally petitioned to juvenile court (Bortner & Wornie, 1985; DeJong &
Jackson, 1998), and receive more punitive sanctions (McGarrell, 1993; Thomas & Sieverdes,
1975), after controlling for the severity of their offense.
        Additionally, the decisions of legal actors have often upheld traditional patriarchal
ideologies at the expense of court-involved girls (Chesney-Lind, 1977). The term judicial
paternalism describes how the actions of the juvenile court both protect and punish young
women for violating gendered behavioral expectations (Chesney-Lind, 1977; Spivak et al.,
                                                   7


2014). The direction of judicial paternalism’s influence on court outcomes is a subject of debate
among scholars: some have found evidence that legal actors apply a chivalrous bias towards girls
in the justice system, granting them leniency from the otherwise imposed penalties of
delinquency (Blackwell et al., 2008; Daly, 1994). Others have asserted that girls under juvenile
court jurisdiction are doubly punished: first for the actual offense, and again for violating
patriarchal expectations for appropriate feminine behavior (Crew, 1991; Erez, 1992; Spohn,
1999). Regardless, the influence of judicial paternalism on risk evaluation inhibits fair, equitable,
and effective case processing for court-involved girls.
         Inappropriate case processing decisions can disrupt normative adolescent development,
inhibit prosocial transitions to adulthood, and predispose youths to persistent criminal trajectories
(Liberman & Fontaine, 2015). In sum, steering justice administration by discretion of legal actors
institutionalizes racism, paternalism, and other forms of oppression, at significant costs to court-
involved youths (Liberman & Fontaine, 2015).
Development of Juvenile Risk Assessment
         Actuarial juvenile risk assessments were developed to remedy the harm incurred by
discretion-based methods of risk evaluation (St. John et al., 2020). Beginning in the 1990s,
several studies systematically investigated the characteristics that distinguished adolescents who
follow persistent criminal trajectories from their peers (Hoge et al., 1996; Howell & Hawkins,
1998). Drawing upon these findings, experts developed several semi-structured interview
paradigms, checklists, and rating tools for the purpose of identifying these characteristics in
court-involved youths (Hoge & Andrews, 2010).
         Risk-Needs-Responsivity Model. The fundamental theory of change guiding juvenile
risk assessments is the Risk-Needs-Responsibility (RNR) Model, a widely adopted approach to
                                                   8


corrections in the criminal justice system that has been generalized to juvenile court settings
(Bonta & Andrews, 2007). The RNR steers court responses to delinquency using three
principles: (1) the risk principle, which states that the level of restriction in court-sanctioned
services must match the youth’s cumulative level of criminogenic risk; (2) the needs principle,
which states that the types of services must match the youth’s unique profile of criminogenic
needs; and (3) the responsivity principle, which states that the court must maximize the
likelihood of personal growth by adapting services to relevant individual and community
characteristics (Bonta & Andrews, 2007).
         Juvenile risk assessments play a key role in translating these principles into practice.
First, juvenile risk assessments yield cumulative risk scores, which correspond to the number of
risk factors identified. In accordance with the risk principle, more intensive services should be
reserved for youths with elevated risk scores, while those with lower scores should be eligible for
less involved sanctions (e.g., community service) or dismissed from court supervision altogether
(Andrews et al., 1990). Next, juvenile risk assessments parse out criminogenic risk in distinct
domains, including prior involvement in the justice system, family and peer relations, school
conduct, and leisure time management. The needs principle states that courts should provide a
menu of services and strategies that address these distinct areas of need; a “one size fits all”
approach to intervention will yield limited success (Bonta & Andrews, 2007). Finally, many
juvenile risk assessments identify individual- and community-level strengths that may deter
youths from future offending. The responsivity principle states that courts can maximize the
likelihood of treatment success by incorporating identified strengths into case planning (Hoge et
al., 1996).
                                                   9


Innovations in Juvenile Risk Assessment
        Overtime, many juvenile risk assessment instruments have evolved to reflect a more
ecologically informed and strengths-based correctional framework (Barnes-Lee, 2020; Jacobs et
al., 2020). Both innovations represent significant gains in the evaluation and treatment of court-
involved youths, over discretion-based methods of risk evaluation.
        Ecologically Informed Measurement. Standardized juvenile risk assessments measure
relevant criminogenic characteristics at both the individual (e.g., antisocial attitudes, emotional
regulation) and microsystemic (e.g., family, school, peer group) level (Bronfenbrenner, 1979).
Identifying areas of need within youths’ immediate social environment enables court
practitioners to involve other relevant actors and settings in rehabilitation efforts (Singh &
Azman, 2020). For example, family members, school personnel, and coaches may play key roles
in youths’ treatment success by monitoring changes in behavior and participating in therapeutic
interventions (Singh & Azman, 2020).
        Integration of Strengths-Based Assessment. Many juvenile risk assessments are paired
with a separate, but complementary, assessment of criminogenic protective factors (Hoge et al.,
1996). Identifying protective factors represents a critical step towards integrating strengths-based
assessment into judicial decision-making (Nissen, 2006). Strengths-based assessments measure
positive qualities, capacities, and resources that could play meaningful roles in youths’
desistance from crime (Nissen, 2006). Strengths-based assessments are not designed to replace
conventional “deficit-based” risk assessments; rather, they help practitioners identify and treat
problem behaviors from a more wholistic and humanistic perspective (Graybeal, 2001; Nissen,
2006). In practice, juvenile case managers may use protective factors to enhance treatment
                                                  10


responsivity by referring youths into programs that showcase their talents, goals, interests, and
abilities (Barnes-Lee, 2020; de Vogel et al., 2011; Ward & Brown, 2004).
Advantages of Juvenile Risk Assessment Utilization
         Since their inception, juvenile risk assessments have become an increasingly integral
component of risk evaluation across the United States: as of 2020, 46 states have implemented
statutes which support or require their utilization (JJGPS, 2020). Their popularity can be
attributed to several favorable outcomes over discretion-based methods of evaluation:
         Accurate Assessment. Estimates from juvenile risk assessments have proven more
accurate than discretion-based methods in distinguishing chronic juvenile offenders from their
peers (Grove & Meehl, 1996; Harris, 2006; Oleson et al., 2011). Improved accuracy in risk
evaluations allows courts to operate in alignment with the RNR’s risk principle: time- and cost-
intensive resources are allocated only to the youth who exhibit the greatest cumulative risk
(Bonta & Andrews, 2007). Accordingly, meta-analytic evidence indicates that overtime, juvenile
risk assessment utilization lowers courts’ reliance on incarceration without compromising public
safety (Viljoen et al., 2019).
         Consistent Appraisal. Juvenile risk assessments are composed of clearly operationalized
risk factors; when implemented correctly, youths’ risk score should be invariant of the legal actor
administering the assessment (Peck & Jennings, 2016). This represents significant improvement
over discretion-based methods of risk evaluation, wherein case processing decisions can vary
based upon the mental heuristics, political agendas, and personal biases of the evaluator (Oleson
et al., 2011).
         Improved Service Delivery. Juvenile risk assessments facilitate wholistic and
humanistic case management by identifying a host of criminogenic risks and protective factors at
                                                   11


multiple ecological levels (Bonta & Andrews, 2007). Case managers can maximize the potential
for treatment success by matching youths to services that address areas of need and leverage
existing talents, community resources, and capacities. As a result, jurisdictions that utilize
juvenile risk assessments witness higher rates of treatment compliance and lower rates of
recidivism (Schwalbe, 2007).
Challenges to Risk Assessment Utilization
        While juvenile risk assessments have improved court outcomes in several regards, some
scholars have raised concern that risk assessment scores serve as proxy indicators for racism,
trauma, poverty, and other consequences of structural oppression (Harcourt, 2010; Miron et al.,
2021). As previously noted, structural oppression predisposes youth to differential delinquent
involvement; consequently, many characteristics of structural oppression are measured, either
directly or indirectly, as predictors of reoffending via juvenile risk assessment (Vincent &
Viljoen, 2020). Accordingly, a substantial body of literature has investigated the ways in which
juvenile risk assessments may sustain, rather than prevent, discriminatory judicial decision-
making, with explicit attention to consequences for youth of color and girls:
        Racism in Risk Assessment. Speaking specifically to the relationship between racism
and juvenile risk assessment, Brown (2007) writes: “Each category [of criminogenic risk] builds
a bias towards youth of color by neglecting how urban geographies affect these standardized
measures”. For example, juvenile risk assessments flag defiance towards legal authority and pro-
criminal attitudes as criminogenic risk factors (Hoge, 2020). However, these attitudes and beliefs
may be logical responses for youth of color, especially Black youth, whose communities have
been generationally harmed by mass incarceration, racial profiling, and police sanctioned murder
(Glover, 2008; Outland, 2021; Tucker, 2014). By virtue of their positionality, youth of color may
                                                  12


be at systematically classified as high risk, regardless of their actual likelihood of reoffending. In
this regard, juvenile risk assessments may do little more than discretion-based method of risk
evaluation to prevent racially disparate sentencing decisions.
        Gender Bias in Risk Assessment. Most widely utilized juvenile risk assessment
instruments were developed and calibrated using predominantly male samples, which may come
at a significant detriment to court-involved girls (Belisle & Salisbury, 2021). Feminist
criminologists have identified distinct gendered factors and life experiences which prime girls
towards and away from delinquency. Specifically, court-involved girls are more frequently
exposed to trauma and victimization, which can be tied to their offense directly (e.g., running
away from an abusive home) or indirectly (e.g., substance use as a coping mechanism for post-
traumatic stress disorder) (Kerig & Becker, 2012). These experiences also hold relevance when
measuring criminogenic risk, as related features of trauma may be flagged as familial
dysfunction (e.g., poor relations with mother/father) or emotion dysregulation (i.e., short
attention span) (Van Voohis et al., 2010). Importantly, girls’ delinquency is categorically less
chronic, violent, and severe when compared to boys’, highlighting how juvenile risk assessment
scores may conflate actual likelihood of reoffending with non-criminogenic trauma (Holtfreter &
Morash, 2003). By failing to account for gendered socialization processes and life experiences,
juvenile risk assessments may justify overly restrictive sentencing decisions directed towards
girls.
Juvenile Risk Assessment & Systematic Misevaluation
        These critiques bring to light how certain groups of youth are at heightened risk for
systematic misevaluation via juvenile risk assessment. Directly inhibiting the RNR’s risk
principle, systematic misevaluation occurs when youths’ cumulative risk scores are misaligned
                                                  13


with their actual likelihood of reoffending. While no risk assessment will predict recidivism with
complete accuracy, misevaluation that is systematic indicates that certain criminogenic
characteristics are consistently and methodologically mismeasured. Two forms of misevaluation,
as well as their consequences, are discussed below (Butcher et al., 2014):
         False Negatives. When juvenile risk assessments underestimate criminogenic risk (i.e.,
artificially low scores), youths’ actual likelihood of reoffending exceeds the level estimated by
their risk score. In these circumstances, courts may automatically divert youths from formal
processing, rendering them ineligible for rehabilitative treatment and wraparound services. While
it may not be the ideal venue for service delivery, contact with the juvenile justice system can
catalyze youths’ first opportunity to receive mental healthcare, addiction recovery treatment, and
intensive school support (Pullmann et al., 2006). Accordingly, systematic misevaluation of these
“false negatives” allows youths to re-enter their communities with unaddressed criminogenic
needs (McCarter, 2016); as a result, these youths are at elevated likelihood for reoffending.
         False Positives. When juvenile risk assessments overestimate criminogenic risk (i.e.,
artificially high scores), youths’ actual likelihood of reoffending falls short of the level estimated
by their risk score. In these circumstances, courts prescribe inappropriately punitive, cost- and
time-intensive sanctions to youths who do not need them. Rather, these sanctions may
unintentionally raise the criminogenic risk level of these “false positives”. Prescription of highly
restrictive sanctions (e.g., juvenile detention) promotes the development of deviant self-
perception and disrupts participation in normative at-home routines (Cecile & Born, 2009; Gatti
et al., 2009). Furthermore, association with higher risk peers through court-sanctioned
programming may increase their propensity towards delinquent behavior (Leve & Chamberlain,
2005). Ergo, systematic misevaluation of these “false positives” indicates that the juvenile risk
                                                   14


assessment is erroneously misallocating court resources, at a potential detriment to youths’ well-
being.
                                               15


                                     THE CURRENT STUDY
        Given these severe consequences, it is critical that juvenile risk assessment scores closely
align with youths’ latent level of criminogenic risk. Juvenile risk assessment scores correspond
to the unweighted sum of all risk factors identified, a process hereinafter referred to as the
Scoring-as-Usual method. Despite its near universal application, the Scoring-as-Usual method
may generate imprecise risk estimates, enabling systematic misevaluation (Butcher, 2014;
Vincent et al., 2012). The current study sought to improve measurement of criminogenic risk
through the development of a Novel Scoring Algorithm, tailored to local patterns in data from a
county-level analytic sample.
        The Novel Scoring Algorithm was developed using initial scores from court-involved
youth who received the Youth Level of Service/Case Management Inventory (YLS/CMI), the
most widely utilized proprietary juvenile risk assessment instrument (Andrews & Bonta, 2010;
Schwalbe, 2007). The YLS/CMI is a 41-item assessment that measures criminogenic risk in
eight domains: Prior Dispositions/Offenses, Education, Leisure & Recreation, Attitudes &
Orientation, Personality & Behavior, Peer Relations, Substance Abuse, and Family & Parenting
(Andrews & Bonta, 2010). Innovations to YLS/CMI scoring are described below:
Innovation I: Adjust Item and Domain Weights
        Early developers of juvenile risk assessments distinguished between “static risk factors”,
which are less amenable to change and more significantly tied to reoffending (e.g., prior offense
history), and “dynamic risk factors” which are more easily remedied through effective court-
sanctioned intervention (e.g., substance use) (OJJDP, 2015). Furthermore, assessment items on
the YLS/CMI identify the same behavior at different frequencies or intensities (e.g., occasional
versus chronic substance usage). Theoretically, dynamic risk factors at low frequencies should
                                                 16


have lower correlation with criminogenic risk, relative to static risk factors at high frequencies
(Hoge & Andrews, 1996). However, the Scoring-as-Usual method constrains all risk factors to
contribute equally to estimates of criminogenic risk. The Novel Scoring Algorithm tailored item
weights to their correlation with latent domains of criminogenic risk, based upon the patterns in
juvenile risk assessment scores from the county-level analytic sample.
        Weighting was doubly necessary in the present context, as risk domain subscales on the
YLS/CMI are unequally sized. For example, the Leisure & Recreation subscale encompasses
three risk factors while the Personality & Behavior subscale encompasses seven. By failing to
weight domain sub-scores, cumulative risk estimates are inherently biased towards domains with
more risk factors (McNeish & Wolf, 2020). The Novel Scoring Algorithm weighted domain sub-
scores to ensure that their relative contribution to cumulative risk estimates was empirically
grounded, rather than reflective of arbitrary scale composition.
Innovation II: Integrate Risk and Protective Factors
        Contemporary juvenile risk assessments are advantageous over other forms of risk
evaluation because they include a separate, complementary assessment of criminogenic
protective factors. While the protective factors identified can be used to facilitate wholistic case
planning, they are omitted from estimates of cumulative criminogenic risk in Scoring-as-Usual
procedures (Barnes-Lee, 2020). This creates significant challenges for interpretation and
eliminates the potential for protective factors to enhance risk evaluation: the extent to which the
presence of protective factors mitigates the influence of risk factors is unclear. To remedy this
challenge, the Novel Scoring Algorithm integrated risk and protective factors into a single
estimate of criminogenic risk.
                                                   17


                                          PLAN OF WORK
        In effort to improve juvenile risk assessment measurement models, the overall objective
of this dissertation was to develop and evaluate a Novel Scoring Algorithm which innovates
upon the conventions of Scoring-as-Usual method. In light of these innovations, it was
hypothesized that risk estimates generated by Novel Scoring Algorithm would significantly
improve the measurement of criminogenic risk (i.e., model fit) and prediction of recidivism (i.e.,
diagnostic accuracy) of the juvenile risk assessment instrument, over and above the Scoring-as-
Usual method. To maximize responsivity to local criminogenic patterns, the Novel Scoring
Algorithm was informed by official risk assessment and recidivism records collected from a
county-level juvenile circuit court. Data sources and analytic methods are discussed below.
Methods
        Sample. The study drew upon archival risk assessment and recidivism records from a
pooled sample of 559 court-involved youth between the ages of 10 and 18 (mean [M] = 14.6
years, standard deviation [SD] = 1.4 years). Thirty-nine youths in the sample exceeded 16 years
of age at the time of initial risk assessment which, at the time of data collection, was the
maximum age of juvenile court jurisdiction. These 39 youth were initially petitioned to criminal
court, and later waived to juvenile court via prosecutorial discretion. They were ultimately
retained in the current analytic sample so that the present results encompass all youth who were
evaluated via the YLS/CMI during the window of data collection.
        Youth entered the supervision of a juvenile circuit court in a single mid-sized Midwestern
County via truancy (34.2%) or delinquency (65.8%) division. The truancy division is a
specialized branch of the court which offers targeted services designed to remove barriers to
school attendance and promote educational success. Youth with chronically unexcused absences
                                                  18


were referred to truancy court by school truancy officers. Alternatively, youth processed in the
delinquency division came under court contact through conventional means (i.e., via arrest or
police referral), and were matched to individualized treatment plans designed to reduce
likelihood of future delinquent involvement. Despite these distinctions, youth in the truancy and
delinquency division participated in the same juvenile risk assessment process. Youth processed
through both divisions were retained in analyses to emulate the generalist application and
interpretation of juvenile risk assessment scores. However, to account for distinction in the
selection and treatment of youth across divisions, post-hoc analyses examined differences in the
performance of the risk assessment instrument between truant and delinquent subsamples.
        The analytic sample represents all youth who were formally petitioned to juvenile court
and adjudicated as delinquent or truant for the first time between September 2015 and December
2018. Sample demographic and charge information, collected via self-report at the time of initial
risk assessment, is reported in Table 1. Ten youths in the analytic sample (1.8%) were missing
indicators of race/ethnicity. Indicators of race/ethnicity in were found to be missing at random, as
these cases did not otherwise differ from sample in cumulative number of risk factors, protective
factors, or recidivism rates. Accordingly, they were retained in all aggregated analyses, as well in
gender and court division comparisons. However, they were omitted from analyses comparing
risk assessment performance across racial/ethnic cohorts.
   Table 1.
   Demographics and charge types of study sample.
   Characteristic                                                                            N (%)
   Gender
       Girls                                                                          194 (34.7%)
       Boys                                                                           365 (65.3%)
   Race/Ethnicity
       Caucasian/White                                                                149 (26.7%)
       Hispanic/Latinx                                                                  57 (10.2%)
       African American/Black                                                         231 (41.3%)
                                                  19


  Table 1 (cont’d).
  Characteristic
  Race/Ethnicity
        Multi-Racial                                                                107 (19.1%)
        Other                                                                          5 (<1.0%)
  Charge Type
       Status                                                                       204 (36.5%)
       Property                                                                     151 (27.0%)
       Person                                                                       117 (20.9%)
       Sex                                                                             29 (5.2%)
       Public Ordinance                                                                24 (4.3%)
       Weapon                                                                          14 (2.5%)
       Drug                                                                             7 (1.3%)
       Other                                                                            9 (1.6%)
        Measures. The core constructs of the proposed study were represented by the risk
assessment and recidivism measures utilized by the county-level juvenile circuit court at the time
of data collection.
        The Youth Level of Services/Case Management Inventory (YLS/CMI) is an adapted
youth version of the Level of Service Inventory – Revised (LSI-R), an instrument designed to
evaluate criminogenic risk in court-involved adults (Andrews & Bonta, 2010). The psychometric
properties of the YLS/CMI have been rigorously investigated (see Andrews et al., 1986; Shields
& Simourd, 1991; Simourd et al., 1991, 1994); across these studies, 41 items have been retained,
having consistently demonstrated significant correlation with juvenile reoffending. Using factor
analysis, these items have been grouped into eight domains of criminogenic risk: Prior
Dispositions/Offenses, Education, Leisure & Recreation, Attitudes & Orientation, Personality &
Behavior, Peer Relations, Substance Abuse, and Family & Parenting. In line with criminological
theory, these domains represent both static and dynamic characteristics of the youth and their
proximal social environment (OJJDP, 2015).
                                                20


        Each of the eight domains on the YLS/CMI includes between 3 and 7 risk factors, which
are scored dichotomously using a set of concretized, pre-determined criteria. (see Appendices A
and B for a list of YLS/CMI items and intradomain bivariate correlations between items). In the
current research setting, youth are classified at one of three risk levels based upon the cumulative
number of unweighted risk factors identified across domains: low risk (8 or fewer risk factors),
moderate risk (between 9-22 risk factors), or high risk (23 or more risk factors). This risk level
classification informs several judicial decisions, including eligibility for diversion, duration of
court supervision, and level of restrictiveness in sanctions. Descriptive information regarding
sample risk scores is presented in Table 2.
        The Protective Factors for Reducing Juvenile Reoffending (PFRJR) is a novel strengths-
based assessment. It was developed in response to a growing desire to understand how prosocial
characteristics may reduce odds of reoffending and increase responsivity to treatment in court-
involved youth (Barnes-Lee & Campbell, 2020). The 22-item scale maps on to seven of the eight
domains of the YLS/CMI (Prior Dispositions/Offenses, Education, Leisure & Recreation,
Attitudes & Orientation, Personality & Behavior, Peer Relations, Substance Abuse, and Family
& Parenting), and includes an additional domain identifying community-level strengths (see
Appendices A and B for a list of PFRJR items and intradomain bivariate correlations between
items) (Barnes-Lee, 2020). The factor structure of the PFRJR was confirmed via cross-validation
and found to have strong internal consistency (Barnes-Lee, 2020). Like the YLS/CMI, the
PFRJR is administered as part of the initial risk assessment and scored as a summative checklist
of dichotomous protective factor items. Importantly, scores from the PFRJR are not integrated
into estimates of cumulative criminogenic risk using the Scoring-as-Usual method, and do not
influence the youth’s risk level classification. Rather, PFRJR scores are used at case managers’
                                                  21


discretion to enhance treatment responsivity. Descriptive information regarding sample
protective factor scores is presented in Table 2.
        The Novel Scoring Algorithm and Scoring-as-Usual method were evaluated, in part,
based on their diagnostic accuracy in correctly classifying youth as recidivant or desistant.
Recidivism here refers to any additional adult or juvenile petitions (including felonies,
misdemeanors, or criminal violations of probation) received within the 24 months immediately
following the date of the youth’s initial risk assessment. Given that the sample encompasses
youth whose initial risk assessment took place between September 2015 and December 2018, the
two-year window for measuring recidivism concluded in December 2020. Table 2 summarizes
sample recidivism rates, both overall and by risk level classification.
 Table 2.
 Sample YLS/CMI scores, PFRJR scores, and recidivism rates.
                                                              Risk Classification N (%)
                       Mean (SD)        Range        Low Risk      Moderate Risk        High Risk
 YLS/CMI Score          16.1 (6.8)       1-42       80 (14.3%)      368 (66.1%)        108 (19.4%)
 PFRJR Score             7.6 (5.2)       0-22            –                –                  –
                                Overall              Low Risk      Moderate Risk        High Risk
 Recidivism Rate                 44.4%                 30.0%            44.8%             54.6%
        Data Collection. Official risk assessment and recidivism records were obtained through a
collaborative research partnership involving Michigan State University and the county-level
juvenile circuit court.
        The initial YLS/CMI and PFRJR were administered together via structured interview
between a highly trained case manager and a recently adjudicated youth. Case managers scored
each item dichotomously based upon youths’ self-report, using a set of predetermined criteria.
Criteria operationalize each risk and protective factor into concrete, easily identifiable terms. For
example, to fulfill the criteria for chronic drug use, youth must disclose illegal drug use at least
                                                   22


twice per week or have a drug-related problem in one or more major life areas (e.g., drug-related
arrest, school/employment citations, withdrawal symptoms). Case managers calculated
cumulative risk scores using the Scoring-as-Usual method. Each risk assessment was reviewed
by a trained research assistant for quality and completion, and entered into a secure database
housed on the court’s computers. All identifying information was removed from the risk
assessment records prior to analysis.
         Recidivism records include any additional juvenile or adult petitions accrued within two
years of the initial risk assessment date. Adult petitions were obtained and synthesized with
juvenile petitions through an integrated data management system involving the criminal and
juvenile branches of the county circuit court. This information is obtained, merged with risk
assessment records, and evaluated by the Michigan State University research team on an annual
basis.
         Analytic Plan. This dissertation was executed in three sequential analytic phases:
development, evaluation, and cohort comparison. All latent variable modeling was conducted in
MPlus Version 8.5 (Muthén & Muthén, 2020). Diagnostic testing was conducted in SPSS
Version 27, and post-hoc comparisons of diagnostic accuracy were performed in R (v.4.1.2; R
Core Team 2022).
         The development of the Novel Scoring Algorithm was achieved through second-order
Confirmatory Factor Analysis (CFA). CFA is a statistical technique used to examine how latent
factors influence responses on measured variables (Kline, 2016; Takane & Deleeuw, 1987).
Second-order CFA is utilized when a general construct (i.e., criminogenic risk) accounts for the
variation between the latent factors (i.e., domains on the YLS/CMI and PFRJR) (Gould, 2015).
CFA is best suited to addressing the specific aim, as opposed to other methods of factor analysis
                                                  23


(e.g., Exploratory Factor Analysis, Principal Components Analysis), because the factor structure
was specified a priori by the nine domains on the YLS/CMI and the PFRJR. CFA is the only
method of factor analysis which analyzes multi-dimensional constructs based on established
theory (Kline, 2016).
        The execution of the second order CFA model included two levels. At the first level, the
64 collective item indicators on the YLS/CMI and PFRJR were loaded onto nine first-order
factors, reflective of the nine domains of the YLS/CMI and PFRJR (i.e., Prior
Dispositions/Offenses, Education, Leisure & Recreation, Attitudes & Orientation, Personality &
Behavior, Peer Relations, Substance Abuse, Family & Parenting, and Community) (see Figure
1). Factor loadings among the first-order factors (represented by single-arrowed lines between
the nine latent domains and observed item indicators) indicated the magnitude and direction of
the association between each item and its corresponding domain. Factor loadings from the first-
order factors were used as assessment item weights in the Novel Scoring Algorithm.
        At the second level, the nine first-order factors were loaded onto one second-order factor,
reflecting cumulative criminogenic risk. Factor loadings from the second-order factor
(represented by the single-arrowed lines between criminogenic risk and the nine latent domains)
indicated the magnitude and direction of the association between each domain and cumulative
criminogenic risk. Factor loadings from the second-order factor weighted domain scores in the
Novel Scoring Algorithm. Weighting scores at both the item- and domain-level was necessary,
given the variation in number of items per domain. Failing to adjust the weight of the domain
scores would bias cumulative estimates toward the domains with more assessment items.
                                                  24


Figure 1.
Factor model describing the relationship between assessment items and domains on the
YLS/CMI and PFRJR.
Note. Items ending in “PF” denote protective factors.
        The Scoring-as-Usual method involves summing all unweighted risk factors identified.
While sum scoring and factor analysis are typically vetted as competing methods, McNeish and
                                               25


Wolf (2020) argue that both are forms of latent variable modeling. Accordingly, the Scoring-as-
Usual model was estimated as a constrained form of the Novel Scoring Algorithm, wherein all
first- and second-order factor loadings of the YLS/CMI were set to one. The PFRJR was omitted
from this model because protective factors are not included in composite estimates of risk using
the Scoring-as-Usual method.
         The Novel Scoring Algorithm and the Scoring-as-Usual method were evaluated on
several criterion: (1) absolute fit, indicating how well the models explained variation in the
observed data (Kline, 2016); (2) relative or incremental fit, indicating how well the models
improved fit of the data relative to a null model (Kline, 2016); and (3) diagnostic accuracy,
measuring the performance of the models in correctly classifying youth as recidivant or desistant
(Rice & Harris, 2005). Given the known penalties of basing evaluation on one indicator alone,
multiple fit indices were calculated and interpreted. Table 3. describes the evaluation metrics and
their criteria for acceptable fit.
 Table 3.
 Latent variable model evaluation metrics and criteria for acceptable fit.
                                                                                      Criteria for
                             Type of                                                  Acceptable
          Index               Index                      Description                      Fit
 Chi-Square Fit           Absolute fit     Indicates discrepancy in the covariance  p > 0.05
 Index                                     and mean matrices between the
                                           specified model and the observed data.
                                           Non-significant test statistics indicate
                                           low discrepancy, signaling good fit
                                           (McNeish & Wolf, 2020)
 Standardized Root        Absolute fit     Badness-of-fit index; indicates the      SRMR < 0.10
 Mean Square                               overall discrepancy between the
 Residual (SRMR)                           observed and predicted variable
                                           correlations (Kline, 2016)
 Root Mean Square         Absolute fit     Badness-of-fit index; indicates the      RMSEA <
 Error of                                  discrepancy between the specified and    0.08
 Approximation                             observed covariance matrices, adjusting
 (RMSEA)                                   for model complexity (Cangur & Ercan,
                                           2015)
                                                    26


 Table 3 (cont’d).
                                                                                       Criteria for
                               Type of                                                  Acceptable
          Index                  Index                       Description                    Fit
 Comparative Fit            Relative fit     Goodness-of-fit index; indicates         CFI ≥ 0.90
 Index/Tucker Lewis                          whether the specified model improves
 Index (CFI/TLI)                             the fit of the data by 90%, relative to
                                             the null model (Kline, 2016)
 Area Under the             Clinical         Performance metric for model             AUC = 0.55,
 Receiver Operating         performance discrimination; indicates the specified       .64, & 0.71
 Characteristic                              model’s diagnostic accuracy in           for small,
 (AUROC)                                     predicting outcomes over chance (Rice    moderate, and
                                             & Harris, 2005)                          large effect
                                                                                      size
         Composite scores yielded by juvenile risk assessments may have differential implications
for recidivism by virtue of youths’ demographic (e.g., race/ethnicity, gender) and charge-related
(e.g., truancy, delinquency) characteristics. Differential diagnostic performance directly inhibits
fair and equitable court decision-making (Anderson et al., 2016; Campbell et al., 2018; Onifade
et al., 2009). To detect variation in diagnostic accuracy, a series of cohort comparisons were
conducted for youth across gender, race/ethnicity, and division of the court (i.e., truancy,
delinquency). Within cohort comparisons were conducted through a series of pairwise test of
independent-group area differences, performed on both the Novel Scoring Algorithm and the
Scoring-as-Usual method. Between model comparisons were conducted through a series of
DeLong tests (DeLong et al., 1988).
         These metrics of absolute fit, relative fit, and diagnostic accuracy were used to identify
and select the model that most accurately measures criminogenic risk in court-involved youth.
For the purpose of this inquiry, in order for the Novel Scoring Algorithm to be retained as a
procedure for estimating likelihood of recidivism, it must improve metrics in all three domains
(i.e., absolute fit, relative fit, diagnostic accuracy) over the Scoring-as-Usual method.
                                                       27


                                               RESULTS
Phase Ia: Development of Novel Scoring Algorithm
         Congruent with the multidimensional structure of the YLS/CMI and PFRJR, the Novel
Scoring Algorithm loaded 63 discreet binary assessment items (i.e., the risk and protective factor
items) onto nine latent first-order factors (i.e., the risk and protective factor domains). The nine
latent first-order factors were subsequently loaded onto a single second order factor, reflective of
composite criminogenic risk. The first factor loading for each first- and second-order latent
variable was constrained to one, representing the unit loading identification (ULI) constraint
(Kline, 2016). ULI constraints scale the latent factors to the YLS/CMI’s and PFRJR’s units of
measurement. In a single sample analysis, the indicator selected as the ULI is arbitrary and holds
no bearing on the fit of the model (Kline, 2016). All latent variable means, intercepts, and error
variances were freely estimated, meaning that they reflected the corresponding parameters
observed in the data.
         To account for redundancy between discreet variables, modification indices
recommended covariances between the following pairs of items: disruptive classroom behavior
with problems with teachers (r(557) = 0.57, p < 0.05), passing (protective factor) with low
achievement (r(557) = -0.66, p < 0.05), involvement in organized activities (protective factor)
with lack of organized activities (r(557) = -0.84, p < 0.05), lack of positive friends with lack of
positive peer acquaintances (r(557) = 0.57, p < 0.05), some delinquent friends with some
delinquent peer acquaintances (r(557) = 0.64, p < 0.05), consistent supervision (protective factor)
with inadequate supervision (r(557) = 0.58, p < 0.05), and actively seeking help with not seeking
help (r(557) = 0.50, p < 0.05). It is ill-advised to add paths to the model based upon modification
indices without first consulting theory. Nevertheless, these covariances are logically justified, as
                                                    28


the item pairs represent the same characteristic at different intensities (e.g., delinquent peers
versus delinquent friends) or the same characteristic at opposite ends of the spectrum (e.g.,
actively seeking help versus not seeking help). The theoretical relationship between these items
is further confirmed by the strong positive and inverse correlations observed between each item
pair (see Appendix B). A summary of the modified model, including all first- and second-order
factor loadings, is presented in Appendix C.
         Using the Novel Scoring Algorithm, possible composite estimates of criminogenic risk
ranged from -39.67 (i.e., having all protective factors and no risk factors) to 53.86 (i.e., having
all risk factors and no protective factors). However, observed estimates within the sample ranged
from -36.14 to 38.14 (M = 8.26; SD = 16.64 points). Table 4 describes the sample scores
rendered by the Novel Scoring Algorithm, in aggregate and across gender, racial/ethnic, and
court division cohorts.
Phase Ib: Development of Scoring-as-Usual Method
         The Scoring-as-Usual model loaded the 41 items of the YLS/CMI onto eight latent first-
order factors. The ninth first-order factor, as modeled in the Novel Scoring Algorithm, represents
the Community domain on the PFRJR, which is composed solely of protective factors. The eight
first-order factors were in turn loaded onto one second-order factor, reflective of composite
criminogenic risk. Based upon modification indices, the Scoring-as-Usual method additionally
specified covariances between the three following pairs of assessment items: disruptive
classroom behavior with problems with teachers, some delinquent friends with some delinquent
acquaintances, and lack of positive friends with lack of positive peer acquaintances. The four
additional covariances specified in the Novel Scoring Algorithm were not relevant to the
Scoring-as-Usual model, as they involved protective factors.
                                                  29


        To emulate the process of summing the unweighted total of all YLS/CMI assessment
items, all unweighted factor loadings of the Scoring-as-Usual model were fixed to one. A
summary of the model, including all first- and second-order factor loadings, is presented in
Appendix D. Using the Scoring-as-Usual method, possible composite estimates of criminogenic
risk ranged from 0 (i.e., having no risk factors) to 41 (i.e., having all risk factors). However,
observed estimates within the sample ranged from 1 to 35, with a mean score of 16.18 and a
standard deviation of 6.76 points. Table 4 describes the sample scores rendered by the Scoring-
as-Usual method, in aggregate and across gender, racial/ethnic, and court division cohorts.
  Table 4.
  Composite risk estimates using the Novel Scoring Algorithm and Scoring-as-Usual method.
                                    Novel Scoring Algorithm
                                   Mean Risk Score Standard Deviation                    Range
  All Youth                                8.26                  16.64              -36.16 – 38.14
  Gender
      Girls                                8.32                  16.16              -29.61 – 38.14
      Boys                                 8.22                  16.91              -36.14 – 37.02
  Race/Ethnicity
      African American/Black              10.61                  15.96              -34.16 – 38.14
      Caucasian/White                      3.80                  17.50              -36.14 – 35.16
      Multi-Racial                        10.23                  16.21              -28.99 – 36.86
      Hispanic/Latinx                      8.77                  14.39              -31.99 – 27.95
      Other                              -11.23                  21.40              -34.08 – 23.97
  Division
      Delinquency                          8.81                  17.33              -36.14 – 38.14
      Truancy                              7.19                   15.1               -28.99 – 34.41
                                    Scoring-as-Usual Method
                                   Mean Risk Score Standard Deviation                    Range
  All Youth                               16.18                   6.76                   1 – 35
  Gender
      Girls                               15.68                   6.63                   2 – 31
      Boys                                16.45                   6.83                   1 – 35
  Race/Ethnicity
      African American/Black              17.60                   6.70                   2 – 35
      Caucasian/White                     14.12                   6.01                   1 – 29
      Multi-Racial                        16.62                   6.63                   1 – 29
      Hispanic/Latinx                     15.63                   6.85                   2 – 25
      Other                               10.60                   6.11                   2 – 19
                                                  30


 Table 4 (cont’d).
                                       Scoring-as-Usual Method
                                      Mean Risk Score     Standard Deviation            Range
 Division
     Delinquency                           17.03                 6.94                  1 – 35
     Truancy                               14.56                 6.10                  1 – 30
Comparing Risk Scores Across Models
         A Pearson correlation was conducted to estimate the linear relationship between the
composite risk scores generated by the two measurement models. The correlation was strong,
positive, and statistically significant (r(557)=0.91, p <0.01), indicating that only 17.19% (s = 1 –
r2) of the variance in scores differed between the Novel Scoring Algorithm and the Scoring-as-
Usual method. Figure 2 below visualizes the strong, positive linear relationship between scores
generated by the two models.
         Taken together, these results indicate that youths’ relative position on the spectrum of
criminogenic risk remained largely unchanged, regardless of the scoring method employed. The
high degree of shared variance between the two models highlights the strong, inverse
relationship between risk and protective factors. In other words, the youth whose risk scores
were lowered by protective factors using the Novel Scoring Algorithm were already low risk
using the Scoring-as-Usual method. Concurrently, patterns in unit weighting reflect the observed
relationship between each assessment item and its intradomain constituents. As a result, the
youth whose risk scores were raised by heavily weighted items using the Novel Scoring
Algorithm were already high risk using the Scoring-as-Usual method.
                                                  31


Figure 2.
Relationship between composite risk scores estimated by the Novel Scoring Algorithm and
Scoring-as-Usual method.
                              50
                              40
                              30
    Novel Scoring Algorithm
                              20
                              10
                               0
                                    0       5        10        15          20       25        30        35        40
                              -10
                              -20
                              -30
                              -40
                                                              Scoring-as-Usual Method
Note. Open circles denote youth who desisted and Xs denote youth who recidivated.
Phase II: Evaluation of Scoring Methods
                              Absolute Fit. Absolute fit indices are used to evaluate how well a latent variable model
explains variation in the observed data (Kline, 2016). The Novel Scoring Algorithm and the
Scoring-as-Usual method were compared on absolute fit using the chi-square fit index, the
Standardized Root Mean Square Residual (SRMR), and the Root Mean Square Error of
Approximation (RMSEA) (see Table 3 for description of fit indices and thresholds for acceptable
fit).
                                                                      32


        Likely penalized by the large sample size, both the Novel Scoring Algorithm and the
Scoring-as-Usual method yielded significant chi-square values (Novel Scoring Algorithm:
χ2(1,873)=3,072, p < 0.05; Scoring-as-Usual: χ2(808)=1,932.63, p < 0.05) (Bentler & Bonett,
1980). Significant chi-square values indicate discrepancies between the covariance matrices in
the specified models and the observed data (McNeish & Wolf, 2020). Additionally, both models
exceeded the threshold for acceptable SRMR (Novel Scoring Algorithm: SRMR = 0.11; Scoring-
as-Usual: SRMR = 0.17), indicating significant discrepancies between the observed and
predicted variable correlations (Kline, 2016).
        However, both models yielded acceptable RMSEA values (Novel Scoring Algorithm:
RMSEA = 0.03; Scoring-as-Usual: RMSEA = 0.05), indicating that, after adjusting for model
complexity, the specified and observed covariance matrices in both models were comparable.
Even though both models yielded acceptable RMSEA values, the constraints imposed in the
Scoring-as-Usual method significantly worsened the absolute fit of the data (ΔRMSEA > 0.015)
(Chen, 2007; Cheung & Rensvold, 2002). Taken together, the analysis of absolute fit indicated
that the Novel Scoring Algorithm better explained variation in the observed data over and above
the Scoring-as-Usual method.
         Relative Fit. Relative or incremental fit indices are used to evaluate how well a latent
variable model improves fit of the data over a null model (Kline, 2016). The Novel Scoring
Algorithm and the Scoring-as-Usual method were compared on absolute fit using the
Comparative Fit Index/Tucker Lewis Index (CFI/TLI) (see Table 3 for description of fit indices
and thresholds for acceptable fit). Only the Novel Scoring Algorithm yielded acceptable relative
fit (CFI/TLI = 0.94), indicating that the estimated model improved overall fit by 94% over a null
model. Conversely, the Scoring-as-Usual method failed to achieve acceptable fit (CFI/TLI =
                                                 33


0.83). Using thresholds recommended by Chen (2007) and Cheung & Rensvold (2002), the
constraints imposed in the Scoring-as-Usual method significantly worsened the relative fit of the
data (ΔCFI/TLI > 0.01). Taken together, the analysis of relative fit indicated that the Novel
Scoring Algorithm yielded greater improvement over a null model when compared to the
Scoring-as-Usual method. Both absolute and relative fit indices for the measurement models are
summarized in Table 5.
 Table 5.
 Relative and absolute fit indices for the Novel Scoring Algorithm and Scoring-as-Usual
 method.
                                                  χ2
                       # param         Est.            df      p     SRMR RMSEA CFI/TLI
 Novel Scoring
                          142       3,072.35         1,874    .00      .11        .03            .94
 Algorithm
 Scoring-as-Usual          53       1,932.63          808     .00      .17        .05            .83
        Diagnostic Accuracy. The final step in the evaluation process compared the diagnostic
accuracy of the measurement models in correctly classifying youth as either recidivant (i.e.,
received one or more petitions in the two years post-initial risk assessment) or desistant (i.e., did
not receive any petitions in the two years post-initial risk assessment) (Rice & Harris, 2005).
Overall diagnostic accuracy was assessed using Area Under the Curve (AUC) values derived
from a univariate logistic regression predicting recidivism from composite estimates of
criminogenic risk. The direction and magnitude of risk misclassification was further probed
through an analysis of model sensitivity (i.e., true positive rate) and 1 – specificity (i.e., false
positive rate). Results for the Novel Scoring Algorithm and Scoring-as-Usual method are
presented below.
        Drawing upon the study sample of 559 youth, composite criminogenic risk scores derived
from the Novel Scoring Algorithm method significantly predicted recidivism in the first two
years following the youths’ initial contact with the court. The estimated odds ratio indicated that
                                                   34


a one-unit increase in composite risk score increased the likelihood of recidivism by 3.00%
(Exp[B] = 1.03, p < 0.01, 95% CI [1.02 1.04]) (see Table 6. for model summary).
         Next, a confusion matrix was generated to assess the accuracy of the Novel Scoring
Algorithm as a classification method. Overall, the model correctly classified 59.53% of court-
involved youths as either recidivant or desistant. Model sensitivity was 0.59, indicating that
59.43% of youth who reoffended were correctly classified as recidivant (i.e., “true positives”).
Additionally, 1 – specificity was 0.54, indicating that 53.58% of youth who did not reoffend
were incorrectly classified as recidivant (i.e., “false positives”). The Area Under the Curve
(AUC) index, which summarizes the overall diagnostic accuracy of the Novel Scoring
Algorithm, was 0.64, which equates to a moderate effect size in violence risk assessment
literature (Rice & Harris, 2005).
         Drawing upon the same sample of 559 youth, estimates of criminogenic risk derived
from the Scoring-as-Usual method significantly predicted recidivism in the first two years
following the youths’ initial contact with the court. The estimated odds ratio indicated that each
additional unweighted risk factor increased the likelihood of recidivism by 8.00% (Exp[B] =
1.08, p < 0.01, 95% CI [1.05 1.11]) (see Table 6. for model summary).
         Next, a confusion matrix was generated to assess the accuracy of the Scoring-as-Usual
procedure as a classification method. Overall, the model correctly classified 61.33% of court-
involved youths as either recidivant (i.e., received at least one petition in the two years following
initial court contact) or desistant (i.e., did not receive any additional petitions in the two years
following initial court contact). Model sensitivity was 0.62, indicating that 61.66% of youth who
did reoffend were correctly classified as recidivant (i.e., “true positives”). However, 1 –
specificity was 0.47, indicating that 47.17% of youth who did not reoffend were incorrectly
                                                     35


classified as recidivant (i.e., “false positives”). The Area Under the Curve (AUC) index for the
Scoring-as-Usual method was 0.65, which again falls within disciplinary standards of a moderate
effect size (Rice & Harris, 2005).
 Table 6.
 Diagnostic accuracy of the Novel Scoring Algorithm and Scoring-as-Usual method.
                                       Novel Scoring Algorithm
                                                                               95% CI for Odds
                                                                                     Ratio
                                       B (SE)          Wald    Odds Ratio     Upper        Lower
 Intercept                         -0.15** (0.09)      2.40       0.86
 Criminogenic Risk Score            0.03** (0.01)     30.26       1.03          1.02        1.04
                                          Predicted Desistant             Predicted Recidivant
 Observed Desistant                               123                              142
 Observed Recidivant                               83                              208
                                            Scoring-as-Usual
                                                                               95% CI for Odds
                                                                                     Ratio
                                       B (SE)          Wald    Odds Ratio     Upper        Lower
 Intercept                         -1.18** (0.23)     25.32
 Criminogenic Risk Score            0.08** (0.01)     34.07       1.08          1.05        1.11
                                          Predicted Desistant             Predicted Recidivant
 Observed Desistant                               140                              125
 Observed Recidivant                               90                              201
 **p < 0.01
        The AUC estimates yielded by the Novel Scoring Algorithm (AUC = 0.64) and Scoring-
as-Usual method (AUC = 0.67) were compared against each other using a DeLong test (DeLong,
1988). In diagnostic testing, DeLong tests are used to evaluate multiple AUC estimates derived
from different predictors (i.e., composite risk estimates from the two measurement models) on
the same set of data (DeLong, 1988). Results from the DeLong test indicated no statistically
significant differences in the AUC estimates produced between the Novel Scoring Algorithm and
the Scoring-as-Usual method (z-score = 0.32, p = 0.75, 95% CI [-0.02 0.02]).
                                                    36


         Summary of Evaluation. In sum, the Novel Scoring Algorithm better models the
observed variation in measured juvenile risk assessment data, as evidenced by appreciably better
indices of relative and absolute fit. These findings suggest that risk and protective factors covary
with one another to different degrees, and failing to account for this in risk measurement creates
significant psychometric imprecision. Concurrently, the Novel Scoring Algorithm held no
relative advantage over the Scoring-as-Usual method in accurately distinguishing youth who
recidivated from those who desisted. Due to these comparable rates of diagnostic accuracy,
results of the evaluation ultimately affirm Scoring-as-Usual as an acceptable method of
estimating likelihood of recidivism.
Phase III: Cohort Comparisons
         Most juvenile risk assessments, including the YLS/CMI, are designed to be administered
and interpreted using the same procedure, regardless of youths’ demographics (i.e., gender,
race/ethnicity) or charge-related (i.e., delinquency, truancy) characteristics. Accordingly, it is
critical to ensure that the psychometric properties and diagnostic accuracy of these generalist risk
assessment instruments are consistent for all youth. Due to small cell sizes across gender,
racial/ethnic, and court division cohorts, comparing the relative and absolute fit of the YLS/CMI
between these groups via multigroup CFA lies beyond the scope of this dissertation. However, a
series of subgroup analyses were conducted to detect variation in overall diagnostic accuracy
between and within the Novel Scoring Algorithm and the Scoring-as-Usual method (see Table
7). Assessing diagnostic accuracy holds the most immediate relevance for equitable justice
administration, as results denote patterns in under- and overestimation of risk.
                                                  37


 Table 7.
 Diagnostic accuracy across sample cohorts.
                                    Novel Scoring Algorithm
                             Two-Year            Sensitivity             1 - Specificity
                          Recidivism Rate (True Positive Rate)      (False Positive Rate)      AUC
 All Youth                    52.33%                 0.59                     0.54             0.64
 Gender
     Girls                    43.46%                 0.35                     0.22             0.60
     Boys                     56.99%                 0.81                     0.60             0.66
 Race/Ethnicity
     African                  63.16%                 0.88                     0.71             0.65
     American/Black
     Caucasian/White          40.27%                 0.08                     0.04             0.59
     Multi-Racial             51.40%                 0.67                     0.46             0.68
     Hispanic/Latinx          50.87%                 0.66                     0.71             0.57
     Other                     0.00%                  --                        --               --
 Division
     Delinquency              60.33%                 0.87                     0.71             0.63
     Truancy                  36.70%                 0.32                     0.11             0.67
                                    Scoring-as-Usual Method
                             Two-Year            Sensitivity            1 – Specificity
                          Recidivism Rate (True Positive Rate)      (False Positive Rate)      AUC
 All Youth                    52.33%                 0.62                     0.47             0.65
 Gender
     Girls                    43.46%                 0.35                     0.22             0.59
     Boys                     56.99%                 0.75                     0.51             0.67
 Race/Ethnicity
     African                  63.16%                 0.90                     0.71             0.65
     American/Black
     Caucasian/White          40.27%                 0.03                     0.01             0.56
     Multi-Racial             51.40%                 0.38                     0.70             0.69
     Hispanic/Latinx          50.87%                 0.66                     0.40             0.62
     Other                     0.00%                  --                        --               --
 Division
     Delinquency              60.33%                 0.87                     0.77             0.61
     Truancy                  36.70%                 0.32                     0.13             0.67
        Gender Variation in Diagnostic Accuracy. When analyzed independently, both models
produced statistically comparable rates of diagnostic accuracy between boys and girls (see Table
8). Additionally, when comparing models against each other, the Novel Scoring Algorithm and
the Scoring-as-Usual method predicted recidivism to equivalent degrees of diagnostic accuracy
for both girls and boys (see Table 9). Taken together, these findings reflect the results of the
                                                38


overall sample: the Novel Scoring Algorithm held no relative advantage over the Scoring-as-
Usual method in predicting recidivism for youth across gender.
         While AUC is considered best overall indicator of diagnostic accuracy, taking stock of
model sensitivity (i.e., true positive rate) and 1 – specificity (i.e., false positive rate) offers
greater insight on the direction and magnitude of risk misclassification (Mossman, 1994; Swets
et al., 2000; Rice & Harris, 2005). Model sensitivity among boys was elevated relative to the full
sample (Novel Scoring Algorithm: Sensitivity = 0.81; Scoring-as-Usual: Sensitivity = 0.75),
indicating that most recidivist boys are correctly identified using both scoring methods.
Additionally, 1 – specificity among boys was slightly elevated relative to the full sample (Novel
Scoring Algorithm: 1 – Specificity = 0.60; Scoring-as-Usual: 1 – Specificity = 0.51), indicating
that over half of desistant boys are incorrectly identified as recidivist using both scoring methods
(see Table 7). In other words, composite juvenile risk assessment scores slightly overpredicted
recidivism in court-involved boys, regardless of the scoring method employed.
         Conversely, model sensitivity among girls fell short of the level estimated for the full
sample (Novel Scoring Algorithm & Scoring-as-Usual: Sensitivity = 0.35), signaling that most
girls who recidivate are not correctly identified using either scoring method. Concurrently, 1 –
specificity among girls also fell short of the level estimated for the full sample (Novel Scoring
Algorithm & Scoring-as-Usual: Sensitivity = 0.22), indicating that fewer than 25% of desistant
girls are incorrectly classified as recidivist using both scoring methods. Taken together,
composite juvenile risk assessment scores underpredicted recidivism in court-involved girls,
regardless of the scoring method employed.
         Racial/Ethnic Variation in Diagnostic Accuracy. When analyzed independently, both
models produced statistically comparable rates of diagnostic accuracy for youth across
                                                   39


racial/ethnic groups (i.e., African American/Black, Caucasian/White, Multi-Racial,
Hispanic/Latinx, and Other) (see Table 8). Additionally, when comparing models against each
other, the Novel Scoring Algorithm and the Scoring-as-Usual method predicted recidivism to
equivalent degrees of diagnostic accuracy for all racial/ethnic groups (see Table 9). In
congruence with previous findings, these results suggest that the Novel Scoring Algorithm
neither improves nor worsens the overall diagnostic accuracy of the Scoring-as-Usual method for
youth across race/ethnicity.
        For African American/Black youth, both models correctly identified most recidivists
(Novel Scoring Algorithm: Sensitivity = 0.88; Scoring-as-Usual: Sensitivity = 0.90); however,
nearly three quarters of desistant youth were incorrectly classified as recidivist (Novel Scoring
Algorithm & Scoring-as-Usual: 1 - Specificity = 0.71). A similar, but less extreme, pattern was
observed among Multi-Racial youth (Novel Scoring Algorithm: Sensitivity = 0.67, 1 -
Specificity=0.46; Scoring-as-Usual: Sensitivity = 0.38, 1 – Specificity = 0.70) and
Hispanic/Latinx youth (Novel Scoring Algorithm: Sensitivity = 0.66, 1 - Specificity=0.71;
Scoring-as-Usual: Sensitivity = 0.66, 1 – Specificity = 0.40). Taken together, composite risk
estimates overpredicted recidivism among youth of color, with the most significant degree of
misevaluation observed in African American/Black youth.
        Conversely, model sensitivity among Caucasian/White youth was the lowest of all
racial/ethnic groups (Novel Scoring Algorithm: Sensitivity = 0.08; Scoring-as-Usual: Sensitivity
= 0.03), signaling that the overwhelming majority of Caucasian/White youth who recidivate are
not correctly identified using either scoring method. Additionally, 1 – specificity among
Caucasian/White youth was also exceedingly low (Novel Scoring Algorithm: 1 – Specificity:
0.04; Scoring-as-Usual: 1 - Specificity = 0.01), indicating that fewer than 5% of desistant
                                                 40


Caucasian/White youth are incorrectly classified as recidivist. Accordingly, composite risk
estimates underpredicted recidivism among White youth, regardless of the scoring method
employed.
        Court Division Variation in Diagnostic Accuracy. When analyzed independently, both
models produced statistically comparable rates of diagnostic accuracy for youth across court
divisions (i.e., delinquency, truancy) (see Table 8). Additionally, when comparing models
against each other, the Novel Scoring Algorithm and the Scoring-as-Usual method predicted
recidivism to equivalent degrees of diagnostic accuracy for both truant and delinquent youth (see
Table 9). Once more, these results indicate that Novel Scoring Algorithm and the Scoring-as-
Usual method are equally accurate predictors of recidivism across court division.
        Model sensitivity among delinquent youth was elevated relative to the full sample (Novel
Scoring Algorithm: Sensitivity & Scoring-as-Usual Method = 0.87), indicating that 87%
recidivist delinquent youth are correctly identified using both scoring methods. Additionally, 1 –
specificity among delinquent youth was elevated relative to the full sample (Novel Scoring
Algorithm: 1 – Specificity = 0.71; Scoring-as-Usual: 1 – Specificity = 0.77), indicating that over
two thirds of desistant delinquent youth were incorrectly classified using both scoring methods.
In sum, these results indicate that composite risk estimates overpredicted recidivism among
delinquent youth, regardless of the scoring method employed.
        Conversely, model sensitivity among truant youth fell short of the level estimated for the
full sample (Novel Scoring Algorithm & Scoring-as-Usual: Sensitivity = 0.32), signaling that
only 32% of truant youth who recidivate are correctly identified using both scoring methods.
However, 1 – specificity among truant youth also fell short of the level estimated for the full
sample (Novel Scoring Algorithm: 1 – Specificity = 0.13; Scoring-as-Usual: 1- Specificity =
                                                  41


0.11), indicating that no more than 13% of desistant truant youth are incorrectly classified as
recidivist. Taken together, these results indicate that composite risk estimates underpredicted
recidivism among truant youth, regardless of the scoring method employed.
 Table 8.
 Comparing within-model diagnostic accuracy across cohorts.
                                     Novel Scoring Algorithm
                                                                            95% CI for Z-Score
                                 AUC Difference        Z-Score       p      Lower         Upper
 Gender
    Boys & Girls                       0.06              1.24      0.22      -0.16         0.04
 Race/Ethnicity
    African American/Black             0.06              0.98      0.33      -0.18         0.06
    & Caucasian/White
    Youth
    African American/Black             0.03              0.67      0.51      -0.17         0.08
    & Multi-Racial Youth
    African American/Black             0.08              0.94      0.35      -0.25         0.08
    & Hispanic/Latinx
    Youth
    Caucasian/White &                  0.11              1.22      0.22      -0.23         0.52
    Multi-Racial Youth
    Caucasian/White &                  0.02              0.24      0.81      -0.16         0.20
    Hispanic/Latinx Youth
    Multi-Racial &                     0.11              1.16      0.25      -0.08         0.29
    Hispanic/Latinx Youth
 Division
    Delinquency & Truancy              -0.04             -.78      0.44      -0.14         0.06
                                     Scoring-as-Usual Method
                                                                            95% CI for Z-Score
                                 AUC Difference        Z-Score       p      Lower         Upper
 Gender
    Boys & Girls                       0.08              1.54      0.12      -0.18         0.02
 Race/Ethnicity
    African American/Black             0.09              1.41      0.16      -0.21         0.03
    & Caucasian/White
    Youth
    African American/Black             0.03              0.42      0.68      -0.15         0.10
    & Multi-Racial Youth
                                                  42


Table 8 (cont’d).
                               Scoring-as-Usual Method
                                                                  95% CI for Z-Score
                           AUC Difference     Z-Score       p     Lower       Upper
  African American/Black         0.03           0.37      0.71     -0.20       0.13
  & Hispanic/Latinx
  Youth
  Caucasian/White &              0.13           1.83      0.07     -0.27       0.01
  Multi-Racial Youth
  Caucasian/White &              0.06           0.63      0.53     -0.23       0.12
  Hispanic/Latinx Youth
  Multi-Racial &                 0.07           0.81      0.07     -0.11       0.25
  Hispanic/Latinx Youth
Division
  Delinquency & Truancy          0.06           2.24      0.21     -0.16       0.04
Table 9.
DeLong tests comparing between-model performance across sample cohorts.
                                                                  95% CI for Z-Score
                           AUC Difference     Z-Score       p     Lower       Upper
All Youth                        0.01          -0.32      0.75     -0.02       0.02
Gender
  Boys                           0.01           0.32      0.75     -0.02       0.02
  Girls                          0.01           0.57      0.57     -0.03       0.05
Race/Ethnicity
  African American/Black        <0.01           0.02      0.98     -0.03       0.03
  Caucasian/White                0.03           0.21      0.21     -0.01       0.07
  Hispanic/Latinx                0.05           1.31      0.19     -0.13       0.03
  Multi-Racial                   0.01           0.69      0.49     -0.06       0.03
Division
  Delinquency                    0.01           1.57      0.12    <-0.01       0.04
  Truancy                       <0.01           0.28      0.78     -0.04       0.0.3
                                          43


                                            DISCUSSION
        Juvenile risk assessment has become an increasingly integral component of evaluating
and treating court-involved youths (JJGPS, 2020). The purpose of this study was to develop and
evaluate a Novel Scoring Algorithm for estimating composite criminogenic risk, based upon
patterns of risk and protective factors in a county-level sample. Composite criminogenic risk
estimates generated from the Novel Scoring Algorithm were highly correlated with those
generated from Scoring-as-Usual (r(557)=0.91, p <0.01), indicating substantial shared variance
between the two scoring methods. Put simply, the Novel Scoring Algorithm generally replicated,
rather than altered, youths’ Scoring-as-Usual risk scores in relation to their peers.
        Indices of absolute and relative model fit favored the Novel Scoring Algorithm
(c2(1,874) = 3,072.35, p < 0.01, SRMR = 0.11, RMSEA = 0.03; CFI/TLI = 0.94), highlighting
significant psychometric imprecision incurred by the Scoring-as-Usual method (c2(808) =
1,932.63, p < 0.01, SRMR = 0.17, RMSEA = 0.05; CFI/TLI = 0.83). However, differences in
AUC estimates rendered by the two models were not statistically significant, indicating that the
Novel Scoring Algorithm (AUC=0.64) holds no relative advantage over the Scoring-as-Usual
method (AUC=0.65) in classifying youth as recidivant or desistant. AUC values derived from
both the Novel Scoring Algorithm and the Scoring-as-Usual method were remarkably similar to
average meta-analytic estimates for third generation juvenile risk assessment instruments
(AUC=0.65, k=21, N=4,965) (Schwalbe, 2007). While both scoring methods predicted juvenile
recidivism with expected levels of diagnostic accuracy, results ultimately do not support the full
hypothesized advantages of the Novel Scoring Algorithm over the Scoring-as-Usual method.
        Taken together, results ultimately affirm the Scoring-as-Usual method as an acceptable
method of estimating likelihood of recidivism in court-involved youths. Nevertheless, the
                                                  44


magnitude and form of risk misclassification observed across gender, court division, and
racial/ethnic cohorts highlight the penalties of juvenile risk assessment utilization on fair and
equitable decision-making. Drawing from the factor structure of the Novel Scoring Algorithm,
preliminary recommendations for risk management and measurement are discussed.
Patterns in Diagnostic Accuracy
        In studies of prediction, AUC corresponds conceptually to the probability that a random
score drawn from one sample (e.g., youth who recidivate) exceeds another score drawn from a
separate sample (e.g., youth who desist) (Mossman, 1994; Swets et al., 2000; Rice & Harris,
2005). Following conversion procedures from Cohen’s (1988) thresholds, AUC estimates of
0.55, 0.64, and 0.71 respectively correspond to small, moderate, and large effect sizes in violence
risk assessment literature (Rice & Harris, 2005). Ergo, the AUC estimates yielded by the Novel
Scoring Algorithm (AUC=0.64) and the Scoring-as-Usual method (AUC=0.65) correspond to a
moderate effect size (Rice & Harris, 2005). As previously noted, these estimates additionally fall
in line with meta-analytic estimates of both juvenile risk assessment performance (AUC=0.64)
(Schwalbe, 2007), and adult risk assessment performance (AUC=0.67) (Gendreau et al., 1996).
Taken together, both scoring methods predicted recidivism to an expected degree of overall
diagnostic accuracy.
        For the aggregated sample, model sensitivity (i.e., true positive rate) was 0.59 for the
Novel Scoring Algorithm and 0.62 for the Scoring-as-Usual method, indicating that 59% and
62% of youth who recidivated were correctly identified as recidivist by the respective models.
Concurrently, 1 – specificity (i.e., false positive rate) was 0.54 for the Novel Scoring Algorithm
and 0.47 for the Scoring-as-Usual method, indicating that 54% and 47% of desistant youth were
incorrectly identified as recidivist by the respective models. While the differences in overall
                                                   45


diagnostic accuracy between models were not statistically significant (see Table 9), these
comparisons indicate that the Scoring-as-Usual yielded slightly more true positives and fewer
false positives when compared to the Novel Scoring Algorithm. These results further affirm the
Scoring-as-Usual method as the preferred method of estimating likelihood of recidivism in court-
involved youths.
        Cohort Comparisons. While results from the at-large sample affirm the Scoring-as-
Usual method as a valid method of predicting recidivism, it is additionally important to
investigate how certain subgroups (i.e., gender, racial/ethnic, court division cohorts) fare.
Juvenile risk assessments were developed, in part, to alleviate discriminatory, paternalistic, and
otherwise harmful biases incurred through discretionary court decision-making (Peck &
Jennings, 2016). However, these standardized risk assessment instruments may conflate
likelihood of recidivism with related characteristics of structural oppression (e.g., racism,
poverty, trauma), justifying inappropriate treatment outcomes for marginalized youth (Green,
2007; Harcourt, 2010; Holtfreter & Morash, 2003).
        Cohort comparisons of overall diagnostic accuracy (i.e., AUC) revealed no statistically
significant gender, racial/ethnic, or court division differences between measurement models (see
Tables 9 and 10). However, upon assessing forms of misclassification, several patterns emerged:
composite risk scores overpredicted recidivism among boys, youth of color, and youth processed
in the delinquency division of the court, regardless of the scoring method employed.
Concurrently, composite risk scores underpredicted recidivism among girls, Caucasian/White
youth, and youth processed in the truancy division. Both forms of risk misevaluation directly
inhibit effective service delivery and diminish the likelihood of successful rehabilitation.
                                                   46


         Previous research on gender in juvenile risk assessment contextualizes the observed
differences between boys and girls. Many third-generation juvenile risk assessment instruments,
including the original version of the YLS/CMI, were developed and validated in the 1990s, when
girls represented approximately 1 in 5 juvenile court cases (Office of Juvenile Justice &
Delinquency Prevention [OJJDP], 2019). While girls still account for far fewer arrests, in the
time since then, they have become the fastest growing cohort in the juvenile justice system
(OJJDP, 2019; Schwartz & Steffensmeier, 2012). Accordingly, some facets of measured risk on
the YLS/CMI may be less sensitive to how girls present criminogenic risk. Research suggests
that girls are often socialized into delinquency through distinct pathways (e.g., via intimate
partner relationships), which are drawn out of focus or omitted entirely from generalist juvenile
risk assessment instruments (Eklund et al., 2010; Kerig, 2014). Courts may benefit from utilizing
gender-responsive evaluation approaches to estimate criminogenic risk more accurately in girls
(Van Voohris et al., 2010).
         Some feminist scholars have cautioned against general application of juvenile risk
assessment, as features of non-criminogenic trauma may be incorrectly flagged as risk
(Holtfreter & Morash, 2003). Girls may be acutely vulnerable to risk overprediction, given the
elevated prevalence of previous trauma and victimization experiences (Hennessey et al., 2004).
The present results were not compatible with this prior literature: girls’ risk scores
underpredicted their actual likelihood of recidivism, such that only 35% of those who recidivated
were correctly identified. The low percentage of true positives among girls may instead reflect
the court’s failure to adequately address girls’ criminogenic needs. Girls enter juvenile court
supervision with qualitatively different risk profiles, as identified via initial juvenile risk
assessment, with greater needs centered in familial and behavioral domains (Kitzmiller et al.,
                                                  47


2022). Effective court-sanctioned intervention for girls should generally be both: (1) minimally
restrictive, given that most girls enter court supervision with low to moderate cumulative levels
of risk; and (2) complementary to these distinct differences in types of needs. It is possible that
the court is failing to provide effective intervention in one or both of these regards, thus
increasing girls’ actual criminogenic risk level over the course of court supervision (De La Rue
& Ortega, 2019).
          Finally, results indicate that juvenile risk assessment scores slightly overpredicted
recidivism in boys, relative to the aggregated sample. As previously noted, juvenile risk
assessments may be more attuned to typical features of criminogenic risk in boys. For example,
several risk factors indirectly or directly identify externalizing behaviors (e.g., disruptive
classroom behavior, explosive episodes, physical aggression), which are more commonly
observed coping mechanisms in adolescent boys (Hoffmann & Su, 1998; Maschi et al., 2008).
These externalizing behaviors, left unchecked, closely resemble delinquency (Maschi et al.,
2008); however, they also represent relatively normative characteristics of psychosocial
immaturity, which tend to digress naturally as youth enter late adolescence and early adulthood
(Liu, 2004). While both boys and girls experience these normative psychosocial changes, it is
perhaps less likely that girls would be flagged for externalizing characteristics of criminogenic
risk at the onset of court supervision. In congruence with the current study’s findings, measuring
externalizing behaviors via juvenile risk assessment may provide well-reasoned impetus for
referral to adjacent wraparound services; however, it may additionally contribute to
overestimation of risk.
          Results observed across court divisions closely resembles those across gender: namely,
juvenile risk assessment scores overpredicted recidivism among delinquent youth, and
                                                     48


underpredicted recidivism for truant youth, regardless of the scoring method employed. It is
important to note that gender and court division are intertwined: in the current sample, girls
represent 50.79% of youth in the truancy division compared to 35.79% of youth in the
delinquency division (c2(1)=33.11, p<0.01). Previous research substantiates this pattern: while
boys and girls commit status offenses (e.g., truancy) at comparable frequencies, girls are more
likely than boys to fall under juvenile court jurisdiction for a status offense (Chesney-Lind &
Sheldon, 2004; Onifade et al., 2010). Accordingly, it is possible that the courts’ failure to
adequately mitigate criminogenic risk in girls is reflected again in the rates of diagnostic
accuracy for the truancy division.
        Concurrently, while the YLS/CMI has made significant inroads in predicting repeat
delinquent offending, less is known regarding its appropriateness in truancy specialty courts
(Onifade et al., 2010). Truancy has increasingly been addressed through the juvenile court
system, rather than the education system, in effort to stymie future criminogenic development
more effectively (Baker et al., 2001; Onifade et al., 2010). In the current sample, youth processed
via truancy division recidivated at the lowest rates (36.70%) relative to all other cohorts. Even
so, their initials risk scores indicate that this recidivism rate is higher-than-expected. Thus,
addressing truancy in a juvenile justice context may be ineffective and iatrogenic. This
speculation is in line with other research highlighting the harmful effects of overprocessing low
risk youth (Cecile & Born, 2009; Gatti et al., 2009). While the current study is not designed to
examine the effects of truancy court specifically, future research should investigate whether
truancy courts likewise represent a form of ineffective, and ultimately harmful, overprocessing.
        Variation by race/ethnicity. Results observed across race/ethnicity underscore one of the
central most criticisms of risk assessment: namely, that risk assessments forecast future justice
                                                     49


system contact, which is deeply informed by racism and other overlapping systems of oppression
(Green, 2020; Hannah-Moffat et al., 2009; Maurutto & Hannah-Moffat, 2007). As a result, risk
scores overpredicted recidivism among youth of color, justifying the continued over prescription
of restrictive court sanctions to this cohort. While evidence of artificially high risk scores was
observed in all youth of color, African American/Black youth appear to be acutely vulnerable to
overprediction of recidivism: out of the 86 African American/Black youth who desisted, 61 were
incorrectly identified as recidivist using both scoring methods. These findings are supported by
previous research which note the unique effects of anti-Black racism on standardized juvenile
risk assessment instruments (Miller et al., 2021).
         Because many facets of measured risk appear to be linked to racial marginalization,
juvenile risk assessment scores demonstrated exceedingly poor accuracy in correctly identifying
recidivist Caucasian/White youth. Of the 60 Caucasian/White youth who recidivated, five were
correctly identified using the Novel Scoring Algorithm and two were correctly identified using
Scoring-as-Usual. The high prevalence of “false negatives” within this cohort may represent a
missed opportunity to provide much needed intervention and wraparound services, as
Caucasian/White youth with artificially low risk scores may re-enter their communities with
unaddressed criminogenic needs.
Moving Towards Equitable Decision-Making
         The widespread utilization of juvenile risk assessments reflects a growing effort towards
implementing evidence-based evaluation and treatment standards in juvenile court settings
(National Research Council, 2013; Singh et al., 2014; Vincent et al., 2012). However, the
patterns of misclassification yielded from the Novel Scoring Algorithm and the Scoring-as-Usual
method suggest that risk assessments provide may evidence-based justification for deeply
                                                  50


entrenched oppressive ideologies upheld through the justice system (Butcher & Kretschmar,
2020). One of the goals of this dissertation is to support equitable decision-making through
improved juvenile risk assessment measurement. Ergo, eliminating risk assessment from juvenile
justice administration altogether is likely not the appropriate solution; after all, risk assessments
provide court practitioners with valuable and consistent information regarding youths’
criminogenic risks and needs (Oleson et al., 2011; Peck & Jennings, 2016).
        Some scholars posit that juvenile risk assessments can minimize contribution to systems-
level inequity through the process of community norming. Community norming is the process by
which an off-the-shelf risk assessment instrument is modified from its original form to improve
performance, based upon local patterns of risk and recidivism (Lovins et al., 2018). While most
criminal and juvenile justice agencies use off-the-shelf instruments without local norming or
validation, some research suggests that juvenile risk assessments’ performance is highly variable
across jurisdictions (Wright et al., 1984). In response, experts posit that community norming via
data mining and machine learning techniques will become hallmark characteristics as risk
assessments enter their fifth generation (Duwe, 2014; Wormith, 2017).
         While the process of community norming is not currently standardized, frequent steps
include: (1) collecting responses from a large pool of potential items drawn existing off-the-shelf
measures (Barnoski & Drake, 2007); (2) selecting items with strong predictive association with
the outcome of interest via stepwise logistic regression (Austin & Tu, 2004; Hamilton et al.,
2015); (3) weighting items appropriately based on predictive association (Hamilton et al., 2015);
(4) conducting thorough review by subject matter experts; and (5) ensuring robustness to change
overtime via cross-validation (Silver et al., 2000). While the process of community norming lies
well beyond the scope of the current results, the factor model yielded from the Novel Scoring
                                                  51


Algorithm provides a promising launching point to refine risk measurement and management in
court-involved youths.
Recommendations for Effective Risk Management
        Prior to discussing the implications for effective risk management, it is firstly important
to discuss the conceptual implications of the Novel Scoring Algorithm. Through CFA, the Novel
Scoring Algorithm estimated latent criminogenic risk from the shared covariance among
measured risk and protective factor items and domains. The Novel Scoring Algorithm yielded
favorable absolute and relative model fit over the Scoring-as-Usual method, signaling that its
parameters appropriately represent the observed covariance between item indicators.
Accordingly, first-order factor loadings reflect shared covariance between a given assessment
item and the other items within its domain. Assessment items with large first-order factor
loadings have a stronger and more predictable “pull” on their constituents while those with small
factor loadings have little bearing on other related facets of risk (Comrey & Lee, 1992). Taken
together, the observed factor loadings generated by the Novel Scoring Algorithm can help court
practitioners expedite risk reduction by centering areas with large factor loadings in treatment,
while bringing items with low factor loadings out of focus.
        Importance of Protective Factors. In six of the seven domains that include both risk
and protective factors, the assessment items with the largest factor loadings were protective
factors (Education: positive relationships with teachers (l = -0.86); Peer Relations: close bonds
with positive peers (l = -0.86); Family & Parenting: strong family management (l = -0.93);
Attitudes & Orientation: prosocial attitudes (l = -0.88); Personality & Behavior: low aggression
(l = -0.97); Substance Abuse: low availability to drugs (l = -0.69)). Only within the Leisure &
                                                  52


Recreation domain was the assessment item with the largest factor loading a risk factor (i.e.,
could make better use of time (l = 0.87)).
        The presence of a protective factor denotes two similar, but distinct, pieces of
information about a court-involved youth. First, they indicate that the youth does not have an
unaddressed need in an area which may contribute to repeat offending. This inverse association
with risk factors was clearly apparent in the present study, as evidenced by the strong, negative
correlations observed between risk and protective factor items in Appendix B. Perhaps more
importantly, protective factors indicate that the youth has an existing strength in an area that may
play a role in their desistance from delinquency (Fergus & Zimmerman, 2005). The presence of a
strength, coupled with the absence of a deficit, likely explains why protective factors were often
the items with the largest “pull” on others within a risk domain. These results suggest that
rehabilitative efforts are best devoted to cultivating new and existing strengths, rather than
mitigating youths’ deficits.
        The importance of protective factors is substantiated by the growing development and
implementation of strengths-based, restorative approaches to curriculum design and
programming for court-involved youths. Protective factors identified via juvenile risk assessment
provide a menu of youths’ goals, capabilities, and assets which, in turn, can be incorporated into
individually tailored treatment plans (Rennie & Dolan, 2010). Related studies have shown
favorable effects of strengths-based programming on youths’ self-efficacy and relationships with
program staff (Akiva et al., 2017). While their effects on recidivism have yet to be systematically
investigated, the current results provide preliminary support that strengths-based programming
may additionally yield promising returns on criminogenic risk score reduction.
                                                  53


        Identifying Extraneous Assessment Items. Concurrently, assessment items with low
factor loadings should be brought out of focus from rehabilitative treatment, as they have weak
association with other related facets of criminogenic risk. In concert with other community
norming processes (e.g., incremental changes in predictive validity, cross-validation), items with
low factor loadings may additionally be considered for removal from composite estimates of
criminogenic risk. The results of the current study do not serve as conclusive evidence for
assessment item removal; rather, this discussion contextualizes the following risk factors in the
extant literature and weigh their implications for equitable decision-making. Given that juvenile
risk assessments overpredicted recidivism among cohorts of youth that are already
overrepresented in the juvenile justice system (e.g., boys, delinquent youth, and youth of color),
identifying risk items which artificially raise composite scores is an issue of immediate
importance.
        Using Comrey and Lee’s (1992) criteria, standardized factor loadings that fall below 0.38
are considered poor indicators of the specified latent construct. Results from the second-order
CFA revealed that the following eight items fell below this threshold: three or more current
convictions (l=0.30), chronic alcohol use (l=0.27), substance use linked to offense(s) (l=0.34),
poor relations with father (l=0.33), poor relations with mother (l=0.34), not seeking help
(l=0.30), inadequate guilt feelings (l=0.35), and inflated self-esteem (l=0.09).
        Within the Prior/Current Offenses domain, three or more current convictions was
endorsed in 9 (1.6%) cases; it is therefore not likely a frequent contributor to composite risk
estimates. Nevertheless, many scholars contend that quantifying risk based using prior and
current justice system involvement biases risk assessment tools against people of color
(Harcourt, 2010). Regarding the item at hand, the number of convictions on a youths’ current
                                                  54


docket reflects both their participation in delinquency and the decisions of justice officials (e.g.,
police decision to arrest, prosecutor decision to approve petitions) (Skeem & Lowenkamp,
2016). The weak factor loadings attributed to this item suggests that youth who had three or
more current convictions did not necessarily have previous justice system contact. Therefore, by
considering omission of three or more current convictions, the court could reduce the impact of
differential selection on youths’ risk scores without losing other related information on their
previous justice system involvement.
        Within the Substance Abuse domain, assessment items chronic alcohol use (N=16; 2.9%)
and substance use linked to offense(s) (N=86; 15.4%) yielded weak factor loadings.
Experimentation with alcohol is widely considered to be a common feature of adolescent risk
taking, with little to no serious or long-term consequences (Bonomo et al., 2001). However,
youth who consume alcohol with high frequency are more likely to report adverse outcomes
concerning the justice system (e.g., trouble with police) and beyond (e.g., trouble at school or
work, trips to the emergency room, trouble at home) (Colder et al., 2002). Concurrently, youth
who consume alcohol chronically in early adolescence are more likely to perpetrate or be
victimized by violence in adulthood (Popovici et al., 2012). The weak factor loading attributed to
chronic alcohol use indicates that this characteristic does not predictably covary with other facets
of measured risk pertaining to substance abuse; however, in accordance with the extant literature,
chronic alcohol use in adolescents may signal elevated risk of future justice system contact, and
may be an important indicator for referral to alcohol dependence treatment (Popovici et al.,
2012).
        The risk factor substance use linked to offense(s) diverges from the other items within the
Substance Abuse domain, as it pertains to a characteristic of the offense, rather than the youths’
                                                  55


self-reported behavior. Quantifying risk based upon characteristics of the offense introduces
opportunity for penalty based on differential selection; the endorsement criteria hinges upon a
decision from justice officials to arrest and petition the youth for a substance use-related charge.
Prior research indicates that self-reported rates of substance use are consistent across
racial/ethnic cohorts (Rosenberg, 2018). Despite this, youth of color, particularly African
American/Black youth, are disproportionately arrested, adjudicated, and incarcerated for
substance use related charges (Rosenberg, 2018; Rovner, 2016). In the present sample, less than
one quarter (22.43%) of youth who used substances occasionally or chronically met the criteria
for substance use linked to offense(s), indicating that this assessment item has little bearing on
youths’ habitual substance usage, and thus provides little information on their need for substance
use related treatment. Taken together, by considering omission of substance use linked to
offense(s), the court could further reduce the impact of differential selection without losing other
relevant information on youths’ substance use tendencies.
        Within the Family & Parenting domain, assessment items poor relations with mother
(N=152; 27.2%) and poor relations with father (N=346; 61.9%) yielded weak factor loadings. A
substantial body of literature holds that dysfunctional family environments can cause, sustain, or
worsen adolescent delinquent involvement (Simons et al., 2005; Stern & Smith, 1995); in turn,
mobilizing the family as a therapeutic influence is among the most common goals of juvenile
court intervention (Buel, 2002; Diamond et al., 2011; Woolfenden et al., 2001; Woolfenden et
al., 2002). However, estimating family risk through family configuration (e.g., relationships with
biological parents) reflects the antiquated notion that so-called “broken homes” can be identified
based upon kinship form (Parsons, 1943; Wells & Rankin, 1991). This assumption has been
widely critiqued by race and gender scholars, who argue that, among other deficiencies, it fails to
                                                   56


account for support from extended family and community networks, a common feature in
African American/Black communities(Collins, 1990; Love & Morris, 2019; Stack, 1974).
Indeed, the results of the current study indicate that the relationship between the child and their
biological parents holds little bearing on other measured components of family risk (e.g.,
inadequate supervision, inappropriate discipline, inconsistent parenting). Accordingly, the court
may consider removing poor relations with mother and poor relations with father as indicators of
family risk.
         The three remaining assessment items include not seeking help (N=299; 53.5%),
inadequate guilt feelings (N=91; 16.3%), and inflated self-esteem (N=32; 5.7%). These
characteristics all pertain to attitudes, personality, and behavioral tendencies, which may be more
difficult for court practitioners to accurately assess in on-the-job risk evaluations. While little is
known on participants’ experience of risk assessment specifically, participating in justice system
procedures can be stressful and traumatizing for both youth (Branson et al., 2017; Ko et al.,
2008; Pilnik & Kendall, 2012) and adults (Covington, 2022; Maschi et al., 2011). The resulting
fear and confusion may further obscure youths’ true personality traits and behavioral tendencies.
It is also worth noting that probation officers have interpreted youths’ behavior differently based
upon race: report narratives of African American/Black youth were more likely to include
descriptions of negative personality traits, while narratives of Caucasian/White youth were more
likely to include descriptions of negative environmental influences (Bridges & Steen, 1998).
Importantly, these three items are not exhaustive of all potentially difficult-to-assess personality
and behavioral characteristics. However, their lack of predictable covariance with other facets of
related risk flag them as potential areas to consider for omission.
                                                    57


Summary
        It is critical that juvenile court processing decisions are appropriately tailored to youths’
latent cumulative level of criminogenic risk to reduce recidivism. Aspiring to improve courts’
measurement of risk, the present study compared two juvenile risk assessment measurement
models: one derived from the unweighted sum score of all endorsed risk factors (i.e., Scoring-as-
Usual method), and one weighted to correspond to a freely estimated factor model (i.e., Novel
Scoring Algorithm). While the Novel Scoring Algorithm improved the overall fit of the data,
composite risk estimates predicted recidivism with equivalent degrees of diagnostic accuracy to
the Scoring-as-Usual method. Accordingly, these results endorse Scoring-as-Usual as an
acceptable method of predicting recidivism.
        Both measurement models yielded rates of diagnostic accuracy which fell in line with
meta-analytic estimates for third generation juvenile risk assessment tools (Schwalbe, 2007).
However, the form and magnitude of risk misclassification varied widely in accordance with
demographic and charge-related characteristics: juvenile risk assessment scores overpredicted
recidivism among boys, youth of color, and youth processed via delinquency division.
Concurrently, juvenile risk assessment scores underpredicted recidivism among girls,
Caucasian/White youth, and youth processed via truancy division.
        While the current results cannot parse apart the mechanisms responsible for the divergent
patterns in risk misclassification, it is possible that juvenile risk assessments may not be
responsive to the characteristics which prime girls and status offenders to reoffending.
Furthermore, ineffective court intervention may yield iatrogenic effects among these cohorts,
rendering them more likely to recidivate upon court supervision exit than they were at entry. The
overprediction of recidivism among boys, youth of color, and youth processed via delinquency
                                                    58


suggests that certain characteristics of measured risk may correspond to non-criminogenic
features of adolescent developments and consequences of structural racism. These findings
highlight an urgent need to critically examine juvenile risk assessment items and eliminate those
with marginal implications for risk and recidivism.
         While the process of community norming lies well beyond the scope of the current study,
parameter estimates yielded by the Novel Scoring Algorithm serve as an optimal launching point
for this work. Specifically, estimates indicate that leveraging new and existing protective factors
in juvenile programming and case management may be the most efficient means of expedient
risk reduction. Additionally, the following eight risk factors yielded marginal covariance with
other related indicators of risk: three or more current convictions (l=0.30), chronic alcohol use
(l=0.27), substance use linked to offense(s) (l=0.34), poor relations with father (l=0.33), poor
relations with mother (l=0.34), not seeking help (l=0.30), inadequate guilt feelings (l=0.35),
and inflated self-esteem (l=0.09). In concert with other community norming procedures (e.g.,
cross-validation, stepwise logistic regression), these items may be considered for removal to
reduce artificially high composite risk scores.
Strengths & Limitations
         The results of the current study are bolstered by several strengths. First, the measurement
of criminogenic risk and recidivism is highly ecologically valid, as the data collected represents
official juvenile risk assessment and recidivism records retained by court practitioners in the
field. Relatedly, the assessment instrument utilized (i.e., the YLS/CMI) is among the most the
widely adopted actuarial juvenile risk assessment tools in juvenile court settings. Taken together,
the findings have immediate implications towards understanding and refining local measurement
                                                   59


of criminogenic risk among court-involved youth. Additionally, given the popularity of the
YLS/CMI, findings create opportunity for cross-validation in different settings.
        Concurrently, the current study employs a novel methodological approach to measuring
criminogenic risk via juvenile risk assessment. Sum scoring is among the most common method
of estimating a variable of interest that is not directly measurable (e.g., criminogenic risk) (Bauer
& Curran, 2015). However, sum scoring may be insufficient depending on the context and the
stakes involved (McNeish & Wolf, 2020). The current study is the first of its kind to weigh the
tradeoffs in psychometric precision and predictive validity incurred by sum scoring in juvenile
risk assessment. Results affirm that sum scoring yields no detriment in estimating youths’
likelihood of recidivism when compared to a freely estimated factor model. Accordingly, results
serve as a necessary robustness check on a near-universally utilized method of estimating
composite criminogenic risk.
        Despite these strengths, findings from the current study are tempered by several
methodological and theoretical shortcomings. The data collected represents patterns in risk
assessment scores and recidivism from a single juvenile circuit court jurisdiction. Utilizing a
single county sample optimizes the study’s responsivity to the local ecology of delinquency, and
therefore maximizes the relevance of implications on court practices. However, patterns in risk
assessment scores and recidivism vary widely by geography (Feld, 1991); thus, the parameter
estimates yielded by the Novel Scoring Algorithm and their implications for recidivism are not
generalizable beyond the single county sample. Future research drawing from additional court
jurisdictions is warranted to rigorously evaluate the factor structure and diagnostic performance
of the YLS/CMI.
                                                   60


         Results are further tempered by premising prediction of recidivism solely on youths’
initial juvenile risk assessment score. Functionally, the initial risk assessment score is analogous
to a court’s first impression of a newly adjudicated youth, and therefore has the greatest
influence over processing and treatment decisions. However, many facets of criminogenic risk in
adolescents are subject to change over time. For instance, scores among those initially classified
as high risk may decline over the period of court supervision, either naturally or in response to
effective intervention. Likewise, scores among youths initially classified as low risk may
increase over the period of court supervision, sometimes in response to iatrogenic court
responses. In any case, these initial risk score may not align with the youths’ true likelihood of
recidivism at the end of their period of court supervision. In the current study, misclassification
caused by fluxuations in criminogenic risk over time was indistinguishable from measurement
error in the risk assessment tool. Future research may disentangle the confounding effect of risk
score fluxuations by instead premising prediction of recidivism on youths’ final risk assessment
score.
         Finally, the comparison of diagnostic performance across sample cohorts was limited in
its lack of intersectional scope. Youth hold multiple identities, each of which may have
compounding or contradicting implications for risk misevaluation. For example, results indicate
that African American/Black girls in the delinquency division are simultaneously vulnerable to
artificially low and high composite risk scores. It is likely, therefore, that discussions of
race/ethnicity, gender, and court division are overly simplistic, and obscure heterogeneous
within-cohort patterns. Future research should consider replicating analyses with larger samples,
allowing intersectional cohort comparisons to be drawn.
                                                  61


Conclusions
        Over the last few decades, courts have increasingly relied upon juvenile risk assessments
to inform case processing and treatment decisions (JJGPS, 2020). Results of the current study
both affirm and challenge their continued use. First, findings suggest that the Scoring-as-Usual
method, the near-universally implemented procedure for calculating composite risk, predicts
recidivism with acceptable levels of diagnostic accuracy, based upon disciplinary standards (Rice
& Harris, 2005). Concurrently, findings highlight distinctly different patterns in risk
misevaluation based upon youths’ demographic and charge characteristics, suggesting that risk
scores provide evidence-based justification for deeply entrenched oppressive ideologies upheld
through the justice system. Importantly, this central criticism of risk assessment reflects system-
level inequities and will likely persist without systems-level change. Nonetheless, the present
results provide preliminary evidence that courts may be able to reduce immediate harms by
leveraging protective factors in case management, while drawing other extraneous facets of risk
out of focus.
                                                  62


APPENDICES
    63


                Appendix A: Frequency of YLS and PFRJR Item Endorsement
Table 10.
Frequency of YLS and PFRJR item endorsement.
                                                          Frequency of Endorsement
Assessment Item                               n     Endorsed (%)        Not Endorsed (%)
Prior/Current Offenses
    Three or more prior convictions          559       7 (1.3%)           552 (98.7%)
    Two or more prior failures to comply     559      21 (3.8%)           538 (96.2%)
    Prior probation                          559      51 (9.1%)           508 (90.9%)
    Prior custody                            559      37 (6.6%)           552 (93.4%)
    Three or more current convictions        559       9 (1.6%)           550 (98.4%)
Education
    Low achievement                          559     453 (81.0%)          106 (19.0%)
    Problems with teachers                   559     222 (39.7%)          337 (60.3%)
    Problems with peers                      559     253 (45.3%)          306 (54.7%)
    Disruptive classroom behavior            559     264 (47.2%)          295 (52.8%)
    Disruptive behavior on school property   559     338 (60.5%)          221 (39.5%)
    Truancy                                  559     424 (75.8%)          135 (24.2%)
    Passing*                                 559      83 (14.8%)          476 (85.2%)
    High achievement*                        559      28 (5.0%)           531 (95.0%)
    Positive relationships with teachers*    559     214 (38.3%)          345 (61.7%)
    Commitment to school/education*          559     229 (41.0%)          330 (59.0%)
Leisure & Recreation
    Lack of organized activities             559     387 (69.2%)          172 (30.8%)
    Could make better use of time            559     454 (81.2%)          105 (18.8%)
    No personal interests                    559      51 (9.1%)           508 (90.9%)
    Involvement in organized activities*     559     162 (29.0%)          397 (71.0%)
    Positive personal interests*             559     337 (60.3%)          222 (39.7%)
    Religiosity*                             559     156 (27.9%)          403 (72.1%)
Peer Relations
    Lack of positive peer acquaintances      559     209 (37.4%)          350 (62.6%)
    Lack of positive friends                 559     242 (43.3%)          317 (56.7%)
    Some delinquent peer acquaintances       559     425 (76.0%)          134 (24.0%)
    Some delinquent friends                  559     344 (61.5%)          215 (38.5%)
    Close bonds with positive peers*         559     168 (30.1%)          391 (69.9%)
Substance Abuse
    Occasional drug use                      559     365 (65.3%)          194 (34.7%)
    Chronic drug use                         559     187 (33.5%)          372 (66.5%)
    Chronic alcohol use                      559      16 (2.9%)           543 (97.1%)
    Substance abuse interferes with life     559     152 (27.2%)          407 (72.8%)
    Substance use linked to offense(s)       559      86 (15.4%)          473 (84.6%)
    Low availability to drugs*               559     130 (23.3%)          429 (76.7%)
    Actively abstaining from drugs/alcohol*  559     214 (38.3%)          345 (61.7%)
Family & Parenting
    Inadequate supervision                   559     227 (40.6%)          332 (59.4%)
    Difficulty in controlling behavior       559     348 (62.3%)          211 (37.7%)
    Inappropriate discipline                 559     256 (45.8%)          303 (54.2%)
    Inconsistent parenting                   559     235 (42.0%)          324 (58.0%)
    Poor relations with father               559     346 (61.9%)          213 (38.1%)
    Poor relations with mother               559     152 (27.2%)          407 (72.8%)
                                             64


Table 10 (cont’d).
    Consistent supervision*                559 203 (36.3%) 356 (63.7%)
    Strong family management*              559 151 (27.0%) 408 (73.0%)
    Consistent parenting*                  559 111 (27.3%) 294 (72.4%)
    Strong adult bonds*                    559 313 (56.0%) 246 (44.0%)
Attitudes & Orientation
    Not seeking help                       559 299 (53.5%) 260 (46.5%)
    Actively rejecting help                559  74 (13.2%) 485 (86.8%)
    Defies authority                       559  67 (12.0%) 492 (88.0%)
    Antisocial/pro-criminal attitudes      559 178 (31.8%) 381 (68.2%)
    Callous, little concern for others     559  76 (13.6%) 483 (86.4%)
    Actively seeking help*                 559  79 (19.5%) 279 (81.3%)
    Positive response to authority*        559 185 (33.1%) 374 (66.9%)
    Prosocial attitudes*                   559 133 (23.8%) 426 (76.2%)
Personality & Behavior
    Short attention span                   559 341 (61.0%) 218 (39.0%)
    Poor frustration tolerance             559 416 (74.4%) 143 (25.6%)
    Verbally aggressive/intimidating       559 364 (65.1%) 195 (34.9%)
    Explosive episodes                     559 263 (47.0%) 296 (53.0%)
    Physically aggressive                  559 264 (47.2%) 295 (52.8%)
    Inadequate guilt feelings              559  91 (16.3%) 468 (83.7%)
    Inflated self-esteem                   559  32 (5.7%)  527 (94.3%)
    Low aggression*                        559 135 (24.2%) 424 (75.8%)
    Strong social skills*                  559 112 (21.8%) 437 (78.2%)
Community
    Perceived safety*                      559 397 (71.0%) 162 (29.0%)
    Access to resources*                   559 343 (61.4%) 216 (38.6%)
    Positive adults*                       559 275 (49.2%) 284 (50.8%)
*Denotes that item is a protective factor.
                                           65


            Appendix B: Correlations Between YLS and PFRJR Assessment Items
Table 11.
Correlations between YLS and PFRJR assessment items.
                                          Prior/Current Offenses
                                                  1                  2                 3             4           5
1. Three or more prior convictions               --
2. Two or more failures to comply              .32*                 --
3. Prior probation                             .30*               .36*                --
4. Prior custody                               .23*               .36*              .57*            --
5. Three or more current convictions           .11*                .05               .01           .02           --
                                                   Education
                                  1    2          3          4           5         6          7      8        9     10
1. Low achievement                --
2. Problems with teachers .16*         --
3. Problems with peers          .12* .33*         --
4. Disruptive classroom
                                .10* .57* .36*              --
behavior
5. Disruptive behavior on
                                .10* .27* .35* .37*                      --
school property
6. Truancy                      .29*  .06       .02        .03          .01       --
7. Passing                     -.66* -.13* -.13* -.17* -.14* -.28*                           --
8. High achievement            -.39* -.10* -.08 -.12* -.10* -.20* .48*                               --
9. Positive relationships
                               -.27* -.37* -.17* -.30* -.16* -.23* .31* .22*                                  --
with teachers
10. Commitment to
                               -.24* -.19* -.07 -.14* -.11* -.33* .34* .24* .45*                                     --
school/education
                                           Leisure & Recreation
                                               1               2              3              4           5         6
1. Lack of organized activities                --
2. Could make better use of time             .41*              --
3. No personal interests                     .10*            .09*             --
4. Involvement in organized activities       -.84*          -.45*           -.12*            --
5. Positive personal interests               -.22*          -.22*           -.30*          .28*          --
6. Religiosity                               -.24*          -.15*           -.09*          .26*        .18*       --
                                                Peer Relations
                                                     1                 2                3             4          5
1. Lack of positive peer acquaintances              --
2. Lack of positive friends                       .59*                 --
3. Some delinquent peer acquaintances             .17*              .20*                --
4. Some delinquent friends                        .22*              .27*              .64*           --
5. Close bonds with positive peers               -.35*              -.48*            -.27*         -.28*         --
                                              Substance Abuse
                                          1             2             3            4            5           6       7
1. Occasional drug use                    --
2. Chronic drug use                     .48*            --
3. Chronic alcohol use                   .08          .15*            --
4. Substance abuse interferes with
                                        .39*          .50*          .14*           --
life
5. Substance use linked to
                                        .25*          .32*          .17*         .27*           --
offense(s)
                                                         66


Table 11 (cont’d).
                                                   Substance Abuse
                                                1           2            3             4            5           6       7
6. Low availability to drugs                 -.60*        -.38*        -.09*        -.29*         -.22*         --
7. Actively abstaining from
                                             -.52*        -.45*        -.11*        -.37*         -.22*       .49*      --
drugs/alcohol
                                                  Family & Parenting
                                   1         2         3        4          5          6          7         8          9  10
1. Inadequate supervision          --
2. Difficulty in
                                 .28*        --
controlling behavior
3. Inappropriate
                                 .21*     .38*         --
discipline
4. Inconsistent parenting        .24*     .27*       .48*       --
5. Poor relations with
                                 .09*     .14*       .15*      .08         --
father
6. Poor relations with
                                  .05     .24*       .17*      .12        .02         --
mother
7. Consistent supervision       -.58*    -.41*       -.18*    -.19*      -.12*      -.18*        --
8. Strong family
                                -.16*    -.54*       -.34*    -.31*      -.23*      -.17*      .40*        --
management
9. Consistent parenting         -.17*    -.44* -.37* -.42* -.21* -.11* .39*                              .63*        --
10. Strong adult bonds           -.03    -.32* -.10* -.10* -.07 -.18* .25*                               .42*      .30*   --
                                               Attitudes & Orientation
                                              1          2          3            4          5           6          7     8
1. Not seeking help                          --
2. Actively rejecting help                 .13*          --
3. Defies authority                         .04        .13*         --
4. Antisocial/pro-criminal                  .04        .18*       .23*          --
attitudes
5. Callous, little concern for              .07        .11*       .16*        .24*          --
others
6. Actively seeking help                 -.50* -.16* -.12* -.09*                         -.09*          --
7. Positive response to authority        -.18* -.17* -.26* -.20*                         -.10*        .22*         --
8. Prosocial attitudes                   -.15* -.16* -.15* -.34*                         -.10*        .24*       .49*    --
                                               Personality & Behavior
                               1        2             3            4             5          6           7          8     9
1. Short attention span        --
2. Poor frustration          .25*       --
tolerance
3. Verbally                  .23*     .47*            --
aggressive/intimidating
4. Explosive episodes        .18*     .41*          .43*           --
5. Physically                .13*     .32*          .37*         .33*            --
aggressive
6. Inadequate guilt          .08*      .08           .08          .02          .17*         --
feelings
7. Inflated self esteem      -.01      .04          .10*         .14*         <.01         .04          --
8. Low aggression           -.24*     -.55*        -.58*        -.49*         -.46*       -.07        -.03         --
9. Strong social skills     -.24*     -.37*        -.27*        -.25*         -.20*       -.08        <.01       .42*    --
                                                             67


Table 11 (cont’d).
                            Community
                         1              2  3
1. Perceived safety      --
2. Access to resources .47*             --
3. Positive adults     .41*           .41* --
*p < 0.05
                               68


                     Appendix C: Summary of the Novel Scoring Algorithm
Table 12.
Summary of the Novel Scoring Algorithm.
                                     First-Order Factor Loadings
                                                 Unstd. Est.      Std. Est.  p
                                                    (S.E.)          (S.E.)
Prior/Current Offenses BY
   Three or more prior convictions                1.00 (.00)      .98 (.13) .00
   Two or more prior failures to comply           .82 (.14)       .80 (.08) .00
   Prior probation                                .87 (.14)       .85 (.07) .00
   Prior custody                                  .97 (.16)       .95 (.07) .00
   Three or more current convictions              .31 (.18)       .30 (.18) .09
Education BY
   Low achievement                                1.00 (.00)      .57 (.06) .00
   Problems with teachers                         1.03 (.13)      .58 (.05) .00
   Problems with peers                            .87 (.12)       .49 (.05) .00
   Disruptive classroom behavior                  .97 (.13)       .55 (.05) .00
   Disruptive behavior on school property         1.00 (.14)      .56 (.05) .00
   Truancy                                        .89 (.12)       .51 (.06) .00
   Passing*                                      -1.31 (.11)     -.74 (.05) .00
   High achievement*                             -1.32 (.16)     -.75 (.06) .00
   Positive relationships with teachers*         -1.53 (.16)     -.86 (.03) .00
   Commitment to school/education*               -1.39 (.15)     -.78 (.04) .00
Leisure & Recreation BY
   Lack of organized activities                   1.00 (.00)      .69 (.05) .00
   Could make better use of time                  1.26 (.11)      .87 (.05) .00
   No personal interests                          .58 (.11)       .40 (.07) .00
   Involvement in organized activities*          -1.07 (.05)     -.74 (.04) .00
   Positive personal interests*                  -1.08 (.11)     -.74 (.05) .00
   Religiosity*                                   -.61 (.11)     -.42 (.07) .00
Peer Relations BY
   Lack of positive peer acquaintances            1.00 (.00)      .65 (.04) .00
   Lack of positive friends                       1.03 (.07)      .67 (.04) .00
   Some delinquent peer acquaintances             1.05 (.10)      .68 (.05) .00
   Some delinquent friends                        1.05 (.09)      .68 (.04) .00
   Close bonds with positive peers*              -1.33 (.09)     -.86 (.03) .00
Substance Abuse BY
   Occasional drug use                            1.00 (.00)      .59 (.03) .00
   Chronic drug use                               1.07 (.06)      .63 (.03) .00
   Chronic alcohol use                            .45 (.12)       .27 (.10) .00
   Substance abuse interferes with life           .89 (.06)       .53 (.04) .00
   Substance use linked to offense(s)             .57 (.08)       .34 (.06) .00
   Low availability to drugs*                    -1.17 (.06)     -.69 (.03) .00
   Actively abstaining from drugs/alcohol*       -1.02 (.06)     -.60 (.03) .00
Family & Parenting BY
   Inadequate supervision                         1.00 (.00)      .49 (.05) .00
   Difficulty in controlling behavior             1.75 (.19)      .86 (.03) .00
   Inappropriate discipline                       1.22 (.15)      .60 (.04) .00
   Inconsistent parenting                         1.20 (.15)      .59 (.04) .00
                                                   69


Table 12 (cont’d).
                                     First-Order Factor Loadings
                                                 Unstd. Est.      Std. Est.  p
                                                    (S.E.)          (S.E.)
Family & Parenting BY
  Poor relations with father                      .67 (.13)       .33 (.06) .00
  Poor relations with mother                      .70 (.14)       .34 (.06) .00
  Consistent supervision*                        -1.47 (.12)     -.72 (.04) .00
  Strong family management*                      -1.88 (.20)     -.93 (.02) .00
  Consistent parenting*                          -1.82 (.20)     -.90 (.03) .00
  Strong adult bonds*                            -1.23 (.16)     -.61 (.04) .00
Attitudes & Orientation BY
  Not seeking help                                1.00 (.00)      .30 (.06) .00
  Actively rejecting help                         1.55 (.32)      .47 (.06) .00
  Defies authority                                2.13 (.44)      .64 (.05) .00
  Antisocial/pro-criminal attitudes               2.02 (.41)      .61 (.04) .00
  Callous, little concern for others              1.56 (.36)      .47 (.07) .00
  Actively seeking help*                         -1.44 (.25)     -.44 (.06) .00
  Positive response to authority*                -2.78 (.53)     -.84 (.03) .00
  Prosocial attitudes*                           -2.92 (.57)     -.88 (.03) .00
Personality & Behavior BY
  Short attention span                            1.00 (.00)      .41 (.06) .00
  Poor frustration tolerance                      1.91 (.29)      .79 (.04) .00
  Verbally aggressive/intimidating                1.99 (.30)      .82 (.03) .00
  Explosive episodes                              1.69 (.27)      .70 (.04) .00
  Physically aggressive                           1.61 (.26)      .67 (.04) .00
  Inadequate guilt feelings                       .86 (.23)       .35 (.08) .00
  Inflated self-esteem                            .23 (.26)       .09 (.11) .40
  Low aggression*                                -2.35 (.35)     -.97 (.03) .00
  Strong social skills*                          -2.19 (.33)     -.90 (.05) .00
Community BY
  Perceived safety*                               1.00 (.00)      .68 (.05) .00
  Access to resources*                            1.13 (.13)      .77 (.05) .00
  Positive adults*                                1.40 (.15)      .96 (.06) .00
                                    Second-Order Factor Loadings
                                                 Unstd. Est.      Std. Est.  p
Criminogenic Risk BY
  Prior/Current Offenses                          1.00 (.00)      .35 (.05) .00
  Education                                       1.43 (.32)      .86 (.02) .00
  Leisure & Recreation                            1.53 (.32)      .75 (.04) .00
  Peer Relations                                  1.81 (.35)      .95 (.03) .00
  Substance Abuse                                 1.80 (.36)      .72 (.03) .00
  Family & Parenting                              1.25 (.27)      .86 (.02) .00
  Attitudes & Orientation                         .81 (.22)       .91 (.03) .00
  Personality & Behavior                          .84 (.21)       .69 (.03) .00
  Community                                      -1.01 (.23)     -.50 (.05) .00
*Denotes that item is a protective factor.
                                                   70


                     Appendix D: Summary of the Scoring-as-Usual Method
Table 13.
Summary of the Scoring-as-Usual method.
                                      First-Order Factor Loadings
                                                Std. Est    Unstd. Est. S.E.   p
Prior/Current Offenses BY
   Three or more prior convictions                 .98         1.00     .00  999.00
   Two or more prior failures to comply            .80         1.00     .00  999.00
   Prior probation                                 .85         1.00     .00  999.00
   Prior custody                                   .95         1.00     .00  999.00
   Three or more current convictions               .30         1.00     .00  999.00
Education BY
   Low achievement                                 .57         1.00     .00  999.00
   Problems with teachers                          .58         1.00     .00  999.00
   Problems with peers                             .49         1.00     .00  999.00
   Disruptive classroom behavior                   .55         1.00     .00  999.00
   Disruptive behavior on school property          .56         1.00     .00  999.00
   Truancy                                         .51         1.00     .00  999.00
Leisure & Recreation BY
   Lack of organized activities                    .69         1.00     .00  999.00
   Could make better use of time                   .87         1.00     .00  999.00
   No personal interests                           .40         1.00     .00  999.00
Peer Relations BY
   Lack of positive peer acquaintances             .65         1.00     .00  999.00
   Lack of positive friends                        .67         1.00     .00  999.00
   Some delinquent peer acquaintances              .68         1.00     .00  999.00
   Some delinquent friends                         .68         1.00     .00  999.00
Substance Abuse BY
   Occasional drug use                             .85         1.00     .00  999.00
   Chronic drug use                                .91         1.00     .00  999.00
   Chronic alcohol use                             .39         1.00     .00  999.00
   Substance abuse interferes with life            .76         1.00     .00  999.00
   Substance use linked to offense(s)              .49         1.00     .00  999.00
Family & Parenting BY
   Inadequate supervision                          .49         1.00     .00  999.00
   Difficulty in controlling behavior              .86         1.00     .00  999.00
   Inappropriate discipline                        .60         1.00     .00  999.00
   Inconsistent parenting                          .59         1.00     .00  999.00
   Poor relations with father                      .33         1.00     .00  999.00
   Poor relations with mother                      .34         1.00     .00  999.00
Attitudes & Orientation BY
   Not seeking help                                .30         1.00     .00  999.00
   Actively rejecting help                         .47         1.00     .00  999.00
   Defies authority                                .64         1.00     .00  999.00
   Antisocial/pro-criminal attitudes               .61         1.00     .00  999.00
   Callous, little concern for others              .47         1.00     .00  999.00
Personality & Behavior BY
   Short attention span                            .41         1.00     .00  999.00
   Poor frustration tolerance                      .79         1.00     .00  999.00
                                                   71


Table 13 (cont’d).
                                     First-Order Factor Loadings
                                               Std. Est    Unstd. Est. S.E.   p
  Verbally aggressive/intimidating                .82         1.00     .00  999.00
  Explosive episodes                              .70         1.00     .00  999.00
  Physically aggressive                           .67         1.00     .00  999.00
  Inadequate guilt feelings                       .35         1.00     .00  999.00
  Inflated self-esteem                            .09         1.00     .00  999.00
                                    Second-Order Factor Loadings
                                               Std. Est.   Unstd. Est. S.E.   p
Criminogenic Risk BY
  Prior/Current Offenses                          .35         1.00     .00  999.00
  Education                                       .86         1.00     .00  999.00
  Leisure & Recreation                            .75         1.00     .00  999.00
  Peer Relations                                  .95         1.00     .00  999.00
  Substance Abuse                                 .72         1.00     .00  999.00
  Family & Parenting                              .86         1.00     .00  999.00
  Attitudes & Orientation                         .91         1.00     .00  999.00
  Personality & Behavior                          .69         1.00     .00  999.00
*Denotes that item is a protective factor.
                                                  72


REFERENCES
    73


                                           REFERENCES
Akiva, T., Li, J., Martin, K. M., Horner, C. G., & McNamara, A. R. (2017). Simple interactions:
        Piloting a strengths-based and interaction-based professional development intervention
        for out-of-school time programs. Child & Youth Care Forum, 46(3), 285-305.
Andrews, D. A., & Bonta, J. (2010) The psychology of criminal conduct (5th ed.). New
        Providence, NJ: LexisNexis.
Andrews, D. A., Kiessling, J. J., Mickus, S., & Robinson, D. (1986). The construct validity of
        interview-based risk assessment in corrections. Canadian Journal of Behavioral
        Science, 18(4), 460.
Austin, P. C., & Tu, J. V. (2004). Bootstrap methods for developing predictive models. The
        American Statistician, 58(2), 131-137.
Bailey, Z. D., Krieger, N., Agénor, M., Graves, J., Linos, N., & Bassett, M. T. (2017). Structural
        racism and health inequities in the USA: Evidence and interventions. The Lancet,
        389(10077), 1453–1463. https://doi.org/10.1016/S0140-6736(17)30569-X
Baker, M. L., Sigmon, J. N., & Nugent, M. E. (2001). Truancy Reduction: Keeping Students in
        School. Juvenile Justice Bulletin.
Barnes-Lee, A. R. (2020). Development of protective factors for reducing juvenile reoffending: a
        strengths-based approach to risk assessment. Criminal Justice and Behavior, 47(11),
        1371-1389.
Barnes-Lee, A. R., & Campbell, C. A. (2020). Protective factors for reducing juvenile
        reoffending: an examination of incremental and differential predictive validity. Criminal
        Justice and Behavior, 47(11), 1390-1408.
Barnoski, R., & Drake, E. (2007). Washington’s Offender Accountability Act: Department of
        Correction’s static risk instrument. Washington State Institute for Public Policy.
Bauer, D., & Curran, P. (2015). The discrepancy between measurement and modeling in
        longitudinal data analysis. Advances in multilevel modeling for educational research:
        Addressing practical issues found in real-world applications, 3-38.
Beaulac, J., Bouchard, D., & Kristjansson, E. (2009). Physical activity for adolescents living in a
        disadvantaged neighbourhood: Views of parents and adolescents on needs, barriers,
        facilitators, and programming. Leisure/Loisir, 33(2), 537-561.
Belisle, L. A., & Salisbury, E. J. (2021). Starting with girls and their resilience in mind:
        Reconsidering risk/needs assessments for system-involved girls. Criminal Justice and
        Behavior, 48(5), 596-616.
                                                  74


Birckhead, T. R. (2012). Delinquent by reason of poverty. Wash. UJL & Pol'y, 38, 53.
Bishop, D. M., & Frazier, C. E. (1995). Race effects in juvenile justice decision-making:
       Findings of a statewide analysis. Journal of Criminal Law & Criminology, 86, 392.
Blackwell, B. S., Holleran, D., & Finn, M. A. (2008). The impact of the Pennsylvania sentencing
       guidelines on sex differences in sentencing. Journal of Contemporary Criminal
       Justice, 24(4), 399-418.
Bonomo, Y., Coffey, C., Wolfe, R., Lynskey, M., Bowes, G., & Patton, G. (2001). Adverse
       outcomes of alcohol use in adolescents. Addiction, 96(10), 1485-1496.
Bonta, J., & Andrews, D. A. (2007). Risk-need-responsivity model for offender assessment and
       rehabilitation. Rehabilitation, 6(1), 1-22.
Bonilla-Silva, E. (1997). Rethinking racism: Toward a structural interpretation. American
       Sociological Review, 62(3), 465–480. https://doi.org/10.2307/2657316
Bortner, M. A., & Wornie, L. R. (1985). The preeminence of process: An example of refocused
       justice research. Social Science Quarterly, 66(2), 413.
Branson, C. E., Baetz, C. L., Horwitz, S. M., & Hoagwood, K. E. (2017). Trauma-informed
       juvenile justice systems: A systematic review of definitions and core components.
       Psychological Trauma: Theory, Research, Practice, and Policy, 9(6), 635.
Bridges, G. S., & Steen, S. (1998). Racial disparities in official assessments of juvenile
       offenders: Attributional stereotypes as mediating mechanisms. American Sociological
       Review, 63(4), 554–570. https://doi.org/10.2307/2657267
Bronfenbrenner, U. (1979). The ecology of human development. Harvard university press.
Butcher, F., Kretschmar, J. M., Lin, Y., Flannery, D. J., & Singer, M. I. (2014). Analysis of the
       validity scales in the trauma symptom checklist for children. Research on Social Work
       Practice, 24(6), 695-704.
Cangur, S., & Ercan, I. (2015). Comparison of model fit indices used in structural equation
       modeling under multivariate normality. Journal of Modern Applied Statistical
       Methods, 14(1), 14.
Cauffman, E., Cavanagh, C., Donley, S., & Thomas, A. G. (2016). A developmental perspective
       on adolescent risk-taking and criminal behavior. The Handbook of Criminological
       Theory, 100-120.
Cauffman, E., & Steinberg, L. (2000). (Im)maturity of judgment in adolescence: Why
       adolescents may be less culpable than adults. Behavioral Sciences & the Law, 18, 741-
       760.
                                                 75


Cécile, M., & Born, M. (2009). Intervention in juvenile delinquency: Danger of iatrogenic
        effects?. Children and Youth Services Review, 31(12), 1217-1221.
Chesney-Lind, M. (1977). Judicial paternalism and the female status offender: Training women
        to know their place. Crime & Delinquency, 23(2), 121-130.
Chesney-Lind, M., & Sheldon, R. G. (2004). Young women, delinquency and juvenile justice.
Colder, C. R., Campbell, R. T., Ruel, E., Richardson, J. L., & Flay, B. R. (2002). A finite
        mixture model of growth trajectories of adolescent alcohol use: predictors and
        consequences. Journal of Consulting and Clinical Psychology, 70(4), 976.
Collins, P. H. (1990). Black feminist thought: Knowledge, consciousness, and the politics of
        empowerment. New York: Routledge.
Comrey, L. A., & Lee, H. B. (1992). A first course in factor analysis (2nd ed.). Hillside, NJ:
        Lawrence Erlbaum Associates.
Covington, S. (2022). Creating a trauma-informed justice system for women. Wiley handbook on
        what works with female offenders.
Crew, B. K. (1991). Sex differences in criminal sentencing: Chivalry or patriarchy?.
Daly, K. (1994). Gender, crime, and punishment. Yale University Press.
Development Services Group, Inc. (2015). Risk and needs assessment for youths. Washington,
        D. C.: Office of Juvenile Justice and Delinquency Prevention. Available at
        https://www.ojjdp.gov/mpg/litreviews/RiskandNeeds.pdf
DeJong, C., & Jackson, K. C. (1998). Putting race into context: Race, juvenile justice processing,
        and urbanization. Justice Quarterly, 15(3), 487-504.
De La Rue, L., & Ortega, L. (2019). Intersectional trauma-responsive care: A framework for
        humanizing care for justice involved girls and women of color. Journal of Aggression,
        Maltreatment & Trauma, 28(4), 502-517.
de Vogel, V., de Vries Robbé, M., de Ruiter, C., & Bouman, Y. H. (2011). Assessing protective
        factors in forensic psychiatric practice: Introducing the SAPROF. International Journal of
        Forensic Mental Health, 10(3), 171–177.
Diamond, B., Morris, R. G., & Caudill, J. W. (2011). Sustaining families, dissuading crime: The
        effectiveness of a family preservation program with male delinquents. Journal of
        Criminal Justice, 39(4), 338-343.
Draelos, R. (2019, February 23). Measuring performance: AUC (AUROC). Glass Box. Available
        at https://glassboxmedicine.com/2019/02/23/measuring-performance-auc-auroc/
                                                   76


Duwe, G. (2014). The development, validity, and reliability of the Minnesota screening tool
        assessing recidivism risk (MnSTARR). Criminal Justice Policy Review, 25(5), 579-613.
Eklund, J. M., Kerr, M., & Stattin, H. (2010). Romantic relationships and delinquent behaviour
        in adolescence: The moderating role of delinquency propensity. Journal of
        Adolescence, 33(3), 377-386.
Erez, E. (1992). Dangerous men, evil women: Gender and parole decision-making. Justice
        Quarterly, 9(1), 105-126.
Feinstein, R. (2015). A qualitative analysis of police interactions and disproportionate minority
        contact. Journal of Ethnicity in Criminal Justice, 13(2), 159-178.
Feld, B. C. (1991). Justice by geography: Urban, suburban, and rural variations in juvenile
        justice administration. Journal of Criminal Law & Criminology, 82, 156.
Fergus, S., & Zimmerman, M. A. (2005). Adolescent resilience: A framework for understanding
        healthy development in the face of risk. Annual Review of Public Health, 26, 399-419.
Fountain, E. N., & Mahmoudi, D. (2021). Mapping juvenile justice: Identifying existing
        structural barriers to accessing probation services. American Journal of Community
        Psychology, 67(1-2), 116-129.
Gatti, U., Tremblay, R. E., & Vitaro, F. (2009). Iatrogenic effect of juvenile justice. Journal of
        Child Psychology and Psychiatry, 50(8), 991-998.
Gendreau, P., Little, T., & Goggin, C. (1996). A meta‐analysis of the predictors of adult offender
        recidivism: What works!. Criminology, 34(4), 575-608.
Glover, K. S. (2008). Citizenship, hyper-surveillance, and double-consciousness: Racial profiling
        as panoptic governance. In Surveillance and governance: Crime control and beyond.
        Emerald Group Publishing Limited.
Gorman-Smith, D., Tolan, P. H., Zelli, A., & Huesmann, L. R. (1996). The relation of family
        functioning to violence among inner-city minority youths. Journal of Family
        Psychology, 10(2), 115.
Graybeal, C. (2001). Strengths-based social work assessment: Transforming the dominant
        paradigm. Families in society, 82(3), 233-242.
Green, B. (2020, January). The false promise of risk assessments: epistemic reform and the limits
        of fairness. In Proceedings of the 2020 Conference on Fairness, Accountability, and
        Transparency (pp. 594-606).
                                                 77


Grove, W. M., & Meehl, P. E. (1996). Comparative efficiency of informal (subjective,
        impressionistic) and formal (mechanical, algorithmic) prediction procedures: The
        clinical-statistical controversy. Psychology, Public Policy, and Law, 2, 293-323.
Hamilton, M. (2015). Risk-needs assessment: Constitutional and ethical challenges. American
        Criminal Law Review, 52, 231.
Harcourt, B. E. (2010, September). Risk as a proxy for race. University of Chicago Law &
        Economics Olin Working Paper 535; University of Chicago Public Law Working Paper
        323. Retrieved from http://ssrn.com/ abstract1677654
Harris, P. (2006). What community supervision officers need to know about actuarial risk
        assessment and clinical judgment. Federal Probation, 70(2), 8-14.
Hawkins, J. D., Van Horn, M. L., & Arthur, M. W. (2004). Community variation in risk and
        protective factors and substance use outcomes. Prevention Science, 5(4), 213-220.
Hennessey, M., Ford, J. D., Mahoney, K., Ko, S. J., & Siegfried, C. B. (2004). Trauma among
        girls in the juvenile justice system. Los Angeles, CA: National Child Traumatic Stress
        Network.
Hoffmann, J. P., & Su, S. S. (1998). Stressful life events and adolescent substance use and
        depression: Conditional and gender differentiated effects. Substance Use &
        Misuse, 33(11), 2219-2262.
Hoge, R. D. (2020). The Youth level of service/Case management inventory. In Handbook of
        violence risk assessment (pp. 191-205). Routledge.
Hoge, R. D., Andrews, D. A., & Leschied, A. W. (1996). An investigation of risk and protective
        factors in a sample of youthful offenders. Journal of Child Psychology and
        psychiatry, 37(4), 419-424.
Hoge, R., & Andrews, D. A. (2010). Evaluation for risk of violence in juveniles. Oxford
        University Press.
Holtfreter, K., & Morash, M. (2003). The needs of women offenders. Women & Criminal
        Justice, 14(2-3), 137-160.
Homan, P. (2019). Structural sexism and health in the United States: A new perspective on
        health inequality and the gender system. American Sociological Review, 84(3), 486-516.
Howell, J. C., & Hawkins, J. D. (1998). Prevention of youth violence. Crime and justice, 24,
        263-315.
                                                  78


Hubbard, D. J., & Matthews, B. (2008). Reconciling the differences between the “gender-
        responsive” and the “what works” literatures to improve services for girls. Crime &
        Delinquency, 54(2), 225-258.
Jacobs, L. A., Ashcraft, L. E., Sewall, C. J., Folb, B. L., & Mair, C. (2020). Ecologies of juvenile
        reoffending: a systematic review of risk factors. Journal of criminal justice, 66, 101638.
Javdani, S., & Allen, N. E. (2016). An ecological model for intervention for juvenile justice-
        involved girls: Development and preliminary prospective evaluation. Feminist
        Criminology, 11(2), 135-162.
Jones, C. P. (2000). Levels of racism: A theoretic framework and a gardener’s tale. American
        Journal of Public Health, 90(8), 1212–1215.
Juvenile Justice Geography, Policy, Practice & Statistics. (2020). Juvenile justice services: Risk
        assessment. National Center for Juvenile Justice. Available:
        http://www.jjgps.org/juvenile-justice-services#risk-assessment
Kerig, P. K. & Becker, S. P. (2012). Trauma and girls’ delinquency. Delinquent Girls, 119-143.
Kerig, P. K. (2014). Introduction: for better or worse: intimate relationships as sources of risk or
        resilience for girls' delinquency. Journal of Research on Adolescence, 24(1), 1-11.
Kitzmiller, M. K., Hoskins, K., & Cavanagh, C. (2022). Examining Sex-Based Measurement
        Invariance in the Youth Level of Service/Case Management Inventory. Crime &
        Delinquency, 00111287211073677.
Ko, S. J., Ford, J. D., Kassam-Adams, N., Berkowitz, S. J., Wilson, C., Wong, M., ... & Layne,
        C. M. (2008). Creating trauma-informed systems: Child welfare, education, first
        responders, health care, juvenile justice. Professional psychology: Research and
        practice, 39(4), 396.
Kline, R. B. (2016). Principles and Practices of Structural Equation Modeling (4th ed.). The
        Guilford Press: New York, NY.
Kuhn, D. (2009). Adolescent thinking. In R.M. Lerner & L. Steinberg (Eds.), Handbook of
        Adolescent Psychology. Hoboken, NJ: Wiley & Sons.
Kumpfer, K. L., & Turner, C. W. (1990). The social ecology model of adolescent substance
        abuse: Implications for prevention. International journal of the addictions, 25(sup4),
        435-463.
Leve, L. D., & Chamberlain, P. (2005). Association with delinquent peers: Intervention effects
        for youth in the juvenile justice system. Journal of abnormal child psychology, 33(3),
        339-34.
                                                  79


Liberman, A., & Fontaine, J. (2015). Reducing harms to boys and young men of color from
         criminal justice system involvement. Washington, DC: Urban institute.
Littlefield, A.K., Sher, K.J., & Steinley, D. (2010). Developmental trajectories of impulsivity
         and their association with alcohol use and related outcomes during emerging and young
         adulthood. Alcoholism: Clinical and Experimental Research, 34(4), 1409–1416. doi:
         10.1111/j.1530"0277.2010.01224.x.
Liu, J. (2004). Childhood externalizing behavior: Theory and implications. Journal of Child and
         Adolescent Psychiatric Nursing, 17(3), 93-103.
Long, J., & Sullivan, C. J. (2017). Learning more from evaluation of justice interventions:
         Further consideration of theoretical mechanisms in juvenile drug courts. Crime &
         Delinquency, 63(9), 1091-1115.
Love, T. P., & Morris, E. W. (2019). Opportunities diverted: intake diversion and
         institutionalized racial disadvantage in the juvenile justice system. Race and Social
         Problems, 11(1), 33-44.
Lovins, B. K., Latessa, E. J., May, T., & Lux, J. (2018). Validating the Ohio risk assessment
         system community supervision tool with a diverse sample from Texas. Corrections, 3(3),
         186-202.
Mandrekar, J. N. (2010). Receiver operating characteristic curve in diagnostic test
         assessment. Journal of Thoracic Oncology, 5(9), 1315-1316.
Maschi, T., Morgen, K., Bradley, C., & Hatcher, S. S. (2008). Exploring gender differences on
         internalizing and externalizing behavior among maltreated youth: Implications for social
         work action. Child and Adolescent Social Work Journal, 25(6), 531-547.
Maschi, T., Gibson, S., Zgoba, K. M., & Morgen, K. (2011). Trauma and life event stressors
         among young and older adult prisoners. Journal of Correctional Health Care, 17(2), 160-
         172.
McBride, D. C., & McCoy, C. B. (1981). Crime and drug‐using behavior: An areal
         analysis. Criminology, 19(2), 281-302.
McCarter, S. A. (2016). Holistic representation: A randomized pilot study of wraparound
         services for first-time juvenile offenders to improve functioning, decrease motions for
         review, and lower recidivism. Family Court Review, 54(2), 250-260.
McGarrell, E. F. (1993). Trends in racial disproportionality in juvenile court processing: 1985-
         1989. Crime & Delinquency, 39(1), 29-48.
McNeish, D., & Wolf, M. G. (2020). Thinking twice about sum scores. Behavior Research
         Methods, 52, 2287-2305.
                                                   80


McNeish, D., & Wolf, M. G. (2020). Dynamic fit index cutoffs for Confirmatory Factor Analysis
        models.
Miller, W. T., Campbell, C. A., Papp, J., & Ruhland, E. (2021). The contribution of static and
        dynamic factors to recidivism prediction for Black and White youth
        offenders. International journal of offender therapy and comparative criminology,
        0306624X211022673.
Miron, M., Tolan, S., Gómez, E., & Castillo, C. (2021). Evaluating causes of algorithmic bias in
        juvenile criminal recidivism. Artificial Intelligence and Law, 29(2), 111-147.
Moffitt, T.E. (1993). Adolescence limited and life course persistent antisocial behavior:
        A developmental taxonomy. Psychological Review, 100, 674–701.
Moon, S. S., Patton, J., & Rao, U. (2010). An ecological approach to understanding youth
        violence: The mediating role of substance use. Journal of human behavior in the social
        environment, 20(7), 839-856.
Mossman, D. (1994). Assessing predictions of violence: Being accurate about accuracy. Journal
        of Consulting and Clinical Psychology, 62(4), 783.
Muthén, L. K., & Muthén, B. O. (1998-2020). Mplus User's Guide. Sixth Edition. Los Angeles,
        CA: Muthén & Muthén.
National Research Council (NAS). (2013). Reforming juvenile justice: A developmental
        approach. Washington, D. C.: The National Academics Press.
Nissen, L. (2006). Bringing strength-based philosophy to life in juvenile justice. Reclaiming
        Children and Youth, 15(1), 40.
Nunn, K. B. (2001). The child as other: Race and differential treatment in the juvenile justice
        system. DePaul L. Rev., 51, 679.
O’Brien, L., Albert, D., Chein, J., & Steinberg, L. (2011). Adolescents prefer more immediate
        rewards when in the presence of their peers. Journal of Research on Adolescence, 21(4),
        747–753. doi: 10.1111/j.1532"7795.2011.00738.
Office of Juvenile Justice & Delinquency Prevention. (2019). Estimated number of arrests by
        offense and age group. OJJDP Statistical Briefing Book. Available:
        https://www.ojjdp.gov/ojstatbb/crime/ucr.asp?table_in=1
Oleson, J. C., VanBenschoten, S. W., Robinson, C. R., & Lowenkamp, C. T. (2011). Training to
        see risk: Measuring the accuracy of clinical and actuarial risk assessments among federal
        probation officers. Fed. Probation, 75, 52.
                                                  81


Onifade, E., Smith Nyandoro, A., Davidson, W. S., & Campbell, C. (2010). Truancy and patterns
        of criminogenic risk in a young offender population. Youth violence and juvenile justice,
        8(1), 3-18.
Outland, R. (2021). Why Black and Brown Youth Fear and Distrust Police: An Exploration of
        Youth Killed by Police in the US (2016/2017), Implications for Counselors and Service
        Providers. Open Journal of Social Sciences, 9(04), 222.
Parsons, T. (1943). The kinship system of the contemporary United States. American
        anthropologist, 45(1), 22-38.
Peck, J. H., & Jennings, W. G. (2016). A critical examination of “being Black” in the juvenile
        justice system. Law and Human Behavior, 40(3), 219.
Pilnik, L., & Kendall, J. R. (2012). Victimization and trauma experienced by children and youth:
        implications for legal advocates.
Piquero, A. R. (2008). Disproportionate minority contact. The future of children, 59-79.
Popovici, I., Homer, J. F., Fang, H., & French, M. T. (2012). Alcohol use and crime: findings
        from a longitudinal sample of US adolescents and young adults. Alcoholism: Clinical and
        Experimental Research, 36(3), 532-543.
Powell, J. A. (2007). Structural racism: Building upon the insights of John Calmore- A tribute to
        John O. Calmore’s work. North Carolina Law Review, 86(3), 791–816.
Pullmann, M. D., Kerbs, J., Koroloff, N., Veach-White, E., Gaylor, R., & Sieler, D. (2006).
        Juvenile offenders with mental health needs: Reducing recidivism using
        wraparound. Crime & Delinquency, 52(3), 375-397.
Rennie, C. E., & Dolan, M. C. (2010). The significance of protective factors in the assessment of
        risk. Criminal Behaviour and Mental Health, 20(1), 8-22.
Rice, M. E., & Harris, G. T. (2005). Comparing effect sizes in follow-up studies: ROC Area,
        Cohen's d, and r. Law and Human Behavior, 29(5), 615-620.
Roesch, R. (1988). Community psychology and the law. American Journal of Community
        Psychology, 16(4), 451-463.
Rosenberg, L. (2018) Community services for mental illnesses and substance use disorders: The
        moral test of our time. The Journal of Behavioral Health Services & Research, 45(2),
        157-159.
Rucker, J. M., & Richeson, J. A. (2021). Toward an understanding of structural racism:
        Implications for criminal justice. Science, 374(6565), 286-290.
                                                 82


Schwalbe, C. S. (2007). Risk assessment for juvenile justice: A meta-analysis. Law and human
        behavior, 31(5), 449.
Sebastian, C., Viding, E. Williams, K.D., & Blakemore, S.J. (2010). Social brain development
        and the affective consequences of ostracism in adolescence. Brain and Cognition, 72,
        134–145.
Shields, I. W., & Simourd, D. J. (1991). Predicting predatory behavior in a population of
        incarcerated young offenders. Criminal Justice and Behavior, 18(2), 180-194.
Silver, E., Smith, W. R., & Banks, S. (2000). Constructing actuarial devices for predicting
        recidivism: A comparison of methods. Criminal Justice and Behavior, 27(6), 733-764.
Simourd, D. J., Hoge, R. D., Andrews, D. A., & Leschied, A. W. (1994). An empirically-based
        typology of male young offenders. Canadian Journal of Criminology, 36(4), 447-461.
Singh, P. S. J., & Azman, A. (2020). Dealing with Juvenile Delinquency: Integrated Social Work
        Approach. Asian Social Work Journal, 5(2), 32-43.
Singh, J. P., Desmarais, S. L., Hurducas, C., Arbach-Lucioni, K., Condemarin, C., Dean, K.,
        Otto, R. K. (2014). International Perspectives on the practical application of violence risk
        assessment: A global survey of 44 countries. International Journal of Forensic Mental
        Health, 13, 193– 206. http://dx.doi.org/10.1080/14999013.2014.922141
Skeem, J. L., & Lowenkamp, C. T. (2016). Risk, race, and recidivism: Predictive bias and
        disparate impact. Criminology, 54(4), 680-712.
Spivak, A. L., Wagner, B. M., Whitmer, J. M., & Charish, C. L. (2014). Gender and status
        offending: Judicial paternalism in juvenile justice processing. Feminist
        Criminology, 9(3), 224-248.
Spohn, C. (1999). Gender and sentencing of drug offenders: Is chivalry dead? Criminal Justice
        Policy Review, 9(3-4), 365-399.
Schwartz, J., & Steffensmeier, D. (2012). Stability and change in girls’ delinquency and the
        gender gap: Trends in violence and alcohol offending across multiple sources of
        evidence. In Delinquent Girls (pp. 3-23). Springer, New York, NY.
Simons, R. L., Simons, L. G., Burt, C. H., Brody, G. H., & Cutrona, C. (2005). Collective
        efficacy, authoritative parenting and delinquency: A longitudinal test of a model
        integrating community‐and family‐level processes. Criminology, 43(4), 989-1029.
Stack, C. (1974). All our kin: Strategies for survivor in a Black community. New York: Harper
        and Row.
                                                 83


Steinberg, L., Cauffman, E., & Monahan, K. C. (2015). Psychosocial maturity and desistance
        from crime in a sample of serious juvenile offenders. OJJDP Juvenile Justice Bulletin.
Stern, S. B., & Smith, C. A. (1995). Family processes and delinquency in an ecological
        context. Social Service Review, 69(4), 703-731.
St. John, V., Murphy, K., & Liberman, A. (2020). Recommendations for addressing racial bias in
        risk and needs assessment in the juvenile justice system. Child Trends.
Swets, J. A., Dawes, R. M., & Monahan, J. (2000). Psychological science can improve
        diagnostic decisions. Psychological Science in the Public Interest, 1(1), 1-26.
Takane, Y., & De Leeuw, J. (1987). On the relationship between item response theory and factor
        analysis of discretized variables. Psychometrika, 52(3), 393-408.
Tarter, R. E., Kirisci, L., Vanyukov, M., Cornelius, J., Pajer, K., Shoal, G. D., & Giancola, P. R.
        (2002). Predicting adolescent violence: impact of family history, substance use,
        psychiatric history, and social adjustment. American journal of psychiatry, 159(9), 1541-
        1547.
Thomas, C. W., & Sieverdes, C. M. (1975) Juvenile court intake: An analysis of discretionary
        decision-making. Criminology, 12(4), 413-43.
Tucker Sr, R. B. (2014). The color of mass incarceration. Ethnic Studies Review, 37(1), 135-149.
Van Voorhis, P., Wright, E. M., Salisbury, E., & Bauman, A. (2010). Women’s risk factors and
        their contributions to existing risk/needs assessment: The current status of a gender-
        responsive supplement. Criminal Justice and Behavior, 37(3), 261-288.
Viljoen, J. L., Jonnson, M. R., Cochrane, D. M., Vargen, L. M., & Vincent, G. M. (2019). Impact
        of risk assessment instruments on rates of pretrial detention, postconviction placements,
        and release: A systematic review and meta-analysis. Law and Human Behavior, 43(5),
        397–420. https://doi.org/10.1037/lhb0000344
Vincent, G. M., Guy, L. S., & Grisso, T. (2012). Risk assessment in juvenile justice: A
        guidebook for implementation.
Vincent, G. M., & Viljoen, J. L. (2020). Racist Algorithms or Systemic Problems? Risk
        Assessments and Racial Disparities. Criminal Justice and Behavior, 0093854820954501.
Ward, T., & Brown, M. (2004). The good lives model and conceptual issues in offender
        rehabilitation. Psychology, Crime & Law, 10(3), 243–257.
Wells, L. E., & Rankin, J. H. (1991). Families and delinquency: A meta-analysis of the impact of
        broken homes. Social Problems, 38(1), 71-93.
                                                  84


Williams, D. R., & Mohammed, S. A. (2013). Racism and health I: Pathways and scientific
       evidence. American Behavioral Scientist, 57(8), 1152–
       1173. https://doi.org/10.1177/0002764213487340
Windle, M. (2000). A latent growth curve model of delinquent activity among
       adolescents. Applied Developmental Science, 4(4), 193-207.
Woolfenden, S., Williams, K. J., & Peat, J. (2001). Family and parenting interventions in
       children and adolescents with conduct disorder and delinquency aged 10‐17. Cochrane
       Database of Systematic Reviews, (2).
Woolfenden, S. R., Williams, K., & Peat, J. K. (2002). Family and parenting interventions for
       conduct disorder and delinquency: a meta-analysis of randomized controlled
       trials. Archives of disease in childhood, 86(4), 251-256.
Wordes, M., Bynum, T. S., & Corley, C. J. (1994). Locking up youth: The impact of race on
       detention decisions. Journal of research in crime and delinquency, 31(2), 149-165.
Wormith, J. S. (2017). Automated offender risk assessment. Criminology & Public Policy, 16,
       281.
Wright, K. N., Clear, T. R., & Dickson, P. (1984). Universal Applicability of Probation Risk‐
       Assessment Instruments: A Critique. Criminology, 22(1), 113-134.
Zane, S. N., & Pupo, J. A. (2021). Disproportionate Minority Contact in the Juvenile Justice
       System: A Systematic Review and Meta-Analysis. Justice Quarterly, 1-26.
                                                 85