Multiracial identity response as a predictor of preterm birth among nulliparous, singleton birthing people in the US : An application of machine learning algorithms
Background: Preterm birth (PTB) is a significant cause of neurological or respiratory complications and infant death. Early identification of pregnant people at risk for PTB enables timely interventions and personalized pregnancy management to prevent potential complications. Over the past ten years, the multiracial population in the US has experienced significant growth. Multiracial disaggregation has been suggested as a factor that could help explain disparities in PTB rates, but it remains unclear whether classifying people into granular racial groups helps predict PTB. Objectives: This study aims to build four predictive models for preterm birth and to investigate which of them are important predictors of PTB across 31 race/ethnicity groups that include multiracial identities among nulliparous, singleton birthing people. Methods: We used population-based, cross-sectional data from U.S. birth records in 2019. Medical and socioeconomic factors potentially associated with PTB and race/ethnicity groups, including multiracial groups that are available within the first 16 weeks of pregnancy, were compared between nulliparous, singleton birthing people delivering preterm (<37 weeks of gestation) and term (≥37 weeks of gestation). Logistic regression with all variables, logistic regression with selected main effect variables and two-way interaction variables, Decision Tree, and a Random Forest model were employed to build the prediction models. A Random Forest model from an oversampling dataset was utilized to assess the relative importance of risk factors. Results: 97,555 individuals experienced PTB, and 24,041 were classified as multiracial among the analytic sample (N=1,032,465). The ranges of areas under the receiver operating-characteristic curves(AUC) of all models with oversampling data were 57. The accuracy range of all models with an oversampling dataset was 62 to 65. The mean decrease in the accuracy of the importance plot indicated that some multiracial groups were important predictors of PTB compared with socioeconomic factors. Conclusions: This study's results supported the idea that several granular multiracial groups could be considered meaningful predictors of PTB.
Read
- In Collections
-
Electronic Theses & Dissertations
- Copyright Status
- Attribution 4.0 International
- Material Type
-
Theses
- Authors
-
Kim, Heesu
- Thesis Advisors
-
Gartner, Danielle
- Committee Members
-
Johnson, Candice
Picasso, Catalina
- Date Published
-
2025
- Subjects
-
Epidemiology
- Program of Study
-
Epidemiology - Master of Science
- Degree Level
-
Masters
- Language
-
English
- Pages
- 36 pages
- Permalink
- https://doi.org/doi:10.25335/r6js-yp27