MULTIRACIAL IDENTITY RESPONSE AS A PREDICTOR OF PRETERM BIRTH AMONG NULLIPAROUS, SINGLETON BIRTHING PEOPLE IN THE US: AN APPLICATION OF MACHINE LEARNING ALGORITHMS By Heesu Kim A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Epidemiology – Master of Science 2025 ABSTRACT Background: Preterm birth (PTB) is a significant cause of neurological or respiratory complications and infant death. Early identification of pregnant people at risk for PTB enables timely interventions and personalized pregnancy management to prevent potential complications. Over the past ten years, the multiracial population in the US has experienced significant growth. Multiracial disaggregation has been suggested as a factor that could help explain disparities in PTB rates, but it remains unclear whether classifying people into granular racial groups helps predict PTB. Objectives: This study aims to build four predictive models for preterm birth and to investigate which of them are important predictors of PTB across 31 race/ethnicity groups that include multiracial identities among nulliparous, singleton birthing people. Methods: We used population-based, cross-sectional data from U.S. birth records in 2019. Medical and socioeconomic factors potentially associated with PTB and race/ethnicity groups, including multiracial groups that are available within the first 16 weeks of pregnancy, were compared between nulliparous, singleton birthing people delivering preterm (<37 weeks of gestation) and term (≥37 weeks of gestation). Logistic regression with all variables, logistic regression with selected main effect variables and two-way interaction variables, Decision Tree, and a Random Forest model were employed to build the prediction models. A Random Forest model from an oversampling dataset was utilized to assess the relative importance of risk factors. Results: 97,555 individuals experienced PTB, and 24,041 were classified as multiracial among the analytic sample (N=1,032,465). The ranges of areas under the receiver- operating-characteristic curves(AUC) of all models with oversampling data were 57. The accuracy range of all models with an oversampling dataset was 62 to 65. The mean decrease in the accuracy of the importance plot indicated that some multiracial groups were important predictors of PTB compared with socioeconomic factors. Conclusions: This study's results supported the idea that several granular multiracial groups could be considered meaningful predictors of PTB. TABLE OF CONTENTS INTRODUCTION .............................................................................................................. 1 METHODS ......................................................................................................................... 4 RESULTS ........................................................................................................................... 9 DISCUSSION ................................................................................................................... 11 CONCLUSION ................................................................................................................. 15 REFERENCES ................................................................................................................. 16 APPENDIX A: TABLES .................................................................................................. 22 APPENDIX B: FIGURES ................................................................................................ 32 iii INTRODUCTION Preterm birth (PTB), which is defined as birth before the completion of 37 weeks of gestation (1,2), is the leading cause of infant morbidity and mortality in the world (28). In the US, preterm birth affects approximately 10% of live-born deliveries, and recent data from the National Vital Statistics System (NVSS) found an increase in preterm birth prevalence from 2016 to 2022, even after accounting for variation during COVID-19 (4). In addition, infants born prematurely face a heightened risk of immediate health issues, including neurodevelopmental disabilities and respiratory and gastrointestinal complications, as well as enduring challenges such as cardiovascular and metabolic disorders (5). Therefore, identifying risk factors for preterm birth is essential for effective early prevention. Early identification of pregnant people at risk for PTB enables timely interventions and personalized pregnancy management to prevent potential complications. Some proximal factors, including infection or inflammation, vascular disease, and uterine overdistension, are suspected to have a relationship with preterm birth (3). With efforts to identify risk factors, PTB prediction has garnered significant attention in recent decades (15,16). According to the American College of Obstetricians and Gynecologists (ACOG) clinical management guidelines, it is hard to predict the spontaneous PTB of singleton infants in nulliparous people, and there remains controversy and uncertainty regarding what screening tests to use (20). For example, there appear to be no advantages in using screening tests such as short cervix or endovaginal ultrasonography (29, 30). Because of the lack of screening tests, maternal demographics like granular race group or socioeconomic factors can become strong predictors of PTB in clinical settings. Various risk stratification models have been developed based on demographic factors and medical obstetric history (34,35). Since these characteristics are readily available, they are easily applicable in clinical practice (11). One of Healthy People 2030’s objectives is to ‘Eliminate health disparities, achieve 1 health equity, and attain health literacy to improve the health and well-being of all’ (10). Race and ethnicity provide one dimension to evaluate health disparities in PTB. For example, African American people experience a rate of PTB that is more than 1.5 times that of PTB compared with non-Hispanic White women (9.5% vs 14.7%, respectively) (41). Additionally, AIAN people experienced higher rates of preterm birth (11.5% vs. 9.1%) and low birth weight (8.0% vs. 6.9%) compared to non-Hispanic white (NHW) infants (42). Although there is a growing body of research on racial and ethnic disparities in health outcomes, including PTB, most researchers have studied the health of minority monoracial (one race only) groups. A growing demand exists to comprehensively understand the health and health outcomes of the multiracial (two or more races) population. Over the past ten years, the multiracial population in the US has experienced significant growth. According to the U.S. Census Bureau, the number of individuals identifying as multiracial increased by 276% between 2010 and 2020, rising from 9 million to 33.8 million people, and the White alone group decreased by 8.6% since 2010 (9). Multiracial disaggregation has been suggested as a factor that could help explain disparities in PTB rates (6,12, 22), but studies focusing on the role of multiracial identity in predicting preterm birth remain limited. Starting in 2016, all 50 states, along with the District of Columbia, Puerto Rico, Guam, the Northern Mariana Islands, and the U.S. Virgin Islands, reported race data in alignment with the revised 1997 Office of Management and Budget (OMB) standards (27). These standards permit reporting of at least five race categories, either as single races (i.e., reported alone) or as combinations of multiple races (i.e., more than one race). Building upon this change, the 2003 revision of the U.S. Standard Certificate of Live Birth allowed the reporting of multiple races for each parent (26). The standards for collecting racial information were fully implemented in all U.S. states in 2016, creating an environment where multiracial research is more feasible using birth certificates. This study aims to build four predictive models for preterm birth and to investigate which of the 31 race/ethnicity groups are significant predictors of PTB among nulliparous, 2 singleton birthing people. 3 METHODS 1 Target Population and Study Sample This study used US nationwide live birth certificates from 2019 provided by the NVSS. Observations were included if they were singleton births to nulliparous birthing people aged 18-44. The 18–44 age range is used to reflect the natural reproductive age while excluding extreme age groups (teenagers and women aged 45 and older) who have a significantly higher risk of PTB. Observations were excluded if the number of preterm births in each multiracial group was less than 10 or if variables of interest exhibited missingness. The analytic sample included 1,032,465 births (Figure 1). 2 Variables The outcome variable was preterm birth (PTB), defined as less than 37 weeks of gestation, estimated as weeks from an obstetric estimate of conception to the delivery date. The analyses included 28 variables, 18 of which represented race and ethnicity, and 10 represented socioeconomic factors or medical conditions that could be risk factors for PTB and are available to clinicians for decision-making in the 16th week of pregnancy. We use 16 weeks as a reference point because preterm birth prediction models are often built using data collected before the first trimester ends (45). The 18 racial/ethnicity groups were included among 31 race/ethnicity groups because of the exclusion criteria. Racial /ethnic identity were self-reported in the birth certificate dataset. The 10 socioeconomic factors and medical conditions were included because prior research suggests their relationship with PTB and they were included in the dataset (2,19,29,37) (Table 1). 3 Analysis Plan 3.1 Descriptive study Table 2 presents frequency counts and percentages for all categorical variables. 3.2 Analytic study To determine which model best fits the data, we created a training dataset comprised of a random sample of 80 percent (825,972) of the observed births. The remaining 20 percent 4 of observations were used as the test dataset. We split the data for model evaluation. The oversampling method was used to create models. Oversampling is a technique used to equalize the sizes of two groups (those with and without PTB) to help train machine learning models more evenly. There was a significant data imbalance because PTB occurs in approximately 10% of births. Without the oversampling method, it is hard to train PTB’s patterns and decrease the performance of learning algorithms. Therefore, we utilized an oversampling method to mitigate the imbalance and improve model performance. For the first objective, we created four machine learning models. The four statistical algorithms compared were: 1) logistic regression model with all variables (logistic regression model 1), 2) a logistic regression model with interaction terms (logistic regression model 2), 3) a Decision Tree model (DT), and 4) a Random Forest model (RF). The relative performance of the four models in the test dataset was assessed by examining overall accuracy, sensitivity, specificity, area under the operating characteristic curve (AUC), and balanced accuracy. Accuracy is the number of correct predictions over the total number of predictions. Sensitivity is the percentage of true positive in a disease. Specificity is the percentage of true negative in non-diseased individuals, and balanced accuracy is sensitivity plus specificity divided by 2. Area under the operating characteristic curve can be calculated by using the true positive rate(TPR) and false positive rate(FPR) at every possible threshold. 3.2.1 Logistic regression Logistic regression is a widely used statistical method in clinical research to examine the relationships between patient characteristics and binary outcomes. These models are a specific type of generalized linear model, estimated using maximum likelihood. In generalized linear models, the expected outcome is modeled as a function of a linear combination of predictor variables, with logistic regression using the logit function (14). The logistic regression results provide odds ratios that describe the associations between the dependent and independent variables. Additionally, the model generates an estimated probability for the outcome, which can be applied for classification and prediction. We 5 used the resulting model for the prediction of PTB. This study included two types of logistic regression models for prediction. Logistic regression model 1 included all 28 variables listed in Table 1 as main effects. The second logistic regression included all variables and two-way interactions. A univariable analysis was conducted to assess the statistical relationship between each individual, main effects variable and PTB, excluding race and ethnicity groups. Variables exhibiting a significant relationship with the outcome (alpha ≤0.05) were selected for the final model. Because all variables’ p-values were less than the alpha, we included them all. Spearman’s correlations were chosen to evaluate collinearity between variables. Two variables were considered correlated when the Spearman’s coefficient was higher than 0.7. No variables exhibited correlated relationships. The final logistic model was determined through the Akaike Information Criterion (AIC) and a backward stepwise elimination approach, where variables with a p-value greater than 0.05 were excluded from the model. All variables remained. Finally, two-way interactions between variables with p-values ≤0.05, except for the race and ethnicity variables, were evaluated. Of the 45 two-way interaction variables assessed, 23 two-way interactions were added to the final logistic regression model. 3.2.2 Decision Tree and Random Forest The third prediction model was a Decision Tree. A Decision Tree is a model developed by asking and answering questions about independent variables, where a node represents each independent variable. After answering the different questions at each node, the algorithm goes to the destination, a leaf, which returns the predicted results. Such a tree is fitted by splitting nodes to minimize a particular loss function (17). Hyperparameter tuning was used to increase the performance of the Decision Tree (38). Three parameters were adjusted in the Decision Tree analysis in this study: complexity parameter, minimum observation for splitting a node, and maximum depth of the tree. Through cross-validation and grid search, the best combination of complexity parameters, minimum observation for splitting a node, and maximum depth of the tree were selected. The optimal hyperparameters were chosen based on the AUC metric, which is a helpful 6 measure for evaluating the performance of classification models (13). AUC reflects the balance between sensitivity (true positive rate) and specificity (false positive rate) rather than just accuracy. Unfortunately, a Decision Tree can have high variance, indicating that its predictions are highly sensitive to fluctuations in the training data and thus overfitting. This overfitting results in the model capturing noise or random fluctuations in the training data rather than the underlying trend, which diminishes its predictive ability. Bootstrap aggregating, or bagging, can solve this issue. A bagging method trains multiple trees on different bootstrap samples of the same data and then averages these models. This technique smooths out the prediction and reduces the variance. We chose a Random Forest model as the bagging model, in which Decision Trees are trained with the vital restriction that in each step with a new split, only a few randomly selected features become available candidates for the split. By aggregating less correlated trees, Random Forest led to a possible improvement in variance (17). There were two steps for creating our final classification Random Forest model. First, we identified a specific number of trees with bootstrap samples and drew them from the training data. Second, the Decision Tree was trained on each bootstrap sample until a specific minimum node size that we specified was reached. This second step involved randomly selecting m variables that we specified from all the predictor variables, picking the best split among these variables, and splitting the node into two according to the best split. The predictions of all the trees in the Random Forest get to a prediction for a new observation, and a majority vote is utilized to decide on the final prediction (17). Three parameters were adjusted in the Random Forest analysis in our study: the number of trees, the number of variables randomly drawn, and a specific minimum node size. As for the number of chosen variables, the default in a classification model is the square root of the number of variables, so we chose five as the number of variables. The best combination of the optimal number of trees and minimum node size was selected through cross-validation and grid search. The optimal hyperparameters were chosen based on AUC. We chose the mean decrease accuracy as the metric to gauge important figures of a 7 Random Forest. First, it is easy to interpret. For example, if the mean decrease accuracy value for a variable called A is 0.3, it means that when the information of variable A is removed (i.e., when it is shuffled), the model's accuracy decreases by 0.3 (43). The other reason was that the "permutation accuracy importance" measure is an advanced way to evaluate variable importance in Random Forests. It works by randomly shuffling the values of a predictor variable, disrupting its original relationship with the response variable. This disruption leads to a decrease in prediction accuracy, highlighting the importance of the variable in the model’s performance (44). 4 IRB and Statistical packages This research was exempt from IRB oversight because it used publicly available, deidentified data, and statistical analyses were conducted using R (v4.4.2 R Foundation for Statistical Computing). 8 RESULTS 1 Descriptive study A total of 1,032,465 birthing people were included in the analysis, of which 97,555 (9.45%) experienced PTB. Many birthing people were aged between 25 and 29, and they were usually born in the US. Birthing people were most often overweight and had private insurance. Birthing people also typically visit prenatal care for the first time within 2 months of the beginning of their pregnancy (Table 2). 2 Evaluating Preterm Birth Prediction: Methodologies and the Role of Multiracial Groups 2.1 Classification of preterm birth using four methods to identify the best model Table 3 showed the PTB prediction models' accuracy, sensitivity, specificity, AUC, and balanced accuracy. Accuracy and AUC from the original dataset had the same value in the two logistic regression models and the Random Forest model. The Decision Tree’s accuracy was 0.9, similar to other models. Four models using oversampled data showed better AUC and sensitivity than four models using the original data. The AUC of the four models using the oversampling dataset was 0.57. The accuracy using the oversampling dataset was the highest for the Random Forest (0.62, 0.63, 0.62, and 0.65, respectively). 2.2 Effects of multiracial groups on PTB using a Random Forest model Figure 2 illustrates the ranked importance of variables in the Random Forest model derived from an oversampling dataset. Nine race and ethnicity groups had higher mean decrease accuracy than insurance type and cigarette smoking before the second trimester. Of those nine, five race and ethnicity groups were multiracial (e.g., non-Hispanic Black and White, non-Hispanic Asian and White, non-Hispanic Asian and NHOPI and White, non-Hispanic AIAN and White). To explore the impact of race on preterm birth probabilities, we created three hypothetical observations using the Random Forest model trained on an oversampling dataset. The only difference between the three observations was race—non-Hispanic Asian, non-Hispanic Black and Asian, non-Hispanic Asian and NHOPI and White— 9 while all other variables remained constant. As shown in Table 4, the prediction for non- Hispanic Asian birthing people indicated no possibility of preterm birth. In other words, 100% term birth probability. In contrast, for the non-Hispanic Black and Asian birthing people, the model predicted a 45.1% probability of PTB despite identical values for all other covariates. Moreover, the model predicted a 27.4% PTB for non-Hispanic Asian birthing people and non-Hispanic NHOPI and White birthing people. 10 DISCUSSION In this study, we conducted a comprehensive analysis of the determinants of preterm birth (PTB) using a population-based, cross-sectional cohort of 1,032,465 people with a live birth in the U.S. in 2019, with 28 variables, including sociodemographic factors, maternal medical history, and detailed race and ethnicity categories. Using machine learning techniques, we established prediction models for PTB and investigated the importance of granular racial and ethnic groups. Moreover, this study evaluated the performance of four machine learning models in predicting PTB; all except a Decision Tree model had the same accuracy and AUC when using the original dataset. However, when using the oversampling dataset, the Random Forest had the highest accuracy. Still, the accuracy or AUC’s performance was poor across the four machine learning models. One similar study evaluated the clinical potential of spontaneous PTB prediction models, and the models’ performances were similarly poor (AUC 0.51-0.56) for nulliparous women (39). Generally, a previous PTB history is considered the most critical risk factor for a future PTB (40). Indeed, Meertens et al. showed that having prior PTB data increases the performance of PTB prediction models. Therefore, the lack of previous PTB data in this study could be one reason why the predictive ability of our models is much lower, since the target population was nulliparous, singleton-birth people. So there is no data about previous PTB for our target population. Finally, we identified the order of importance of variables based on results from a Random Forest with the oversampling dataset. Among the 28 variables analyzed, non- Hispanic Black and White, non-Hispanic Asian and White, non-Hispanic Asian and NHOPI and White, and non-Hispanic AIAN and White ranked higher than insurance types or cigarette smoking before 2nd trimester based on the mean decrease accuracy metric. The results of our analysis were similar to a previous study that used 2016-2019 Medical Expenditures Panel Survey (MEPS) data to predict foregone preventive dental care for adults and demonstrated that combining distinct multiracial groups into one 11 group resulted in the lowest model performance among stratified race and ethnicity groups (36). Moreover, we used the Random Forest model to create three virtual observations to estimate the probability of PTB, holding all other variables constant. We estimated that these three observations had substantial racial/ethnic differences in PTB probability. Such results would be obscured if aggregated into a single monoracial Asian group. This study makes two contributions to the literature. First, several studies demonstrate the importance of race disaggregation for understanding many health conditions, such as respiratory diseases or type 2 diabetes (7,8,21). Some studies also identify variations in PTB regarding multiracial granular categories and show the need for detailed racial/ethnic subcategories (6,12,23). In this study, we determined that multiracial groups are essential predictors of PTB. We showed that when we use granular race/ethnicity groups to predict PTB, the prediction probability changes. Thus, it may be helpful to predict PTB by granular groups. Second, researchers often aggregate multiracial subgroups together due to small sample sizes (24). Two previous studies examining PTB race aggregation used rate comparison methods and conventional statistics (6,23). However, we used an oversampling approach to balance preterm birth and term birth to avoid the sample size requirements of conventional statistics. Moreover, the Random Forest method can handle high- dimensional data with many variables combined in non-linear fashions to predict outcomes or detect new patterns (25). This method can better identify the importance of granular racial groups, previously masked due to small sample sizes or data complexity. This study has certain limitations. First, we did not conduct subgroup analyses of PTB. Preterm birth can be classified based on its etiology or gestational age. In terms of etiology, PTB can be indicated, which results from medical intervention due to maternal or fetal complications such as severe preeclampsia or non-reassuring fetal heart rate, or spontaneous, which occurs due to spontaneous preterm labor or preterm premature rupture of membranes (PPROM). Based on gestational age, PTB is further divided into early PTB (birth before 32+0 weeks of 12 gestation) and late PTB (birth between 32+1 and 36+6 weeks of gestation) (31,32). The risk factors of each type of PTB, by etiology or gestational age, could be different since the pathophysiology of PTB can differ. If we divide PTB into its subgroups to create a prediction model, it is possible that the model's performance will improve. Second, we could not include adequate measurement of relevant socioeconomic or medical conditions. We included only the variables that could both be obtained within the first 16 weeks of pregnancy (to support clinical relevance of findings) and that are included on birth certificates. Other potentially relevant, yet unmeasured, covariates include alcohol consumption and previous medical history. Similar to the first limitation, adding more variables to the models may further improve the model’s predictive power of PTB for nulliparous, singleton birthing people. Third, measurement errors may occur. Measurement error is one of the main factors that undermine data validity. A previous study found that the sensitivity and positive predictive value (PPV) of birth certificate data varied widely across items, ranging from 0% to 100% (46), suggesting the presence of inaccuracies that could introduce measurement error. We did not explore the impact the measurement error could have introduced into this study, but acknowledge its potential role in biasing reported results. Fourth, additional machine learning algorithms are available, including diverse algorithms, such as the artificial neural network, the support vector machine, the neural network algorithm, extreme gradient boosting, and multi-layer perceptron (MLP) (11,33). Other machine learning methods might perform better with this data and could be explored in future research. Lastly, we need to consider what the race/ethnicity category on the birth certificate actually represents. Roth discussed the multiple dimensions of race, including racial identity, self-classification, observed race, reflected race, phenotype, and racial ancestry (47). However, we do not know which of these dimensions people rely on when responding to birth certificate questionnaires. There are also some misclassification cases between birth certificate data and hospital data (48). Therefore, there is a need to be cautious when conducting research that takes race/ethnicity groups into account, as 13 indicated on birth certificates. 14 CONCLUSION This study developed four prediction models for preterm birth (PTB) and demonstrated that specific multiracial and ethnic groups might serve as meaningful predictors of PTB risk among nulliparous, singleton-birthing people. The findings suggest that considering granular racial/ethnic identity could enhance the accuracy of PTB risk prediction. However, these four models are still hard to apply in a clinical setting. Future improvements in these prediction models, incorporating more diverse variables and advanced techniques, could better detect possible PTB in clinical settings, enabling more personalized and timely interventions for at-risk individuals. 15 REFERENCES 1. Spong C. Y. (2013). Defining "term" pregnancy: recommendations from the Defining "Term" Pregnancy Workgroup. JAMA, 309(23), 2445–2446. https://doi.org/10.1001/jama.2013.6235 2. Purisch, S. E., & Gyamfi-Bannerman, C. (2017). Epidemiology of preterm birth. Seminars in perinatology, 41(7), 387–391. https://doi.org/10.1053/j.semperi.2017.07.009 3. Goldenberg, R. L., Culhane, J. F., Iams, J. D., & Romero, R. (2008). Epidemiology and causes of preterm birth. Lancet (London, England), 371(9606), 75–84. https://doi.org/10.1016/S0140-6736(08)60074-4 4. Martin, J. A., & Osterman, M. J. K. (2024). Shifts in the Distribution of Births by Gestational Age: United States, 2014-2022. National vital statistics reports : from the Centers for Disease Control and Prevention, National Center for Health Statistics, National Vital Statistics System, 73(1), 1–11. 5. Saigal, S., & Doyle, L. W. (2008). An overview of mortality and sequelae of preterm birth from infancy to adulthood. Lancet (London, England), 371(9608), 261–269. https://doi.org/10.1016/S0140-6736(08)60136-1 6. Brown, C. C., Moore, J. E., & Tilford, J. M. (2023). Rates Of Preterm Birth And Low Birthweight: An Analysis Of Racial And Ethnic Populations. Health affairs (Project Hope), 42(2), 261–267. https://doi.org/10.1377/hlthaff.2022.00656 7. Pleis, J. R., & Barnes, P. M. (2008). A comparison of respiratory conditions between multiple race adults and their single race counterparts: an analysis based on American Indian/Alaska Native and white adults. Ethnicity & health, 13(5), 399–415. https://doi.org/10.1080/13557850801994839 8. Springer, Y. P., Filardo, T. D., Woodruff, R. S., & Self, J. L. (2024). Racial and Ethnic Disaggregation of Tuberculosis Incidence and Risk Factors Among American Indian and Alaska Native Persons-United States, 2001-2020. American journal of public health, 114(2), 226–236. https://doi.org/10.2105/AJPH.2023.307498 9. U.S. Census Bureau. (2021, August 12). Improved race, ethnicity measures show U.S. is more multiracial. Census.gov. https://www.census.gov/library/stories/2021/08/improved-race-ethnicity-measures- reveal-united-states-population-much-more-multiracial.html 10. Huang, D. T., Uribe, A., & Talih, M. (2024). Measuring progress toward target attainment and the elimination of health disparities in Healthy People 2030. Vital and 16 Health Statistics, 2(211). https://doi.org/10.15620/cdc/164019 11. Lee, K. S., & Ahn, K. H. (2020). Application of Artificial Intelligence in Early Diagnosis of Spontaneous Preterm Labor and Birth. Diagnostics (Basel, Switzerland), 10(9), 733. https://doi.org/10.3390/diagnostics10090733 12. Hamilton, B. E., & Ventura, S. J. (2007). Characteristics of births to single- and multiple-race women: California, Hawaii, Pennsylvania, Utah, and Washington, 2003. National vital statistics reports : from the Centers for Disease Control and Prevention, National Center for Health Statistics, National Vital Statistics System, 55(15), 1–20. 13. Fawcett, T. (2005). Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 17(3), 328–336. https://doi.org/10.1109/TKDE.2005.50 14. Hosmer, D. W., & Lemeshow, S. (2000). Applied logistic regression (2nd ed.). Wiley. 15. Narice, B. F., Labib, M., Wang, M., Byrne, V., Shepherd, J., Lang, Z. Q., & Anumba, D. O. (2024). Developing a logistic regression model to predict spontaneous preterm birth from maternal socio-demographic and obstetric history at initial pregnancy registration. BMC pregnancy and childbirth, 24(1), 688. https://doi.org/10.1186/s12884- 024-06892-3 16. Mirzamoradi, M., Mokhtari Torshizi, H., Abaspour, M., Ebrahimi, A., & Ameri, A. (2024). A Neural Network-based Approach to Prediction of Preterm Birth using Non- invasive Tests. Journal of biomedical physics & engineering, 14(5), 503–508. https://doi.org/10.31661/jbpe.v0i0.2201-1449 17. Hastie, T., Tibshirani, R., & Friedman, J. (2009). Model inference and averaging. In The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). Springer. 18. Hosmer, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression (3rd ed.). Wiley & Sons, Inc. 19. Prediction and Prevention of Spontaneous Preterm Birth: ACOG Practice Bulletin, Number 234. (2021). Obstetrics and gynecology, 138(2), e65–e90. https://doi.org/10.1097/AOG.0000000000004479 20. Springer, Y. P., Filardo, T. D., Woodruff, R. S., & Self, J. L. (2024). Racial and Ethnic Disaggregation of Tuberculosis Incidence and Risk Factors Among American Indian and Alaska Native Persons-United States, 2001-2020. American journal of public 17 health, 114(2), 226–236. https://doi.org/10.2105/AJPH.2023.307498 21. Koyama, A. K., Bullard, K. M., Onufrak, S., Xu, F., Saelee, R., Miyamoto, Y., & Pavkov, M. E. (2023). Risk Factors Amenable to Primary Prevention of Type 2 Diabetes Among Disaggregated Racial and Ethnic Subgroups in the U.S. Diabetes care, 46(12), 2112–2119. https://doi.org/10.2337/dci23-0056 22. Blebu, B. E., Waters, O., Lucas, C. T., & Ro, A. (2022). Variations in Maternal Factors and Preterm Birth Risk among Non-Hispanic Black, White, and Mixed-Race Black/White Women in the United States, 2017. Women's health issues : official publication of the Jacobs Institute of Women's Health, 32(2), 140–146. https://doi.org/10.1016/j.whi.2021.10.010 23. Brown, C. C., & DuBois, D. (2024). Racial/Ethnic Disparities in Pregnancy- Associated Death: The Critical Importance of Disaggregation by Cause of Death and Race/Ethnicity. American journal of public health, 114(7), 666–668. https://doi.org/10.2105/AJPH.2024.307700 24. Miotto, R., Wang, F., Wang, S., Jiang, X., & Dudley, J. T. (2018). Deep learning for healthcare: review, opportunities and challenges. Briefings in bioinformatics, 19(6), 1236–1246. https://doi.org/10.1093/bib/bbx044 25. National Center for Health Statistics. (n.d.). 2003 revisions of the U.S. Standard Certificates of Live Birth and Death and fetal death report. Centers for Disease Control and Prevention. Retrieved from https://www.cdc.gov/nchs/nvss/vital_certificate_revisions.htm 26. Office of Management and Budget. (1997). Revisions to the standards for the classification of federal data on race and ethnicity. Federal Register, 62(210), 58782– 58790. 27. Walani S. R. (2020). Global burden of preterm birth. International journal of gynaecology and obstetrics: the official organ of the International Federation of Gynaecology and Obstetrics, 150(1), 31–33. https://doi.org/10.1002/ijgo.13195 28. Kleinrouweler, C. E., Cheong-See, F. M., Collins, G. S., Kwee, A., Thangaratinam, S., Khan, K. S., Mol, B. W., Pajkrt, E., Moons, K. G., & Schuit, E. (2016). Prognostic models in obstetrics: available, but far from applicable. American journal of obstetrics and gynecology, 214(1), 79–90.e36. https://doi.org/10.1016/j.ajog.2015.06.013 29. Esplin, M. S., Elovitz, M. A., Iams, J. D., Parker, C. B., Wapner, R. J., Grobman, W. A., Simhan, H. N., Wing, D. A., Haas, D. M., Silver, R. M., Hoffman, M. K., Peaceman, A. M., Caritis, S. N., Parry, S., Wadhwa, P., Foroud, T., Mercer, B. M., Hunter, S. M., Saade, G. R., Reddy, U. M., … nuMoM2b Network (2017). Predictive Accuracy of Serial 18 Transvaginal Cervical Lengths and Quantitative Vaginal Fetal Fibronectin Levels for Spontaneous Preterm Birth Among Nulliparous Women. JAMA, 317(10), 1047–1056. https://doi.org/10.1001/jama.2017.1373 30. Orzechowski, K. M., Boelig, R., Nicholas, S. S., Baxter, J., & Berghella, V. (2015). Is universal cervical length screening indicated in women with prior term birth?. American journal of obstetrics and gynecology, 212(2), 234.e1–234.e2345. https://doi.org/10.1016/j.ajog.2014.08.029 31. Brown, H. K., Speechley, K. N., Macnab, J., Natale, R., & Campbell, M. K. (2014). Neonatal morbidity associated with late preterm and early term birth: the roles of gestational age and biological determinants of preterm birth. International journal of epidemiology, 43(3), 802–814. https://doi.org/10.1093/ije/dyt251 32. Hendler, I., Goldenberg, R. L., Mercer, B. M., Iams, J. D., Meis, P. J., Moawad, A. H., MacPherson, C. A., Caritis, S. N., Miodovnik, M., Menard, K. M., Thurnau, G. R., & Sorokin, Y. (2005). The Preterm Prediction Study: association between maternal body mass index and spontaneous and indicated preterm birth. American journal of obstetrics and gynecology, 192(3), 882–886. https://doi.org/10.1016/j.ajog.2004.09.021 33. Wong, K., Tessema, G. A., Chai, K., & Pereira, G. (2022). Development of prognostic model for preterm birth using machine learning in a population-based cohort of Western Australia births between 1980 and 2015. Scientific reports, 12(1), 19153. https://doi.org/10.1038/s41598-022-23782-w 34. Koivu, A., & Sairanen, M. (2020). Predicting risk of stillbirth and preterm pregnancies with machine learning. Health information science and systems, 8(1), 14. https://doi.org/10.1007/s13755-020-00105-9 35. Liu, Y., Liu, J., & Shen, H. (2024). Machine learning model-based preterm birth prediction and clinical nomogram: A big retrospective cohort study. International journal of gynaecology and obstetrics: the official organ of the International Federation of Gynaecology and Obstetrics, 10.1002/ijgo.16036. Advance online publication. https://doi.org/10.1002/ijgo.16036 36. Schuch, H. S., Furtado, M., Silva, G. F. D. S., Kawachi, I., Chiavegatto Filho, A. D. P., & Elani, H. W. (2023). Fairness of Machine Learning Algorithms for Predicting Foregone Preventive Dental Care for Adults. JAMA network open, 6(11), e2341625. https://doi.org/10.1001/jamanetworkopen.2023.41625 37. Wang, R., Shi, Q., Jia, B., Zhang, W., Zhang, H., Shan, Y., Qiao, L., Chen, G., & Chen, C. (2022). Association of Preterm Singleton Birth With Fertility Treatment in the US. JAMA network open, 5(2), e2147782. https://doi.org/10.1001/jamanetworkopen.2021.47782 19 38. Sam'an, M., Farikhin, & Munsarif, M. (2025). An improved decision tree model through hyperparameter optimization using a modified gray wolf optimization for diabetes classification. Computer methods in biomechanics and biomedical engineering, 1–17. Advance online publication. https://doi.org/10.1080/10255842.2025.2460178 39. Meertens, L. J. E., van Montfort, P., Scheepers, H. C. J., van Kuijk, S. M. J., Aardenburg, R., Langenveld, J., van Dooren, I. M. A., Zwaan, I. M., Spaanderman, M. E. A., & Smits, L. J. M. (2018). Prediction models for the risk of spontaneous preterm birth based on maternal characteristics: a systematic review and independent external validation. Acta obstetricia et gynecologica Scandinavica, 97(8), 907–920. https://doi.org/10.1111/aogs.13358 40. Koullali, B., Oudijk, M. A., Nijman, T. A., Mol, B. W., & Pajkrt, E. (2016). Risk assessment and management to prevent preterm birth. Seminars in fetal & neonatal medicine, 21(2), 80–88. https://doi.org/10.1016/j.siny.2016.01.005 41. Hamilton, B. E., Martin, J. A., & Osterman, M. J. K. (2024). Births: Provisional data for 2023. Vital statistics rapid release (Vol. 35). Centers for Disease Control and Prevention. https://www.cdc.gov/nchs/data/vsrr/vsrr035.pdf 42. Martin, J. A., Hamilton, B. E., Osterman, M. J. K., & Driscoll, A. K. (2019). Births: Final Data for 2018. National vital statistics reports : from the Centers for Disease Control and Prevention, National Center for Health Statistics, National Vital Statistics System, 68(13), 1–47. 43. Genuer, R., & Poggi, J. M. (2020). Random forests with R. Springer. https://doi.org/10.1007/978-3-030-56485-8 44. Strobl, C., Malley, J., & Tutz, G. (2009). An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychological methods, 14(4), 323–348. https://doi.org/10.1037/a0016973 45. Arabi Belaghi, R., Beyene, J., & McDonald, S. D. (2021). Prediction of preterm birth in nulliparous women using logistic regression and machine learning. PloS one, 16(6), e0252025. https://doi.org/10.1371/journal.pone.0252025 46. Josberger, R. E., Wu, M., & Nichols, E. L. (2019). Birth Certificate Validity and the Impact on Primary Cesarean Section Quality Measure in New York State. Journal of community health, 44(2), 222–229. https://doi.org/10.1007/s10900-018-0577-y 47. Roth, W. D. (2016). The multiple dimensions of race. Ethnic and Racial Studies, 39(8), 1310–1338. https://doi.org/10.1080/01419870.2016.1140793 20 48. Reid, C. N., Obure, R., Salemi, J. L., Ilonzo, C., Louis, J., Rubio, E., & Sappenfield, W. M. (2023). Race and Ethnicity Misclassification in Hospital Discharge Data and the Impact on Differences in Severe Maternal Morbidity Rates in Florida. International journal of environmental research and public health, 20(9), 5689. https://doi.org/10.3390/ijerph20095689 21 Table 1. Independent variables APPENDIX A: TABLES Number Input variables Conceptualization of 1 2 3 4 5 6 7 8 9 10 11 12 13 Hispanic Non Hispanic American Indian/Alska Native (AIAN) the variables Yes, No Yes, No Non Hispanic AIAN & Asian Yes, No & White (AIAsW) Non Hispanic AIAN & White Yes, No (AIW) Non Hispanic Asian Yes, No Non Hispanic Asian & Native Yes, No Hawaiian and other Pacific Islander (NHOPI) (ANH) Non Hispanic Asian & White Yes, No (AW) Non Hispanic Asian & Yes, No NHOPI & White (AsNHW) Non Hispanic Black Yes, No Non Hispanic Black & AIAN Yes, No (BAI) Non Hispanic Black & AIAN Yes, No & White (BAIW) Non Hispanic Black & Asian Yes, No (BAs) Non Hispanic Black & Asian Yes, No & White (BAsW) 22 Table 1 (cont’d) 14 15 16 17 18 19 20 21 22 23 24 25 Non Hispanic Black & Yes, No NHOPI (BNH) Non Hispanic Black & White Yes, No (BW) Non Hispanic NHOPI Non Hispanic NHOPI & Yes, No Yes, No White (NHW) Non Hispanic White Yes, No Maternal age (1) 18-19 years (2) 20-24 years (3) 25-29 years (4) 30-34 years (5) 35-39 years (6) 40-44 years Nativity (1) Born in the U.S. (2) Born outside the U.S Education (1) High school-level degree (2) More than a High school-level degree Cigarettes snokedbefore 2nd Yes, No trimester Body mass index (BMI) (1) Underweight (2) Normal (3) Overweight Pre-pregnancy diabetes Yes, No Pre-pregnancy hypertension Yes, No 23 Table 1 (cont’d) 26 Insurance type (1) Medicaid 27 28 (2) Private insurance (3) Other Month of 1st prenatal visit (1) 1-month Visit (2) 2-month Visit (3) 3-month Visit (4) 4-month Visit Infertility treatment used Yes, No 24 Table 2. Basic characteristics of the analytic population Variables Term birth N= 934,910 Preterm birth N= 97,555 ( ) = proportion of total ( ) = proportion of total (%) (%) Age 18 – 19 years 20 – 24 years 25 – 29 years 30 – 34 years 35 – 39 years 40 – 44 years Nativity 79,138 (8.5) 266,841 (28.5) 280,532 (30) 221,138 (23.7) 75,123 (8) 12,138 (1.3) Born in the U.S. 740,300 (79.2) Born outside the U.S. 194,610 (20.8) High school-level degree 295,986 (31.7) 10,180 (10.4) 28,656 (29.4) 26,073 (26.7) 21,318 (21.9) 9,242 (9.5) 2,086 (2.1) 78,473 (80.4) 19,082 (19.6) 37,228 (38.2) More than a High school- 638,924 (68.3) 60,327 (61.8) level degree Cigarettes before 2nd trimester Yes 54,359 (5.8) 6,914 (7.1) Body Mass Index Underweight 35,063 (3.8) Normal 434,138 (46.4) Overweight 465,709 (49.8) 4,018 (4.1) 40,256 (41.3) 53,281 (54.6) Pre-pregnancy diabetes Yes 6,075(0.65) 2,095 (2.1) 25 Table 2 (cont’d) Pre-pregnancy hypertension Yes 14,818 (1.6) 3,869(4) Payor at Delivery Medicaid 318,412 (34) Private insurance 547,838 (59) Other 68,660 (7) Prenatal Care Visit Timing 39,520 (40.5) 51,358 (52.7) 6,677 (6.8) No visit before 16 weeks 108,137 (11.6) 15,303 (15.7) 1-month Visit 52,268 (5.6) 5,855 (6) 2-month Visit 405,069 (43.3) 37,080 (38) 3-month Visit 295,837 (31.6) 29,722 (30.5) 4-month Visit 73,599 (7.9) 9,595 (9.8) Infertility treatment used Yes 20,020 (2.1) 2,882 (3) Race & Ethnicity Hispanic 205,153 (21.9) 22,015 (22.6) Non Hispanic AIAN 5,514 (0.6) Non Hispanic AIAN & 90 (0.01) Asian & White (AIAsW) 649 (0.66) 10 (0.01) 26 Table 2 (cont’d) Non Hispanic AIAN & 3,336 (0.36) 338 (0.3) White (AIW) Non Hispanic Asian 75,968 (8.1) Non Hispanic Asian & 460 (0.05) 6,912 (7.1) 60 (0.06) NHOPI (ANH) Non Hispanic Asian & 5,163 (0.55) 495 (0.5) White (AW) Non Hispanic Asian & 524 (0.06) 77 (0.08) NHOPI & White (AsNHW) Non Hispanic Black 110,928 (11.87) 17,699 (18.1) Non Hispanic Black & 528 (0.06) 81 (0.08) AIAN (BAI) Non Hispanic Black & 638 (0.07) 66 (0.07) AIAN & White (BAIW) Non Hispanic Black & 624 (0.07) 75 (0.08) Asian (BAs) Non Hispanic Black & 230 (0.02) 29 (0.03) Asian & White (BAsW) Non Hispanic Black & 128 (0.01) 17 (0.02) NHOPI (BNH) Non Hispanic Black & 9,464 (1.01) 1,042 (1.1) White (BW) Non Hispanic NHOPI 1,813 (0.2) Non Hispanic NHOPI & 522 (0.06) 256 (0.26) 44 (0.05) White (NHW) Non Hispanic White 513,827 (55) 47,690 (48.9) 27 Table 2 (cont’d) Values are n (% total) 28 Table 3. Performance of the models when performed on the training data, original and oversampling datasets Model Measures Original data Oversampling Logistic regression Accuracy model 1 Sensitivity Specificity AUC Balanced Accuracy Logistic regression Accuracy model 2 Sensitivity Specificity AUC Balanced Accuracy Decision Tree Accuracy model Sensitivity Specificity AUC Balanced Accuracy Random Forest Accuracy model Sensitivity Specificity AUC Balanced Accuracy 0.91 0.0008 99.9 0.5 0.5 0.91 0.0002 99.9 0.5 0.5 0.9 0.002 99.9 0.5 0.5 0.91 0 1 0.5 0.5 0.62 0.50 0.64 0.57 0.57 0.63 0.49 0.65 0.57 0.57 0.62 0.5 0.63 0.57 0.57 0.65 0.47 0.67 0.57 0.57 29 Table 4. Predicted probability of preterm birth for three virtual observations Observation characteristics Term birth Preterm probability birth (%) probability Non Hispanic Asian 100% (%) 0% - More than high school education - No cigarette smoking before pregnancy - Private insurance - Age: 30 years - Born outside the US - First prenatal visit: 3rd month - BMI: Normal - No diabetes or hypertension history - No infertility treatment Non Hispanic Black & Asian 54.9% 45.1% - More than high school education - No cigarette smoking before pregnancy - Private insurance - Age: 30 years - Born outside the US - First prenatal visit: 3rd month - BMI: Normal - No diabetes or hypertension history - No infertility treatment 30 72.6% 27.4% Table 4 (cont’d) Non Hispanic Asian & NHOPI & White - More than high school education - No cigarette smoking before pregnancy - Private insurance - Age: 30 years - Born outside the US - First prenatal visit: 3rd month - BMI: Normal - No diabetes or hypertension history - No infertility treatment 31 Figure 1. A snapshot of flow chart for the derivation of the analytic sample APPENDIX B: FIGURES 32 Figure 2. Importance figures of a Random Forest model using an oversampling dataset 33